GCC 6 Will Warn You About Misleading Code Indentations

jimrandomh · on Jan 10, 2016

> "Specifically, a warning is issued for if, else, while, and for clauses with a guarded statement that does not use braces, followed by an unguarded statement with the same indentation."

> "The warning is not issued for code involving multiline preprocessor logic ... The warning is not issued after a @code{#line} directive, since this typically indicates autogenerated code"

Seems like a good thing to have, and pretty conservative about not causing false positives. In particular, constructs like double-iteration should be fine:

    for(y=0; y<ymax; y++)
    for(x=0; x<xmax; x++)
    {
        //...
    }

Which I checked because I like this construct and people very rarely think about it.

JoshTriplett · on Jan 10, 2016

That pattern still seems like obfuscation to me. Why not indent the inner for loop, to make the control flow more obvious?

For instance, consider how "break" or "continue" inside the braces would work.

jimrandomh · on Jan 10, 2016

The reason for doing it that way, is to emphasize when two loops together are conceptually iterating over a single thing, and have some symmetry, like iterating over the cells in a 2D grid. It makes it possible to line up the elements of the two loops, emphasizing the similar structure.

I agree that it would be very bad form to use "break" inside a block like that. But in the cases where this construct applies, it would still be bad form to break if the outer block had curly braces and indentation, because skipping the rest of a row isn't a natural operation in the sort of way that skipping the rest of a collection is.

S4M · on Jan 10, 2016

Even with that explanation I am not convinced of the soundness of that notation. For example, if you think you have a particular row that is causing you some problem and want to add a print statement for the row number, your code becomes, if you are too quick and forget to add braces:

    for(y=0; y<ymax; y++)
    printf("y=%i\n",y);
    for(x=0; x<xmax; x++)
    {
        //...
    }

And it will not work anymore as intended.

Granted it's a small mistake and it will not take more than 5 minutes to fix, but it still add a small load to your already busy mind.

jimrandomh · on Jan 10, 2016

This doesn't seem like a realistic mistake to me. The printf is just at the wrong indent level; this is the same mistake as:

    for(i=0; i<imax; i++)
    printf("i=%i\n",i)
    {
        //...
    }

Which also compiles and which is wrong in the same way. I don't recall ever having made or having seen this.

S4M · on Jan 10, 2016

Well, not really, because in the example you just gave, you would put the printf right between the braces as it should be. In the case originally discussed, you have no braces so you would have to add them, and that can be forgotten.

GFK_of_xmaspast · on Jan 10, 2016

Totally agreed, but in this case if the author had declared the variable in the for-loop:

  for(int y=0; y < ymax; y++) 
     ...

and didn't shadow y from an enclosing scope, that would have given the compiler a chance to catch that error.

seiji · on Jan 10, 2016

If the option is between "something we personally like but has edge cases or requires other understanding that could confuse some readers" versus "something everybody will understand," the first case should never happen.

It's almost like everybody forgot about https://matt.sh/howto-c#_formatting already. :(

rdtsc · on Jan 10, 2016

However one can say they are not quite the same thing, as you mentioned, how "break"s work, so it is obfuscating things a bit. But even performance wise: in order in which you iterate over the arrays can make a significant difference because of cache effects. So it is important to make obvious which one is the nested loop and which one is the outer one.

jlarocco · on Jan 11, 2016

> The reason for doing it that way, is to emphasize when two loops together are conceptually iterating over a single thing, and have some symmetry, like iterating over the cells in a 2D grid. It makes it possible to line up the elements of the two loops, emphasizing the similar structure.

I'm not convinced. Even iterating over a 2D grid, one loop is iterating over the columns, and the other the rows, and indentation (and braces) can help see which is which.

contravariant · on Jan 10, 2016

I don't see how 'continue' would be confusing, it would stop the current calculation and start the next one, same as with normal loops.

The behaviour of 'break' might be confusing but then that's always the case in nested loops, no matter how you notate them.

wasyl · on Jan 10, 2016

I would argue that `break` intuitively makes you go back an indent, so if you don't indent after the first `for`, one can assume `break` just exits both loops (goes back from lvl 1 to lvl 0 indent)

jimrandomh · on Jan 10, 2016

This doesn't quite work, since break skips over if statements (and most breaks have at least one if statement around them). So you have to look glance at the keyword at the elbow of each indent anyways, to know where it's going.

cnvogel · on Jan 10, 2016

While I understand that it's mostly a question of aesthetics, I don't like this also because it might break the expected semantics of continue/break in such a loop. When I iterate over things more complex than a single variable where the conditions in a for() would become ugly, I tend to put the domain-specific "increment" into a small static function:

    static int
    next_pixel(*x, *y, width, height) {
        (*x)++;
        ...
        if (last pixel...)
            return 0;
        return 1;
    }

    /* later on... */

    do {
        /* do something with coordinates x, y */
    } while (next_pixel(&x, &y, WIDTH, HEIGHT));

This is, of course, not only restricted to 2D arrays, but could cover walking of a tree or a filesystem...

noobie · on Jan 10, 2016

I apologize for the stupidity of my question but shouldn't that be

    for(y=0; y<ymax; y++)
    {
    for(x=0; x<xmax; x++)
    {
        //...
    }
    }

?

Edit: Oh, I get it, {} are only "mandatory" if we have more than one instruction, the second for loop gets counted as one instruction and hence the omitting of {}. Thanks!

halosghost · on Jan 10, 2016

In C, control flow like if, for, while, do, else, etc., take the rest of the line of code (i.e., till the next semi-colon) and runs it under the conditions of the clause.

braces—{}—allow you to take multiple lines of code and have them be guarded at the same time rather than having to repeat the condition. GP is taking advantage of the fact that the next for clause will be considered a single statement, and therefore the second set of braces are not technically needed.

However, omitting braces for single-line snippets has caused plenty of havoc in the past (cf. Apple's gotofail). Some would say it is better to always include braces in control flow.

ddevault · on Jan 10, 2016

It works without the additional braces. Consider this:

    for (i=0;i<imax;i++)
        printf("%d\n", i);

You can omit the braces if the body of the loop is a single block, just like an if statement. You can also do this:

    if (a == b)
        if (b == c)
        {
            do_thing();
        }

Same deal with the for loop. The nested loop is a single block.

jimrandomh · on Jan 10, 2016

If I were doing it that way, I wouldn't give the outer blocks' curly braces their own lines. For example in C#, where pretty much every file has a single "namespace" block containing a single "class" block, I write it as:

    namespace SomeNamespace {
    class SomeClass
    {
        //...
    }}

Which avoids having shifted the whole file an extra indent level to the right. But I haven't seen anyone else doing this.

Avshalom · on Jan 10, 2016

I'm not writing C# but yeah: AS3 and Java both have the Package{class{ two-step before you actually write any code. I don't even indent inside class:

  package{
  class{
  main{
    now we indent
  }
  }}

threeseed · on Jan 10, 2016

You don't indent after a package e.g.

https://github.com/apache/spark/blob/master/core/src/main/ja...

And it's fair enough to indent with a main since it is so infrequent. Ideally you only have one main entry point.

avar · on Jan 10, 2016

No, the two mean the same thing in C, e.g.:

    if (x)
        y;

Is the same as:

    if (x) {
        y;
    }

Macha · on Jan 10, 2016

Just in case anyone who has never used C (or Java, or C# etc etc) is confused, it's not the indentation that is causing this meaning here:

    if (x) y;

also has the same meaning.

recursive · on Jan 10, 2016

Why would it be? Certainly not for readability. I think it was posted as intended.

chris_wot · on Jan 10, 2016

Ew! That's not any better at all...

Man's gotta have an opinion. At least C doesn't have optional semi-colons.

thiht · on Jan 11, 2016

If I see this, I'll immediately think it's a bug, the difference between x and y is not even obvious at first sight, it looks like the line was duplicated accidentally.

Spl3en · on Jan 10, 2016

What about this ?

    for(y=0; y<ymax; y++) {
    for(x=0; x<xmax; x++) {
        //...
    }
    }

eru · on Jan 11, 2016

I'd put the two loop-headers on the same line in that case.

seanwilson · on Jan 11, 2016

I wish more languages would have what are essentially compiler enforced style guidelines (e.g. indentation, variable/function naming, file naming). So much time is wasted debating style, dealing with code review comments about style and making/using related tools.

yoz-y · on Jan 11, 2016

I agree, although I think it should be done by a tool separate from the compiler. Purely because I'd have a stable compiler first and more code = more bugs.

KindDragon · on Jan 10, 2016

PVS Studio support similar warning V640 http://www.viva64.com/en/d/0258/

rwmj · on Jan 10, 2016

I wonder if it'll be able to make sense of our macro-abusing XML-in-C: https://github.com/libguestfs/libguestfs/blob/master/src/lau... (Scroll up a couple of pages to see the definitions of the macros)

userbinator · on Jan 10, 2016

This would certainly be a useful feature, especially for those learning C, but I wonder whether checking stylistic things should really be in the compiler or part of a separate tool.

pcwalton · on Jan 10, 2016

> I wonder whether checking stylistic things should really be in the compiler or part of a separate tool.

These checks absolutely should be part of the compiler (and this isn't a stylistic warning; it's a correctness warning). Our software ecosystem would have been in a far worse place if C compiler developers had decided to not support the -Wall switch in favor of lint(1). There's a lot more friction involved with using an external tool compared to just modifying CFLAGS.

chris_wot · on Jan 10, 2016

The genie is out of the bottle already, gcc quite sensibly will warn you if it sees an assignment operator in an if statement. I'm sure it's saved countless man hours and pretty awful bugs since it was introduced :-)

PeCaN · on Jan 10, 2016

Specifically, it doesn't warn you if the assignment statement is in parenthesis, so while ((xs = xs->tail)) doesn't warn. I really like how GCC handles that.

chris_wot · on Jan 13, 2016

I just noticed this comment. OK, you've definitely got a good point. That is elegant.

ifdefdebug · on Jan 10, 2016

I hope it doesn't warn for things like

if(0 != (i = getValue()))

tspiteri · on Jan 10, 2016

It doesn't since the assignment is in brackets. You can also write

    if ((i = getValue()))

sigjuice · on Jan 10, 2016

Why not be absolutely sure?

  if (0 != (0 != (i = getValue())))

ifdefdebug · on Jan 11, 2016

I am aware of the redundancy, this was about the new warnings. If you can't do if(i = getValue()) without warning, I was wondering if assignments as a part of a larger expression would pass.

chris_wot · on Jan 10, 2016

I hope it does!

PeCaN · on Jan 10, 2016

Why would that be a warning? It's quite clear what that code is doing and is a reasonably common C idiom.

chris_wot · on Jan 10, 2016

Yeah, fair enough. It's not bad - I had to think about my objection to it and it's probably largely unfounded.

malka · on Jan 10, 2016

the problem being refered to here is :

> if (i = 0)

instead of

> if (i == 0)

archgoon · on Jan 10, 2016

Any tool that does syntax checking of a c++ code (correctly) is rerunning the entire compilation process except for the generation of object code and optimization. Since build times are already an issue for C++ projects (even unoptimized), running the compilation process twice (or more) would not make developers happy. Also, I suspect that C++ devs would be philosophically opposed to spending clock cycles on recomputation ;).

Clang lets you write plugins (which at least solves the code reuse problem), however, these are currently strongly coupled to the compiler version itself. See for example, the chromium guide to writing clang plugins "Don't Write A Clang Plugin"[1]

https://chromium.googlesource.com/chromium/src/+/master/docs...

tomjakubowski · on Jan 10, 2016

You don't need to do a full semantic analysis (e.g. type checking, const checking) for this kind of check, do you?At worst you'd need to expand macros and build an AST with information about white space preserved.

JoshTriplett · on Jan 10, 2016

This doesn't represent a style check, but rather a likely indication of incorrect code. By contrast, the compiler won't warn for this code:

    if (condition) {
        foo();
    bar();
        baz(); }

__david__ · on Jan 10, 2016

Nice, maybe this will finally quiet the wrongheaded opinion that braces should always be used.

yoz-y · on Jan 11, 2016

I do not think that the opinion of mandatory braces is wrong - in a language that does have braces.

If not for anything else, then for the fact that yi{ will _always_ yank the current block for me. Omitting braces will ruin this.

ArkyBeagle · on Jan 10, 2016

Seems like simply using astyle would be an improvement over catching this at compile-time.

nn3 · on Jan 10, 2016

The point of the warning is to find bugs, not enforce a particular coding style.

Running astyle would make it worse because you would hide the bug.

ArkyBeagle · on Jan 10, 2016

Having used astyle for exactly this, I simply have to disagree. Of course, you have to look at the code between running astyle and compiling it.

hoodoof · on Jan 10, 2016

So whitespace matters huh?

leni536 · on Jan 10, 2016

No more "goto fail;"?

jongraehl · on Jan 10, 2016

clang-format is better

jsizz · on Jan 10, 2016

And thus, every boneheaded project that ships with -Werror enabled will inflict an excrement tornado of FAIL on its grateful users and packagers.

THIS IS WHY YOU SHOULD NOT USE -Werror. NOT EVER. (And if you can't bring yourself to do anything about mere warnings, you shouldn't be in software engineering.)

tomjakubowski · on Jan 10, 2016

From the article:

> This new warning isn't enabled by default.

And even if it were, it's a matter of adding -Wno-misleading-indentation to make builds green again. Not a big deal. And besides, it's likely that fixing those warnings could be done mechanically. (and it's hardly true that every project that ships with -Werror has such misleading indentation somewhere in its code)

cperciva · on Jan 11, 2016

And even if it were, it's a matter of adding -Wno-misleading-indentation to make builds green again

How does that help the person who is building the release you shipped six months ago with a new compiler?

legulere · on Jan 10, 2016

You should have -Werror enabled on debug builds when developing and disabled on release builds.

MereInterest · on Jan 10, 2016

Huh. I would think it to be the other way around. When debugging, let me comment stuff out without erroring due to an unused variable or an unused parameter. Let me use a C-style cast to check the bit representation of an object. Then when building the release build, enable -Werror to make sure that I'm not sweeping anything under the rug.