What If? Declaring variables in if statements, and the curiosities of scope that follows 🔭

tl;dr: Variables declared in if statement conditions are accessible in the else statement as well.

Like all good blog posts, this one first came to me while I was banging my head against the keyboard, cursing and screaming out for a maternal figure to hold me while I incoherently babbled about a crash caused by seemingly-nonsensical code.

“This shouldn’t compile! This CAN’T compile!”

And yet, despite the odds, C++ had decided yet again to show me a new side of it. Please take a moment to peruse the following example code, based on the original that caused me such pain, and see if you can spot the bug that causes a nullptr exception.

Of course, to compile this properly we need some further definitions. Check out https://repl.it/@Winwardo/IfElse-Chain-With-Bug for the full example.

To save you the trouble, the error is on line 12, where I attempt to access asDerivedA->predicate.

That’s right: despite declaring asDerivedA in a separate if statement above, I’m able to access it in an else/if statements below. No, this is not a compiler bug; this is clearly and unambiguously defined in the C++ specification. Before we get to that, let’s take a look at some code.

If you’re new to the syntax that’s used in the code sample, if (int i = 5) { is a perfectly valid way of declaring and defining a variable, then using it inside the given if statement. It allows us to write terser, clearer code, while also avoiding limiting the scope of a variable. It’s good practise to keep the scope of a variable as small as possible, both to avoid accidental re-use, and to lower the mental overhead on future programmers.

Wait, what’s scope?

If you’ve not come across the term before, the scope of a variable is the code it’s accessible in. Variables defined within a function, if statement, or other control block are scoped to that block. This is important to stop us accidentally re-using variables that were declared elsewhere in code.

The concept of a scope is used in most languages — all major languages, including C++, Javascript, C#, Rust, even Haskell, have a concept of scope. Note that specific definitions and rules vary from language to language.

You might have also heard of global scope — this is where a variable is accessible from anywhere, inside or outside a function or block. This is considered poor practise in many cases.

What’s happening when we declare inside an if statement? 🐿️

Scope in C++ isn’t just for functions — any time you open a new pair of curly braces, or block, you create a new scope. You should also know that if we’re declaring (and defining) inside the if statement, the condition then becomes the result of the definition, converted to a boolean. For converting a number to boolean, we ask the question “Is this number not 0?” If the number is 0, the boolean is false, otherwise it’s true.

Knowing this, we can transform a simple if declaration as such:

Both first and second are semantically the same, and produce identical assembler output from the compiler. You can prove this to yourself here: https://godbolt.org/g/zRC8BT

If you think this is a fluke or want further proof that this is the case, here you can see the C++ specification Decreeing That It Must Be So, in section [stmt.select]. Don’t worry if this section of this blog-post is confusing!

The C++ specification shows that any variable declared inside the if condition, is hoisted to outside and available inside both the if and else statements. You can view the current C++ specification draft at http://eel.is/c++draft/stmt.select

Because of this choice, we can see that any variables declared inside the init-statement must be accessible to the else statement. This alone may not seem enough to cause the bug above, however, due to C++’s resolution to the dangling else problem, it means variables declared in the init-statement must also be visible in any else-if statements.

If this is not clear, note that the following two functions are identical:

This is a really neat trick! I can already see where I can use this! 😍

NO.

STOP RIGHT THERE.

Don’t.

YOU.

DARE.

Having discussed with several colleagues, we’ve been unable to come up with a single legitimate use case for this behaviour. Every example we’ve considered can be rewritten in a much clearer and less surprising fashion.

To entertain the idea however, here is the sort of code you may have the misfortune of stumbling across.

Note that we’ve encoded some state that can be accessed via operator bool(), which will be called when we convert the data to a boolean type.

This is not immediately clear code, and it’s certainly astonishing. Intent has been obscured behind layers of abstraction that don’t provide us any real benefit.

This is how your colleagues will feel if they come across you using this anti-pattern. https://unsplash.com/photos/2Ts5HnA67k8

If you’re suggesting against this, why was it in your code base? 🤔

That’s a fair question. I was working on a data-exporter, which had to legitimately cast a pointer to one of several potential types in order to extract information for serialization. (No, polymorphism would not have been the solution here.)

It was an if-else chain, and I had copy-pasted one part, forgetting to update the variable name in the process. As shown by PVS Studio’s brilliant article titled The Last Line Effect, copying and pasting similar lines downwards regularly results in a bug.

To conclude

C++ is a big language, and despite having been working in it for 5 years, it still reveals new parts of itself to me every day. It’s a nice reminder that, even if you think you’re writing code well, without static analysis tools and effective automated testing, it’s very easy to miswrite some code and introduce a crash to a system.

Have you come across this curiosity anywhere? Can you think of an example where it would be useful to use? Let us know in the comments 😄

Thanks to Jessica Baker, Chantelle Porritt, Andy Bastable and Stuart Milne for proof-reading and discussions.