Automation in C++
Shreemoyee describes the elegance and beauty of not having to explicitly mention variable types in C++ by using the ‘auto’ keyword.
By: Shreemoyee Sarkar is a Quantitative Developer in the Financial Modeling Group
Prologue
It was a beautiful summer night in Budapest. The Danube was babbling in front of me with baroque Budapest skyline sparkling as its backdrop. I was basking in the dreamy moonlight and marveling at the simple joys of life — writing x = 2
in python as the interpreter happily took care of type deductions. That's when it occurred to me, I would be bad at writing relatable romantic literature!
Life-altering epiphanies aside, I also realized the beauty of not having to explicitly mention variable types extends beyond python, all the way up to C++.
Enter the auto
keyword.
PS. This is Part II of a series about features of modern C++ used for developing BlackBird- a next generation, state of the art, security analytics platform, click here read Part I covering smart pointers.
Introduction
The auto keyword in C++ is a placeholder type specifier that:
- For variable declarations, specifies that the type will be automatically deduced from the initializer.
- For functions, specifies that the return type will be deduced from its return statements.
- For non-type template parameters, specifies that the type will be deduced from the arguments.
I can pretty much end the article here and happily assume that the reader is now fully acquainted with auto. But my love for verbosity and my need for intelligibility would not let me.
Deep dive
Using the keyword auto, we can replace statements such as
with
This is one of the magical things that can be accomplished using auto. Let’s take a look at such a list.
1. The curse of uninitialized variables
Imagine writing:
and forgetting to initialize x
. Uninitialized variables are the worst kind of bugs in C++ (believe me, I have been a victim). However, just writing auto x;
would throw a compiler error, forcing the coder to initialize it and therefore reducing the avenues for such errors.
2. Verbose variable declarations
Imagine this: you have a variable that stores the serial number and names of certain entities in an unordered map. Now, just to complicate matters further, each serial number is associated with an arbitrary number of entities. So it looks something like this:
Awesome. Now we wish to print the contents of the map by simply iterating over it. So we write:
I won’t lie, that was exhausting to type. Not to mention, exhausting to read.
So rather, we write:
Much cleaner, right? We let the compiler deduce the type from the initialization. Let’s go a step further:
Now we are talking. Combining auto with a C++11-style range-based for loop shortened the code significantly.
3. Ability to hold lambda closures
Auto
also allows us to store lambda expressions in named variables the same way as any ordinary variable - this way we can use the lambda without having to repeat code. For example, to compare two instances of class Entity represented by pointers:
How else might we go about writing this?
Using a std::function
object. std::function
is an STL template that essentially returns a function pointer which can refer to any “callable object” (i.e. anything that may be invoked like a function). When creating a std::function
object, we are obliged to explicitly specify the type of the function pointer. So the above lambda can alternatively written as
Obviously, writing something as elaborate as that opens up possibilities for errors. But that is not the only downside of using std::function
. The type of a std::function
declared variable holding a closure has a fixed size for any given signature (being an instantiation of the std::function
template). This size may not be adequate for the closure it’s asked to store, and when that’s the case, the std::function
constructor will allocate heap memory to store the closure. It therefore requires more memory than an auto declared closure and is often slower to boot due to implementation details.
4. Writing functions whose return type is not known
Suppose we need to write an template function that takes a container as well as an index and performs some operation on the element of the container at that index. What’s important here is that the type returned by a container’s operator[]
depends on the container. Since the type of both the container and the index will be known only after the parameters of such a function are declared - well, so will the return type. So we write something like,
When we do the above, the function returns whatever type operator[]
returns when applied to the passed-in container, as desired. decltype
on [] passes that information to the auto
following the parameter list.
Puzzles
The puzzles here are going to be rather simple, as can be imagined. For variables A through H in the auto
statements below, you have to guess what type the compiler will deduce
Solutions
std::initializer_list<int>
std::initializer_list<int>
int
int*
const int*
const int*
const int&
std::vector<int>::size_type
(In number 6 the top-level const is removed. A constant pointer there does not matter, so this const is discarded, but the const-ness of the memory behind it still sticks.)
What NOT to do?
auto
can be bittersweet, which is why using it is more of a choice than a mandate. Some things should be avoided while using it, as listed:
1. Using auto
with a "proxy class"
A proxy class is a class that exists for the purpose of emulating and augmenting the behavior of some other type. Some proxy classes such as std::shared_ptr
and std::unique_ptr
are designed to be apparent to clients. Other proxy classes are designed to act more or less invisibly. std::vector::reference
is an example of such “invisible” proxies, as is its std::bitset
compatriot, std::bitset::reference
.
What it means is this. Suppose I have a function that returns a vector of bools for an Entity object, where each bool represents whether the entity has a particular feature or not,
Now we need to access the 2nd element of the feature vector for our particular Entity and use it to compute some other function,
There is nothing wrong with doing this. However, if we use auto
to get the boolean in step 2 like
Will result in an error. This is because even though std::vector<bool>
conceptually holds bools, operator[]
for std::vector<bool>
doesn’t return a reference to an element of the container (which is what std::vector::operator[]
returns for every other type except bool). Instead, it returns an object of type std::vector::reference
(a class nested inside std::vector
). Hence calling a function that expects bool will result in an undefined behavior.
So, how can we tell if a given class is a proxy class? Libraries using them often document that they do so. The more we familiarize ourselves with the basic design decisions of the libraries we use, the less likely we are to be blindsided by proxy usage within those libraries.
2. Not using auto& when the need calls for it
Simply using auto will never produce a reference. For example, even if you have const int& f(){...}
then auto x = f();
deduces x
as int
, and not const int&
.
Hence if a reference is needed, it should be explicitly mentioned with auto, by writing auto& x = f()
, now the type of x
is const int&
. In the same way, while using auto
with loop iterators, make sure to use auto&
if the iterator is to be modified in the loop.
3. Using auto when the return type is unclear/ambiguous
If you come across a code such as:
You will probably ask what is this res
? It is definitely not clear from context, some IDEs might help with the type if hovered over res
, while some others might not. Such code defeats the purpose of using auto
by actually reducing readability. It would be better to rather write it as
Epilogue
auto
is very useful. However, it warrants caution. But then again, what in life does not?
References
If you wish to learn more, you may refer:
Thank you for making it all the way to the end! Hope you enjoyed it. And I apologize for the pun in the title!