The worst bugs (C++)
Bugs are our enemies. Most of them are stupid and easy-to-find: an exception, a segmentation fault, an obvious memory overhead, etc. They show themselves quite soon. But there are some, that are very well hidden, like silent killers, harming unnoticeably. These are the worst ones: they’re hard to find, even harder to notice. Usually, they are noticed and fixed so late, that their damage is irreversible.
Let’s take a look at the following example.
This article contains gist embeds, which might not be shown in the mobile app.
The code
Consider such a situation: you have Employee
objects and you want to pay a salary from a limited fund to those who need it the most, i.e. have less balance. Here is a code that does that.
Let’s quickly break down the code a little, so that no one will think “this is not the C++ I used to know” and leave.
- The
Employee
class is as simple as it can be. - The
employee_less_than
is a lambda function that compares twoEmployee
-s, saying whether the first one is “less than” the second one or not. It effectively compares their current balances. - The
employees_sorted
is amultiset
ofEmployee
-s, using the comparator defined above to keep them sorted. It’s necessary because theEmployee
doesn’t have anoperator<
defined, and there might be more than one employees with the same balance, hence themultiset
instead of theset
. - The
for
loop then iterates over theemployees_sorted
container, starting from the beginning, where the employee with the minimum balance is. In it, we pay their salaries if we have enough funds left.
As the topic suggests, this article is about the bugs, and, of course, this code has one too. Can you figure out where it is and how to fix it?
If you have some experience with C++, you’ve probably heard that a const_cast
is a bad thing, so you’ve probably started your search from there.
An important lesson of const correctness
Let’s take a closer look at that part.
for (auto& e : employees_sorted) {
if (e.get_salary() <= total_funds) {
total_funds -= e.get_salary();
const_cast<Employee&>(e).pay_salary();
}
}
This looks weird, cause we are const_cast
-ing e
to Employee&
, the e
we expect to be a non-const reference to Employee
, hence the auto&
. But people write const_cast
only to force the code to compile. It’s easy to ascertain that this cast is necessary. Unlike get_salary
, pay_salary
is a non-const member function, and e
surprisingly is const Employee&
. But how come e
is deduced to be a const reference? It’s automatically deduced to whatever we’d get dereferencing an iterator of employee_sorted
.
To see what exactly happens under the hood of the rather new C++ features I highly recommend this great website cppinsights.io.
As the employee_sorted
is not const, we expect the for loop to take a non-const iterator, dereference it and let us work with an Employee&
, except, the multimap
doesn’t have a non-const iterator. Yes, it does have two type definitions, iterator
and const_iterator
, but they’re both logically const and effectively the same. The iterator
definition then might be considered odd and confusing, and it’s actually there just for the consistency, which is a good point. But why does it only have a const iterator?
Const correctness is an important matter, especially in STL. If something is const, there’s a reason for that, and in this case, it’s quite a strong one. The underlying data structure of the multiset
(also set
, map
and multimap
) is a binary search tree, to keep the elements sorted and easily accessible.
Consider this tree. Imagine if we would forcefully change the value 3
to 10
. We would lose the track of 4
, 6
and 7
. Lots of different scenarios are possible here. If we are lucky and the damage is somewhere critical, inaccuracies will be often and the behavior will be suspicious. But, we might damage a leaf and hardly ever notice that something is wrong.
const_cast
is not illegal, it just indicates that either you, the user of someone else’s interface, are doing something wrong, or that someone else has designed the interface poorly.
To sum up, when a constness of something disturbs you and you want to get rid of it, it probably means they were telling you not to push that button.
Other associative containers also have this characteristic. The
set
,multiset
,unordered_set
andunordered_multiset
have const iterators only, andmap
,multimap
,unordered_map
andunordered_multimap
have their keys const.
Now, back to our code, how to fix it? The easiest and straightforward way is to store the Employee
-s in a vector
, and sort it after filling up.
This is asymptotically equivalent performance-wise and uses even less memory.
I’ve actually come across a similar problem, but the situation was a bit different there. Changing the container would be too much work, changes everywhere, so was not an option. Someone had cast away the constness of a key in a map
. Something like this:
std::map<std::string, int> container = {
{ "one", 1 },
{ "five", 2},
{ "three", 3 }
};
auto it = container.find("five");
if (it != container.end()) {
const_cast<std::string&>(it->first) = "two";
}
At that time, it was practically impossible to change the whole container: it was not so local and it had to stay a map. So, to fix it, I removed the old value and inserted a new one.
auto it = container.find("five");
if (it != container.end()) {
container.insert(std::make_pair("two", it->second));
it = container.erase(it);
}
Is that the best I can do? It was back then, but now, with C++17, we can do better. It has brought us a new function for associative containers.
node_type extract( const_iterator position );
It unlinks the node from the container and returns a handler to it. The returned object is of a move-only node_type
type.
auto it = container.find("five");
if (it != container.end()) {
auto node_handler = container.extract(it);
node_handler.key() = "two";
it = container.insert(std::move(node_handler)).position;
}
The insert
then simply links it back into the container, in the correct place of course. It returns an insert_return_type
object.
This way we change the key of a value in a map without any reallocation, which is a big advantage.
Appendix
There is something else in the code snippet at the beginning worth noting. It’s not a bug, more of a source of misunderstanding, which may grow up and become a bug. The Employee
has an operator==
, pretty well formed. On the other hand, we have a multiset
of Employee
-s with a custom comparator. It’s important to note, that multiset
doesn’t care for that operator==
, it’ll use the comparator to check equality:
if (!custom_less_than(a, b) && // a is not less than b
!custom_less_than(b, a)) { // b is not less than a
// a equals b
}
Thus, there might be situations when two Employee
-s are equal in the container but not in real life. Basically, it’s odd when a type has one comparison operator but not the others, so we should either get rid of it or add the rest of the ensemble.
“A bug is never just a mistake. It represents something bigger. An error of thinking that makes you who you are.”
-Elliot Alderson
📝 Read this story later in Journal.
👩💻 Wake up every Sunday morning to the week’s most noteworthy stories in Tech waiting in your inbox. Read the Noteworthy in Tech newsletter.