The anxiety of working with C++ enums
Nowadays everyone knows the drill. When you can list all possible values of a type, it's a good idea to make it into an enumeration.
enum Fruit {
Apple,
Orange,
Peach
};Simple and clean. This way, your IDE can give you a neat auto-complete list, compiler can give you heads up when you miss a value in switch and everyone is happy. End of story, right?
Unfortunately not.
In practice, the first thing that you notice is that all enum values belong in the parent namespace and get mixed up.
enum Fruit { Apple, Orange, Peach };
enum Color { Red, Green, Blue };int main() {
Fruit favorite_fruit = Apple; // not Fruit::Apple
Color favorite_color = Red; // not Color::Red
}
This is not good idea on many different levels. When declaring a value of type Fruit, using Apple instead of Fruit::Apple leads me to think that since Red is in the same namespace, it might be also a valid Fruit value, when in fact it is a Color. It also gives me bad intuition about what Fruit values are overall, and quite often, I have to get off my chair and look up the declaration of Fruit to be completely sure.
In addition, this practice clutters the namespace with many global values, making it harder to navigate and distinguish between objects.
And lastly, you can't simultaneously have a fruit named Orange and a color named Orange in a single namespace.
In C++11, these problems are solved with new enumeration access rules and the enum class statement. Here's how it works.
enum class Fruit { Apple, Orange, Peach };
enum class Color { Red, Green, Blue };int main() {
Fruit favorite_fruit = Fruit::Apple;
Color favorite_color = Color::Red;
}
Happy now? Still not quite.
What if I wrote a simple library around my two enums and used it for some time in fruit-related programs. Then, after a good vacation in the Caribbean I decided to create libtropical, which would add some more fruit?
The way my fruit enums are declared now I would have no way of extending the enums apart from going to the original library and adding new values in the enum declaration, likely breaking my code in multiple places in the process.
This is no longer a good use case for enumerations. Runtime polymorphism to the rescue, am I right?
namespace fruits {
struct FruitBase {}; struct Apple: FruitBase {};
struct Orange: FruitBase {};
struct Peach: FruitBase {};
}// Later, in libtropical:
namespace fruits {
struct Pineapple: FruitBase {};
struct Coconut: FruitBase {};
}
This looks nice and clean again. And later, any library can go forward and declare some extensions of FruitBase in its namespace. But how do you use this concept in practice?
Let's now leave the happy fruit and color world and imagine a specific use case, where you are trying to communicate with some other process using shared memory in order to read some parameters.
You originally had a method, which worked with an enum. Nothing pretty but it worked quite well given the high rate of incoming messages with parameter data.
enum Param { First, Second, Third };struct FirstParamData { /* ... */ };
struct SecondParamData { /* ... */ };
struct ThirdParamData { /* ... */ };FirstParamData data1;
SecondParamData data2;
ThirdParamData data3;// This method gets called when data arrives.
// `input_data` must point to 32 bytes of memory
void read_param(Param param, const void* input_data) {
switch (param) {
case Param::First:
data1 = *(FirstParamData*) input_data;
break;
case Param::Second:
data2 = *(SecondParamData*) input_data;
break;
case Param::Third:
data3 = *(ThirdParamData*) input_data;
break;
}
}
Let's now see what happens when we attempt to give it the same extensible polymorphic treatment we described above.
We move everything from the read method into the declaration of the param structures, ending up with a bunch of structs and virtual methods.
struct ParamBase {
virtual void read_data(const void* input_data) const = 0;
};struct First: ParamBase {
virtual void read_data(const void* input_data) const override {
data1 = *(FirstParamData*) input_data;
}
};struct Second: ParamBase {
virtual void read_data(const void* input_data) const override {
data2 = *(SecondParamData*) input_data;
}
};// This method gets called when data arrives.
// `input_data` must point to 32 bytes of memory
void read_param(std::shared_ptr<ParamBase> param,
const void* input_data) {
param->read_data(input_data);
}
That's not bad, but it's not good either.
First, notice the code for the parameter structures looks nearly identical in both cases. Repeating yourself a lot when coding is not a good sign.
Also, virtual methods are quite slow in high-performance applications. And someone is going to have to allocate and deallocate those shared pointers every time a data packet arrives. Moreover, since they carry no usable information apart from the parameter discriminator, this is not a good trade.
Is there a better solution? I think I may have found one.
We can make use of the typeid() operator along with some templates to bypass runtime polymorphism altogether and obtain something as extensible and neat-looking as the fruit examples without the hassle of constant memory reallocation and virtual method calls.
For those like me, who originally had no idea what typeid() does: It maps types and variables to std::type_info at compile time, giving you the ability to extract type name or compare types of different variables at runtime. There's however a small catch — std::type_info is not CopyConstructible, so you can't move it around too much. For this, you can use std::type_index which retains all its properties while being CopyConstructible. More information here.
Let's start over with parameter structs in a namespace. Except now, have every parameter declare its data structure type.
struct FirstParamData { /* ... */ };
struct SecondParamData { /* ... */ };
struct ThirdParamData { /* ... */ };namespace params {
struct ParamBase {}; struct First: ParamBase { using data_type = FirstParamData; };
struct Second: ParamBase { using data_type = SecondParamData; };
struct Third: ParamBase { using data_type = ThirdParamData; };
}
Then, create a templated helper method, which copies given memory into a parameter-specific structure, but only if the type assertion is correct.
// `data` must point to 32 bytes of memory
template<typename ParamType,
typename TargetType = typename ParamType::data_type>
void read_param_helper(std::type_index param,
TargetType& target,
const void* data) {
if (param != typeid(ParamType)) return;
target = *(TargetType*) data;
}Its usage in the original read method is then rather simple and elegant.
FirstParamData data1;
SecondParamData data2;
ThirdParamData data3;// This method gets called when data arrives.
// `data` must point to 32 bytes of memory
void read_param_data(std::type_index param, const void* input) {
read_param_helper<param::First>(param, data1, input);
read_param_helper<param::Second>(param, data2, input);
read_param_helper<param::Third>(param, data3, input);
}
This is method is going to be slightly more complex than the original enum-based one due to all the helper method invocations (in the original, only one branch of switch was called). However, if you are using a modern compiler, it's not going to matter as it will be optimized away at -O3. And there's going to be no ugly memory footprint or virtual calls like in the previous attempt.
If a new version with more parameters comes along, we can safely extend the param namespace from another library, link it with the original library and nothing should break.
Also, we get an added benefit of compile-time type checking. This is something we would not get in the original enum-based system or in the first ugly attempt at polymorphic solution.
// Compiles:
FirstParamData data1;
read_param_helper<param::First>(param, data1, input);// Does not compile – converting FirstParamData to SecondParamData:
SecondParamData data2;
read_param_helper<param::First>(param, data2, input);
And lastly, we can avoid structure names like FirstParamData entirely by using the using statement in declarations.
param::First::data_type data1;This way, if we later decide to substitute FirstParamData for some other structure, we can do it in one place, and the rest of the code adapts automatically.
