Why Standardizing flat_map is a Bad Idea
90+% of of std::map usage is wrong.
If you care about performance and you do not put a lot of weight on the number of source code characters. And std::flat_map will not fix this.
Brief History of Container Disappointments
C++98 did not have hash maps/sets. The things that are today called std::unordered_map/set. Then C++0x got delayed(add concepts to working paper here, remove concepts from working paper there and soon we are talking 13 years).
Finally in C++11 hash map was added to the Standard. But…
unordered_map has 2 big problems:
- Ugly long name(because compatibility hash_map was not used)
- Standard requires less than optimal implementation(performance).
Users that do not care about ordering(which is most of the time) can choose short name/slow std::map or long name/less slow std::unordered_map.
There is a proposal to add flat_map, that is basically a sorted vector. It is faster than std::map but slower than hash_map for “large” number of elements(although more space efficient) and inserts are linear.
Problems With flat_map
Original boost and proposed std::flat_map claim to be drop-in replacement for std::map. This is ≈true. But it is not a good thing.
As professional comedian and part time C++/D developer Alexandrescu once said(while sharing his love for JAVA).
Encapsulation of complexity should be a crime.
And although here we are not talking about same problem it is similar. Perfectly valid std::map code ported to “better” std::flat_map starts performing horribly slow immediately, or even worse it becomes ticking time bomb.
[future postmortem] For months after commit everything worked fine, until one day consumer X decided to process enormous(1ook lines) CSV file.
Similarly new users of C++ writing new code may consider that flat_map insert is “fast” because map unlike std::vector is for inserting and looking up things.
“But flat_map is widely used by smart people with success”
True, but that is actually not an argument for standardizing it. It is used by who? By people who understand cache locality, implementation of std::map, know the bounds of their data size, profile their code, know not to write insert that is linear...
Standardizing flat_map will expose very useful but fringe container to the masses of programmers. I do not want flat_map on the same level of visibility(for example it being listed under containers ) as general purpose containers.
flat_map is enabler
As mentioned earlier users usually do not need their map to be sorted. std::map with this expensive guarantee became the default since it was only container available for a long time and/or it is 10 chars less to write than std::unordered_map.
So while flat_map is indeed faster than hash map for some sizes/type crossover point is generally probably one or two orders of magnitude less than you expected and generally it should not be used instead of hash_map.
flat_map is a Special Snowflake, just like every other immutable container
At first flat_map may seem just like a smart optimization of tree based std::map. It is true, under limitations discussed before.
But it is actually immutable container. With insert method. So C++ instead of collection of containers named properly with API matching their implementation/operation complexity C++ gets one container with an API matching the mutable API.
flat_map is so good at being bad proposal
Unfortunately flat_map fits all the usual criteria for a Nice Proposal(tm).
- It is widely used
- It has existing implementation(boost)
- It improves on existing container(std::map)
- It is a small and isolated proposal. No need to change core language, no need to add overloads to other parts of standard library or deprecate anything…
First 3 points were discussed previously. Forth is not a problem in itself, but it becomes if Grand Unifying Proposal would be better in the long run.
What Should be Standardized
Unordered elephant in the room
As explained before the biggest problem with std::map is not that it is tree based. It is the best design for it’s requirements when it comes to general purpose containers.
Problem is that proper default is slower than it can be, and it has an ugly name.
- introduce hash_map without performance problems of unordered_map
- deprecate unordered_map(remove it in C++22)
…that will not happen…
- much more fun to add new stuff than to remove old, just like new code is more fun than bug fixing
- psychology — hard to blame lack of move semantic for unordered_map, like you can with auto_ptr, and it is kind of embarrassing 😊 to mess up the most useful data structure in the world.
- backwards compatibility and it’s infinite 😜 weight when it comes to grading proposals
… but Immutability could change things
Immutable containers(hey Bjarne why my const vector has capacity pointer? You lied to me Bjarne 😉) added to the language as new category of containers where users are expected to learn about new API and tradeoffs would be nice.
So flat_map would become imm_map without insert.
Why do I Hate flat_map so much?
flat_map is great. I have used it, I recommend it in cases when there is need for it and user understands the trade offs and maintenance burden. But I do not feel it belongs in std::. There is a place for general use algorithms/containers and place for specific algorithms/containers. I do not feel that specific should live inside std::. I think boost/folly/absl are more appropriate homes for them.
flat_map is sometimes useful and while C++ situation with associative containers is not great flat_map does not belong in C++ Standard Library.