UnifiedMap: How it works?
Eclipse Collections comes with it’s own implementations of List, Set and Map. It also has additional data structures like Multimap, Bag and an entire Primitive Collections hierarchy. In this blog we will take a look under the hood of UnifiedMap.
UnifiedMap is the Map implementation of Eclipse Collections and is implemented very differently than a JDK HashMap.
java.util.HashMap is backed by a table of Entry objects. The Entry implementation in HashMap has
hashcode, key, value, next as members, HashMap essentially caches the hashcode of keys.
A UnifiedMap is backed by an array where keys and values occupy alternate slots in the array :
protected transient Object table . The flattened array structure stores only the keys and values there by creating a leaner Map implementation with reduced memory footprint. Added benefit of having keys and values in consecutive slots is improved cache locality.
Collisions in the main array are handled by putting a special object called
CHAINED_KEY in key slot. The value slot of
CHAINED_KEY (slot after
CHAINED_KEY slot) contains another
Object array similar to main array called
CHAINED_OBJECT with keys and values in alternate slots.
Look ups in UnifiedMap use a standard hashcode index algorithm to find the location of a key in the array. If a key is not a
CHAINED_KEY then the next slot contains the value. If a key is a
CHAINED_KEY then the
CHAINED_OBJECT is evaluated linearly to find the required key and the next slot in
CHAINED_OBJECT contains the value.
Since UnifiedMap does not cache the hashcode, for each look up, hashcode needs to be computed. So, the performance of UnifiedMap is directly dependent on the hashcode implementation of the key.
Below are a few memory and performance comparisons between JDK 1.8 HashMap and Eclipse Collections 9.0.0 UnifiedMap.
Memory Footprint (lower number the better)
Performance Tests (higher number the better)
Source code for memory tests and performance tests is available on GitHub.
Summary: Eclipse Collections UnifiedMap has ~50% smaller memory footprint compared to JDK HashMap, but that comes with a minor performance implication.
Eclipse Collections is open for contributions. If you like it star it.