Dealing with Ranges using Java TreeMaps

Storing and querying non-overlapping ranges using one of NavigableMap implementation

Isuru Weerarathna
The Startup
6 min readDec 16, 2019

--

Photo by veeterzy on Unsplash

As soon as you hear word TreeMaps, you always remember that it is a map type that can iterate keys in order. In this article, I will show you that TreeMap can actually do a little bit more than that in some different scenarios, such as, when dealing with ranges.

Have you ever faced scenarios where you want to get the corresponding falling range for a given value? For instance, the range [10, 20], providing any number between 10 to 20, inclusively, should return this range, or range’s associated value. Hypothetically, think you are writing a just-another-static-analytic-tool and you want to find the owner of each line using stored git blame to assign the reported violations. Also think that you are creating a game, and you want to show the required score to get for the next level based on the current score.

We can solve these range problems using a TreeMap.

Revision: Maps matches with exact keys

If you query using the get method of a Map, it will return the associated value of that exactly matched key from that map. If such does not exist, it will return null. This is what maps do. It gives what you ask.

But, if you ask differently and no matching entry exists for a given key, it can still give you a value (depending on how you ask), which could be useful. And, guess what? TreeMap has such methods.

What is a TreeMap?

TreeMap’s main use case is the ability to iterate keys in key’s natural order, where there is no guarantee about the key ordering in other java map types (such as, HashMap). In a programming challenge or a coding interview, if you are asked to print some key values in an ordered manner, I assume you would have already demonstrated it using a TreeMap.

However, it can do better.

Interesting thing here lies in Tree Map is its implementation of NavigableMap.

NavigableMap is a kind of SortedMap which has methods to return closest matches for given search keys. NavigableMap has an additional set of methods that you can query value for the nearest key. These methods are useful when finding for range.

Let’s explore those methods by taking a use case.

Example: Git Blame

Assume that you are asked to develop a data structure for a next-generation static code analysis tool to return the latest author for a given line to assign the found violations on the file.

If the file has N lines, we know that only one author (latest) can be set for a single line, and every line has an author as well. Some contiguous lines could be authored by the same person. So, your data structure should support getting the author when given any line number.

We could have implemented this using a simple HashMap by having a key for each line and author as the value. In other words, you have to duplicate each continuous line with the same author (block). Even if a file is only touched by a single user, you have to have duplicate entries for each and every line. That is not efficient, because it takes unnecessary space.

To overcome this, we can use a TreeMap.

By using a TreeMap, we can reduce the above problem to a problem of range search. Assume that your git parser have returned the set of contiguous blocks in below format [start, end] -> author. Consider the below blocks for a file having only a total of 45 lines.

Note that the start and end line numbers are inclusive. So, let’s fill this information in the map using put function, assuming the functiongetBlocks will return blocks parsed from git blame command.

We have used only the startLine to record user and we have not inserted all line numbers starting startLine to endLine, because as I mentioned earlier, storing such a way is really space waste in case there are few authors in a whole file.

We’ve added starting lines of all blocks into the map, now we are ready to query.

Querying

Let’s say we want to know who owns the line number 8. It should return the result as ‘Harry’ by using get method. Since there is an exact match for the given key (startLine), the expected result returns.

However, when we query the line number 11, it should return ‘Smith’ despite not having an entry for 11 in our TreeMap instance.

TreeMap does not differ from the conventional map when using the method get. It exactly searches for the given key in its space, which in this case, there is none! Therefore, to get the correct blame author, we need to match this number to the range [10, 21]. So, how do we match this? Which method we can use to map the given key to the expected range?

New Method: ‘floorEntry’ comes to the rescue!

For an exact non-matching key, floorEntry returns the entry with the greatest key less than or equal to the given key. If the key is matched, then that corresponding entry will return.

That’s why this method has a prefix called floor as in Math.

If we use this function, it will match the expected range [10, 21]. Because, for that key 11, the greatest key less than in the map is 10, and its entry is [10 -> Smith]. Therefore, we can verify that the owner of line 11 is indeed Smith.

Method floorEntry is very useful where we want to find out starting boundary of the range the given value falls, when the ranges are stored only using the start boundaries.

Fixing out of range

If you query the map with a number, say 50, which lies outside the range, it will return Roy, even if the file has only 45 lines. This is incorrect!

We can fix it by adding another extra entry with (last line number + 1) to the map having a special value flag, simply null.

This additional entry will mark the end of ranges by setting the value as null. After this, when you query with number 50, it will return null which is the indication of out of range.

Further, if you query with a negative value, say -1, then also the return will be null entry, because there is no entry having a key lesser than -1.

Difference between floorEntry and lowerEntry

If you examine the interface of NavigableMap, you could see another similar method called, ‘lowerEntry’ which we cannot use for the above example though it sounds similar.

As said earlier, if the key exactly matches with an available key, it will return in methodfloorEntry. But inlowerKey it always returns the greatest key strictly less than for the given key.

For example, if we query the map for line 10 in both methods, the result would be different.

So, the lowerEntry method check for a strictly less key in the map. Hence, obviously the result could be different for the same input.

Methods ceilingEntry and higherEntry

ceilingEntry method does the opposite of floorEntry while returning the value of the smallest key greater than the given key. higherEntry is the strict version of ceilingEntry similar to the lowerEntry.

When you want to find out the range ending value for a given value, this ceilingEntry method can be used.

This method is useful in scenarios, like in games, finding the next level target based on the current score.

Time Cost for Retrievals

The methodget mostly has O(1) complexity. But the methods mentioned above has O(log n) time cost because it has to deal with searching the nearest one if such exists. This O(log n) comes from that, TreeMap uses binary search to find out the floor or ceiling keys when one of the above method is called.

Notes

  • The sound of floorEntry or ceilingEntry might not be best in terms if you are providing a reusable data structure to other developers. If you like, you can extend TreeMap and provide a meaningful method (such as getRangeValue) so that others can understand properly.
  • TreeMap is not suitable when you want to store overlapping ranges. For example, having blocks of [3,8] and [4,10] with different values will not return both values when querying for number 5 (5 falls in both ranges). With floorEntry or ceilingEntry method, it returns value associated with range [4,10]
  • If you want to deal with such complex overlapping ranges, then you better use Guava’s RangeMap.

I hope you learned something you have never used before by reading this. Cheers!

--

--