A headache in Ruby: Hash default values

Ulysse BUONOMO
Klaxit Tech Blog
Published in
3 min readOct 24, 2018

--

The good old way to initialize default values inside a Hash:

This is already a good solution, the issue comes when you want to get fancy, using a default value for your Hash. The new fancy way:

Okay, it looks fancy, but it also doesn’t give us a good result and it is quite a bizarre result as well. There are two issues here, let’s try and fix them one by one. First, our option theater seems to be containing too many students. In fact, we could have called any key from our hash and have the same result. What we’ve changed at line 4 is the default value of our hash. Because the array object that is our hash default value is created between parenthesis, Ruby will give its reference next time it will be used. Hence during line 4, we are appending a student to this object, and not to a specific reference linked only to our key (option here).

So a first approach is to pass a block instead of using parenthesis.

This way, when we reach line 4, the array that will be modified will be a new one every time: it will be the returned value of our default proc { [] }.

However, we can see there is still an issue here. Indeed our hash remains empty. The reason is that when you access a key, you don’t set it. So you will access keys and find arrays, and then push some value to those arrays, but you will never tell your hash to set keys where those arrays are. When you iterate a second time over the same key, it will recreate a new array from the default proc. All that said, we still need to go one step further.

When initializing a hash, we can give the block two arguments that correspond to the hash and key. This allows us to put initialization inside of our hash declaration. I’d say this is slightly better than the top method since you can see the purpose of your hash during the declaration.

Conclusion

The usage of default value in Ruby’s Hash class is something tricky. A rule of thumb would be that you can use it with parenthesis only for immutable classes, for instance grades = Hash.new(10). However, when you use mutable classes, don't forget to set every key, using either the old fashion method showed in first, or the new fancy showed last.

It is hard to understand just why Ruby developers did this, if you have any usage example, don’t hesitate to show your creativity in the comment section. My idea of a usage you could have of this feature is that you can have a Proc enhanced with default values.

Working around this thread taught me something: sometimes, you should just not be fancy. Using these default values can cause a headache where it is not necessary. It is quite similar to this piece of poetry from Jamie Zawinski every developer should know:

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

A tiny bonus

This example shows a hidden ruby feature that may cause some troubles. The a += 1 grammar is syntactic sugar for a = a + 1. This creates a new object, instead of mutating the previous one. This is the reason in our previous example, += and << have a really different result.

With that said, can you think about another way to write the top problem without any issue, using Hash.new([])?

Hint: do not mutate, replace!

--

--