Ruby Symbols, The Good and Bad Parts

Ruby Symbols are all good and fun but if you’re not careful they can bite you in the a**.

Quick reminder, what are Ruby symbols? From http://rubylearning.com/satishtalim/ruby_symbols.html:

“Symbols are more efficient than strings. Two strings with the same contents are two different objects, but for any given name there is only one Symbol object. This can save both time and memory.”
puts “string”.object_id #21066960
puts “string”.object_id #21066930
puts :symbol.object_id #132178
puts :symbol.object_id #132178

As you can see regular ruby strings get re-created every time you reference them, while symbols are created once for the entire duration of your program. As rubylearning.com said, this usually leads to speedier and less memory bloated programs. If you find it hard to believe, imagine we’re looping through all our users (lets say we have 100000 users since we’re a successful Rails startup) and we need to output their name:

User.find_each do |user|
puts user['name'] #name string constructed 100000 times
end

In this case, the string ‘name’ will be allocated 100000 times (each iteration the string ‘name’ is reconstructed to a different object). If you don’t understand why look at the first example where each time you reference a string you create a new object with a different object_id.

Without delving into garbage collection and lower level discussions, it should be clear why symbols are more performant in this case:

User.find_each |user|
puts user[:name] #symbol :name is created just once!
end

Not only do we get less object allocations by using symbols(one allocation instead of 100000!), we also get the javascript syntax way of creating hashes (I believe since ruby 2.0):

my_hash = {user_name: 'Tom Cruise', age: 52}

This is a syntax that every new ruby programmer should be familiar with since it looks almost identical to javascript. It’s also the now encouraged way by the ruby style-guide to create hashes in ruby https://github.com/bbatsov/ruby-style-guide#no-mixed-hash-syntaces .

The “old” way of creating hashes in ruby is less aesthetic (though php programmers might be familiar with it):

my_ugly_hash = {'user_name' => 'Tom Cruise', 'age'=> 52}

Or alternatively using symbols (somewhat less common I think)

my_ugly_hash = {:user_name => 'Tom Cruise', :age => 52}

The fundamental difference is the new style always creates symbols for the hash keys . Later I will show you when this can surprise you.

Symbols, the bad parts:

So what’s the problem with symbols? We’ve shown they lead to better performance and they can also be used to create hashes more elegantly (with syntax very familiar to programmers from other languages).

The problem happened at work the other day. I needed to save a bunch of events data for users so I decided a postgres json array is a good structure to hold them.

I had a code similar to this example (don’t mind the details so much, just the fact that the hash’s keys are symbols and then saved to the database.

user.events_data = {"seen": ['post_1', 'post_2']}
user.save

When deserialising the user back to the html, I naively wrote something like this (mind the :seen symbol)

events = user.events_data 
puts events[:seen] #BAD!

What could be more natural then using symbols on this object? I mean, If before I put it in the database I used symbols I can expect symbols when I pull it back, right? Also, the style-guide asked me to prefer symbols to strings in hashes and even my IDE screams at me whenever I put strings as hash keys. So use a symbol, right?

Wrong! It’s a common mistake that happens since we always use keys on hashes without even thinking about it anymore. If you try to access events[:seen] you will get a nil value.

After deserialisation (e.g pulling the json back from the database to a variable) you simply can not and must not expect to use symbols on the hash. The information of your original hash (whether it’s keys were symbols or not) is simply lost when it is serialised into a database. After 3 years with Ruby and Rails I still make silly mistakes like that . From the amount of stackoverflow questions I see on the subject I’m in good company.

There are ways to deal with this situation that I won’t get into in this post (Just to mention Rails’ HashWithIndifferentAccess). If you take anything from this post, take this:

Be careful with using symbols on objects that have been deserialised. For example:

serialised = {name: 'Brad Pitt', age: 52}.to_json
deserialised = JSON.parse(serialised)
deserialised[:name] # NO!
deserialised['name'] # YES

And one last thought: For a language as beginner friendly and elegant as ruby symbols are an oddity. They’re unfamiliar to many other programmers coming from different backgrounds and they can bite you in the ass sometimes. Even Ruby isn’t perfect I guess.

Show your support

Clapping shows how much you appreciated Yoel Blum’s story.