Effective Ruby Idioms to Write Better Code

Ruby is a very elegant language, its idioms can make your code incredibly expressive and help you create easy to read and maintainable code. However, it’s not the simplest language to master, and things can go wrong if you don’t pay attention to best practices.

The intention of this series of posts is to summarize some useful points from Effective Ruby: 48 Specific Ways to Write Better Ruby, that you can start using to improve your code, so I’ll get to it right away!

UNDERSTAND WHAT RUBY CONSIDERS TO BE TRUE

Every language has its own way of dealing with Boolean values, in some of them, the number zero is considered to be a false value, while all other numbers are true. Empty strings, arrays, and objects can also have different truthiness across different languages, so understanding how this works in Ruby can help discover bugs quickly.

In Ruby, the rule for truthiness is very simple, so it’s worth memorizing it:

Every value is true except false and nil

Consider the following:

irb> !![]
=> true
irb> !!false
=> false
irb> !!nil
=> false
irb> !!{}
=> true
irb> !!0
=> true

Understanding this will make it easier to know whether an expression will evaluate to true or false. Sometimes, however, you want to differentiate between false and nil. In certain contexts, false can mean that something is disabled, while nil can mean that something wasn’t explicitly specified. The easiest way to achieve this is to use the nil? method.

TREAT ALL OBJECTS AS IF THEY COULD BE NIL

undefined method `fubar' for nil:NilClass (NoMethodError)

This error is a like plague even in well tested applications, it happens when you call a method on an object and it turns out to be that pesky nil object, the one and only object from the NilClass class (remember, that in Ruby, everything is an object, even true, false or classes!). There’s a large number of ways that nil can unexpectedly get introduced into your application. The best defense is to assume that any object might actually be the nil object. This includes arguments passed to methods and return values from them.

One of the easiest ways to avoid invoking methods on the nil object is by using the nil? method. It returns true if the receiver (the object where you’re calling the method) is nil and false otherwise. For example:

irb> nil.nil?
=> true
irb> false.nil?
=> false
irb> hash = {}
irb> hash[:test].nil?
=> true

It’s often easier to explicitly convert a variable to the expected type rather than worry about nil all the time, the Object class defines several conversion methods which can come in handy in this case. For example:

irb> 42.to_s
=> "13"
irb> nil.to_s
=> ""

As you can see, NilClass#to_s returns an empty string. What makes this really nice is that String#to_s simply returns self without performing any conversion or copying. If a variable is already a string, then using .to_s will have minimal overhead. This allows you to safely work with an argument that might be nil:

Almost all the built-in classes have matching conversion methods: .to_a, .to_i and .to_f are some examples.

BE AWARE OF THE DIFFERENT BEHAVIORS OF SUPER

Even though super is used and acts like a method, it’s actually a language keyword. This difference is important because super changes its behavior based on something that is supposed to be optional in Ruby, parentheses. Choosing to omit parentheses when using super changes the way it works. To see how parentheses change things let’s review the three ways that super can be written:

  • If you givesuper at least one argument then it acts like a regular Ruby method and the parentheses are optional. In this form, super passes along the exact arguments it was given to the target method.
  • If no arguments are given and no parentheses are used thensuper will invoke the target method with all of the arguments which were given to the enclosing method. It also forwards along the method’s block if one is associated with it.
  • When you don’t actually want to pass any arguments to the overridden method you need to write super with an empty set of parentheses, i.e. super(). This looks especially weird in Ruby. This use ofsuper seems unnatural but it’s the only way to call an overridden method with no arguments (and no block).

Here are the same rules from above expressed in code:

Prefer Struct to Hashes for Structured Data

Hash tables are useful, general-purpose data structures which are employed heavily by Ruby programmers. When it comes to working with key-value pairs, Hash is definitely the go to class.

However, when working with structured data in an OOP language we often have better choices than hashes.

Say you’re interested in exploring annual weather data from a local weather station. Armed with a CSV file you plan to load the data into an array and play around with it. Each row in the CSV file contains temperature statistics for a single month. You’ve decided to extract the interesting columns from each row and store them in a Hash. Consider this:

Each month in the @readings array has high and low temperatures. You’d like to know the mean temperature of the year which means you’ll also need to know the mean temperature for each month. However, we resort to adding this logic into the more general aggregation method mean.

This array of hashes represents a collection of objects, except that you can’t access their attributes through getter methods. You have to use the hash index operator instead. This might be a minor issue but it’s one which will have an impact on the AnnualWeather class’ interface. Furthermore, every time you want to work with these hashes internally you will need to go back to the initializemethod to remind yourself which keys are available. Again, as long as the keys are set in a single method the burden is fairly low.

Using hashes to stand in for really simple classes happens a lot in Ruby. Sometimes it’s completely fine but more often than not we really should be creating dedicated types for these sort of objects. The thought of creating a new class for something so simple seems like an unnecessary chore. Fortunately, that’s exactly what the Struct class is for.

Here’s how our class would look like if we used a Struct object instead:

The existing mean method required few changes but now has a much better OOP feel to it. Accessing the high and low attributes through getter methods has a subtle side effect too. Typos in attribute names will now raise a NoMethodError exception. The same isn’t the case when using hashes. Attempting to access an invalid key in a hash doesn’t raise an exception but instead returns nil. This usually means that you’ll wind up with a more obscure TypeError exception later in the code.

Another improvement with Struct is that we can now do something which we wanted to do earlier, define a mean method for each month. We can optionally pass a block to Struct::new. For example, we can modify our Struct definition to include the mean method:

For those times when it seems too heavy-handed to create a new class, Struct can be very useful. And unlike a bunch of unrelated yet uniform hashes, Struct lets you define instance and class methods. That’s perfect for when you need to add a few simple behaviors to these otherwise boring objects.

DUPLICATE COLLECTIONS PASSED AS ARGUMENTS BEFORE MUTATING THEM

Most objects are passed around as references and not as actual values. When these types of objects are inserted into a container the collection class is actually storing a reference to the object and not the object itself.

Mutating an element of an array affects the original object which may still be available outside of the array. Similarly, when a method alters one of its arguments, those changes are visible outside of the method. Sometimes that’s the entire point of the method and its author was kind enough to suffix the method name with “!”. Most of the time, however, you really don’t want your objects to be mutated behind your back.

Collections (such as arrays and hashes) tend to be modified heavily during their lifetime. A common mistake is to mutate a collection which was passed as an argument to a method. Often without realizing that the original collection will also be modified.

Consider the following:

As you can see, the numbers array was modified after it was mutated within the OddPunisher class. The delete_if method mutates the collection instead of returning a new copy. If someone using this class expected the numbers array to remain unchanged, they would definitely have a surprise.

One way of fixing this is by using a method that doesn’t mutate the collection (using reject instead of delete_if). However, there are times where you do need to mutate the collection, and for those cases what you want is to create a copy of the collection instead of a reference to the original. This way you can mutate the new collection however you want.

There are two methods in Ruby which allow you to create copies of objects, dup and clone. They’re not equivalent, but you usually want to use dup when you’re planning to mutate the resulting object. Our class then simply needs to be modified like this:

@numbers = numbers.dup

It’s a common pattern to duplicate collection objects given as method arguments.

One thing to be aware of though is that dup and clone return shallow copies, for collection classes this means that the container is duplicated but not the elements. Consider this:

irb> a = ["Polar"]

irb> b = a.dup << "Bear"
#> ["Polar", "Bear"]

irb> b.each {|x| x.sub!('lar', 'oh')}
#> ["Pooh", "Bear"]

irb> a
#> ["Pooh"]

In case you need to create a deep copy of a collection you can override theinitialize_copy method if you’re creating a new collection class, this will allow you to control the depth of the duplication process. If you’re using existing collections then you’ll have to roll your own, there’s a quick solution which works the majority of the time:

irb> a = ["Monkey", "Brains"]

irb> b = Marshal.load(Marshal.dump(a))

irb> b.each(&:upcase!); b.first
#> "MONKEY"

irb> a.last
#> "Brains"

This comes with a lot of limitations, it takes time to serialize and deserialize an object, and you need to consider the amount of memory needed. This method will also not work with some of the core Ruby classes, these include the IO and File classes. Moreover, classes which contain closures or classes with singleton methods cannot be serialized.

Things to Remember

  • Every value is true except false and nil.
  • Unlike in a lot of languages, the number zero is true in Ruby.
  • If you need to differentiate between false and nil, either use the nil? method or use the “==” operator with false as the left operand.
  • Due to the way Ruby’s type system works, any object can be nil.
  • The nil? method returns true if its receiver is nil and false otherwise.
  • Using super with no arguments and no parentheses is equivalent to passing it all of the arguments which were given to the enclosing method.
  • If you want to use super without passing the overridden method any arguments, you must use empty parentheses, i.e. super().
  • When dealing with structured data which doesn’t quite justify a new class prefer using Struct toHash.
  • Assign the return value of Struct::new to a constant and treat that constant like a class.
  • Method arguments in Ruby are passed as references, not values. Notable exceptions to this rule are Fixnum objects.
  • Duplicate collections passed as arguments before mutating them.
  • The dup and clone methods only create shallow copies.

All credits to the author of Effective Ruby: 48 Specific Ways to Write Better Ruby, I definitely recommend the book for anyone who’s trying to improve his Ruby code, there’s much more useful information available in the book than what is presented here.