A Mental Model for Understanding Encapsulation in Ruby
A discussion of encapsulation, how it is implemented in Ruby, and how exactly it benefits the code we write on a practical level.
Encapsulation in object oriented programming is the grouping of data into objects while making that data unavailable to other parts of a codebase. It’s one of the fundamental conceptual pillars of object oriented programming (along with abstraction, polymorphism and inheritance).
More simply, encapsulation is akin to placing information into a bucket, closing the lid and then hiding that information from the curious and prying eyes of others. Within the context of object oriented programming, the data we’re hiding is the attributes that an object has (which together comprise the state of the object) and the functionality (behaviors) of which the object is capable. Therefore, to say that we have encapsulated data is to say that we have hidden the state and behaviors of an object from the rest of the codebase. To say that we are hiding data is to say that we are making that data inaccessible from the rest of the program, which also protects that data from unintentional manipulation.
In this article I discuss what exactly encapsulation looks like, how methods are used to encapsulate data or expose it to the rest of a codebase, and how exactly encapsulation benefits our ability to design complex programs.
The Hidden Nature of Objects
Ruby uses instance variables to track the values associated with an object’s attributes. For example, we may have a
Dog class whose objects have the following attributes: name, weight and age. To track these attributes, our
Dog class has instance variables of the same name:
@age. Together, these three instance variables comprise the state of objects instantiated from the
Let’s consider the following example:
Here we have initialized a local variable
spot to a new object of the
Dog class. The state of this object is represented by the instance variables that we have initialized in
lines 3–5, which tell us that the
@name of this object is ‘Spot’, that it has a
@weight of ‘12’ (lbs) and that its
@age is ‘4’. Invoking
line 10 returns a human readable representation of the object
spot: the class name of our object, an encoding of the object id and, most importantly for this discussion, all of the instance variables that have been initialized within that object along with their values.
At their most basic level, objects are distinct entities that are separate and independent of each other. We could also say that the internal representation of objects is hidden from each other, where one object is unable to access information about the other. Practically, this means that we cannot interact with an object unless we explicitly expose the state and behaviors of that object to other parts of our code. Ruby uses methods to expose the state of an object, but without these methods an object remains what is essentially an information silo, closed off and isolated from the rest of the world (or, rather, code). With these methods, we can retrieve and manipulate the data within an object.
To use the example above, our dog
spot has a number of attributes. However, with the code as currently written there is no way to access them from the rest of the program. To demonstrate this we can attempt to reference one of the instance variables from outside of the class
Dog where the instance variable was initialized.
As we saw before, we’ve initialized the
@name instance variable in
line 3, to which we’ve assigned the value
’Spot’. We also confirmed that the instance variables have been initialized when we invoked
line 10, which returned the instance variables and their values along with other information about our object
spot. Yet when we attempt to reference the variable in
nil is returned.
There’s a good chance this result isn’t surprising to you, but it might not be surprising for the reason you think.
nil is returned when directly referencing the
@name instance variable outside of the
Dog class is because instance variables are encapsulated in the object where they were initialized. As you recall from the beginning of our discussion, when data is encapsulated it is hidden — that is to say, inaccessible — from other parts of our codebase. For instance variables, they are hidden by default unless we deliberately create a way to access their values. To expose an object’s instance variables — to take them out of hiding, so to speak — we have to define methods that allow us to access the instance variable’s value outside of the object.
Using Methods to Open Objects to the Rest of the World
Instance variables are scoped at the object level, meaning that an instance variable exists only within the object where it was initialized. Therefore, to gain access to the values assigned to an object’s instance variables we have to consider the object as the gateway to access them.
To access these values, we need to define instance methods that expose them to other parts of our code. When an object’s instance variables are exposed we can retrieve the values assigned to them, or we can manipulate those values altogether. When we don’t have methods that expose them, they remain locked and hidden within the object.
Let’s define instance methods that allow us to retrieve and manipulate the value assigned to one of our instance variables; such methods are called getter and setter methods, respectively. For this example, we’ll use the
@name instance variable.
lines 4–6 we’ve defined our getter method
name. When we invoke the
name method on
line 13, it simply returns the value assigned to the
@name instance variable. Likewise, on
lines 8–10 we’ve defined our setter method
name=(), which reassigns the
@name instance variable to the string passed as an argument to the
And voilà, we are now able to access the value assigned to
@name from outside of the class where it was initialized. With both getter and setter methods in place, we can now both retrieve and reassign the value assigned to
You’ll notice, of course, that the
name=() instance methods are invoked on our
spot. This is because instance methods can only be invoked on objects of the class where those methods are defined. As mentioned previously, if we want to access information about the state of an object, we have to consider the object as the gateway to that information. If we were to invoke, for example, the
name method apart from
spot, an error would be raised.
Controlling How Objects are Exposed with Method Access Control
Encapsulation allows a programmer to have fine tuned control over what data is hidden within an object and what data is exposed to other parts of a program. Think of it like peeling back the layers of an onion: rather than exposing the entirety of the inner onion, I can peel back as many or as few layers as I want, revealing only as much of the onion as is necessary.
Any discussion of encapsulation necessarily entails a discussion of how exactly Ruby accomplishes the encapsulation of data. Like many other programming languages, Ruby uses the concept of access control to restrict and open access to the methods that allow one to retrieve and manipulate data within an object. Within the context of Ruby, we call this mechanism method access control.
Method access control uses access modifiers to control access to methods. In Ruby, access modifiers are public, private and the less commonly used protected. Public methods are available outside of the class where they are defined, meaning that they can be invoked anywhere without restriction. In Ruby, methods are public unless explicitly declared to be private or protected. In our code example above where we defined getter and setter methods for the
@name instance variable, these are public methods since we did not declare them to be otherwise. We also saw this in practice when we successfully invoked the
name=() methods from outside of the
Private methods, on the other hand, can only be invoked from within the class itself. It is also the case that private methods cannot have a caller, because they are implicitly invoked on
Let’s look at an example of how to declare a method to be private, as well as the implications of doing so.
To declare a method as a private method, we simply include the
private method invocation, followed by any method definitions we intend to be private. In the above code, we have taken the
name getter method and moved it below the
private method invocation. And the result? As we see here, invoking a private method from outside the class raises a
NoMethodError as well as an indication that the method we have tried to call is a private method. While Ruby typically raises a
NoMethodError when it doesn’t find a method in the calling object’s lookup path, in this case it raises the error not because the method doesn’t exist but because making the method private has blocked access to it.
As stated previously, private methods can only be invoked from within the class where the method is defined. Now that we’ve declared the
name method as a private method, here’s an example of how we can call it without raising an error.
line 5 we’ve interpolated the
name method within a new method definition,
speak, with the resulting output on
line 18. Whereas we could not invoke the private
name method from outside of the class, this demonstrates that private methods can be invoked by other instance methods of the same class. It also demonstrates how we can encapsulate methods, since private methods are accessible only within the class itself and are protected from invocation outside of the class.
Protected methods lie between public and private methods. Like private methods, they can only be invoked from within the class where they are defined. However, unlike private methods they can be invoked on a calling object other than
self, so long as the calling object is an instance of the same class. Protected methods aren’t commonly used, but one common use case is when comparing objects of the same class.
In the above example we’ve declared
weight as a protected method. This allows us to protect the
weight method from being accessed outside of the
Dog class, while also allowing us to compare the weight of objects of the
Dog class as defined in the
< method on
lines 8–10. Like private methods, protected methods can only be invoked from within the class, but unlike private methods they can be invoked on both
self and other objects of the same class.
The Practical Benefits of Encapsulating Data in Objects
Learning object oriented programming for the first time can be very challenging. Not only is it a major conceptual shift in how we think about programming, but it can be difficult to translate object oriented programming on a conceptual level to how it benefits the code we write on a practical level. This is true as well of understanding encapsulation and how it is implemented in Ruby.
Encapsulating data into objects has two chief benefits:
- It protects data from unintentional manipulation. In other words, in order to change data within an object there must be obvious intention behind doing so. It also means that we can restrict the way in which data is manipulated, so as to prevent it from being manipulated arbitrarily.
- It allows us to hide complex operations while leaving a simple public interface to interact with those more complex operations.
Let’s illustrate these benefits with an example.
The above code illustrates a simple but key point: in order to start the car, the object
joe doesn’t need to know the implementation details of every method involved in starting the engine, or even that those methods exist. More specifically, objects of the
Person class do not need to access the
start_motor methods, which are all necessary steps in starting an engine. Rather, the only method that objects of the
Person class need to know about is the
start_engine method; all other implementation details that follow from this method can remain hidden and inaccessible.
And that is encapsulation in practice. We’ve packaged all of the complex details involved in starting an engine, have made them inaccessible outside of the
Car class and instead have defined a simple public interface — the
start_engine method — to handle all of the underlying complexity. In fact, this models the real world implementation of starting a car: one doesn’t need to know the internal mechanics of how exactly a car engine starts. Instead, a person only needs to know how to turn the ignition with a key. The rest of the implementation happens under the hood and out of sight; in other words, it is encapsulated.
Notice as well that while we have defined a public getter method for the
@engine_status instance variable in
line 11, we have made its setter method in
line 27 private. While we may want the status of the engine to be publicly accessible (for example, a mobile app is able to check if the engine is running), we want only the internal implementation of the object’s class to be able to reassign it, which protects it from the possibility of being directly changed from outside the class. In practical terms, we don’t want an object other than a
Car to be able to modify
@engine_status. Rather, we want it to be changed only as a result of the internal implementation that begins with the public
start_engine method and ends with the private
start_motor method. We want the value of
@engine_status to reflect the actual status of the engine, while also preventing arbitrary changes that don’t. Using method access control to structure our methods this way ensures that
@engine_status is manipulated with clear intention and only in the specific way we’ve designed it to be changed in our program.
In summary, encapsulation allows a programmer to group data into objects and then hide that data from the rest of the codebase. Likewise, it also allows a programmer to expose only data that needs to be accessed outside of the class. By encapsulating data, we can prevent arbitrary changes to data, and we can also hide complex operations while providing a simple public interface to interact with them.
- As of Ruby 2.7,
selfcan explicitly call private methods.