Many students learn at the university the principles of object-oriented programming, and they are told that object-oriented is one of the crucial crystals of knowledge that they need to possess in order to find a good job in the industry, they graduate from the university thinking that object-oriented is the best paradigm there is and they should apply it everywhere, many of them don’t know that object-oriented isn’t perfect and has its weaknesses, here we are going to review some of them.
Object-oriented programming focuses at modeling the world by objects, objects are models for entities that represent what are these entities and not what they can do, object-oriented approach allows us to:
- Use data abstraction (and abstraction in general) in order to give meaningful representation of the data that we care about.
- Use encapsulation (or information hiding) which hides the implementation of behavior, i.e. allows other objects to focus on what the object can do by what it is, and not how it does it.
Data Abstraction and Encapsulation
Object-oriented gives us tools to abstract away details and hide implementation with encapsulation, which allows us to only use what we want when we don’t care how it’s done.
A popular example for abstraction of data is
Car class with a property of
fuel that tells us how full the tank is.
By exposing the getter
fuel that returns how full the fuel tank is in percentages and not by its actual representation in liters, we abstract away the details of the class.
Sometimes we don’t care how many liters are currently in the tank, sometimes we only want to get the sense of how much fuel there is in the tank respective to the capacity it can hold, a value of 0.1 would tell us that the fuel is low and we need to fill it up, on the other hand a value of 0.9 would tell us that it’s almost completely full and won’t need to be filled up very soon.
Due to encapsulation, in the future we can change the actual representation of the data from float to double, from liters to grams (in case we have solid fuel) and all the classes that depend on class
Car won’t even know that something has changed and nothing will break.
The same idea can be applied to monsters in some RPG game, each monster has a health bar displayed right above it, and the bar is filled according to how much portion of health is not yet lost, 100% health would display a full green health bar, where 50% health would display a half yellowish health bar and the same idea with 10% health with the color red, all the inner calculations of damage taken and dealt along with the actual numbers of health are encapsulated inside the object and never leave it, until we really need them to, like displaying the actual amount oh health points of each monster instead of a health bar.
By using data abstraction and encapsulation we hide pieces of data/code behind meaningful names and minimize the amount of work that should be done in case of changes, changing the fuel tank units from liters to milliliters it won’t affect any other component in the system.
We can see right away the usefulness of this approach, programs and software are always evolving and rarely never change, by detecting the axes of change in the program we can encapsulate and abstract away things that might change frequently and therefore would require less changes that need to be done in the code.
- Object-oriented ties data to its functions together inside an object, where methods hides the data behind abstractions and give meaningful names with better readability which results in a system that is easier to maintain.
- Object-oriented gives us some tools to utilize these ideas such as inheritance, interfaces and polymorphism.
Procedural vs Object-Oriented programming
If object-oriented ties data and functions together by hiding data and exposing functions that work on that data, then procedural is the complete opposite, in procedural programming we separate the data from the functions, quite often we see data classes along with service/helper/utility classes that operate on these data classes, a great quote from the book Clean Code by Robert C. Martin (Uncle Bob):
Objects hide their data behind abstractions and expose functions that operate on that data. Data structures expose their data and have no meaningful functions.
Let’s see some examples of code in object-oriented style versus procedural style.
A very common example is by using shapes as circle, rectangle and square, lets say we do object-oriented programming and we want to make our program to have the ability to calculate the area of a shape, so one would do something like this:
Notice that all properties (fields) of
Square are private and hidden from the outside world, and we expose the function
area() that operates on these private fields, notice that we use object-oriented tools such as polymorphism and inheritance.
What would happen if in the future we would need another shape, like triangle? we’ll have no problem, we’ll just create another class named
Triangle, make it implement the interface
area() method and we are done, we do not need to touch any existing code in our project, none! We won’t worry that some previously written code has gone bad because we made some changes here and there.
Now, let’s look at the procedural equivalent of shapes.
Notice that all shapes are now Data classes/structures, all of their fields are public as the data and the functions that operate on the data are separated, the functions need a way to access that data, therefore the fields are public, in procedural we keep data and behavior in separate, the behavior is frequently stored inside a Helper/Utility/Service classes that contain almost to no state.
What would happen if we want to add a triangle shape? we would had to go into
Geometry class and then to all functions it has (only
area() at the moment) and add another else-if case, that’s already sounds worse than the OOP counterpart.
So, is OOP better? well, not quite, lets think about what would happen if instead of adding another shape, we would want to add another behavior, i.e. another function such as
shapeLength()? in the OOP version we would have to go to ALL shapes and add another method to each of them, sounds tiresome, what about the procedural version? well, we’ll only have to add a function in Geometry class right beneath
area(), all in one place.
What if we didn’t create these shapes classes and some other co-worker wrote them and we can’t really change them, but we want to add the function
shapeLength()? if we are working with OOP, then good luck, in procedural, we’ll only have to add another function in Geometry class.
Another great quote from Clean Code:
Procedural code (code using data structures) makes it easy to add new functions without changing the existing data structures. Object oriented code, on the other hand, makes it easy to add new classes without changing existing functions.
Procedural code makes it hard to add new data structures because all the functions must change. Object oriented code makes it hard to add new functions because all the classes must change.
So what is the solution of OOP to this problem? there is no true solution but the closest is the Visitor design pattern, where we export the behavior out of the class, splitting up data and functions, sounds procedural right? well, kind of.
One of the advantages that OOP has in its arsenal are interfaces, when we want to add
shapeLength() method to shapes, we add it to the
Shape interface first, and most compilers would tell us all the classes that need to be worked on to add the function to them, in procedural way, when adding a shape, the compiler won’t shout at us that we have missed a shape implementation until we encounter a run-time exception.
The Weaknesses of Object-Oriented
As we have already seen that OOP struggles with adding behavior (more precisely, it takes us more work), there are things that are hard to achieve with object-oriented regardless of the amount of work that it requires to be done.
Serialization and Boundaries
Let’s think that our application, that contains business logic and models, as a box.
Our whole application is written in Java and follows the object-oriented paradigm, object-oriented lives in a world of high-level of abstraction, it likes it there, but that world isn’t stretching to infinity and has its boundaries.
Those boundaries of the world are what separating between the world of high-level of abstraction and low-level of abstraction.
Almost always our application needs to cross those boundaries in order to achieve the task it was made for, such as communicating with the user via graphical user interface or communicating with another application via TCP socket (let it be a DB or any other service).
When our application wants to communicate with the outside world (i.e. software that isn’t our application) it has to “speak” in the language that others can understand, the most common language is binary which has low-level of abstraction.
The problem is that objects in the high-level of abstraction world communicate with other objects by passing messages between them, sockets doesn’t know how to receive and read messages, but it knows how to work with stream of binary data, in order for our objects to talk with the outside world there has to be some object (or a component) which knows how to translate object’s messages into stream of bytes and vise-versa, we shall call it “Abstraction translator” for educational purposes.
The translator is nice and all when we deliver messages as data, but what would we do if we need to send an object for persistent storage? we need to somehow translate the object to message, and later on to pure data (using our abstraction translator) in order to cross the boundaries of object-oriented, how are we gonna do that? should we just talk to the said object and ask it to give up all of its inner data that it hides and protects? we’ll break the encapsulation!
The boundaries of any application aren’t object-oriented!
Let’s see a concrete example, let say we have a
Person class which has private fields and public methods.
In order to send the
Person class to persistent storage or display its info in GUI to user, we have to somehow export the private fields like
But we are invading the privacy of the object, although by generating new method such as
toDTO() the object is now giving us its data willingly, we are creating a data class (
PersonData)! it starts to smell like procedural code, we integrated an “abstraction translator” inside Person class in form of the method
toDTO(), although that it still requires more effort to lower the abstraction of the details in order to transmit this data to a socket, it is just enough to make it more procedural than object-oriented.
So what is the more object-oriented way to do so? this is where object-oriented struggles, it doesn’t like low-abstraction data classes, but the closest thing we can do to object-oriented is to introduce a new class
Person will know about,
Person will use
PersonExporter in order to help it export its data.
The advantage of this approach is that we can create new “Exporter” for every format we want, but each exporter is a service class which operates on data given to it by
Person, i.e. it is a combination of procedural and object-oriented, like the Visitor design pattern.
Serializing/exporting and de-serializing/importing private data of objects is a known issue in object-oriented for the simple reason that data should remain private, and by exporting the data of an object, even by an external object with a meaningful name, we break encapsulation and lower the abstraction of data.
We can see many different approaches to tackle this problem in the object-oriented world, many noticeable are Veil objects & Printers instead of getters by Yegor and Project Amber of OpenJDK, the standard approach now days is to use ORM (Object-relational mapping) between objects and their data representation, although not perfect in terms of object-oriented, it does its job quite well in order to be used widely in the industry, the
Serialization interface that is built-in Java’s standard libraries gives a naive approach to the problem with many disadvantages.
That is the reason that it is useful to use multiple paradigms to create a software, each paradigm has its strong sides that we can use in the appropriate scenarios.
- https://en.wikipedia.org/wiki/Visitor_pattern | Wikipedia, Visitor design pattern.
- https://en.wikipedia.org/wiki/Object-relational_mapping | Wikipedia, Object-relational mapping (ORM)
- https://blog.cleancoder.com/ | Robert C. Martin (Uncle bob), The Clean Coder blog.
- https://books.google.co.il/books/about/Clean_Code.html?id=hjEFCAAAQBAJ&redir_esc=y | Google Books page of the book “Clean Code” by Robert C. Martin (Uncle bob).
- https://blog.cleancoder.com/uncle-bob/2019/06/16/ObjectsAndDataStructures.html | Robert C. Martin (Uncle bob), The Clean Coder blog.
- https://www.yegor256.com/2020/05/19/veil-objects.html | Yegor, Veil Objects.
- https://www.yegor256.com/2016/04/05/printers-instead-of-getters.html | Yegor, Printers Instead of Getters.
- http://cr.openjdk.java.net/~briangoetz/amber/serialization.html | OpenJDK, project Amber, Brian Goetz, Towards Better Serialization.
- https://www.infoworld.com/article/3275924/oracle-plans-to-dump-risky-java-serialization.html | Paul Krill, InfoWorld, Oracle plans to dump risky Java serialization.
- https://refactoring.guru/smells/data-class | Refactoring Guru, Data Class.