Recommendation Algorithm Using Python and RabbitMQ (Part 1): Architecture
This post is the first one in a series of seven where I’m going to teach you how to build your own recommendation system using RabbitMQ and Python.
Check out the steps of this series:
- 👉 Part 1: Architecture←you are here
- 👉 Part 2: Connecting with RabbitMQ
- 👉 Part 3: Basic Structure (wip)
- 👉 Part 4: Exploring Data(wip)
- 👉 Part 5: Workers (wip)
- 👉 Part 6: Integrating (wip)
- 👉 Part 7: Next steps (wip)
Recently I’ve worked on a recommendation algorithm project that uses many different calculations in order to recommend a specific content. When we saw that challenge it was clear that our biggest concern was speed. Recommendation algorithms cannot be too slow that it breaks the user experience, but also cannot be too fast that the recommendations suck. It is already tough to achieve good performance with a simple vanilla recommendation, but how could we still have a fast algorithm using many different criteria to the recommendation? In order to achieve that, our team has chosen a microservices architecture where every different calculation should be separated into different services.
It was a good approach. Using microservices enables us to have a higher availability with better fault isolation. Thinking in a recommendation system that will use multiple criteria, this architecture is obviously the best choice. But, since we use different algorithms to calculate different scores, probably the processing time of those algorithms would be different from one another.
Let’s see an example. Imagine that we want to calculate the weight of your notebook. One possible approach is to calculate the weight of the screen, add it to the weight of the notebook’s body and then sum the rest of the hardware weight. Since every calculation (screen weight, body weight, hardware weight) are independent, you don’t need to run this program synchronously.
To be clear, if you want to run your program synchronously, this is what you should get:
In this scenario the total runtime of our program would be:
O(w) = O(s) + O(b) + O(h)
# The total runtime would be equal to the sum of all functions runtime
Well, since the three functions are not dependent from one another, you could run then asynchronously, right? And this is what you should get:
Now it is starting to get interesting. Our new total runtime would be:
O(w) = O(n)
# Being n the slowest function. So, the total runtime would be equal to the total runtime of the slowest function
And that is why we decided to use microservices in order to achieve a faster recommendation algorithm. But what if one of the functions takes too long to respond?
The turtle issue 🐢
After deciding our architecture it was clear that our problem wasn’t still solved. Even using microservices and having asynchronous independent functions running, our algorithm would still be as fast as our slowest function.
Of course, this is the most efficient way that we can run multiple functions. But there is one more problem to be solved. A recommendation algorithm doesn’t run just once. It usually is executed multiple times, by multiple users.
How could we achieve optimal efficiency when our algorithm is running multiple users? For example, if our program takes 2x longer to calculate the hardware weight, and we receive requests from 2 different users at the same time, the second user would need to wait for the first user recommendation to finish:
RabbitMQ to the rescue 🙏🙏🙏
If you haven’t heard of RabbitMQ yet, it is “the most widely deployed open source message broker”. What the heck is a message broker? You should be asking. It is an intermediary computer program module “that translates a message from the formal messaging protocol of the sender to the formal messaging protocol of the receiver”. In other words, it is like a postman. When you send a message to the broker it delivers to the first available worker.
A worker is a service that usually resolves complex computer algorithms. In other words, when you use RabbitMQ (or any other message broker) you can delegate any complex calculation to a different program (a.k.a. another server, process, thread or whatever) in order to achieve better efficiency.
RabbitMQ uses queues to deliver messages to the workers. Also, you can have multiple workers running at the same time. With that in mind we could build a simple architecture that uses different queues in order to delegate the calculations and them, with that, our system would be able to handle multiple requests at the same time.
Let’s look at an example. Using the same notebook’s weight example as before, how should our architecture be like?
As you can see, we have our user (the gray box above) that requests to calculate the notebook’s weight. By doing so, it triggers a server that is designed specifically to start our three calculation services. It does it by sending three messages at the same time, in three different RabbitMQ queues. Each one of our calculation services (screen, body, and hardware) listens to a different queue and when they got the message they automatically start the calculation.
After that, every one of them sends a message in a new queue. That queue is being listened by a service developed to get the messages, sum the weights and return to our user.
There are some important gotchas in the example above. You probably noticed that the service that receives the request is not the same that answers, right? Well, depending on your context it is not possible to achieve that. For example, if you are developing a web application you would need a simple API to handle it for you in order to respond to the web request. In that scenario, you would simply add a new box between the User layer and the Get notebook’s weight and Sum weights layers. That API would be responsible for getting the user requests, triggering the system and waiting for the response. Also, it is a little tricky to build the Sum weights service because it will get the responses asynchronously, so it needs to have some sort of correlation id in order to not mix different notebook’s data and sum the right weights.
“Why you don’t simply sync the request API with the calculations? Then just respond when everything runs?”
- When you sync the request (in RabbitMQ we call this RPC) and your server had any issue you will lose the data. So, bye-bye fault isolation
- Although it is harder to handle the request/response situation when you just use simple messages, you get a way simpler architecture that is much easier to debug
Ok… But what is the big deal? 🤔
With that architecture, your system would not wait for any process in order to respond to your users requests. For example, when a user asks for the notebook’s weight, the first service just triggers every calculation and then it is ready to receive more requests. So, you could run multiple calculations almost at the same time with a simple and efficient architecture.
Take a look at this animation. Every different emoji is a different request:
Cool 😁 Now we understand the benefits of this type of architecture. We just need to build it.
Our goal 🖖
In order to be easier for us, I’m going to propose a little challenge. This is going to be our objective throughout this series. In every article, I’m going to remind us of that challenge.
So, imagine that we’re working for Webflix, a movie streaming platform. Our goal is to develop an algorithm that recommends the most relevant movies to a user based on his history and also her/his most watched genres.
It’s pretty simple and basic, but I think this could show us the power of using RabbitMQ in your software structure 😃
See you guys in the next article. There, we’re going to explore how to connect and use RabbitMQ with Python.
See you soon! 🖖