Migrating an Operational Project to Microservices

Lalafo
Lalafo
Published in
10 min readNov 18, 2017

Dmytro Nemesh — CTO at Lalafo, over 7 years of experience in PHP/Java/Node.js, volunteer at GeekHub — on how a mobile C2C marketplace switched to microservices and the challenges of such a transition.

Lalafo is an app for buying and selling used clothing and cars, as well as looking for real estate, jobs and services. It uses machine learning and computer vision to recognize items on pictures, find fraud and increase content relevance. Today, Lalafo is active in 4 markets, 3 of which it became the #1 mobile marketplace. Currently, this project processes over 900 requests per second and its hardware is working at only 25–30% of maximum capacity.

Upon a detailed analysis, it was found that the product had a range of problems. There were issues with the quality of the code and portability since the project was developed by two different teams on separate occasions.

Why microservices?

We were tasked with combining 7 databases into one. To change a database meant to redesign the project from scratch because such changes affect the structure of tables, data entities and logic. Before we picked the direction for the project, we decided to consult with high-profile developers that worked on high-load projects. This resulted in 3 possible solutions:

  • Rebuild the monolith.
  • Switch to service-oriented architecture.
  • Switch to microservice-oriented architecture.

Microservices seemed to be the most compelling way to go. With this in mind, we spoke to those who already integrated microservices, those who failed to integrate them and assessed the challenges we would have to face. In the end, we went with microservices.

Challenges of migrating to microservices

We needed to combine 7 databases from 7 markets into one global database. This had to be done while taking into account that all the data that would be recorded into microservice databases. We also had to keep all the connections between entities. What does that mean? For example, we needed to fix all ID conflicts (each one of 7 databases having a user entry with ID 1). It was also possible for one user to be registered in 7 different countries.

After all databases are combined, we needed to divide them into smaller microservice databases. Then we needed to repeat the old API to keep all the applications working. Lalafo has millions of users and the seamless transition was a critical point in the process.

On top of that we needed to keep the SEO traffic, meaning that all URLs with the API had to remain available through the same links containing the old IDs.

What kind of microservice models are there and why did we chose the service-oriented one?

There are 3 models of microservices: starshaped, Twitter model and service-oriented. We chose the third one for Lalafo and here’s why:

The Starshaped model has a part of a monolith or a supermicroservice in its center that communicates business logic or tasks to microservices. It’s the most popular model in cases when you have a complete or partial monolith and part of the logic needs to be handled by microservices. A solid system, but it was not suitable for us as we needed to rebuild the entire system from scratch.

Twitter model . We used it. Our API is built on versions: Version 1, Version 2 uses this model. A front-microservice is working as front controller in MVC frameworks. It receives and verifies requests and then analyzes them and sends them to microservices. Everything is neat and we really liked it, but there was too much traffic on the front microservice which called for it to be redone. We had it developed in PHP and the optimal solution was to use Rust or Go to make it run faster. Even though we did not have any performance issues at that moment, before starting to redesign it we thought about looking for other options. And that’s how we decided to try out the service model.

Service-oriented model. The essence of the service model is to have the client work directly with the microservices they need. This model worked great for us, but only its 3rd version. It works wonders, but has two small flaws that we can still work with.

Flaw #1 — complexity of administration. All our microservices work within a private network and can’t be accessed from outside. To work with microservices, routes must be created on the balancer. There are already over a hundred of such routes on our 24 microservices which makes the work of our system administrator that much harder.

Flaw #2 — an element of the service is added to microservice workflow. If a microservice is working directly with one connected domain field: ad, user, etc., it starts to automatically check for user access permissions for the specific ad. This is why it connects to a user microservice, gathers the data, analyzes it, verifies it and sends a response. This logic causes microservices to gradually become full services.

How we migrated to microservices?

When we started development, we needed to understand if we can work with microservice architecture. With this in mind, we developed SDK (Logs, InfluxDB, Services, Helpers, HttpClient) that helped synchronize with various microservices and boost the development. We also developed a tool to help us work with microservices as if we work with ORM. We called this tool ”Object-REST” or OREST, for short. All of this was made to keep the development as close to the regular monolith development as possible. With this, it was easy for developers to get used to it due to the feeling that they were working with a regular monolith.

We also updated all the standard Yii framework components that we used for access control, users and logging. As a result, the developer did not sense the difference between developing a monolith and microservice. This approach helped us develop an MVP in just 3 months for the product that had been in development for over two years.

We developed a microservice that combined 7 databases into 1 while preserved all relations. All database entries got new ID’s and then this microservice built the map of relations between old and new ID’s. This solved the problem of backward compatibility and keeping the SEO traffic. After that, we created another microservice to split the main database into separate databases for each microservice, then another one to provide the ID map. This microservice allowed using old URLs to see the content with the new ID from microservice database.

There’s one trick we used to make all of the above work — we reserved a range of ID’s from 0 to 50 million for entries migrated from old monolith. That means, all new entries created after migration to microservice architecture got ID’s starting from 50 million point. This helped us to understand whether we need to check history map for actual ID or not. This hack is only used by the API v1, which will be shut down some day, and by the web client. In approximately 1 year the microservice that stores ID’d history map will be gone and new ads will completely replace all the old content.

The stack we chose:

  • PHP 7, Yii2, Codeception;
  • Python — our data-science stack containing computer vision and machine learning
  • NodeJS — the comet server for websockets;
  • PostgreSQL — our main database
  • Redis — user sessions storage;
  • ElasticSearch — full-text search;
  • RabbitMQ — main queue;
  • Kafka/Cassandra/Spark — important part of our internal analytics;
  • InfluxDB+Grafana — application metrics;
  • Graylog2, Zabbix — logging and system monitoring;
  • Google BigData — analytics;
  • Jenkins+Docker — now we are changing it to Kubernetes+Gitlab CI
  • CloudFlare.

As a result, we ended up with the following number of microservices:

Core: user, catalog, chat, sender, moderation, payment, security, fraud

Supplementary: page, location, SEO, translation, merge (the hero that merged old databases), map (the one that stores ID map — result of the merge), mobile-api, cache, analytics, upload, and file node

AI: classify, classify-analytics, duplicates, image processing, and content filtering.

When you’ve spent your whole life developing monoliths, you start working on microservices and you have 3 possible types of reactions:

Perfect! You work with some of them and you think “Great, this really works!”

With others you think “something’s not right. These few microservices could be combined into one.”

You see that the microservice performs poorly and it’s not microservice’s problem.

Perfect microservices: translations, sender, analytics, security, upload, and classify.

Microservices you want to combine: user, catalog, and location

Microservices you want to rethink: fraud, and moderation

Importance of testing

It’s impossible to develop a microservice architecture and not auto-test it. Especially if it comes to a fast-growing product and team. We combine Acceptance and Functional test from Codeception framework tests — it’s our best option. It also works fast with continuous integration because you have smaller microservices, opposite of monolith structure deployments when auto-test run could take few hours. When we deploy changes our CI runs test only for microsevrice that has been changed. Because our CI system is fast enough we’re able to test every single commit, no matter what branch it belongs to.

What about manual testing? Because every microservice has only the REST interface — manual API testing could take eternity. Imagine you have to test from 1 to few tens of urls and verify the JSON schema, the logic, overall correctness of microservices behaviour, and of course — all these should be tested with different test data. After all, by testing microservices manually you can only verify the data correctness, not the logic, or schema or whatever correctness.

The moral of the story is:

  • Autotesting is a must have.
  • Manual testing is a pain in the kiester.

Microservice communication:

  • REST API with http-cache
  • RabbitMQ/Kafka
  • Global events with Kafka (Message Bus/ Event-drive)

REST API allows for organization, optimization and versioning of the communication. Also, the system administrator gets new tools to save the product and you if something goes wrong in the middle of the night. Should an issue occur, the system administrator finds a trouble route and connects an HTTP cache to it which makes your microservice almost completely idle. You can analyze log files and fix the issue without the application crashing.

Queue RabbitMQ — we haven’t had a single problem with RabbitMQ in a year. It’s stable and reliable — we can only say nice things about it.

Kafka is also a cool tool. It’s a lot like a queue, with the main difference being that it’s not so much a queue as it is a system of logs with exchange capability. You connect to a regular queue and can process the same event multiple times depending on the microservice type. It has a better throughput capacity compared to RabbitMQ and we’re now working on having all our events sent to Kafka automatically.

Active and passive microservices

If a person is in charge of multiple microservices, the attention is spread thin. Developer can forget or have no time to perform a test or miss something. We came to the understanding that the best ratio to have is:

  • 1 active microservice per developer
  • 3 passive well-known microservices per developer.

A microservice is the same as monolith, just on a different scale.

If there are errors, they are on the micro-scale. Microservices grow which makes it harder to divide them into smaller microservices. You never know which microservice will grow and which one will remain the same.

Microservices reliability

If you follow the recommendations for building a microservice architecture, you would need a lot of servers. There is Microservice 1 and Microservice 2. For each type of microservice or a group, there must be a balancer and each group of microservices must have a separate master/slave model database. This means that for each group we would need to have 5 servers to get maximum fault tolerance. If we have 20 microservices, theoretically, we need more than a hundred servers. We decided to take the other way.

We have 3 powerful servers that host all our microservices. When dealing with microservices, it’s better to have a lot of small servers rather than a few large ones. currently we’re moving the most loaded microservices to separate smaller servers. We’re working towards having a large amount of smaller servers. We’re now playing with Kubernetes which helps us easily manage microservice containers on those servers. But this topic deserves a separate article :)

There’s a viewpoint that microservices are very reliable systems. I agree that they have high reliability rates. When we started, there were some reliability issues we had to face. However, if you do everything right from the start, you’ll end up with a very reliable application. To achieve this, you need to go with the approach that presumes that any microservice could stop responding at some point. To check if your microservices are designed correctly, you need to switch some microservices off to see how others will cope in cases if one of them crashes or delays a response. Only if you test this (whether manually or automatically) will you have a system that is reliable.

Things to keep in mind when switching to microservices:

  • Micro-issues are better than mega-problems with a monolith.
  • Containers and autotests are a must have.
  • Easy scaling for both the product and the team.
  • Easy integration of new technologies.
  • Smooth migration from monolith is better than a forced one.
  • Powerful and advanced monitoring systems.
  • Good system administrator or a DevOps.
  • Containers is a topic that deserves its own separate discussion. In short, using Kubernetes + Docker + GitLab CI takes microservice management and monitoring to another level.

Microservices are a perfect ecosystem for experimenting:

  • No need for complete re-design. Thanks to consulting with others, a year after, we did not completely re-design a single microservice.
  • Fast CI helped avoiding lots of problems.
  • New features developed faster than it can be adopted by the business.
  • Backward compatibility for the whole application. You can always roll back to the previous version of a certain microservice (that said — a previous version of business logic).
  • Great scalability.
  • Low cost of mistakes. Even if a junior developer colossally messes up, any microservice can be refactored within 1–3 days.

--

--