The technological evolution at Mercado Libre: from the monolith to the multicloud platform

Juliano Marcos Martins
Mercado Libre Tech

--

If you are wondering whether to embrace a multicloud platform or not, then you’ve come to the right place. Welcome to our series of articles where we share our experiences in building the main features of the Fury platform, an Internal Developer Platform created by Mercado Libre that provides developers with a unified and self-service environment to efficiently create, deploy, manage, and monitor applications.

Join us on a journey to the heart of Mercado Libre’s tech team, where we create and orchestrate all the applications that make up our ecosystem. Throughout this exploration, you will witness our technological evolution as we move from a monolithic architecture to an integrated, cloud-agnostic, and flexible platform as a service.

To get to speed with the exponential growth of Mercado Libre, our Tech team has developed a solution that enables massive scaling and provides a world-class user experience to our +15,000 engineers. They use this platform daily to create top-notch applications. Managing over 30,000 microservices running on 100,000 instances in production, our team ensures seamless integration, agility, simplicity, and security.

The Mercado Libre ecosystem consists of various pillars or business units such as marketplace, fintech, online shops and ERP, logistics, and advertising. These units rely on a single platform, Fury, for their development and operation. But how did we come up with this innovative solution?

A glimpse into history: Old World

Back in 1999, we released the first version of Mercado Libre’s Marketplace, which was built on a monolithic architecture. At that time, the platform consisted of a few hundred physical servers running on a single Oracle database, with only one code repository in which over 200 developers interacted daily. What started as a small-scale solution quickly grew into a behemoth.

Deployments were carried out on a weekly basis, resulting in frequent delays in the development process as code freezes were required to deliver the solution to the Quality teams. Testing every functionality became a challenge, leading to slow error response times. Additionally, the complex configurations of the environments made it difficult to test new technologies.

Developers faced significant challenges when deploying their changes on the same codebase. This often resulted in conflicting functionalities, where one functionality would work correctly while the others would break. Scaling the system became significantly challenging, finally leading to a high time-to-market.

Imagine the chaos — would it be possible to have 3000 developers deploying more than 10 times a day in this scenario?

MeliCloud: Embracing innovation for complex workflows

Around 2010, Mercado Libre faced a critical decision — to build a “New World” that could break free from constraints an embrace a more scalable infrastructure based on microservices. With the mobile technology boom creating a demand for APIs to handle the growing complexity of workflows, our Cloud & Platform team embarked on a journey of research and experimentation. After careful consideration, we boldly decided to migrate to a cutting-edge platform called MeliCloud.

Melicloud introduced infrastructure as a service, bringing several advantages, including increased developer speed and flexibility. We were able to efficiently manage a large number of interconnected microservices, including:

  • 17,500 instances
  • Over 1,200 traffic pools
  • Deployment of over 1,400 instances per day (representing over 10% of the total number of instances)
  • Partial service failure support

Now that you are into microservices, is chaos controlled?

As our focus shifted towards microservices, we found ourselves in a precarious situation — was chaos truly under control? Too much flexibility led to a complex scenario where our teams dedicated considerable time to resolving operational production issues instead of focusing on product improvement. We transitioned from a state of utter chaos, where progress seemed impossible, to a state of chaotic freedom, where everyone had unlimited control.

However, this infrastructure posed significant maintenance challenges given the available resources at that time:

  • Numerous environments and configurations
  • Complexity in evolving the infrastructure
  • Substantial disparities between Development, Stage, and Production environments
  • Steep learning curve
  • Demanding extensive infrastructure knowledge to operate
  • Team expansion and an increase in the number of development centers

What were our needs?

Our main objective was to streamline operational tasks for engineers, providing them with all the necessary tools for simple development and production management while maintaining a clear separation between frontends and APIs. We required a single location for infrastructure, build, deployment, metrics, and services.

We realized that we needed a robust platform that would allow our developers to have an application up and running in just three clicks. Our goal was to enable them to download the app, make changes, and test it while brewing a cup of coffee: easy to run, easier to deploy to production. So, we designed and developed Fury.

Fury: Hello, world

That’s how Fury was born in 2015. It’s an internal developer platform (IDP) built in-house that empowers developers to create, deploy, manage, and monitor applications.

Here are some highlights:

  • A unified platform with a defined technological stack, languages, frameworks, and tools tailored to our needs and quality standards
  • Portability and consistency in deployment by using containerization
  • Similar and replicable environments with reusable components and services
  • Cost management and optimization
  • Multicloud capability with cloud provider abstraction
  • Ready-to-use deployment infrastructure
  • Integrated monitoring tools within the platform
  • Simple development workflow
  • User-centered UI focused on a platform that is easy to manage and use
  • Scalability for products and teams

What benefits does Fury bring to Mercado Libre?

Fury brings several benefits to Mercado Libre:

  1. Streamlined Development Process: Fury provides a centralized, self-service platform where developers can access the necessary tools, services, and infrastructure to build and deploy applications. This improves developer productivity and accelerates the development cycle.
  2. Reduced Cognitive Load: By offering a unified platform, Fury relieves developers of the cognitive load associated with navigating multiple systems and processes. This allows them to focus more on developing high-quality applications.
  3. Enhanced Security and Compliance: Fury enforces standardized policies and controls, ensuring consistency and reducing risks associated with ad-hoc development practices. This strengthens security and helps maintain compliance with relevant regulations.
  4. Flexibility and Scalability: With its multicloud abstraction, Fury enables teams to leverage the most suitable services and resources from different cloud providers. This flexibility and scalability allow Mercado Libre to adapt and grow as needed.

In summary, Fury has significantly contributed to faster, more efficient, and secure software development at Mercado Libre.

Building apps has never been this easy at Mercado Libre

To create an application, developers can use the Fury front end. With just a few clicks, they can select the desired application type and technology. Behind the scenes, the platform automatically generates a GitHub code repository with the necessary team permissions, sets up the CI/CD pipeline, and provides a basic scaffolding with a preconfigured containerized image.

Additionally, Fury offers a command-line interface (CLI) that enables developers to download the application locally, run it, test it, and create versions ready to deploy using the terminal on their computers. This seamless integration not only extends to the infrastructure but also to the developer’s local environment, providing a cohesive development experience.

Simplifying application deployment with Scopes

At Mercado Libre, we have introduced the concept of Scopes as a means of managing application versions, deployments, and segmentation. Scopes serve as a simplified representation of Fury within cloud provider containers like AWS or GCP. With Scopes, applications can quickly transition from the development environment to an integrated infrastructure, enabling efficient testing in a dedicated test or demo scope. Once the final version is ready, it can be deployed to a production scope.

When creating a scope, the necessary infrastructure is automatically generated, including a load balancer, autoscaling groups, and instances. Traffic is directed to these resources, generating logs, metrics, and monitoring alerts. Fury also provides insights into infrastructure costs and leverages automated optimizations to reduce them. To ensure reliability, critical applications are divided into separate scopes. This approach prevents any failure in one scope from affecting others, enabling a more efficient response to contingencies.

What does this mean for us? With Fury as our platform, the infrastructure is automatically managed and scaled based on our business needs, adding or removing instances as necessary. This enables us to maximize efficiency and optimize costs.

Efficient code integration with Release Process

Since 2018, we have implemented a Release Process feature to manage code integration into our applications effectively. This process leverages the concepts of Continuous Integration (CI) and Continuous Delivery (CD). It uses open-source tools to ensure a simple and rapid flow that aligns with our development needs.

Powering over 26,000 repositories, our Release process automatically runs seven tailored quality validations to uphold the integrity of every application we build. These comprehensive checks include dependencies, branching models, CI, code coverage, and hardcoded credentials. If any check fails, the entire flow is interrupted until the issue is solved. The result? A seamless and efficient code integration process that backs up our developers every step of the way.

Deploying with different strategies according to scenario criticality

When it comes to deploying our applications, there is no one-size-fits-all approach. Each scenario has its unique requirements and considerations. That’s why Fury offers a range of deployment strategies for developers to choose from according to the scenario’s criticality and the potential impact on the business.

Some strategies involve reusing existing infrastructure, making them fast and cost-effective. However, they can bedifficult to roll back in case of errors, potentially causing downtime. These strategies are ideal for testing small changes or low-criticality scopes.

On the other hand, strategies like Blue Green deployment involve creating a new environment and gradually swapping traffic. They are ideal for productive scopes that requiere fast recovery in case of errors. They allow for quick rollback with no downtime and minimize user impact through careful traffic monitoring.

By providing these different deployment strategies, Fury enables developers to make informed decisions based on the specific needs of each scenario, ensuring efficient and reliable application deployments.

Empowering developers with app monitoring tools

At Fury, we provide developers with robust app tools to keep their applications running. By focusing on metrics, logs, and monitoring, we detect issues early and minimize potential downtime. From monitoring infrastructure (CPU and memory) to tracking business rules like peak payment processing times, we ensure that nothing slips from our radar.

Developers have complete ownership over their applications and are equipped with customizable alerts for all kinds of errors. This proactive monitoring culture enables us to take action before problems escalate. With Fury, developers can leverage the power of monitoring services like DataDog, NewRelic, and Opsgenie, with standard monitors tailored to the specific services they are using.

Keep driving innovation and growth

The technological evolution at Mercado Libre has been remarkable. Evolving our work culture in this way allowed us to iterate on our products in an agile and decoupled manner. At the same time, it has accelerated our innovation processes to accompany Mercado Libre’s exponential growth.

From the early days of monolithic architecture to the current state of our advanced multicloud platform, we have witnessed tremendous growth and innovation. We have embraced the power of microservices for greater scalability and flexibility. The adoption of containerization technology has revolutionized our development and deployment processes, allowing us to iterate faster and deliver value to our users more efficiently.

The shift towards a multicloud approach has unlocked new possibilities, providing us with the flexibility to leverage the best services and resources from multiple cloud providers. This strategic decision has not only improved our performance and resilience but has also optimized our costs.

Next steps in our journey

At Mercado Libre, we are always seeking new opportunities to innovate and push the boundaries of what is possible. That’s why we have formed a strategic alliance with OpenAI, with whom we are exploring the possibilities of generative artificial intelligence (GenAI). We believe that GenAI holds great potential within our ecosystem to create disruptive solutions that bring value to our developers.

As we continue to evolve and push the boundaries of what is possible, we remain committed to delivering better experiences for our developers. We strive to provide them with the best tools and resources to improve our mission of democratizing commerce and financial services to transform the lives of millions of people in Latin America.

In our upcoming articles, we will delve deep into various aspects such as traffic security, cost optimization, and our multicloud strategy. These topics are vital to our technological evolution and will showcase the innovative solutions we have implemented to ensure the scalability, resilience, and efficiency of our systems.

This amazing journey has only just begun, and we invite you to stay tuned for more exciting updates and insights. Together, we will continue to shape the future of technology and revolutionize the way people engage in commerce and financial services.

--

--