How much does data matter to us?

Mercado Libre Tech
Mercado Libre Tech
Published in
5 min readSep 4, 2019

--

Why building a data blog?

With the ultimate goal of improving our users’ experience, Mercado Libre’s data team started with what was called the crusade against data. Soon we will tell you what and how we did it and why not, build together what’s coming for the future. In short, during the development of this blog, we intend to tell you how we are adding value to the data in Mercado Libre, which techniques we’ve developed, what our data ecosystem is like and which data-based projects have the main goal of improving our users’ experience.

In a context in which MELI (Mercado Libre’s stock name) is steadily growing and the stored data volume started to grow exponentially

“The dimension of the data warehouse is in the order of 150TB complemented with a 5PB data lake and fed by more than 1TB per day”

we realized it would be necessary to develop an analytical culture and therefore intensive data usage would be a deal breaker to succeed in the e-commerce and online payments industry. But, how do we do this? What do we do with all the data we’ve got? Do we give it the value it really has? How do we mutate into an organization that takes data-driven decisions?

Recently, our CFO said, “there’s nothing worse than an excellent strategy poorly executed”. This phrase will keep on echoing for life in several of our team members. Without being fully aware of it at the time, it made us decide to take small but firm steps towards our goals instead of trying to accomplish everything at once.

Democratizing data

It’s really difficult to summarize the path we’ve taken in a single release. That’s why in this issue, we will be focusing on the most salient aspects of our work and on our current actions to continue transforming MELI: Tools & Technology and Knowledge Management.

Our first goal was to democratize data access to every team. To accomplish this, there’s no single formula but we are certain that there are two clear lines of work. On the one hand, you have “Tools & Technology”. Even if these can open or closed source, we’ve empowered business analysts by facilitating data access while at the same time caring for data governance strategy. Ultimately, the important thing to care about is that every person in the company has independence to access the information that is needed for better business decision making. On the other hand, you have “Knowledge Management”. The training programs we’ve developed are aimed at ensuring everyone has the necessary skills to use these tools correctly and get the most out of them.

We should be aware that we can’t solve everything with a single tool. Business analysts needs probably differ from those of data scientists so it is important to have different environments for different users.

Regarding “Knowledge Management”, at MELI we’ve created a number of programs oriented to fulfilling multiple tasks. Everyone, from an analysts to our CTO, have been trained by means of various tools to guarantee their smooth access along critical paths so they can get by themselves the information they are seeking. Among the training options available to collaborators there are:

  • Key Users program: A key user is a business analyst that has the responsibility of learning from cross-expert teams and train their own team.
  • Mass training: Massive trainings are likely to be more successful when users practice what they’ve learned. In our experience, if you set an agenda with different levels of achievement for each group, with a clear growth path and leveraged with the rest of the programs, you are prone to succeed.
  • Video Recording: Another strategy is to record several videos with long and short training sessions. In our case, we have some introductory videos for some tools with a YouTube video as the first point of entry.
  • MELI Fanatical Support: Although it is one of the newest programs, we are getting very good results. The program consists in giving users intensive and dedicated support to develop their project (so far, this methodology only exists for machine learning teams). Both trainer and owner are responsible for the project. The data scientist that was trained has the independence to do it on their own from that moment and beyond.

If the complexity of the data problem to solve exceeds the business line skills — which tends to be the case with a low percentage of information issues — it is necessary to resort to a cross team with the correct skillset to solve it.

Organizational Structure

At MELI we have chosen to adopt a mixed organizational structure for data teams. With this strategy, we can foster data analysis teams within several business units as well as set forth cross team work. Data teams work usually either in a centralized or decentralized fashion; we consider extremes tend to be bad, since centralization may lead to bottlenecks and decentralization to lack of communication. Building mixed teams boosts motivation and contributes to developing the necessary skills to transform MELI. We take the best from both worlds and try to build an organizational structure that makes everything work in a dynamic way, without barriers.

Further, innovation is not possible if data teams have no time to do research. There are different ways of making innovations. In our case, between 10 and 20% of the time of cross teams is devoted to conducting research that does not necessarily have the purpose of leveraging a business project. Teams can choose to investigate about data science, data engineering, machine learning and more. In the end, we’ve noticed that those innovations generally ended up in productive services generating a transformation in the way of doing things.

Today, September 2019, we’ve grown significantly both in people and data volume. In terms of our people, we’ve transitioned from being a company with fewer than 2,000 employees to one with more than 8,000 and growing. MELI boasts over 4,500 analysts managing business information on a daily basis, using different tools to generate their reports and unfold their own analyses. As for the dimension of the data warehouse is in the order of 150TB complemented with a 5PB data lake and fed by more than 1TB per day. This prospect presents a huge challenge when making resources available for a complete exploitation and information use.

To help us meet this challenge, we invite you to take part, comment, ask, contribute… And, of course, if you want to join our team, you can access to mercadolibre-jobs and and apply to the position you are interested in!

If you liked this post click on 👏 start following us :)

#mercadolibre-datablog

--

--