Technology Blog by Avito: Meet Us on Medium!

AvitoDev
AvitoTech
Published in
6 min readMay 10, 2018

Hi all, today we are launching Avito’s technology blog on Medium. For a start, a few words about Avito. Avito is an online classifieds platform for both individuals and businesses. Presently, Avito consistently ranks among the top 5 Russian websites and the top 3 global classifieds websites, according to various sources. Items offered for sale on Avito can be brand new or used. The website also publishes job vacancies and CVs.

In this blog we will tell you about technologies underlying the Avito platform. Let’s start with a few words about the project’s current status, the functions of its engineering team, and our plans for the near future.

The beginnings

Similarly to many other large projects, Avito was started by a small team. The first version of the website was launched back in 2007, and the first steps followed a trial and error approach. In its present form, the website emerged only two years later. The web service was initially designed by a team of only 4 developers, who dealt with absolutely everything — from infrastructure to frontend. The 2009 version of the website was definitely not something one would consider for submission to a Best Website Design contest. But those who were involved in the project still feel nostalgic for it. And take pride in it because the project was accomplished with limited resources and still managed to make a statement and lay the foundation of a successful business.

It is hard to imagine, but until 2012, the size of the development team did not change. However, the project grew ever larger, and we felt the need for new talent. In 2012, the team entered an exponential growth phase. It became subdivided into specializations, areas, projects, teams, and groups. Now, Avito has an entire engineering department employing more than 300 professionals.

Avito today

Via its web and mobile apps, the platform monthly serves more than 35 million users, who daily add approximately a million new ads (the back office has accumulated more than a billion ads) and close over 100,000 transactions. According to Yandex, in some Russian cities (for example, in Moscow), Avito is considered a highload project in terms of page views. Some figures can give a better idea of the project’s scale: 300+ servers, >20 TB in Postgres, 270TB of images, 13 Gbit/sec of traffic during the evening peak hours, about a million queries per minute to the backend. Therefore, expertise in data processing is critical for our business processes. At the same time, these volumes of data need not only to be accumulated and stored, but also processed, filtered, classified, and made searchable.

No single tool can efficiently cope with these tasks, therefore Avito employs a number of solutions, such as: PostgreSQL (Avito’s PostgreSQL installation is one of the largest and handles some of the world’s highest loads), Tarantool, Vertica, MongoDB, Redis, and other storage systems. We will tell about the system’s architecture in upcoming posts.

Tons of data is good for the platform, but is a challenge for the user who wants to find exactly what they needs. Classification and ad search tools come here to the user’s aid. Search is the most challenging task. The problem is not the volume of data as much as it is the human factor. The reality is that users always make mistakes, both in the ad texts and in the search line. One of the main tasks is to eliminate errors in the ads and understand what the user meant.

To eliminate errors, all kinds of reference materials and correction algorithms, as well as more advanced approaches, such as computer vision, are employed. For example, computer vision is capable of checking with a very high probability (in some categories, higher than 95%) whether the user chose the ad category correctly. In addition, Avito regularly sends machine learning professionals to contests (held by platforms such as machinelearning.ru, boosters, and kaggle), whose goal is to find the most effective algorithms for solving various applied problems.

For full-text search, Sphinx is used, and we regularly share our experience with it and actively participate in the development of the technology.

As already mentioned, users add about a million new ads daily. But few people know that more than half of them are spam. Traditionally, moderation has been used to identify spam. A fun fact: the first version of the moderation system was written in just a week, and it is so effective that not a single major upgrade has been necessary since then. But, despite the improvements, it is obvious that it is impossible to manually handle this amount of information. Therefore, more advanced methods are used, for example, neural networks, which are continuously trained based on human moderator’s decisions.

Data is not the only challenge. The market dictates new requirements all the time that translate into an increasingly complex business logic. Historically, the platform’s business logic is implemented in PHP. In 2016, we switched to a new version — PHP 7, and the servers took a breather as the loads dropped three times. Today, PHP is not the only server language used at Avito. Initially, the project used to have a monolithic architecture, but has for a long time been moving in the direction of microservices. Depending on the task and loads, different languages ​​are used, such as Python and Go.

No matter how complex are the tasks on the server side, it’s all hidden from the user. What users see when interacting with the service is the responsibility of the frontend team. Originally, the website was built using the technology available at that time, server rendering, and jQuery. But not so long ago, we completely abandoned jQuery in favor of browser APIs and small libraries handling specific tasks. Frontend development is doing its best to keep up with the times, use the latest technology and solutions. For example, a new version of JavaScript was implemented as soon as the specifications were approved (ECMA2016 is currently in use). In addition, new web apps (SPA) appear that are built on React and basis.js. Frontend developers also take part in open source projects (such as CSSO (CSS Optimizer — CSS minifier with structural optimizations), develop tools and share their experience at conferences.

Avito appeared at the time of the birth of the mobile platform, as we know it today. Naturally, everything started with a web version, then a web version for mobile devices was launched. But native apps have platform-dependent features. Today, mobile apps are in the spotlight. Individual teams are simultaneously developing several apps for iOS and Android. The guys take their mission very seriously, share experience at conferences and on GitHub. One of their projects is Avito’s media picker Paparazzo, which we posted last year and about which you could read on maniacdev.com or in OLX Group technology blog.

Both mobile development teams — iOS and Android — use cutting-edge technology. First, it’s Kotlin (which we started to use before the release of version 1.0) and Swift. They have almost completely replaced the Java and Objective-C legacy in our products. Second, we invest in the development and promotion of good engineering practices — CI, CD, Code Review, and testing automation. Third, it’s loosely coupled scalable architecture, which allows several groups of developers to smoothly collaborate on a large project and promptly respond to user requests.

Initially, Avito had no testing function, and the first QA-professionals joined the team in 2012. Today, we have more than 40 professionals, one third of them specialize in automation. The toolkit is a standard one: PHP + PHPUnit, Selenium. We have a system for launching tests, through which an average of 110–120 thousand tests are run per day. At the peak this figure reaches 200,000. To organize the interaction between testers and developers, an in-house test case management system is used allowing to store test cases, execute them, and attach bugs in Jira.

This is how development at Avito is organized, to put it briefly. Of course, many things remain behind the scenes. We will try to address this gap in the close future.

As a conclusion

Learn more about Avito’s inner workings from the articles in OLX Group technology blog:

Here you can see a playlist with videos from the meetings held in our office or from conferences with our speakers in English.

Most resources about the Avito’s web development we keep in Russian. We post all our news on AvitoTech on Twitter, Facebook, Vkontakte, Telegram. Presentations by our developers at conferences and meetups are on ours slideshare and speakerdesk.

In this blog, we will give you an insight into what we are doing to provide millions of our users with a convenient and reliable platform for exchanging information and what technologies we use for this.

--

--