Machine learning at Takealot

Takealot
Takealot Engineering
3 min readAug 22, 2022

By Axel Tidemann and Pieter Rautenbach

Takealot is the biggest e-commerce site in South Africa, with 3 million daily active users and its own supply chain network, generating massive amounts of data. Operating fully in the cloud enables machine learning applications at scale. Among the use cases are customer search, product recommendations, moderation of public reviews, and driver capacity predictions, to name a few.

Thanks to our new architecture built over the last few years, we can train machine learning models using millions of samples in a few hours. The data is readily available in a data lakehouse and the cloud is harnessed for scalable processing of data. On-demand graphical processing units (GPUs) enable fast training of models. The data lakehouse ingests millions of records a day, laying the ground for more machine learning models in the future.

Machine learning teams

Takealot has four machine learning teams roughly divided according to business function and domain. These are:

Discovery machine learning: responsible for customer search and product recommendations.

Merchant machine learning: deals with everything to do with in-house sourced retail products and third-party marketplace products.

Supply chain and logistics machine learning: stuff that moves around (e.g. drivers for Mr D or products in the warehouse).

Machine learning operations (MLOps): provides platforms for doing machine learning at scale, including scalable data pipelines, machine learning pipelines for training models, and large-scale analytical databases.

Machine learning platform architecture

Takealot engineering runs on the Google Cloud Platform on top of Kubernetes, which manages the deployment and scaling of containerised applications. This enables the use of Kubeflow, which labels itself as “the machine learning toolkit for Kubernetes”. The machine learning teams write vanilla Kubeflow pipelines and TensorFlow Extended (TFX) pipelines depending on the preference of the developer. Early experimentation is done on Jupyter Notebooks, but a key enabler for putting models into production is to move over to pipeline development after the initial experimentation phase.

The technology stack that enables machine learning at Takealot.

Machine learning pipelines in action

Dataform enables data pipelines, aggregating data from various tables in BigQuery. A cronjob triggers the Dataform pipeline, and subsequently the machine learning pipeline. The latter includes an evaluation of the resulting model by comparing it to the previously trained model. If the new model is better, it will be pushed to Google Cloud Storage, where TensorFlow Serving automatically picks up the model.

Combination of Dataform and machine learning pipelines for automatically retraining models with fresh data, serving the models after training has finished.

Examples of machine learning models in production

In the search domain, there is a model that predicts the category of what the customer is trying to buy. For instance, if you search for “Harry Potter”, that poses several challenges — is the customer looking for books, films, or toys? If the customer searches for “harry potter toys” and “harry potter books”, a machine learning model can understand which category of products to present.

Mr D does not want customers to receive their food cold. So every minute of the day, a model predicts how long it will take a driver to pick up an order, based on the number of orders and available drivers. If the predicted time is too long, the app will not let more orders onto the app.

There is an automated review system that accepts or rejects reviews. Between 4000 to 7000 reviews are written on Takealot every day. The model moderates over 80% of these, sending the rest to manual moderation, which happens when the model is unsure whether to accept or reject.

We are hiring

Takealot is looking for machine learning engineers, machine learning engineers with a focus on recommendations, machine learning managers, and MLOps engineers. The machine learning space at Takealot is growing, join us!

--

--