How to ship a machine learning model in days not months?

Adrien Giraud
Published in
5 min readNov 5, 2021


In this article, you’ll learn how we quickly delivered a machine learning service along with some tips to enhance collaboration between developers and data scientists.

At Doctolib, our priority is to ease the life of health practitioners, especially by automating recurring non-medical tasks and helping them to be fully focused on delivering their expertise.

To achieve this goal, we increasingly use machine learning and unlike other methods to retrieve information such as database queries or business rules, features that rely on machine learning can’t be implemented and maintained by developers only so it implies collaborative work with data scientists.

Recently, we deployed a machine learning algorithm to enhance an existing text classification feature that was previously relying on business rules (regular expression).

Feature accuracy evolution during the sprint. After the model deployment in week 2, accuracy jumped from around 57% using a rule based method to above 86% with a model (+51%).

The modelling approach is not explained here but we used a rather classical approach (bag of words model) that preserves data privacy by relying only on a predefined set of generic keywords and we rather focus on some key lessons learnt from this experience.

No patients’ personal data have been used in accordance with the Doctolib data confidentiality engagement and the GDPR regulation.

Adapt your data science project timeline

At school and in many data science resources, we learn that a data science project should ideally follow this kind of linear timeline:

Classical linear machine learning project timeline.

It’s perfect in theory but in practice especially in an agile environment it comes with some pitfalls.

  • it makes the project prone to the tunnel effect. Many stakeholders start to take the project seriously only at the very end when they can see concrete output. It’s also frequent to have new incoming requirements that are hard to refuse given that we’re “not yet in production”.
  • it’s pretty hard to sync developers and data scientists given that they can collaborate only at a very late stage. Developers just act as integrators of something they didn’t have a word to say until the very end. Data scientists can miss the knowledge they have regarding production constraints that can make part of their work pointless. For instance, developers know response time constraints that is a key constraint in the development of machine leaning models.

From a product manager perspective, incorporating machine learning based features can be seen as a risk that could jeopardise their roadmap.

To address these concerns, we have mitigated the risks by reorganising the timeline as follow:

Revamped machine learning project timeline.

We have skipped proof-and-concept and prototype stages to directly ship a production ready service with all the production constraints set from the beginning. Normally these stages are intended to validate the technical feasibility and business traction but luckily in our case the machine learning problem was rather classical (text classification) and the business benefit was evident as the feature was already present but with an average performance.

To shrink the timeline even more, data scientists have provided to developers a dummy model API endpoint that is an API with the same behaviour as the final one but with a dummy model behind. Developers were therefore able to work on the API integration while data scientists were creating a first model. Then the API update from dummy to real was completely transparent for developers.

How about not starting from scratch?

Another aspect to deliver on time has been reusing many components we have developed from previous data science projects. In the diagram below, we have listed all the components required to deliver a machine learning API endpoint and we observe that apart from business logic specific to the project, almost all other components are partially or completely reusable.

Machine learning pipeline architecture. Most components have been partially or fully reused from previous projects and boilerplate.

To ease and promote the reuse within the data science team, we have created a set of boilerplate libraries that contains all the components mentioned above so that for a new project, data scientists will just have to focus on their project specificities without thinking of commodities such as autoscaling strategy.

API specification first

While providing an API, agreeing on a contract between the two parties (consumer and provider) is key. This is true as well when you provide a ML API. In previous projects, the agreed contract was only written in the documentation and it has been a source of error and misunderstanding as documentation can be subject to interpretation. To avoid this, we decided to set a normalised API contract before starting any development following the standard OpenAPI specification. OpenAPI is a broadly adopted industry standard for describing APIs

Apart from classical API specifications like available resources and methods, we have defined the following criteria that are key for a ML API endpoint:

  • The mandatory feature variables. We enforce consumers of the ML API to have a minimal set of fields required in order to obtain decent prediction performance.
  • Features typed and missing feature behaviour. The data type is enforced so that the ML API refuses to predict when data typing is wrong. It also simplifies a lot the way the ML pipeline is handling missing values.
ML API input content definition example. Using this definition, the number of authorised predictions per call is set at 200 maximum and two features are mandatory (golden_feature_1 and golden_feature_2).
  • Model and inference code versions. To better track performance and bugs, we return two version numbers. The first one corresponds to the inference code and the second is the algorithm version. We have decided to separate the two as they have different life cycles.
ML API output content definition example. It supports multi-label classification (more than one label per instance to classify) as we allow predictions to be an object with multiple arrays.


Brought together, all of these elements are contributing to faster and easier deployment of machine learning models.

  • change your ML project’s timeline habits and adapt to circumstances
  • mock your service to make the developer’s life easier
  • promote reusable ML components
  • spend time on defining your API contract using a standard like Open API

What I have been covering in this article is being used and implemented in the data science team. Check out our jobs page if you want to join us.

I hope that you can reuse some of these ideas to speed up your next data science project. We already have some ideas on ways to improve it so stay tuned and subscribe to our tech newsletter to stay tuned.