Gearing Up for the Next Stage of Machine Learning

Carola Gerwig
TUI Tech Blog
Published in
4 min readFeb 11, 2022
Data Science Journey: What we do & how we do it

At TUI we use data science in every facet of our business. We have dedicated data science teams in each of our business units. These teams have implemented machine learning algorithms in a variety of fields like better inventory planning and by dynamically adjusting our offerings to the customers.

We use bottom up processes guided by our data scientists to align and innovate on data science life-cycle processes across the company. Besides building new ML based products, we are gearing up for the next stage of TUI’s machine learning journey. Our next objectives can be categorized in terms of infrastructure and knowledge development. In this blog post we briefly describe these objectives.

Infrastructure

A good infrastructure should permit easy debugging, safe rollback, minute process monitoring, and reliable scaling for a large number of users. Moreover, it should readily accommodate to a wide-variety of architectures. Its goal is to enable a quick and seamless transition from development to deployment of models.

As our machine learning products developed, each team put in place infrastructure that enabled TUI to tackle problems of greater complexity that required more computation and other resources. Since we give maximum autonomy to our teams, each team has developed infrastructure suitable to the specific requirements of their own projects. These independent developments have led to redundancies as different teams have separately developed infrastructures for similar needs. Such redundancies are not only inefficient, but they also prevent a comprehensive understanding that can only be gained by solving similar problems in different contexts.

We explored many end-to-end infrastructure solutions on the market. While useful for initial iterations, they didn’t possess the rigor and flexibility to serve our advanced modeling requirements. So, we developed TUI Analytics Platform (TAP). TAP is an integrated solution that combines Kubernetes, Gitlab, Airflow, MLFlow, Datadog, Snowflake, S3 based data products, and other tools to deliver a flexible and comprehensive solution. TAP is being continuously developed to add new capabilities and other improvements based on feedback of users such as data scientists and ML engineers.

Believing in tight coupling between model development and model industrialization, TAP requires data scientists to carefully reason about their requirements. While it doesn’t require them to be fully versed in the intricacies of our platform, it does encourage them to develop an understanding of the tools which greatly helps to structure the implementations well from the start. As a matter of fact, there can be a learning curve for some data scientists. One of our major focuses has been to motivate other data scientists to consistently use this platform that has proven its value in production for many projects.

Outline of the TUI Analytics Platform for Data Science at TUI

Knowledge Development

This brings us to the human dimension of our objectives. We encourage our data scientists to write production ready code from the get-go. Over the years, different teams have adopted different sets of development practices.

We have observed that a certain set of processes and guidelines inspired by proven software development practices provides the best trade-off between the speed of experimentation and the ease and reliability of industrialization of models. To promote these practices we have developed a comprehensive template that helps less experienced data scientists in software development to start working with TAP while serving as canonical guidelines for more experienced colleagues.

In addition, more than any other field of IT, modeling benefits from perspective of people with diverse professional skills and experience. So we consider it particularly important to enhance the cooperation between members of different teams that are now distributed all over the globe. At TUI we have regular events including informal company-wide stand-up meetings to share our experiences, tutorials and workshops about different data science and MLOps topics, and deep dive sessions about ongoing ML work or research topics. This has helped maintain a high communal spirit among our data scientists.

Looking Ahead

TUI has rapidly transformed over the last two years to successfully surmount challenges posed by the pandemic. We have used this challenge to successfully innovate on data science processes. By unifying infrastructure and ways of working we can save development resources and benefit from the expertise across the organisation while empowering our globally distributed teams to respond quickly to their local teams or market conditions. We intend to continue on this path. Among the tools we used to push forward is internal competitions, we will describe one in an upcoming blog post.

Thanks to my co-author Jasdeep Singh.

--

--