Data coach’s journey @ TDF

Gaïlé Lejay
TotalEnergies Digital Factory
9 min readSep 16, 2021

How do we create a shared vocabulary between business and data scientists? How do we make sure that the objectives set by the businesspeople are reachable and realistic enough? How do we translate their needs into technical terms?

These are some of the challenging questions the “data coach” is tackling and is trying to solve.

We hope that, by the end of this article, you will understand what it means to be a “data coach” at TotalEnergies Digital Factory.

This writing is divided into two main parts. The first one gives general information about the “data coach”, his team, the ideation workshops conducted, the framing of the projects and the technical operation done by the team called “Data Exploration”. The second part is a storytelling illustrating some of the “data coach”’s daily activities.

PART 1:

“Data coach” and his team, the Data Studio:

In the factory, there are many different teams with all kind of expertise. It is not always easy to know what is the role and mission behind each job title. When it comes to the “data coach”, few people have an exact idea of what it is about. And to be honest, it took me a while before understanding it myself. So, here is a broad description.

The first thing you need to know is that, at the factory, the data coach is part of the Data Studio, a transverse and operational support team dedicated to data activities. The data scientists of this team are helping Subject Matter Experts on different aspects of their digital project, such as data preparation, data visualization, and data science. In this team, the coach can be considered as the proxy between businesspeople and data scientists.

It is the coach who first interacts with the SMEs and does the initial assessment of the relevance of the demand as to the Data Studio’s offers. If the demand is compatible, the coach will have, through-out the Data Studio’s assistance, to continuously correspond with the stakeholders.

One of the key functions of the “data coach” is to help finding a consensus that can address the needs of the business while ensuring the technically feasibility. For that, the coach needs to raise awareness on what can be done with the available data, and to inform about the methodology and the “data prerequisites” associated. Thus, it is the coach’s responsibility to explain to the businesspeople what is realistic and what can be expected when working with data.

Ideation workshops & framing:

At TDF, the Data Studio mainly operates during the ideation phase, or at the early stage of the building process of a digital product. In that regard, the coach always has to stay connected with the team in charge of the framings of the projects.

During this phase, when a product to be built is data-driven, the coach and data experts from his team ask questions to review the decisive technical aspects. Here are some of the questions that are frequently inquired: “What are the data sources?”, “Do we have enough data?”, “Is the data labelized?”, “Do we need to do Machine Learning?”. Those questions help estimate the feasibility of the data functionalities, to secure them, to know which skills to mobilize and to pave the way for the delivery team to come.

Data Exploration”:

Sometimes, the technical review conducted is not sufficient enough and the Data Studio can be solicited to carry out what is called a “Data Exploration”. It is a short and small-scale technical operation on specific datasets. This operation usually lasts between four to six weeks depending on the business’ needs. During this period the data scientists and the businesspeople work collectively. The exploration can address different data features such as data analysis and visualization, Proof of Concept of ML models and optimization.

For example, the SMEs might use an anomaly detection system that sends too many false alarms. In this case, the aim of the “Data Exploration” could be to enhance the system by doing a quality check of the available data, by analyzing it and by finding new and more precise correlations between the data inputs and events. The system would then be more accurate and send less false alarms.

For a “Data Exploration” to succeed, there are some prerequisites that the coach needs to communicate to the businesspeople. First of all, the main targets -baselines, thresholds, business metrics- should be known. Secondly, before the data scientists can start any activity, the datasets need to be ready and available. Finally, the collaborators need to be willing to invest their time and to participate to different collaborative workshops during the weeks to come.

To prepare the launching of the “Data Exploration”, the coach organizes a first meeting with the data scientists and the SMEs. During this meeting, called “DIAGNOSTIC”, the coach asks several questions in order to translate the business needs into technical terms and to find a convergence between the business Key Performance Indicators and statistical metrics. For example, in order to take anticipated measures, the SMEs might need a tool that raises alerts when an indicator drops below a certain value. This designated value is precious as it can be used as a threshold in the Data Scientists’ algorithm.

To facilitate the “DIAGNOSTIC”, the coach utilizes collaborative tools such as Miro. This is quite handy as it helps gathering information in an interactive way and to ensure that the business’ objectives are well understood.

PART 2:

Storytelling:

Here is a little story that illustrates what could be the daily activities of the “data coach” at TotalEnergies Digital Factory:

It is the end of summertime and the data coach is enjoying his drink at TDF’s coffee room. After a strong and tasty coffee, the data coach is back to his workstation, carefully reviewing and commenting the data scientists’ final report that is going to be presented during the restitution meeting of a nearly completed project.

Soon after, the coach is having a meeting with a team who is looking for help to start a data project. Thanks to this conversation, the coach has an idea of the overall context, the main goals, the needs and the urgency.

It is now time to go to the Data Studio’s weekly meeting and to communicate this information with the rest of the team. Everybody agrees that it is pertinent to organize a “DIAGNOSTIC” with the team asking for guidance.

A couple of days later, the “DIAGNOSTIC” starts with a quick roundtable. The coach, who is leading the meeting, goes through different key questions to capture all the important information. The ambition is that, by end of the meeting, the data scientists and the coach have a clear vision of the use-case, the problematic, the inputs and the expected outcomes.

The coach closely listens while writing down the useful elements on the virtual white board. When something is not explicit enough, the coach reformulates to be sure to understand correctly. By doing so, the coach starts identifying some weak points and begins to challenge the interlocutors by inquiring about the added value of the desired solution and its target users. This facilitation helps in the choice of the technical operations to carry out in order to answer to the business needs.

During the last part of the “DIAGNOSTIC”, all the people in the meeting agree on the deliverables and on the next steps with the Data Studio.

For this particular demand, it is decided that during the following weeks, the Data Studio will do a feasibility assessment of the project by developing a small POC of ML model.

A couple of days later, the two data scientists who are working on the use-case and going through the “Data Exploration”, are animating the first collaborative workshop with the SMEs. This meeting is really informative and allows the data scientists to work by themselves the following days.

Once the first results emerge, the data scientists organize a second workshop to get feedbacks and to collect additional information in order to achieve a better performance.

But the SMEs tend to be less involved. Indeed, with the data and the information provided, they assume that the data scientists have everything they need in order to build the model. However, the data scientists do need their domain knowledge in order to continue their progression.

The coach decides to reach out to the experts to tell them that their expertise is essential and that, for the “Data Exploration” to succeed, it is crucial to work together and to share the same vision.

Small aside; if you want to get more details about the importance of a close relationship between the data scientists and the business representatives, I encourage you to read Michel Lutz’ article — Machine learning short time to production: some advices — on our blog.

The SMEs are now fully aware of that and come to the remaining workshops. It is nearly the end of the Data Exploration, and it is time for the final restitution. The coach helps finalizing the report by highlighting the parts that are not sufficiently clear. Indeed, it is important to ensure that people without or with limited data expertise can understand the information shared in the report.

The data scientists present their report, they give conclusions and recommendations. The “Data Exploration” helped validating the technical feasibility of the project and showed that the available data has signal and can be used to develop a ML model. The businesspeople in the meeting are reassured and seem to be pleased with the final results. It is now their decision to make concerning the constitution of a delivery team to build the product.

Few days later, the coach receives the answers from the feedbacks form sent, and they are positive. The “data coach” happily shares this good news with the data scientists and congratulates them.

It is nearly the end of the day. The coach is looking forward to the following ones to carry on his activities and to contribute to the acceleration of TotalEnergies’s digital transformation.

Text and illustrations by Gaïlé Lejay

--

--