A day in the life of a Data Scientist

Cho Zin Tun
Nerd For Tech
Published in
4 min readAug 6, 2021
Activities in the day of a data scientist

I am Zin, currently working as data scientist in payment industry under analytics and consulting team.

Data scientist is regarded as the sexiest job in the 21st century and you might be curious about what the day in the life of a data scientist looks like.

In this article, I am going to share my typical day as a data scientist. My story may not represent the whole data scientist population, but I believe this will give a rough idea of data scientist life for those who are interested to join a similar industry as mine.

Being a data scientist part of the analytics and consulting team, my main responsibility is to help the clients solve their problems either by providing actionable insights or by building machine learning models to forecast the future trend or predict the customers’ behaviours.

From the above, it is obvious that my typical day involves a good mix of activities from business , data and technology.

Data

Around 50 per cent of my day is spent dealing with data. For someday, I would be exploring, pre-processing or wrangling data while the other day, I would be building machine learning models or visualisation dashboards.

Knowing and Exploring Data

Knowing data inside out is essential for data scientists. We are go-to-persons for multiple cross-function teams when it comes to data: be it during the strategic planning sessions, client meetings, or day-to-day decision-making processes.

Whenever there is a client engagement, we have to first ask ourselves if it is a familiar or new business problem. For the former one, we can just leverage existing solutions and data. However, for new business problems, we have to recognise if the problem is being able to address using the existing data. If that is not possible, we have to further consider exploring new datasets. It is undeniable that data is the basic, yet crucial element to determine if we could support a client’s business problem or not.

Data Preparation

In order to perform an analysis or build a model, we have to first prepare the data. Data preparation plays an important part and we can have several steps including data preprocessing and data wrangling stages.

Transforming raw data into the understandable format by machine, imputing the data, handling outliers, reducing dimensions and creating new features are some instances of the data preparation process.

Model Development

We have to maintain and revalidate the existing models to make sure that the models are trained with recent data and trends.

From time to time, we also brainstorm new solutions to serve the additional business needs. While developing new models, as usual, regular processes such as feature extraction, splitting the dataset into train/test for models, cross-validation, out-of-time validation and choosing the best model using relevant performance metrics are executed.

I mainly use HiveQL (HQL) which is SQL-like language for getting data from hadoop for data summarisation, and ad hoc query. For exploring new datasource, Python is used to test various third party APIs. Data preparation and model development is largely implemented using Python.

Building Visualisation Dashboard

In order to support the business as well as to understand the monthly/quarterly trends, we also build the visualisation dashboards. With these, it is easier for non-technical users to understand how the business is performing and plan the strategies ahead for the future.

Tableau is my main Business Intelligence (BI) tool to develop dashboards.

Project Delivery

Another 30 per cent of my day is spent preparing for client project delivery.

After performing analysis or building models, we cannot provide the numbers to the clients. We have to interpret the numbers and explain those in layman’s terms. Additionally, we have to provide the clients with actionable directions so that they could come out with a solid, strategic business plan.

While preparing for project delivery, one of the challenges that I faced were extracting relevant data to address the clients’ problem. Since we have access to abundant data, we are generous and always want to share as much data as we can. Too much information would lead us to divert from the main problem statement. Asking “so-what” on each of the insights always helps us focus on the right track.

Interpreting the insights derived from the data ,extracting relevant information and providing concise actionable messages in client’s languages are as important as dealing with data and building models.

Meetings

The remaining 20 per cent of my day is usually spent on meetings.

I usually manage to acquire business knowledge via cross-functional team meetings. Having business knowledge helps us with our daily analysis as well as models building.

I also have data science meetings where the data science techniques, new solutions and improvement on the pipeline process are discussed.

Meetings are the opportunities where we can learn from each other.

In summary, a data scientist needs not only analytics or programming skills, but also skills in interpreting results, identifying business problems and delivering relevant messages are required.

I hope this article helps those who are trying to figure out what a day in data scientist looks like. If you are interested or trying to be a data scientist, I have written an article about how I become a data scientist.

--

--