Building your data portfolio is like making a dinner

Paige Tran
CodeX
Published in
4 min readAug 15, 2022

Some people asked me to share more in detail about how to build data projects or data portfolios productively after my previous blog about 4 Self-learning steps to get started in Data Analytics

Hence, I separate this topic and write it in more detail. This article is focusing on some basic workflow that I believe would be beneficial for you to build your own portfolio. These steps are the daily process that I am applying for most of my task jobs. I will take one of my best analogies to describe it:

“How do you make a great dish for dinner?

Step 1: Collecting Data

This is the stage when you go to the supermarket and shop. Let’s imagine!

Data is your raw ingredients. Do you have fresh or spoiled meat? Do you have all or missing ingredients? All the quality items that you shopped in a supermarket would impact your dish. Sometimes, the dish is not perfect because perhaps you are missing some ingredients. Sometimes, you can also borrow or combine ingredients from the different supermarkets just like you can improvise if you possibly utilize data from other sources.

How to get the data for your practice?

As shared in the previous blog, you can visit some open source communities such as Kaggle, or Github. Some of the good references to collect the data

Step 2: EDA (Exploratory Data Analysis) and Cleaning Data

Source: Can read more on The Ultimate Guide to Data Cleaning

Whenever we talk about EDA or cleaning the data, food preparation is always one of the best examples of how we should do it. Firstly, we need a lot of patience and love for this stage! This stage is when you start cleaning your ingredient and preparing well (includes cutting, chopping, etc), understand better its quality and which part we should take, and which part we can improvise to make it better. In comparison, some of the great values in this stage for data enthusiasts:

  • Understand better the range of data and be able to detect outliers and anomalies
  • Identify the missing values
  • Decode and conquer imbalance classes
  • Re-format variable names and types

And so on!

To understand better EDA and cleaning data, I would recommend reading more about:

Step 3: Project techniques

This is the “cooking” stage when you select the right method to cook. Is this a stew dish or grilled dish? The decision is based mainly on the purpose of the dish (when you shopped, you target to make dish X). However, sometimes you could be more flexible based on the ingredient you got. For example, when half of the veggie is spoiled, can borrow some from a neighbour or combine it with some other veggie you have in the fridge. In comparison, sometimes to back up your analysis with a lack of data situation, you can manipulate the existing data or other market research data.

Some examples of your technique’s selection

  • Business analysis: Clustering. Even if you only use Excel, SQL, or PowerBI, the most important path is you are clear on the methodology you applying.
  • Recommendation: Collaborative Filtering
  • Predictions: Classification/Regression model

And so on! Some examples of good technique projects

Step 4: Insights and recommendations

I love this the most: “How do you serve your food?”

Many people underestimate this stage because of the quality of the cooking above. However, I believe that if you serve Phở by putting noodles, soup, and meat separately in different bowls, the dish is not the same anymore. Sometimes, it even confuses people regarding how to eat.

In comparison, when you present your results, the key question is: “SO WHAT?”

Source: Internet

One of the greatest pieces of advice from my former boss is when you present your analysis:

“Focus on the storytelling and data insights, rather than the methods”

Some examples of good analysis for your inspiration:

No method is right or wrong to start learning something. The tips are from my experience. We all have a very different approach. Find the one that suits you.

I hope you find these tips are helpful

--

--

Paige Tran
CodeX
Writer for

Data analytics enthusiast | Analytics Leader @Zilch | Ex-data @Curve, @Trip EU (Skyscanner/Travix) @Uber and @Microsoft