Metamorfosa— Transformation of data team in Blibli.com — Part 1

Willy Anderson
Blibli Product Blog
7 min readOct 30, 2018
From brawn to brain

When I first join blibli.com, I never expect that I will get the trust to handle data team. Although I used to be a developer, but I have no background handling things like data modeling, cube, dimension, measure, etc. so it was a weird decision (trusting me handling data team) from my point of view, yet a part of me was captivated with the idea.

Thus, I sit down with my supervisor while trying to find about what are the expectation for me and what is the most urgent goals for this team to see if this challenge is something that I can handle or not. in a nut shell, we want all the data are collected into one place, so it can be used for analytical purpose.

So here I was. A new hire that never involved in big data before and trusted to handle all data in our company.

Oh boy, it will be a fun ride.

The beginning

Now, I have plethora of things that I need to do when I accept the “quest”.

  • Increase my knowledge about analytic and big data both in technical side and business potential.
  • Find and learn on what kind of product vision and roadmap for data team (since normally, you don’t make data team as product team)
  • Find out the current state of data system and start planning the roadmap

And yet, before I do all of that I need to do something that have the utmost importance: talking to the team and identify the roadblock.

This is an unexplored area for me — as I never become a PM in area where I don’t have a major knowledge — so I thoroughly depend on the team for the first step. Luckily, I got a great team that totally helpful in bring me up to speed on what they have done, what they are planning to do and the most important thing: what are the obstacle they are facing right now

We find out that the main issue the team facing are the migration from monolithic architecture to micro services architecture (you can read more detail about that here).

Before we go to the issue, let me try to give a brief description on what are the team did during that time.

Using a basic ETL functionality (we are using Pentaho at that time), the team are trying to gather the data from the OLTP that blibli have into a centralize database (postgres for our data warehouse V 1.0) which can be used further for other needs.

Now, during the architecture changes, every time a new service was created there is a high possibility that the team need to change the old ETL process as the data source may change, moved, edited or even deleted.

The devil’s circle

This devil’s circle above are totally draining, as the team need to keep track on what are the latest changes that happen, update the ETL that they create and sometime even do a backtrack for data that changed during the migration.

They need help, and that where I can contribute

They need someone to help communicate and prioritize which data that needed now and make sure those data will not be moving in the next 3–6 months.

Unfortunately, I find out that is not easy at all to do that due to two reasons:

  1. We are still operating when the migration happens. This mean we need to take several detours when creating micro services. So, we do need to do the devil’s circle again and again, for the sake of our customers
  2. We still finding shape of our micro services. There is no a perfect guide on how to break a monolith system into micro services, it all depends on how your company works. Because of this, a rinse and repeat cycle will happen therefore we can’t be selfish and demand the IT team to lock the architecture design.

Based on this, we decided to do it the hard way. We sacrifice our self and entering the circle over and over until thing stabilize.

This decision doesn’t help the team nor a good choice that I can make but it the most cost effective one.

This is the first phase in our effort, which I call: The brawn.

We just slog it off, repeat whatever effort that we need to do and opening all communication channels, so we can response fast.

Its hard work, primitive, annoying and yet the best possible option that we can make.

The middle part

Our journey in this part start after thing stabilize enough so we can move to the next big thing…. Yeah, no.

I failed to pull the team from one of the deadliest trap product team can ever step on: playing with a new technology AKA The shiny new toys.

Me and the team decide to change our technology design. Now when people changing their underlying technology, there are few reasons:

  1. There is bugs or issue that can’t be handled with the current technology.
  2. The old technology is already obsolete. There is no more support, the community is dead, the company behind the technology are bankrupt
  3. The cost is too high.
  4. Preparation for what will happen in the future (scale, speed,

Number four is the reason on why we change our tech.

The team manage to identify that with current system, we won’t be able to scale as fast and as big we need. That why, we need to find a better tech that can support us, and the winner is greenplum (as data warehouse) and airflow (as ETL Process).

Now all things are good until here; except when you are doing something because reason no 4, it hard to justify. It only a prediction, are you sure we need to revamp the whole thing? How much effort are “wasted”? We could build features that can bring more benefit for us.

And we make blunder here. We try to prove that the new technology can deliver something that the old one cannot.

We made decision to reduce our data lag from 1 day to 1 hour.

Big mistake.

If we made that decision based on our customer needs or technical consideration that is still okay, but we made the decision to “show off” and that a big NO.

None of our users are helped by us reducing the data lag — although at first it seems they have. One hour delayed data is what I categorized as “niche” data. People are looking at that data, but nothing fruitful can be done.

Things will still be alright if we are capable to handle the changes without problem, but we encounter multiple problem in the implementation and we didn’t know which one we need to fix first because the user them self doesn’t need the improvement.

So, what does a PM supposed to do in this position?

They need to make decision:

  1. Do we continue with the changes?
  2. Find out what the user need

Questions no 1 is clear, we need to stop the enhancement and return it back to remove all the issue that happen.

Questions no 2 however, need more efforts. To cut long story short (feel free to message me to get the long story) I talked — a lot — to the user and find out how they use data. I then categorize their data needs into three:

  • Real time data for operational purpose
  • Delayed data for analysis
  • Summarized data for action making decision

But to get this information is not easy. I need to talk to multiple people, stalking them and sometime persuade them that they need to change. Some user needs to change how they read data, others I need to cajole that they don’t need data on details level and summary is enough. I even need to convince user that for doing analysis you mostly don’t need real time data.

Right there I understand, why did data team need PM.

The technical team are busy finding the best practice to handling data (scalability, data cleansing, data validation, etc.) and keeping the system working around the clock, so they need somebody that interact with user and come out with conclusion “This is the kind of data that our user needs” and also say to user “This is how you should use the data”.

They need PM to humanize the data

PM is needed to make sure that the data that we stored have a meaning for user. Because at the end of the day, data do come from user and will be consumed by another user.

This is the second phase which I dub: The reconstruction.

Here we try to understand how the company use data, challenge it and suggest the best practice that we can find.

We also need to develop multiple system to support the user either a visualization app, reporting enhancement even fixing the SOP to reduce data process duplication.

Based on popular reviews, the article feels too long to read in one go; ergo I’ll split it into two parts (second part will come next week — hopefully). Feel free to comment and share if you think this can be helpful for others and see you on the second part of the article.

--

--