Metamorfosa — Transformation of data team in Blibli.com — Part 2

Published in

Blibli Product Blog

6 min readDec 4, 2018

Sorry for the late article, several things happen so fast that I need adjustment period before I can continue the writing.

In the previous article, I’ve tried to cover things that have happen in the past 2 years and hopefully it give you insight on how to lay a foundation for data team. In this part, I’ll try to cover what we are doing and what are next option for us.

The Current state

At last we arrived on where we are right now. It’s a short but challenging — and somewhat fun — journey. Now after all the thing that we have done, it time for us to move to the next step.

I love to use the image above to explain the step of data analytics. The first layer is equal to our first stage and the second and third layer is equal to the second stage. So right now, we are in fourth layer.

This stage is where the tech team have the chances to contribute more on the business side. We have the data, we know what data the user use and we know how they use it; now it’s time for us to play with the data.

Here, the team scan the data and try find out what feature/insight that they can give to the company, and for that we need to do something first.

We need to free up some time.

So, we go back to the planning board and try to find what is the thing that take up our time and yet we can discard. Can you guess what that is?

If your answer is system maintenance, then congratulations you earn yourself a cookie.

More than 60% of the team’s time are spent maintaining the system. Checking if something happen to the server, make sure the ETL job are working properly, investigate why we don’t get any alert when some error happen, estimating what kind of preparation on the hardware for the data growth, etc.

Hence if we can break this down, that mean we’ll have more time for the team to explore and cloud seems to be the answer.

I won’t talk too much about what we do when migrating to the cloud as it can become a whole new article by itself, but I can say it seems to work with quite a big impact.

With the “extra time” that the team got, they can play around and create a prediction model. All fine and good, until we found out two issues:

The chicken and egg problem

Implementing predictive analysis is never easy in the beginning due to several reasons. It can be the user, the timing, the effort, etc. Below are the list of issue that happen to me when I try to implement it.

The user doesn’t know what does predictive analytic can do

This is a standard issue whenever we try to involve users on brainstorming session. If the user have no creative bone, then they will not be able to think how this predictive can help them.

On the other hand, users may be too creative and asking impossible things such as:

“If a customer buy a toothbrush, can we know if they are single or not? And if they do, when will they get married?”

Note that the questions above is just an illustration, I’ve never encounter that questions (yet).

Doubting the accuracy of predictive analysis

As per its name, predictive analysis never talk about absolute result. It can have high confidence level but never definite result. Thus the user normally doubt if it worth the time and effort for something that only a “possibility”.

Because of the two issues above, getting the chances to implementing predictive analysis solution is hard. The best way for us to gain the user confidence is by doing POC for our user.

Here the catch; building predictive analytic model is not easy nor fast so we will spending a lot of effort for something that may or may not be used by the user, so we prefer if the user are committed to implement the solution. Hence the questions come up:

Do we built the model first or find the user first?

The paradox

For all other approach, it quite easy to measure the performance metric but not in predictive analysis.

Let use the example of churn prediction.

Assuming that we can predict when a customer will leave our services in the next 3 months with confidence level 80%. We then generate the customer list and give the report to CRM team (let assume there are 100 customer in the list). Theoretically in the next 3 months, 80 people from those list will drop out from our services.

Now the CRM team (theoretically) should not stay silent and do nothing, they will try their best to prevent those customer to leave.

Then 3 months pass by and we see only 35 people drop while the remaining 65 still continue with us, which mean only 35% drop out (compared to 80% that was predicted).

This mean our model confidence level should not be 80% but 35%, yet the drop can be attributed to steps that CRM team did.

Thus it raise the questions

Does our model failed to work or it work and help the CRM team to perform better?

Of course the easiest way is to generate the model then wait for 3 months and see the drop rate and to get a better data model we probably need to do it for two or three cycle…

Are you seriously considering that?

Back to PM approach

Again, this episode show how easy it is to forgot basic PM knowledge once you are too immersed in the product. I forgot one of the golden rule of PM:

You build product for your user, not finding user for your product.

We have been approaching the problem backward and this is the issue that we must fix.

Rather than asking — How can predictive analysis can help you?

We should asking the very basic question — What are the biggest issue that you have that related to data or predicting something?

Once we change the approach, it is easier for us to find potential implementation of predictive analysis as we are trying to fix issue that user actually facing, and once we manage to proof our self then the next step will be easier.

How about the paradox? Well, that also can be fixed by basic PM approach which normally known as customer focus. I forgot that the user of the model are the CRM team and not our customers, hence what we should focused is not “the accuracy of our confidence level” but more to “why does the 35% still drop out?”.

We should use the result to enhance our model and use it to build a better model for our CRM team, one that give them list of customer for them to process and result in a higher conversion rate.

This two issues teach me a very good lesson; focus. It is very easy to be distracted by other things that you forgot the main reason why you want to build it — the users. Remember them because a great product is nothing without any user.

What next?

OK, so what next for the analytic team?

I believe that sky are the limit. There are so much to do and so many thing we can explore that make me think that this questions will have no wrong answers. As long as we remember the lesson in this article — Customer first and focus — then I trust the data team are able to define the next step.

It such a roller coaster journey for this whole three years and I wish that the data team enjoying the ride and look forward to a more exciting ride in the future.