Diversity in Blockchain Series #3: Jensen Yap, The issue of diversity is now inevitable.

Community at Klaytn
Klaytn
Published in
7 min readDec 2, 2019

Dear readers,

We would like to introduce our data engineer, Jensen Yap, as the second interviewee for [Diversity in Blockchain] series. You can find the diverse perspectives of data in the blockchain. Please enjoy reading his story and career journey to the blockchain industry.

Jesen at Tableau Conference

What is your main job in the data engineering team of Klaytn?

My main job is to set up a stable infrastructure for data to transfer them from multiple sources to our data storage. I have to make sure that the collected and transferred data is safe, clean and stable. We are now collecting data from Ethereum and Klaytn, which has a mainnet and one or multiple testnets. After collecting, compiling and saving the data, what data engineering team does is analyzing, modeling, experimenting and reporting them based on safe and clean data sets. It is about laying a solid foundation for data analysis. That is the first step.

What kind of data do you collect and what do these data mean to us?

There are two types, blocks data, and traces data. Blocks data is piling up every time blocks are created and transactions happened in specific blocks. Also, we can understand what happened in these transactions, such as the activities of token created and token transferred. However, blocks data do not show how one block is connected to the other block. That is the reason we should see traces data as well. With all two types of data, we can assume the user journey, from where the users start, where they are going, and where they are now.

The primary reason why we are looking at Ethereum and other platform data is that we can find certain patterns from the previous data of Ethereum and see how certain activities start and go viral. After learning from similar cases, we can apply the same model as Ethereum to our platform. We can find a way of improvements based on the comparison. The examples we can refer to are our governance models and policies to operate the platform. Also, we can identify and judge users’ behavior that abuse and harm the ecosystem of our platform.

If you have any lessons learned from your analysis, could you share it with us?

According to our analysis, transactions in a certain platform are creating every two seconds. That is not what a human can do consistently. We can assume that users are running bots by creating self-transactions at a fast pace automatically. This activity may result from users’ behavior to populate blocks and get more fees, and it is not a healthy signal to our ecosystem. As each platform is establishing policies to deal with abusement, the past data of other platforms can be a good source for us to improve our chain and set up an appropriate policy within our chain.

You have worked for Coupang, Korbit and are currently working for Klaytn as a data engineer. Could you explain your career journey from Coupang to the blockchain industry? What brought you here?

It was right after the coin market went up. I invested in coins at that time. I wanted to know why the price went up and down but there were not the exact facts to convince me. As a data engineer, I wanted to bring a ‘data’ perspective to a new technology that no one is doing well.

I used to deal with trading-related data after moving to Korbit after Coupang because Korbit is an exchange to enable trading of any blockchain assets. A pretty big amount of users in this exchange were actually robots and the volumes of trading in an out from these robots were more than half. I saw how robots influenced the value of the coin. However, staying at a high volume of trading is quite important for an exchange because it shows how healthy this exchange is, even if the transaction was self-made. We can also identify heavy traders and made them stay by giving them exclusive benefits.

In 2014, Mt Gox, the largest cryptocurrency exchange in the world, was hacked and 850,000 bitcoins were stolen. Chainalysis was the official investigator in the Mt.Gox bankruptcy case. After this case, Chainalysis, whose business is crime-fighting and regulatory compliance, is contributing to detecting fraud and preventing money laundering with data analysis.

When I heard that news, I was really inspired by what Chainalysis did and realized that it is exactly what I have been looking for. Data can mean more than numbers. It can tell the truth. While not many blockchain companies invest money and time in data analytics, as far as I know, Klaytn is willing to do so and has already many types of data in its chain. That is why I joined here.

As you said, data can be an important source for the top management’s decision-making process. However, human behavior and traces, which are the base of all data, may not be always right because humans are not always reasonable. What do you think about that?

I think the reason why most of the people are not used to data-driven decisions is that our life is not always data-driven but experience-based, surrounding-based, and emotion-based. I do not want to necessarily force people to always be data-driven. But the data-driven decision needs to be part of the conversation within a company.

Let’s take an example of a business meeting that diverse teams are participating in. We look at a presentation material saying that 40% of the users are robots. Teams can interact emotionally discussing this issue during the meeting but the data shows us the truth and help everyone to stay on the same page. There may be different ways to translate the data depending on the team but what they are looking at is ultimately the same.

What I want to do is to create a healthy culture that empowers everyone in a company to articulate and tackle every decision based on data. That is why our data team created a data lake. Building one true source of data is necessary to help people discuss the same truth within an organization. Even though our analysis can be wrong sometimes or has no insights, at least we can give everyone enough space to start a conversation, be more objective and give diverse views and feedbacks.

What are you and your team’s next plan?

The mission of our team is to support other teams to make data-based decisions. We, as a platform player, want to look into Klaytn and define who Klaytn users are, what they do, finding out the differences between healthy users and abusing users and what kind of activity is happening in our chain.

At present, the reputation of blockchain technology itself is connected with what the exchanges do. We plan to release some articles or papers to show the professional and unbiased view about the blockchain. We would like to show blockchain is not about the price on the exchange but it is a genuine technology.

What I want to do personally in terms of data science is to create more open-source tools to analyze the chain. Analyzing a certain chain is very expensive because it is very hard to get clean data to analyze. We are continuously developing a service to use for the community when they want to analyze our public Klaytn chain data.

As you have studied and worked in Korea for a long time, you are exposed to a diverse environment. As a person who has a diverse experience in different contexts, how do you feel about the ‘Diversity’ issue?

I think that the issue of diversity is something we cannot avoid at this time. Technology has been and being advanced in a way to connect people globally. It is getting easier to travel across the border and to get a job in a different country. It means we can be exposed to a diverse environment easily and naturally.

From the perspective of business, your service or product cannot survive only in a local market. To penetrate the global market, we need to understand global users’ needs based on their diversity such as ethnicity, age, and gender.

On the other hand, big data, AI, deep learning are being considered as a hot trending technology to change the world but even right now we do not truly understand what they are and how we can fully utilize them in our daily life. What we need now is to integrate diverse contexts and backgrounds as technology advances.

Blockchain is in a similar situation. Blockchain is still in a very early stage and it is not simple to apply it to the real world for service. We should encourage our community to discuss more our technology with diverse perspectives and imagine how it can change our life with blockchain. This approach can be a healthy starting point if you don’t want blockchain to remain just as a concept.

--

--