EpiK Protocol AMA with DataUnion

A Recap of AMA Q&A

EpiK Protocol
EpiK Protocol
8 min readMar 26, 2021

--

Guest: Robin — founder of DataUnion.app. He specializes in blockchain, machine learning, entrepreneurship, and IoT, also as the ambassador of Ocean Protocol

Host: Xiaoxia

Topic: Further illustrate the similarities and differences of Knowledge Mainland and DataUnion.app — both leverage the crowdsourcing mechanism and token economy to incentive data contribution.

How do you see the data economy?

Robin: From my point of view the data economy is still in its infancy.

At the moment centralized data exchanges are governing the space that relies on companies to get access to datasets. Examples of these exchanges are the AWS Data Exchange or the Data Republic.

Many types of data cannot be sold as they contain information that cannot be shared e.g. medical data.

The data that is created by users on the internet is harvested by large corporations for their own profit e.g. Facebook or Google.

BUT we are in the unique situation that now projects are started that are questioning this status quo and that want to challenge these existing data creation, reward, and sales models e.g. Ocean Protocol, EpiK Protocol or DataUnion.app

And I am really looking forward to governing that space with all these new projects that are forming right now! Together we will be strong and create a new data economy!

Host: I’m with you! In China, we have observed the same problem. Data are controlled by data giants, such as Alibaba, Tencent, and Baidu. We found that the existing data market is lacking an open database, which is a burden to the development of the AI industry. For your take on the status quo, we thank the development of blockchain! Which has made the new mechanism of data distribution possible!

Could you briefly introduce DataUnion.app?

DataUnion.app is a crowdsourced data platform that creates the tools for participants around the globe to source, annotates, and verify data to be sold to customers.

Our motto is: Giving the power and profit of data back to the people that create it. Get rewards in datatokens and shares in the dataset.

To do that we launched a liquidity pool on the Ocean marketplace and are using these datatokens to reward our contributors.

We are in the process of creating the tools so contributors can help create these datasets. And we are starting with image data as we have the most experience with this type of data.

The data will be sold via a portal/marketplace that is based on our liquidity pool.

Think about it like a photo gallery where customers can select images to train machine learning algorithms on. By using Ocean Protocols compute-to-data technology we can do that without ever having to send the data to our customers but the customers can still get the algorithms/models that they ordered.

Host: It’s very interesting that you started with image data, because we have ideas of crowdsourcing data based on ImageNet by Li Feifei.

Could you share with us your experience with image data?

So my experience with image data comes from my work in the automotive industry. I build data pipelines to verify sensor data for new car models by major brands like BMW, VW, or Mercedes.

These companies spend millions of Euros to create data, annotate it, etc. but in the end, they do not reuse the data.

This was always very strange to me and with DataUnion.app we want to change this “waste” of data.

Host: Very impressive work experience! Thank you for sharing with us! The same problem with industrial data, like a German pharmaceutical company, has approached us. They would like to build an annotation platform, and they purchased a lot of data but only used for once, and used it for their own.

Robin: Yes, this spans over all industries and is a general chance to create profit for companies with their already existing data.

Host: Of course they are the data purchaser. However, if we could reuse the data, improve the efficiency of data. More and more companies will get the advantage of it. Yes. So what we EpiK Protocol are trying to do is to make an open knowledge graph platform to store all the KG data, and update it in real-time. By doing so, we are actually removing burdens for SMEs that are trying to access KG data but with little resources.

So that will come to our topic about the token incentive. Would you share your thoughts on this?

I think that people around the world are interested in bolstering their income with digital jobs. There are centralized platforms that are already building such networks but we think that decentralized systems which also give the contributors ownership of the part that they contributed are needed.

This introduces the new paradigm of the ownership economy — the participants in the data economy are also directly co-owners of it.

This becomes possible through token incentives as they don’t underlie the restrictions that normal money has.

And another important aspect is that we do not have to even know who is contributing. We can abstract from that via blockchain technology (e.g. Metamask).

Host: I love how you put it “the participants in the data economy are also directly co-owners of it”

Robin: Yes, this is very important for us. To make it possible that the contributors have a long-term incentive to contribute but also to keep the datatokens. If we create algorithms on top of the data we also want to give shares of the new datatokens to the contributors. So that everyone on the planet can benefit from AI in the long term as we see this as one of the major forces of automation in the future. And right now only the big companies are benefiting from that. And that has to change!

Host: We have mentors from OpenKG, Knowledge Factory — both are noted open knowledge bases in China. Based on their experience, we found that the existing data market is actually lacking an efficient mechanism of incentive. And we have seen the power of crowdsourcing by the case of ImageNet. If we could leverage everyone’s leisure time, the construction of great work is not impossible!

Robin: There is a crowdsourcing data app by Google without an incentive mechanism and another one by Microsoft that does pay to contributors but both do not work with cryptocurrencies.

We see their interest as a strong indicator that we are on the right path but that because of our innovativeness we can show them what is possible

Host: The reason that we believe in Blockchain is that it truly encourages everyone to get rewards via contribution. You could participate regardless of your location or experience, only by laptop the world is all yours. In underdeveloped cities of China, young people have a hard time finding stable jobs, while data annotation is something they could do and they are happy to do. Governments and institutions are pushing the campaign forward. By joining the crowdsourcing data annotation community, they not only get paid but could also learn new things funded by the government.

Robin: That is awesome to hear! We are also looking into joining funding programs of the European Union in this direction.

And perhaps Robin you could enlighten us on the status quo of the data market in Europe?

The existing data market is in the hands of centralized data exchanges and there are major problems with that. The European Union is in conflict with the large U.S. companies because they take the data produced by European citizens and use them for their own profit in the U.S.

At the moment there are major movements to change all of that. One important project in this direction is Gaia-X where Ocean Protocol is a member.

DataUnion.app is looking into how they can become part of that. This will form a European data cloud with attached data markets.

There is a huge demand but at the moment space is not in a state that it could really fulfill the demand efficiently.

Companies always create their own data for their specific use case, annotate and verify it. Then train their own models and use them but the data itself cannot be reused or resold efficiently.

That is what we are working on changing.

Host: In China, the data policy is also tight, data export is strictly prohibited. That’s why we believe in the crypto world, where everyone from every corner of the world gets to collaborate. In China, the economic growth driven by data/knowledge graph is huge, up to 100 billion CNY in 2024. We are both on the right track!

Robin: This will be an exciting challenge to solve — how could we make it possible to unite China’s policies with Europe’s policies and make sure that everyone can benefit?

Talking about Ocean protocol, being one of the leaders in the data economy, we all have a lot to learn from them, right?

I have been involved with Ocean Protocol since 2018 and also have been in talks with their team for a long time. I was even so fortunate to visit their headquarter in Berlin and meet the team personally. Thank you so much Ocean Protocol for making this possible!

They are solving issues with their technology that are not solved by any other projects at the moment.

Especially around the pricing of data which is very difficult. Ocean solves this via liquidity pools. This also takes care of the curation of data aka how useful it is.

With compute-to-data, they tackle the biggest problem — how to sell data without giving the buyer the option to copy the data?

Compute-to-data allows the safe execution of an algorithm on the data and delivers the result back to the buyer. This is earth-shattering for the data economy. It enables everyone to sell all data without having the fear of violating any data laws or the privacy of the data being sold.

And by attaching datatokens to datasets these datasets can become part of the Defi space e.g. lending platforms or other projects.

Additionally, they are also developing much more features like bridges to other blockchains or integrations with other projects. So building on top of their protocol is giving DataUnion.app a huge advantage — we are standing on the shoulders of giants!

Host: I got excited when you say that “we are standing on the shoulders of giants”! Yes, we surely are! For EpiK Protocol we have learned so much from our seniors on the track of Knowledge graph, from Tsinghua University, Tongji University, OpenKG, and to huge data companies! They have experienced the wrong way for us so that we could move forward with more plans in mind

· EpiK APP Download

· EpiK Twitter and Telegram Channel (For the latest news)

· EpiK Telegram Community

· EpiK Reddit, Facebook, and Discord

· EpiK Medium (For the latest articles)

·EpiK GitHub (For the full set of code)

--

--

EpiK Protocol
EpiK Protocol

The World’s First Decentralized Protocol for AI Data Construction, Storage and Sharing. https://www.epik-protocol.io/ | https://twitter.com/EpikProtocol