For US$1,554, China’s first credible circulation transaction of AI data was completed!
As an important production factor of AI industry, data is an essential fundamental resource for AI models and their applications, which plays a key role and holds significant value. The AI industry has been facing the problems of expensive data acquisition, labelling and governance, idle data resources after analysis, high cost of continuous data storage and the inability to reuse and share data elements. With the further development of AI industry, the “ confirmation and registration, credible circulation, controlled process, lower costs and increasing efficiency” of labelled data will effectively help AI enterprises to reduce data acquisition costs, accelerate the optimization of AI model algorithms, and facilitate realizing the value of labelled AI data. It will change from acquiring data from one entity at a time for analysis and application into multiple ones, thus realizing reliable reuse on the basis of “measurement on confirmed and registered data, credible circulation and transaction”.
On January 11, 2022, China’s first trusted circulation transaction of AI data was completed for US$1,554, containing 98770DRs of voice command recognition dataset. The dataset was sold by EpiK Protocol to Shenzhen Zhongke Lucent Technology Co. Ltd. (“Zhongke”) through the Data Ownership Confirmation and Trusted Circulation Platform. Zhongke’s chip products have been adopted by well-known brands such as Transon, Philips, Lenovo, Audio-Technica, Netease, Aqiyi, and Tmall Genie. The dataset will be used for training the company’s smart headset chip to understand simple voice commands.
The “simple voice command recognition dataset” includes clear voice data of 1,411 users reading YES and NO 5 times each, as well as annotated information of each voice data, containing multi-dimensional desensitized information such as gender and region.
The data collection and labelling platform EpiK Protocol has developed an original “AI data labelling system”, which enables the collection and labelling of datasets through the joint efforts of domain experts and the ecological community. Through the knowledge nodes with distributed storage based on blockchain technology, the “AI data storage system” has been built, enabling the low cost, high quality and availability of AI data sets, and ensuring the security and trustworthiness of the datasets.
EpiK Protocol will continue to register and confirm the ownership of the collected and labelled data through the “Data Ownership Confirmation and Trusted Circulation Platform” and conduct trusted circulation transactions. It plays its role in the efficient and credible reuse of labelled data for the artificial intelligence industry, and consistently provides comprehensive solutions for the training of knowledge graphs across industries for big data needs. The platform has already gathered several domain experts and carried out dataset collection and processing in the fields of medical health, financial fund, intelligent transportation, emotional computing, and multimodal machine learning. In the future, it will enable applications in various fields such as smart healthcare robotics, financial risk prediction, autonomous driving, commercial advertising or artificial intelligence training.