A Deep-Dive into Deep Knowledge Network (DKN)
In this article, we introduce the design of Deep Knowledge Network, DKN which is a blockchain based crowdsourcing knowledge and reward ecosystem architected with a fast and inherently scalable multi-blockchain architecture.
Figure 1 shows the architecture of DKN. The system consists of various participants including challenge creator(enterprises or chatbot builder), contributors (domain experts, programmers), and AI service users. In this network, whoever contributes to the system will be rewarded with DKN tokens. Contributions can be domain specific conversational data or based on the specific challenges posted by the challengers. By combining a knowledge reward system and an AI expert matching system, DKN can become a human knowledge ecosystem to automatically collect and manage domain specific knowledge.
By creating segments of verified databases with additional information such as the details about the creator, expert skill level of creator, industry, feedbacks etc, more efficient AI models can be built. This will allow the AI developer to derive new AI Algorithm to consider and pick most relevant and selective data for training a model and thus it can increase accuracy (avoid overfit data or cleansing data) and efficiency (faster and less energy in GPU).
To manage tens of millions of domain specific conversational data, DKN has a knowledge management system that clusters and organizes the massive conversational data into a highly structured, scalable knowledge base.
Finally, DKN contains a versatile AI engine that automatically trains various AI tools such as question answering, intent analysis, sentiment analysis and expert recommendation using the constructed knowledge base. These AI models are provided as services for enterprises.
BlockChain based crowdsourced data collection
DKN is a crowdsourcing data collection platform which allows data requesters to post data requirements and receive data from the crowd chatbots.
The system includes four parties: data requesters, data contributors, domain experts and data validators. At first, a data requester registers the needs of data by providing data specifications (e.g., domain, format, examples, etc.). Chatbot platforms such as Botzup (www.botzup.com — example of chatbot builder platform) will create chatbots using the specifications and set them up in online communities. A data contributor would be online and chat on these chatbots to earn DKN tokens. The chat data would be tokenized and turned into digital asset which would be registered as domain datasets. This mapping would be saved in blockchain confirming the identity of the person using digital certificates, since the blockchain is an immutable ledger no one can tamper this data and steal the contributors work. Placing this mapping in blockchain would also help contributors to prove his identity.
Bidding and Auditing
DKN recommends experts with data requesters. Data requesters will provide task requirements with data specifications. Domain experts will register in the platform by providing their expertise. The system also builds profiles for experts based on their historical experience on the platform. The bidding system will rank experts based on their expertise scores and automatically recommend them to data requesters.
Contributor will need to build their expert skill profile by participating more specific skill challenge. Each skill will have average scoring on their previous input and number of completed challenges. Each contributor will also has a ranking based on the requirements of each challenge. The system will manage a distributed database of all experts with their expertise. These information are stored in blockchain. When a requester comes and proposes his data requirement, the system will do automatic auditing by matching expert profiles and recommending an appropriate expert for the data requester.
Expert matching feature of the network allows the data requesters to directly request the experts in specific domain to fulfill the requirements of challengers and at the same time, rewarding the contributors.
In addition to contributor matching, DKN also supports efficient data selection. When an data contributor provides conversational data, the block chain system will remember the identity of the person using digital certificates. The system also creates a profile for each data contributor by assessing his/her expert skill level, historical contributions and feedback of transactions. The contributor identity, together with their profiles are stored in blockchain and are injected as a part of the collected dataset. This will allow the AI developer to derive new AI Algorithm to consider and pick most relevant and selective data for training a model and thus it can increase accuracy (avoid overfit data or cleansing data) and efficiency (faster and less energy in GPU).
The adoption of blockchain would be a game changer in the way researchers collects or share data. The inbuilt history auditing functionality of blockchain would help in footprinting the transfer of ownership of datasets. Introducing blockchain to the conversational data market would eliminate forgery and unauthorized sale of this data set. The interested candidate would have an option to see the sample of data for the evaluation of its quality, then determine whether to buy it or not. The candidates can express their interest in that particular dataset and channel would be opened between the contributor and the interested buyer where they can talk terms or talk about the pricing. On successful agreement the contributor can release the data and change the ownership of data to buyer this would be stored in blockchain. All the data would be classified according to domains for ease of access.
In order to encourage the users to contribute more, DKN has several reward programs in place.
Data Requesters will be provide tokens when they sign up for a new account, these DKN tokens would let them access data in the network and allow them build AI models using the same data.
As the trained AI services contain authorship of the original data contributor. These authorship information are further used for contributor rewards. When the AI model reaches a particular accuracy, they can share it back to the network and earn additional tokens to access more data in the future, when more of their peers are using this AI models they earn more tokens. There could be many business possible business models — subscriptions, on demand purchase, freemium. When a new contribution is made by a contributor, if it passes the AI screening test to filter out repeatable and unqualified response, the user will be awarded with DKN tokens offered by the challenge creator. Moreover, a user in a particular network can refer a challenge to other users in different network and earn 10% of tokens as reward when the second user has accepted and successfully completed the challenge.
For instance, a Hotel concierge manager set a conversation challenge to simulate a common scenario of a hotel guest makes complaints to concierge staff. The manager set up the requirement for expert skill for each role play contributor and reward 100 DKN token reward for each completed contributor, with a total limit of 50,000 DKN token for this challenge. The manager is expecting 500 conversation dialogs to be collected through this challenge. The challenge will then route to 1,000 (2 x times of expected dialogs) contributors feed page with highest expert ranking matching with this specific challenge requirement. After 24 hours if there are remaining budget left, it will open to all the contributors who fit to the expert requirements. Challenge is in first come first serve basis and will turn completed when the defined challenge budget is finished.
Figure 5 shows the architecture of the blockchain system. The project would be a multi-platform based blockchain solution which uses the digital tokenisation characteristics of ethereum as well as highly scalable and through putness of hyperledger fabric. The reward system which is an integral module of the application would be built on top of ethereum, this creates an ecosystem of its own which utilises the DKN token. The tokens build would be of ERC20 standards the storing of data is managed by two popular blockchain platforms hyperledger fabric and bigchaindb. Hyperledger fabric leverages the privacy between the DKN token as well as the buyer with the channels (a concept from hyperledger fabric). The inbuilt CA server which handles the membership services ensure the identity of stakeholders by generating certificates. The access control module ensures that unauthorised personal cannot get the data from blockchain. The DKN has its own proprietary blockchain management system which will administer the deployment of framework in different servers, in short the Enterprises would just need to provide physical machine the DKN platform would manage download/clusterise itself.The sharing marketplace that would help diverse AI developers and buyers to share the data set in more secure and transparent way. The DKN introduces the new feature of automatic training, as the domain specific data are separated and stored in blockchain the developers can connect to blockchain network use this data set for a fee.
The participants in the network like contributors, validators, and requesters can sign up through Membership Service Providers(MSP) in the Hyperledger Fabric platform. MSP abstracts away all cryptographic mechanisms and protocols behind issuing, validation of certificates, and user authentication. The Business layer expose the RESTful APIs of different microservice modules. Each microservice provides a set of functionalities and it will communicate with fabric through a NodeJS client SDK.. Endorsement policy of the network is in such a way that the Data requests should be endorsed by data contributor nodes. DKN tokens are used to reward the users in the network for data. The token business logic and the management roles are created using chaincode and membership service. The Chaincode is used to build the logic for executing the transaction of different participants in the network and it also automatically execute different functionalities without any intermediaries and notify the requested participants after the execution. Every organisation in the network will have its own endorsing peers with a default channel, so that all the transactions are broadcasted to all the peers in that channel. The same organization structure will be maintained in different zone/region for resiliency, load and failover scenarios.
Deep Knowledge Network is a team of top notch scientists and expertise aiming to resolve the barrier and unlock the full potential of conversational AI. We are going to further explain our solutions and features in our coming whitepaper and articles.