Datawallet Phase 1 Roadmap

In this post we break down the components of Datawallet’s permission-based data exchange ecosystem. We highlight the current status of each component, the next iteration we are currently working towards, as well as our further goals and vision for the set of options the community will develop for each.

Evolution of Ecosystem Components

As described in this blog post and our whitepaper, the Datawallet ecosystem is composed of functionally modular pieces (an overview of the overall architecture can be see in Figure 7 of the whitepaper). The initial versions of the ecosystem will only have one or two choices for each component, but the mature version of the ecosystem will offer a breadth of options. Community demand will guide innovation and the fully-specified modular design will allow a range of contributors to write drop-in replacements that augment the component while providing the minimal functionality required of it.

It is important to note that the current roadmap is built upon a number of assumptions — both technological and community demand/business assumptions. The community demand for a self-hosted Datawallet have informed our long-term development goals. However, subsequent research and community engagement may point to more desirable data storing, managing, and exchanging solutions. The technological assumptions include the emergence of scalable implementations of some currently unscalable solutions and open research questions. We have striven to factor in this uncertainty into our roadmap, but the emergence of necessary technologies can impact the proposed timeline. Exogenous circumstances may therefore impact this roadmap, but significant deviations will be clearly communicated and explained to the community.

Based upon the demand faced in the market from, both, enterprise customers and the community, the continued development of the components may prioritize a different sequence for each component. In fact, we are optimistic that active engagement with the permissioned data ecosystem will reveal use-cases and demand we have not fully considered that will prioritize components with novel functionality.

Data Provider Tools

These are the tools that allow data providers to source, collate, and store their data and then permission it for direct compensation and services. Datawallet has been developing and refining these tools for years and therefore already has a mature solution in our native mobile apps, data sourcing and ETL pipelines, and cloud storage system. Our mobile apps provide a unified user experience to source and permission data to specific exchanges. As discussed in the stage map, the next steps include enhancing current ETL pipelines and finalizing the ones still currently under development. Additionally, we are planning to develop a suite of general-purpose tools for community members to develop their own data collection/sourcing tools and integrate it into their datawallet such that it is compliant with the general marketplace. One core component, provider’s data wallet offers a compelling range of implementations.

Collated Data Profile

This is the data provider’s Datawallet, the core of the ecosystem and the kernel for the permissioned data economy. There are a number of solutions with different benefits and disadvantages that can be classified coarsely with the four dimensions used in the table below — Location, Managed, Structure, and Encrypted. The location is where the wallet is stored, either in the cloud or a provider-hosted device like their pc or smartphone and also the extent to which it is ‘decentralized.’ Managed refers to whether the provider personally manages their wallet (optionally with open-source tools provided by Datawallet or the community), or whether the provider contracts a third party service. Structure refers to how the Datawallet itself is organized, either in a normalized relational database or a denormalized object (NoSql). Finally, encrypted indicates which parts and how the data are encrypted. While every possible combination of these dimensions are conceivable, practicality dictates some relationships. For example, only a centralized storage solution would leverage the advantages of a normalized rdb structure.

In the table we have outlined four datawallet storage implementations that have compelling advantages. The DW-cloud-hosted version is the most economical and the easiest to setup and maintain for the data provider. It additionally enables easy data viability checks that make it the easiest source for data developers to design products for. However, it also has considerable drawbacks including requiring that datawallet is a trusted arbiter and posing a more attractive hacking target. A viable 3rd-Party-Decentralized option would mitigate these latter risks, with usability disadvantages and potentially requiring the trust of the 3rd party provider. The self-hosted option allows providers to take physical ownership of their Datawallet and obviates trusting external parties, at the cost of assuming responsibility for storage, security, and usability. The DW-app-hosted wallet would provide many of the advantages of the self-hosted option with improved usability from the integration into the dw-app. The disadvantages of this approach include the obvious storage and network constraints of current smartphones and the added development hurdles for interacting with the wallet compared to the self-hosted pc version.

Table 1. Labeled selection of desirable Datawallet storage systems.

We currently offer the DW-cloud-hosted option and are actively refining this component. We plan on developing a prototype of the self-hosted datawallet in Q4 this year (however, the data-sourcing pipeline presents some challenges given the design choices of many of the social-media data APIs requiring some components to remain cloud-based initially). We are in active discussions with a number of decentralized storage projects to foster development of 3rd Party-Decentralized components. Development of the dw-app-hosted version will be dependent upon insights gleaned from progress in the data marketplace (e.g. what is the minimal viable set of datapoints necessary to meaningfully interact with a sufficient proportion of the data contracts) and the next generation of smartphone specifications.

Data Product Developer Tools

These are the tools that help data-developers — organizations, ML/AI engineers, data scientists, and businesses — develop products and services on providers’ permissioned data. Currently DX Insights and our personal-insights suite of products are the only products developed on providers’ permissioned data. The next step, currently underway, is to refine and extensively document our internal data API and make it public to data developers. In parallel, we are designing a web-based developer studio that allows data developers to specify their data request (as described in the whitepaper) and post an exchange on our in-app data marketplace. Data providers who consent to the exchange will grant the data developer an access token to the specified routes of the data api. We are striving to invite the first batch of data developers to the platform in Q3 this year. Further steps are to make these exchanges smart-contract mediated to obviate our current role as a trusted market-maker. We are currently developing prototypes of encrypted data exchange smart contracts in Solidity. Currently there are a number of economic and technological challenges in making it a scalable and viable solution for micro-data transaction. Even with the emergence of viable off-chain payment projects the current cost of smart-contract deployment and function calls puts a floor on the value of assets that can be profitably exchanged via smart-contract. We are therefore striving to develop the core contracts and ecosystem while remaining chain-agnostic. We plan on offering dw-mediated and smart-contract mediated exchanges side-by-side to allow the marketplace of data-developers and data-providers to decide when the assurance (and additional friction) of the smart contract solution is warranted.

Conclusion

In conclusion we are striving to make a end-to-end permissioned data exchange in which members can choose the components that best fit their needs from a range of implementations developed, both, internally at Datawallet and in the broader community. We are hard at work developing and distributing the first implementation of the necessary and sufficient components to provide our data providers value. Additionally, we will have prototypes of both the locally-hosted wallet and the data developer suite by the end of this year to ensure that our developer community can begin to build the user-permissioned data products of the future.