Data to Go: The Value of Data Portability as a Means to Data Liquidity
By Juliet McMurren and Stefaan G. Verhulst
This article was originally published in Data & Policy, the peer-reviewed, open-access venue dedicated to the potential of data science to address important policy challenges.
If data is the “new oil,” why isn’t it flowing? For almost two decades, data management in fields such as government, healthcare, finance, and research has aspired to achieve a state of data liquidity, in which data can be reused where and when it is needed. For the most part, however, this aspiration remains unrealized. The majority of the world’s data continues to stagnate in silos, controlled by data holders and inaccessible to both its subjects and others who could use it to create or improve services, for research, or to solve pressing public problems.
Efforts to increase liquidity have focused on forms of voluntary institutional data sharing such as data pools or other forms of data collaboratives. Although useful, these arrangements can only advance liquidity so far. Because they vest responsibility and control over liquidity in the hands of data holders, their success depends on data holders’ willingness and ability to provide access to their data for the greater good. While that willingness exists in some fields, particularly medical research, a willingness to share data is much less likely where data holders are commercial competitors and data is the source of their competitive advantage. And even where willingness exists, the ability of data holders to share data safely, securely, and interoperably may not. Without a common set of secure, standardized, and interoperable tools and practices, the best that such bottom-up collaboration can achieve is a disconnected patchwork of initiatives, rather than the data liquidity proponents are seeking.
Data portability is one potential solution to this problem. As enacted in the EU General Data Protection Regulation (2018) and the California Consumer Privacy Act (2018), the right to data portability asserts that individuals have a right to obtain, copy, and reuse their personal data and transfer it between platforms or services. In so doing, it shifts control over data liquidity to data subjects, obliging data holders to release data whether or not it is in their commercial interests to do so. Proponents of data portability argue that, once data is unlocked and free to move between platforms, it can be combined and reused in novel ways and in contexts well beyond those in which it was originally collected, all while enabling greater individual control.
To date, however, arguments for the benefits of the right to data portability have typically failed to connect this rights-based approach with the larger goal of data liquidity and how portability might advance it. This failure to connect these principles and to demonstrate their collective benefits to data subjects, data holders, and society has real-world consequences. Without a clear view of what can be achieved, policymakers are unlikely to develop interventions and incentives to advance liquidity and portability, individuals will not exercise their rights to data portability, and industry will not experiment with use cases and develop the tools and standards needed to make portability and liquidity a reality.
Toward these ends, we have been exploring the current literature on data portability and liquidity, searching for lessons and insights into the benefits that can be unlocked when data liquidity is enabled through the right to data portability. Below we identify some of the greatest potential benefits for society, individuals, and data-holding organizations. These benefits are sometimes in conflict with one another, making the field a contentious one that demands further research on the trade-offs and empirical evidence of impact. In the final section, we also discuss some barriers and challenges to achieving greater data liquidity.
I. Societal Benefits
- Informing policy and programs to respond to global problems. Once data can flow freely and be transferred and reused, it can be used to inform responses to issues like poverty, disease, and environmental threats. Ported data made possible the Green Button initiative, which gives consumers access to their smart meter energy usage data and average consumption for their neighbourhood. Consumers using Green Button reduced their energy use by up to 18 percent, which helped states reduce overall consumption, meet their energy efficiency goals, and optimize their grid sizing. Reuse of this energy data by third parties has the potential to deliver further benefits through the design and delivery of smart appliances that optimize energy use based on grid demand. A similar government data portability initiative in India proposes to help underbanked individuals and small businesses access finance by porting and using their past financial transactions.
- Enabling research and data altruism: Data liquidity through portability also has the potential to be a major enabler of research, allowing individuals to donate data — such as that collected through wearables — to research initiatives, or to transfer their data among projects for reuse. Such initiatives could advance “data altruism” — dramatically expanding the pool of data available to researchers while greatly reducing the cost and time required for some types of research.
- Designing cheaper and more effective services and programs. Improved intelligence derived from ported data could enable consolidated, targeted, and more personalized social services to be delivered at a lower cost to the taxpayer or funder. Integrating health data from multiple providers — as with the US Blue Button initiative — can enable more seamless and convenient services for patients, with lower costs and fewer risks for providers. The aggregated data from this type of initiative can also generate societal insight and public health findings and applications that can help individual users monitor and improve their health.
- Gains to economic growth and productivity. Reducing the costs and friction associated with transferring data, and increasing the ability to combine data sources, could yield productivity and economic gains. Economic growth can be boosted by supply-side benefits enabled by new technologies, infrastructure, capabilities and skills that support data portability, as well as by potential export opportunities for these technologies. Similarly, as data becomes cheaper and more accessible, firms can use more of it, leading to increased productivity, while combining types of data can reduce production costs by allowing a more precisely targeted product. UK estimates suggest productivity and efficiency benefits to GDP from personal data mobility could be worth £27.8 billion (US $36.1 billion).
- Innovation and development of new technologies and industries. When consumers are free to move their data in response to changes in product or pricing, competition is likely to increase. This is particularly true when markets are highly concentrated, as is the case with the tech sector. More competition may in some cases lead to greater innovation and the development of new technologies and markets, both of which can also lead, in turn, to more economic growth and productivity.
II. Individual Benefits
Data portability greatly increases individuals’ agency over their information and their choices as consumers. The resulting benefits to individuals include:
- Empowerment and informational self-determination. Data portability strengthens individuals’ personal autonomy over data, and their capacity to determine what is shared and with whom. Consumers can choose to share data with certain firms over others, responding to differences in pricing, data handling methods, or other practices. In turn, greater autonomy and control is likely to increase transparency and trust in the relationship between data holders and consumers. Data portability may also empower individuals to engage in citizen science or donate their data for other research purposes.
- Increased choice and innovation. As noted, by enabling consumers to switch companies or providers, data portability can create competition in previously uncompetitive markets. In doing so, data portability can also lead to the development of new and previously unimagined products, such as recommendation services for utilities, clothing or entertainment, through recombinant innovation, the combining of data from different sources in new ways. The advent of open banking, for example, opens markets not just for new banks, but also for providers of other services built on financial data, such as price comparison websites, banking infrastructure and interfaces.
- Better services and lower prices. Through improved targeting, reduced switching costs, increased competition, and reduced information asymmetries, data portability can lower prices and result in enhanced service by data holders and providers. In the European Union, for example, the introduction of mobile number portability from the late 1990s led to reductions in mobile market prices of over four percent and cumulative savings equivalent to €880 million per quarter. These benefits can accrue to commercial data subjects as well as individual consumers. A 2019 Arizona statute gives auto dealers the right to authorize third-party software providers to access their data from their dealer management software (DMS) platforms. This decision, which challenges the hold DMS providers have over their client data, offers auto dealers the potential for new services with improved functionality, and at lower cost.
- Convenience. Data liquidity can enable consumers to preserve and move their contacts, content, data, and social graph — the map of connections between that user and other users of the service — when transferring between providers or services. This ability introduces far more convenience in the everyday life of consumers — for instance, when switching platforms or hardware. In the case of health data, the greater convenience that accrues from portability could even be lifesaving in situations where users are unable to give their medical history.
- Security. Data portability can enable users to back up, organize, and archive their data, recover from account hacking or hijacking, and retrieve data from deprecated services at the end of their use life. These capabilities could greatly increase the security of consumer data and activities, ensuring that data stays within and under the control of consumers, and that it can easily be retrieved if stolen, lost, or otherwise damaged.
III. Benefits to Existing and Potential Data Holders
Existing data holders, especially private companies, often see data portability as a competitive threat. It is true that greater data liquidity through portability shifts agency and control from data holders to the consumers who are the data subjects. Equally, it has the potential to increase competition by making data that was once restricted to one data holder available to other players. Nonetheless, organizations holding data also stand to benefit from the data liquidity created by portability if they are willing and able to adapt. Portability has the potential to increase existing data holders’ access to data by enabling them to access additional, previously unavailable sources of data. It can also enable new players to receive data, establish themselves, and become data holders themselves. Among the benefits we would expect to see:
- Opportunities for existing data holders to collaborate by aggregating data to create synergistic products. When data becomes more liquid through portability, existing data holders can benefit by encouraging and facilitating the porting of data to collaborators or a jointly created third party. Through collaboration, they can create platform-like ecosystems of synergistic products and services whose value is greater than members’ individual products or services. Smaller players can also use collaboration to achieve the critical mass of data needed to mount a collective challenge to market leaders. Within the banking sector, for example, this could produce opportunities to collaborate over infrastructure, products, and customer interfaces.
- Opportunities for both existing and potential data holders to innovate to create new products and services. Recombinant innovation has the potential to drive innovation in product and service design and delivery, business models, platforms, ecosystems, technology, and infrastructure. This recombination could involve a mix of in-house data holdings and ported data, in the case of existing data holders, or entirely ported data, in the case of startups. The resulting innovations could include services that were previously impossible or difficult to deliver. A UK study proposed a potential household budgeting use case that would combine supermarket loyalty cards and purchase data to provide insight into spending and recommendations on money management. Similarly, open banking innovations, which allow for portability of consumer data, have permitted third-party providers to deliver novel products and interfaces and combine financial products with other sectors.
- Reduced barriers to market entry for potential data holders. When the costs of switching between providers are high, market leaders enjoy a substantial advantage, since they typically hold the bulk of customer data. However, if consumers are able to port their data, barriers to entry to the market are reduced, creating fresh opportunities for new entrants. Ultimately, lower barriers to entry, greater competition, and more players lead to more data available to be used, with potential benefits for all.
- Improved market intelligence and product validation. As previously noted, increased liquidity through portability has the potential not only to open up previously inaccessible data, but also to create new data sources through reduced barriers to entry. Both trends can improve companies’ awareness of consumer needs, reducing the risk involved in product development and improving targeting and design of products, services, and marketing. Increased competition can also benefit all companies by validating products, forcing companies to identify and demonstrate their competitive advantage and improve the quality of their products, ensuring a more informed market.
IV. Barriers and Challenges
The potential benefits of data liquidity are wide-ranging. Nonetheless, significant barriers to realizing data liquidity at scale through portability remain, even in jurisdictions where an enabling legal and regulatory framework exists. In this section, we discuss some of the barriers and challenges to data portability.
- Demand: Both data subjects and data holders need to be convinced of the value of portability to create demand for the right (in those jurisdictions where portability does not yet exist) and for the data itself. Building demand calls for robust use cases that go beyond the often instanced example of mobile number portability to demonstrate the potential value of greater data liquidity. It also requires the development of the skills, platforms and capabilities to exercise the right to portability and make use of any data that is released.
- Governance: Critical regulatory, legislative, and policy issues need to be addressed and clarified before greater data liquidity can be achieved. The concept of data portability itself — and the distinction between data portability and related concepts such as data transfer and data interoperability — remains poorly defined, leading to conflicts of definition and application between jurisdictions. In addition, there is potential for conflicts within jurisdictions between data portability rights and privacy and intellectual property laws, particularly around communal or jointly created data. Data portability is also often too narrowly defined limiting the potential to provide data liquidity. Finally, regulators will need to decide on appropriate liability models for misuse or accidental or intentional exposure of personal data to ensure compliance and consumer trust, and develop portability impact assessments, modelled after similar privacy impact assessments, to structure discussion about risks and benefits.
- Standards: Realizing true data liquidity will also require the development of standards for portability, interoperability and transfer, which would reduce the costs and risks to data holders and ensure that data could be successfully transferred and reused in different contexts. While mandating standards may be necessary to ensure consistency, there is a risk that government-enforced data standards could stifle innovation by encouraging the continued use of legacy technologies and data formats.
- Roles: Full implementation of data liquidity will require that the roles, rights and responsibilities of all parties are clearly defined. At present, legislative conflict still exists over fundamental issues, such as whether the right of data portability should be extended beyond individuals to organizations. We anticipate that new roles — such as new data intermediaries, data stewards, personal data managers, and data market organizers — will emerge in the data portability and transfer process, and the rights and responsibilities of these various actors will need to be defined. Finally, regulators will need to resolve issues like the length of time data holders are obligated to hold data, and whether data processed by data holders should be included in the right to portability. These limits will need to be weighed carefully against the rights of data subjects, but also against the public good, particularly when dealing with data of high commercial value or research interest.
- Tools: Achieving data liquidity will require the development of technologies and services for safe, secure, and easy management and transfer of data. Data portability creates the potential to multiply the impact of identity theft and fraud, so the security of the transfer process must be a priority if use is to be incentivized. Current tools are also too complex to enable them to be used by average data subjects. Without a guarantee of security and a frictionless end-to-end experience, neither data subjects nor data holders are likely to want to expose themselves to the risks and frustrations involved in the process of transfer.
- Data quality and quantity: Data portability poses potential problems for the quality and quantity, and consequently the value, of data. Since most models of portability involve transfer at the scale of the individual data subject, the quantities of data being transferred may fall short of the critical mass required to be valuable or useful for aggregated analysis unless the transfer process is appropriately incentivized and made frictionless, or unless processes for collective portability are created that can overcome existing asymmetries. Without standards for data collection, too, decentralized data provision also has the potential to compromise data quality.
Although the benefits to individuals, organizations holding data, and the public good are real, there is work still to be done in making the public case for data liquidity and in building the structures, tools, processes, and legal framework required to go beyond data portability. As we have suggested above, building awareness among citizens and policymakers is a critical step. To achieve that awareness, we need more robust collections of use cases and insights derived from those cases. In addition, we need a more robust research agenda that recognizes the potential for conflict between the interests and benefits of stakeholders within the portability process and seeks to identify ways to resolve these conflicts.
We have begun this process at the Open Data Policy Lab (an initiative of The GovLab), and we continue to engage with the emerging field of knowledge globally. In particular, we have been building on our own and others’ work on open data and data collaboratives, all of which offer some important insights into what works and what doesn’t. We invite you to join us in this endeavor by sending us examples of successful (or not so successful) liquidity initiatives, as well as insights and lessons learned.
The authors would like to thank David Osimo for his excellent substantive input and Andrew J. Zahuranec at The GovLab for the editorial review.
About the authors:
Juliet McMurren is a Senior Fellow at The GovLab.
Stefaan G. Verhulst is Co-Founder and Chief Research and Development Officer of The GovLab.