The Great Untangle

Reading the final report of the Australian Productivity Commission on Data Availability and Use

Today the Australian government released the Productivity Commission’s final report on Data Availability and Use, an ambitious plan for data reform in Australia.

An overview and full text of the report (the full text is over 650 pages) are available here.

The Productivity Commission is the Australian Government’s independent advisory body on a range of social, economic and environmental bodies and so its recommendations in the final report are simply that — recommendations. It’s up to the Australian Government now to decide how it will respond to and implement the recommendations contained in the final report.

Untangling Australia’s data landscape issues (image: Tangle by Peter Zoon CC BY 2.0)

Making Australia’s complex data landscape simpler…in a complicated way

The scope of the Terms of Reference provided to the Productivity Commission by the Australian government, and reflected in the final report, were ambitious. The Productivity Commission roams over issues associated with personal data, open data, restrictions in data sharing, data monetisation, governance for data and enabling greater data integration and use.

It’s a sprawling, ambitious, and at times contradictory final report that it’s going to take a few weeks to digest. Based on a close read of the Productivity Commission’s overview of the report, which includes key findings and recommendations, (60+pages!) I’ve put down some of my immediate observations below. Note: I haven’t gotten into the much longer full report yet. There might be nuance in the final report that I’m missing.

Quick summary for those who don’t want to get into the detail:

  • A new definition of ‘consumer data’ is proposed, which extends beyond personal data. I see problems with this. ‘Consumer data’ is to be treated separately from ‘Australia’s data’, which includes public sector data, publicly-funded research data, national interest datasets. Hmmmm.
  • Organisations may charge consumers for access to, edits, transfer of their ‘consumer data’. The ACCC will oversee this. ‘Consumer data’ includes personal data, so it’s not clear how this recommendation would interact with existing rights under privacy laws (which don’t allow for charging — I don’t think? Update: @MsLods has pointed out that under the Australian Privacy Principles charges for access to personal data are permitted in certain circumstances, but not correction)
  • There’s lots of good recommendations around appointing a National Data Custodian, creating a presumption of openness for non-sensitive datasets and identifying and making accessible National Interest Datasets. There’s some devil in the detail, however, that will require ironing out.

I don’t go into every recommendation. The recommendations cover so much — too much — ground. At its core, that’s my problem with the final report and the terms of reference the Productivity Commission were given. In having so much ground to cover, the Productivity Commission couldn’t help but barely scratch the surface of some of the issues it unearths.

The report speaks to the complexity of the data landscape in Australia. And the Productivity Commission has really pushed the boat out on where reform of our data regime might take us. In some recommendations, however, the desire to simplify in practice only makes the data landscape more complicated.

The final report offers a solid, but by no means conclusive basis for the Australian government to consider modernising data sharing and use in Australia. I think more can be done to empower people with a voice and control over how their data is used.

A comprehensive Consumer Right for Australians

As expected, the final report calls for the introduction of a comprehensive consumer right for Australians to control — to some extent — how digital data that concerns them is accessed and used. I say ‘data that concerns them’ because ‘consumer data’, a new term introduced by the Productivity Commission, envisages the right extending beyond personal data as defined by the Australian Privacy Act. SMEs as well as individuals would be able to avail themselves of the right.

The final conception of the consumer right is…messy. The definition of ‘consumer data’ straddles both privacy and IP laws. I don’t think the introduction of ‘consumer’ data makes how organisations manage data about or related to people any simpler.

The PC envisages that consumers may be charged for access, edits to, and transfers of their data. How any charges for ‘consumer data’ would interact with existing rights and permitted circumstances for charging in relation to personal data is unclear.

The Productivity Commission recommends that what ‘consumer data’ looks like in practice be industry-led, sector by sector, rather than by government. Whether industries are obliged to consult with consumer representatives and public interest advocates in determining what’s in scope of ‘consumer’ data isn’t explicit. The Australian Competition & Consumer Commission is the recommended entity for oversight and implementation of the consumer right.

At minimum, however, the PC recommends that consumer data’ include:

  • personal information, as defined in the Privacy Act 1988 (Cth), that is in digital form
  • information posted online by the consumer
  • data created from consumers’ online transactions, Internet-connected activity, or digital devices
  • data purchased or obtained from a third party that is about the identified consumer
  • other data associated with transactions or activity that is relevant to the transfer of data to a nominated third party.

This is broad. Data comes in many formats, shapes and sizes. It can be text. It can be numbers. It can be audio, video, images. The PC’s definition expands ‘consumer data’ to non-personal information created by the consumer (which they may have rights over under copyright law), and third-party content. But how the rights envisaged with respect to ‘consumer data’ interact with rights to personal data, data that you own under copyright law and data that someone else might own isn’t really expanded on. There’s also a new definition for derived data called ‘imputed data’.

The comprehensive consumer right does not include a right of deletion, or a right to control *who* has access to data. In expanding the definition of ‘consumer data’ beyond personal data, the comprehensive right looks less like the kinds of data rights being envisaged in other countries. I wrote about the divergence of the PC’s comprehensive consumer right in the draft report from the rights of people under the European Union’s General Data Protection Regulation here.

The PC does recommend that entities selling, transacting or otherwise sharing ‘consumer data’ publish a list of the organisations who have access to that data openly online. This is positive step forward in requiring openness about how organisations are managing personal data.

The Productivity Commission acknowledges that actually enabling data transfer will be tricky, and across sectors cooperation will be necessary on data standards, formats and data within the scope of ‘consumer data’. Again, standardisation should be industry-led. What if data that falls within the definition of ‘consumer data’ may also be considered ‘national interest’ data? E.g. our health records. I’ll get to the requirements around ‘national interest’ data next.

A ‘framework’ for data availability and use overseen by the National Data Custodian

The PC’s framework for implementing the final report recommendations separates consumer data from ‘Australia’s data’: data held by the public sector, research data, entities funded for public interest purposes, and national interest datasets.

Productivity Commission final report on data availability and use

In what I think is a great step, the PC recommends the apppointment of a National Data Custodian, a statutory officer charged with implementing and overseeing a new data regime. But the Office of the National Data Custodian, with the advice of a small board, has its work cut out for it.

As well as advising on implementation generally and ‘managing the broader ethical considerations’ associated with data availability and use, the NDC will:

  • accredit the processes and abilities of Accredited Release Authorities (ARAs), who will oversee both transformation of National Interest Datasets (NIDs) and access to and use of sensitive data
  • publish guidance around data use including metadata, standards, data security, de-identification and data ethics
  • assess and designate NIDs
  • audit ARAs
  • audit de-identification practices generally
  • manage complaints about ARA processes
  • manage annual reports from data custodians about their requests for data access

This is a significant body of work, designed to streamline existing data request and access processes across various sectors and provide a single point of coordination. But sometimes variation is good and necessary. Not all data sources are the same in terms of their potential uses, the risks and sensitivities they carry, and the context within which use takes place.

I think establishing an Office of the National Data Custodian would be a great step forward, but careful thought needs to be given to how it interacts with existing bodies for best practice data management, including the Office of the Information Commissioner, the Australian National Data Service, Data61 and data bodies and committees in health, medicine and other sectors.

The National Data Custodian accredits Accredited Release Authorities (ARAs), bodies to be funded by the Commonwealth to do more than manage trusted access to sensitive datasets. The PC recommends the introduction of ARAs for:

  • Dataset development and integration from across a sector
  • Assisting custodians curating and improving quality of datasets
  • Facilitating timely updates and ongoing dataset maintenance
  • Approving trusted users of more sensitive data
  • Determining whether a dataset they’re responsible for should be shared or released.

At a roundtable a few weeks ago I commented that eventually Australia will need to move towards sector-specific bodies — perhaps with Commonwealth funding — who are tasked with activities that help make better data sharing and integration a reality: guidance and rules around data standards, taxonomies, conditions of access, as well as support for and investment in key data sources. How these would be governed and the extent of their responsibilities needs debate, and will likely vary from sector to sector.

National Interest Datasets

The Productivity Commission recommends that certain public (and sensitive) data sets be designated as ‘National Interest Datasets’, where identified as such by the National Data Custodian. In determining NIDs, the National Data Custodian should apply a public interest test to establish that through sharing or release, this dataset would be likely to generate significant additional community- wide net benefits beyond those obtained by the original data holder.

Data held by private and non-profit organisations, as well as public sector bodies, may be designated as NIDs. The PC foresees circumstances in which access to and use of NIDs may be charged for (where this would not be inconsistent with the public interest purpose for which the NID may be used).

The PC defines the characteristics of ‘National Interest Datasets’, quite narrowly, as datasets that are:

  • Relatively few in number, not in the thousands
  • Linked, integrated, transformed (for example by de-dentification or use of AI) to suit a prior determined scope of outcomes
  • Offering clearly described public interest benefits of national application
  • Have been confirmed by public review (parliamentary committee proposed).

What happens to datasets when they are designated as NIDs gets a bit confusing. The Productivity Commission recommends the creation of a new Data Sharing and Release Act, to modernise and streamline processes for digital data. For NIDs, the PC recommends the Data Sharing and Release Act (DSR Act)

where possible, override secrecy provisions or restrictions on use that prevent original custodians actively providing access to data to other public sector data custodians and Accredited Release Authorities (ARAs).

The Data Sharing and Release Act would also introduce the new comprehensive consumer right. The Commonwealth Privacy Act would continue to apply.

The Productivity Commission makes a clear case for some kind of legislative reform. The existing legislative landscape for data is fragmented, reflecting a patchwork of needs and priorities. How we knit together this existing landscape, identifying components of legislation to be overridden or superceded, gaps to resolved and inconsistencies corrected, is important work. I have doubts about the efficacy of a streamlined Data Sharing and Release Act as specifically proposed by the PC, however. How this Act would reconcile inconsistencies apparent in the proposed ‘consumer right’ in terms of existing privacy and IP laws is just one area requiring further dedicated thought.

Other bits and pieces

There’s lots of other recommendations and observations from the PC worth noting:

  • that non-sensitive data held by the public sector should presumptively be open (with risk assessment processes in place) (yes!)
  • that data that is created via publicly funded research should be open (yes!)
  • that government agencies should create comprehensive, easy to access registers of data, including metadata and linked datasets, that they fund or hold. (yes!)
  • that agencies might be able to charge marginal costs for ‘minimally processed datasets’ and more for value added services (hmmmm)
  • that where services are undertaken by private entities using public funding, government ‘retain the right to access or purchase data’ created via that service (hmmm — I’d argue that where services are provided using public funding, any data created should be owned by government and able to be made available. Not a ‘right to access or purchase’)
  • that industry is likely to be best placed to determine sector-specific data sharing standards (hmmmm — not in all cases, that’s for sure. Particularly where industry needs could be in conflict with government policy, consumer needs or new market entrants)
  • that current policy obligations to destroy datasets and linkage keys on the completion of research be abolished (I don’t think all linkage keys and datasets *should* be abolished but I foresee this being controversial)

There’s more too that I’ve missed. You can read the whole thing yourself here.