Drafting a Better Open Data Policy


I’ve been lucky to be part of the Technology and Innovation Commission for Long Beach. I love the city and want to keep it a great place to live. Over the last several months the commission has been consulting the public and helping draft the city’s Open Data Policy. An Open Data Policy clarifies how data will be used internally and made available for the public to use. Many view an Open Data Policy as a first step in open government — increasing communication, transparency, and decision-making between government and residents.
Bryan Sastokas, Chief Innovation Officer & Head of Technology and Innovation for the City of Long Beach, made a draft of the policy public on the SpeakupLB.org site. Long Beach deserves recognition for making this a truly public process. Open Data Policies, as a rule, often just reflect boiler plate language used at the federal, state, and local levels. Even though public policy impacts people’s lives, its writing can be a very dry process that few people take notice of. So it’s in the spirit of openness that I’m making my feedback public — for Long Beach residents, academics, policy geeks, and “civic hackers” in Long Beach and worldwide.
Long Beach’s Open Data Policy
There is much to praise in the current draft of the open data policy. It’s clearly written, concise, and presents a cogent vision for the city’s future prosperity. It describes an implementation strategy that fits the city’s internal needs. The early definitional statements provide residents the information they need to understand the policy. The draft policy also wisely suggests creating an Open Data Analyst position that will serve to connect with designated data coordinators within each department. This provides clarity to who is responsible for data on a departmental level and as a city overall.
That said, an open data policy is not the same as an internal data policy. An Open Data Policy is foremost about clarifying the ways government will communicate with the public and the ideals behind this motivation. Government should ensure safe and easy requesting, interpreting, and re-using of data assets. Ultimately, this helps government become more responsible and responsive. Residents, for their part, can participate civically through data and around data issues. To this end, I believe the policy could be improved in two primary areas.
The Open Data Policy draft first lacks components that connect government with public interests. For open data to be meaningful it must be released according to the interests and needs of residents. Government must provide a clear mechanism for requesting data and other forms of communication between government employees and residents. Second, the Technology and Innovation Commission ran a set of public forums and surveys to solicit their input on open data. This is a valuable set of data that in turn should inform how the policy is written. I would like to see more of this data (on open data) influence the writing of the open data policy.
Process for Soliciting Public Input
There is currently no mechanism in the open data policy draft for residents to request data sets from the city. There is also no way for residents to request that a data set be removed, or know why a data set was removed by city officials (p. 5 top paragraph). I am suggesting that Long Beach designate the city Open Data Analyst as the one to receive and respond to public requests. Set a process for requesting open data similar to Los Angeles’ Playbook, which provides a clear and concise way to understand what will be released when, and the procedures that data will have to go through to be released. Information on why sets were removed could be listed on the same landing page as the old data set, with a description of why and when it was removed.
Timeline and Cost for Releasing Data
Timeline and cost for making data public should be explicitly stated. Timeliness and frequency are mentioned on p. 1 (“frequently update dataset”), p. 2 (“timely release and refresh of data”), and elsewhere. Yet, what exactly is meant by “timely” or “frequent” is never defined. The California Public Records Act sets time to respond as 10 days, and if they don’t release the records, they must justify why it is not released by demonstrating that the public interest in confidentiality outweighs the public interest in disclosure. Although agencies routinely take far longer, the federal level the Freedom of Information Act (FOIA) sets time to respond to acknowledge requests at 10 days and request response at 20 days. It seems reasonable to set Long Beach’s data policy as requiring 10 days to acknowledge the request and 30 days to either provide the data or give a justification for withholding. Data then are integrated into the ongoing prioritization list below). Data refresh rate could easily be set at quarterly or yearly to reflect the rate at which they change over time. Cost should be explicitly stated as free to the public, or state cases where cost would not be free.
Prioritization of Data Sets
Much of the success of an open data portal hinges on the way data sets are prioritized for released. Yet, the prioritization of city data sets is discussed (p. 5. Prioritization Process) is described only vaguely — being worked on by the City Manager, Technology and Innovation department, and city staff against “a set of criteria to be defined.” We actually have hard data on what residents of Long Beach would like to see initially released. The Technology and Innovation Commission ran a set of three public forums and a survey (combined online & offline). It collected, among other information, which data sets the public would like to see prioritized. Commissioner Gwen Shaffer and I assembled and presented a preliminary report at our December meeting detailed exactly what data sets the public would like to see prioritized. Referencing that report, the top requested data sets were: crime (75%), transit (65%), code enforcement (56%) and restaurant inspections (56%), city expenditures (52%) and lobbyist activity (50%). These areas are useful starting points for prioritization. People also need to understand how the requests they make are prioritized and worked through. The priority list should be made public and reflect processes for requesting data by internal and external parties above. Yet, a mid-level city like Long Beach does not have infinite resources, and requests from residents are likely to be frequent at the start. This means increased pressures put on already-busy public officials, particularly at the start. Making the priority list public would alleviate a need for time-intensive in-person requests and updates.
Privacy Concerns
Privacy concerns are currently addressed on p. 5 in the section titled Legal Review: “Personally identifiable information shall be excluded from the open data portal to maintain privacy and security.” Yet, the full story of how to ethically use data for civic purposes is more complex. The public already has legal access to some personally identifiable information, such as names and salaries of public employees. We have access to this information because it is legally required and in the public’s best interest.
As an alternative, I would suggest Helen Nissenbaum’s contextual approach to privacy. “Contextual privacy,” which has been widely adopted at the federal level, stipulates that potential harms of data are evaluated based on the contexts in which they present themselves and the social norms that they may violate. Further, data can be combined to reveal more than a single data set would. Data sets that are not currently released in some form should be reviewed to see if there are additional risks from combining data sets.
I believe that these small changes can result in a more honest and progressive relationship between residents of Long Beach and its government. Adding this language will provide clear processes for making data public. It describes the full cycle of data release and interpretation. It’s what the people have said they want. It’s also just the right thing for the city to do.