Extending the Art of the Possible with Big Data to Better Inform COVID-19 Responses
Faizal, our humanitarian data advisor and one of the co-authors of this piece, often recounts what one of his supervisors said to him from his days as an information management officer. These were the days when he would, with 24-hours’ notice, journey to a disaster zone and be in charge of sourcing, collating, organising, and analysing the data that was needed to glean actionable insights for humanitarian actors on the ground. To paraphrase, what his supervisor told him was: “predictable data is what we need most in times such as this.” As communities of practice come together in solidarity to respond to the COVID-19 crisis, that statement is worth reflecting on, especially for a data innovation facility like ours.
Experiential Knowledge Transfer
Our work prior to the onset of the COVID-19 crisis largely focused on big data and real time analytics application for development and humanitarian action. Disaster response and climate change in particular is one of our key thematic areas, where we merge and analyse non-traditional data with the more traditional types of data to provide timely insights to decision and policy makers across government and humanitarian agencies. Part of our role today is to extend the art of the possible, by coaxing access and analysis from alternative data sets that were never originally meant for the purposes for which they are being leveraged.
From using call detail records produced by mobile phone operators as part of their billing processes to pinpoint evacuation destinations following a disaster; analysing public conversations on social media after a disaster to better inform the context of what was happening on the ground; looking on up from on high (i.e. geospatial satellite imagery) and automatically detecting broken infrastructure; and utilising crowdsourced spatial maps to understand alternate routing for humanitarian assistance in a post-disaster situation, our research has demonstrated how near real-time data can be harnessed to provide a wealth of information supporting responsive strategies and interventions. However, translating this art-of-the-possible to this-is-how-we-normally-do-business is another matter. This is why Faizal’s supervisor’s pithy comment is so important in the ongoing response to COVID-19.
To make innovation work in practice and be mainstreamed means having those actors who will use these alternative data to inform decisions, be truly informed and discerning consumers. One cannot become fully informed and discerning about innovation at the actual time of a crisis — it has to happen before. Being informed and discerning means, fully understanding:
- the precise ways in which such alternative data will be useful to their work,
- how it complements, improves, or replaces the data that is normally used,
- the precise limitations of such data and techniques, having “red-teamed” and “practiced” the use of these data sets for different situations.
Furthermore, this means having the confidence to know that this data would or can be made available in a responsible and timely manner so as to make a difference in the decision making processes that key actors would need to undertake in the aftermath of a disaster.
Provenance and Data Priorities
The novelty of the COVID-19 crisis makes it challenging to generalise about data needs, given the multidimensionality and intersectionality of the impact it has brought on. A comprehensive response to COVID-19 involves various phases, from immediate response (efforts to save lives); recovery (steps taken to drive a return to normalcy); mitigation/prevention (actions to reduce any adverse effects as well as to avoid a repeat); to preparedness (strategies implemented to prepare for and inform response to possible re-occurrences). In each phase, the data needs differ depending on the needs of citizens most affected and the specific tasks at hand.
As Indonesia begins to look towards the recovery phase, one area of priority for us is to examine the socio-economic impact on the poor and vulnerable and identify how digital solutions and what big data sources can be leveraged to enhance the response, both at the subnational and national levels. Understanding the data origins in terms of how it’s recorded, collected and shared is paramount for providing a clearer perspective of what it may or may not be used for. These discussions on data provenance are also necessary to build trust among key actors, and can build confidence in the analytical approach as it relates to reproducibility to provide similar results. This is particularly important in convincing policy makers of the accuracy as well as the shortcomings of big data. Related to the COVID-19 crisis, one of the challenges we’ve observed is that different humanitarian actors are vying for the same data sets with competing agendas and approaches which can cause costly precedents for the future. Common Operational Data sets (CODs) though are inherently different from Fundamental Operational Data sets (FODs), with the latter being sector specific. Whilst both of these data sets can be used to create a baseline to inform decision making, it is important to identify data priorities and align these with underlying goals and policy needs.
Informing Government Response
Our current disaster is a global pandemic that is not going away today or tomorrow, but will be something as a global community we will have to live through for the foreseeable future. Working alongside our government counterparts in the Ministry of National Development Planning (Bappenas), our work in applying big data approaches for humanitarian action has enabled us to gain substantial experiential knowledge, applicable today to play a consultative role and help fast track approaches within the Government, related to use of alternative data sources for disaster response. The COVID-19 crisis emphasises the need for us to continue our intent to research and develop fit-for-purpose innovations and develop our role as a convener and catalyst for a strong data ecosystem.
In Indonesia, we’ve been collaborating with the West Java provincial government (Jabar Digital Service) through a combination of advanced data analytics and remote social research to develop a regional-level aggregate platform. Using Facebook Population Density Maps, we’ve looked into identifying areas in West Java that might potentially have a higher risk for COVID-19 spread, for instance areas densely populated with elderly people, to inform containment strategies especially in rural areas where access to health services are limited. The disaster response portfolio we’ve amassed to date, covering our work in disaster-prone Pacific Islands and in Indonesia has given us a comparative advantage to assist government and humanitarian actors in understanding shortcomings related to use of big data for responsive decision making and the importance of having strategic data partnerships and protocols in place beforehand.
The Path Ahead
Since establishment, PLJ has worked on more than 50 partnership agreements including data providers, and in the process has developed networks with technical resource partners which provide a basis for other humanitarian initiatives to utilise. We are also being approached by a broad range of government and development agencies to collaborate, beyond our current resourcing capacity. So, there is clear demand and opportunities to address interconnected, cross-cutting issues facing our society where we could do much more if we had the resources. Our proven capacity and experience over the years are relevant to the ongoing discourse on COVID-19 response, particularly in the areas of:
- developing shared-value data partnerships to facilitate data access;
- applying advanced data analytics approaches to glean actionable insights from new and emerging data;
- building from our know-how experience in this domain to support others in big data adoption; and,
- supporting as an interlocutor in dialogues on protocols and procedures that are needed to effectively engage with key actors going forward.
The COVID-19 crisis could be called a “data-driven pandemic” and like every disaster it is also an opportunity to focus interest and investment. Mobility and good population data for instance are hot commodities these days, but to do any decent modelling related to the spatial spread of the disease, we also need good ground level case data. Our ongoing research suggests there is more work to be done in improving the underlying data infrastructure, which will provide a more comprehensive picture of the situation, including increased visibility of the impact of interventions and existing gaps. Importantly, improving data infrastructures will also support the increasing need for accountability of international humanitarian and development assistance funding.
This would be a good time to build resilience in our data infrastructure, establish data commons with relevant protocols, and develop user-centered interactive data visualisation tools to support evidence-based decisions that will help us go a long way towards making alternative data be considered as Faizal’s supervisor once said “predictable data”.
Authors: Petrarca Karetji (Head), Dwayne Carruthers (Communication Manager) and Faizal Thamrin (Humanitarian Data Advisor)
Pulse Lab Jakarta is grateful for the generous support from the Government of Australia