Mind the gap: five steps to mapping data gaps

Cath Sleeman
Data science at Nesta
9 min readMay 25, 2023

Data is an essential ingredient to tackling social challenges — from inequality to climate change. But what happens when the data doesn’t exist? We discuss the causes and consequences of these data gaps, with a focus on the missing data around obesity, diets and the food environment in Wales. We put forward a vision for creating an open data ecosystem and describe a five-step approach to surfacing data gaps.

Photo by Jamie Street on Unsplash

It’s all about data

At Nesta we tackle three of society’s biggest challenges: inequality in early childhood, obesity, and household carbon emissions. We do that by designing, testing and scaling solutions. An essential ingredient in deploying these methods is data: data that tells us about the scale of these challenges; data that sheds light on their causes; and data that highlights potential solutions. In tackling obesity, our ‘data needs’ extend far beyond statistics on the prevalence of obesity. We are seeking to understand the food environment, which includes factors as varied as the location of food outlets, the nutritional content of food, and our exposure to advertising of HFSS foods (that’s food which is high in fat, salt and sugar).

But what if the data is missing?

Given the importance of data, what happens when it’s missing? The term ‘data gap’ is not referring to a situation where one or two values are missing, but rather when the whole series is missing. Perhaps the data was never collected. Perhaps it is collected but simply isn’t granular enough to be useful, such as data that can’t offer regional insights. In other cases the data might only be available for a hefty price, or with a significant delay. These are all forms of data gaps, and when tackling our biggest societal challenges, they are surprisingly common.

Why are data gaps important?

Primarily, data gaps matter because they limit our ability to understand the causes of social challenges, such as obesity, and in turn hamper our efforts to design effective solutions. But data gaps have another consequence. They prevent attention being shone on an issue. With the release of regular and high quality data comes the scrutiny of the media which can also show the human stories behind data. That, in turn, can shift issues up the policy agenda as politicians and industry bodies are compelled to respond and ‘commit to change’. Without data, however, this chain of events is never set in motion.

Case study: Mapping data gaps around obesity, diets and the food environment in Wales

Photo by Joseph Reeder on Unsplash

Over recent months, colleagues and I have been identifying data gaps around obesity, diets and the food environment in Wales. The full list of data gaps that we found is available in our report, but below are a few examples. A number of the largest gaps are related to childhood obesity.

  • There is very limited data on childhood obesity. The Child Measurement Programme (CMP) for Wales measures Body Mass Index (BMI) in 4–5 year-olds only, and there has been incomplete CMP data collection since 2019 due to disruption caused by COVID-19.
  • There has been no recent collection of granular dietary data (eg, nutrient profile of food and drink consumed) for children under the age of 11 or adolescents (12–18 years).
  • No free-to-access data on the foods purchased by Welsh households.
  • No data on exposure to advertising of HFSS foods (high in fat, salt and sugar) in Welsh public spaces.
  • No recent data on delivery cost and slot availability for online groceries.
  • No data on the healthiness of out of home (OOH) options (eg, takeaways) that could be linked to locations.
  • No data tracking the healthiness of school meals.
  • No data tracking commitments to creating a healthier food environment by the food industry in Wales.

All this missing information obscures the food environment in Wales and, more broadly, makes it much more difficult to address this important issue.

Why do data gaps persist?

How do these data gaps arise and, more importantly, why do they persist? Although the cost of data collection might come to mind, these costs are likely to be insignificant when compared to the cost of treating the complications of obesity. Similarly, while data privacy is an important concern, most of the gaps that we identified (such as the nutritional content of fast food) do not require collecting personal information. Moreover, there are already effective systems for safely sharing anonymised data with approved researchers (such as the UK Data Service), as well as new methods such as synthetic data which can ‘stand in’ for personal data.

A more likely reason for data gaps is the distance between data and impact. Collecting new data won’t directly reduce obesity. For data to have impact, it must be carefully analysed and effectively communicated. These extra steps inject distance and uncertainty into the potential impact of data, making it a less appealing proposition than working directly with individuals who are impacted by obesity.

Another reason that data gaps persist is that they’re simply not noticed. While many groups would benefit from better data, typically no single group has this purpose or owns this responsibility. It is easy to presume that the data which is currently available is all there is and, as a result, miss the potential for improvement. When there is a focus on data, it is often assumed that ‘bringing the data together’ is all that is required — typically by building dashboards. However, building dashboards won’t improve decision making if critical data is missing.

Data gaps can also be obscured by stop-gap measures. One-off surveys or purchasing temporary access to data can give the illusion that sufficient information is available. However, if the survey is not repeated or the paid-for data must be deleted (once a contract expires) then the data gap simply reemerges.

A vision for a data ecosystem

Behind our call for closing data gaps is a larger vision — namely creating an open data ecosystem around obesity and its drivers. The ecosystem would consist of accurate, timely and detailed data that is open to researchers, and is valued as an essential tool in tackling obesity. The emphasis would be on ensuring that the necessary data was provided (by any organisation) rather than being distracted by publishing this data on a single platform.

Photo by Andrew Neel on Unsplash

Effective data ecosystems do already exist; consider, for example, the wealth of data that is available to set monetary policy. Multiple measures of monthly inflation, combined with a huge array of economic and financial data, creates a mature data ecosystem that supports in-depth analysis and allows the Bank of England to set interest rates. Another example is the data ecosystem that was rapidly constructed to combat COVID-19. That system gave accurate daily data on infections, hospitalisations and deaths, all with minimal delay and at a high level of geographic granularity.

Five steps to surfacing data gaps

Over the last few months we have developed the following five-step approach to identifying data gaps around a social challenge, which in our case was obesity in Wales.

  1. Map your routes to impact and find areas of focus. Nesta has already identified areas of focus through which we aim to reduce obesity. These areas are improving access to healthy foods, reducing the energy content of food and drink, improving evidence on diets, reducing the promotion of unhealthy food and understanding the attitudes of decision makers. To these five, we added a sixth area which is measuring the prevalence of obesity.
  2. Find the purpose of data in each area. In our case, each of Nesta’s focus areas contains a set of intermediate goals. In the area of advertising and promotions, for example, we are aiming to reduce the number of advertisements for HFSS food. This goal immediately gives a purpose for data — namely, tracking the number of advertisements for HFSS food over time. By repeating this exercise across each intermediate goal, we found the purpose of data in each of the six areas.
  3. Build a vision for the ideal data in each area. This requires imagining the data that, if it existed, would perfectly fulfil the purpose of data in each area. In Nesta’s area of food reformulation, for example, the ideal data solution might be a national reformulation database in which companies provide details on changes in the nutritional make-up of their food. This would allow us to perfectly track our goal of reducing the energy density of foods. Even though this ideal may not be achievable, it nevertheless provides a firm direction in each area.
  4. Uncover the parts of the ideal vision that already exist. Concentrating on the ideal data forces us to focus only on those datasets that can usefully contribute to that ideal, as opposed to simply listing all data that is currently available. In areas where some data already exists, we assessed whether or not that data would meet our requirements. We asked questions such as ‘is the data available at a useful level of granularity?’ and ‘can we trust the accuracy of the data?’. This automatically exposed partial data gaps. In a number of areas, however, we found that virtually no data was available and the gaps were sizeable.
  5. Describe the data gaps, and gauge the priority of closing each gap. For each data gap, we assigned a priority rating of low, medium or high based on the following criteria:
  • Time sensitivity — Is the gap related to a policy currently being considered, in which case new data could lead to immediate impact?
  • Potential impact — What is the potential impact on obesity that may result from this data becoming accessible?
  • Effort required — How large is the gap between the data that is available and what is preferable?

These ratings were fine-tuned through conversations with external stakeholders.

The table below shows a summary of the data gaps surrounding obesity in Wales. They are grouped by Nesta’s areas of focus which cover many of the drivers of obesity but do exclude the likes of exercise and genetic factors. In the full report, we describe the gaps in more detail, provide the priority levels, and propose solutions for closing each gap. In many cases, the solutions are quite simple and amount to extending a survey or boosting a sample to allow for more granular analysis. Even where the solution requires new data infrastructure, the first step would be to trial small scale solutions. In the case of school meals, for example, the first step might be testing different methods of efficiently collecting menu information from a handful of local councils.

A table showing the data gaps around obesity, diets and the food environment in Wales.
Data gaps around obesity, diets and the food environment in Wales.

What’s next?

Data gap mapping is a relatively new method that we’ve used to shine a light on missing information around obesity in Wales, which is preventing effective and targeted policies. By surfacing these gaps, we hope to spark a conversation around the need for an open data ecosystem that would empower groups to tackle this important challenge.

Photo by Miguel A Amutio on Unsplash

Multiple organisations can play a part in creating that ecosystem. The Welsh Government, for example, could collect better data on food environments and publicly-procured food such as school meals. The Welsh food industry also has a role to play which could range from opening up data on food reformulations to co-creating data trusts that would allow researchers to securely analyse anonymised sales data. With the rapid development in AI, we should also trial novel solutions that could fill these data gaps. These could include using computer vision to automatically detect unhealthy advertising (used by Palmer, G., Green, M., Boyland, E. et al. 2021) or scraping supermarket websites (with permission) to openly track the nutritional content of new products (similar to foodDB).

At Nesta we plan to continue refining our approach to data gap mapping and roll out the method across our other missions. We would welcome both your feedback as well as any opportunities to collaborate on closing the data gaps around obesity, diets and the food environment in Wales. Please contact information@nesta.org.uk with ‘Data Gap Mapping’ in the subject line.

--

--