DataJam North East: working across data silos to fix real problems

Organisations working in complex, highly fragmented environments often need to be able to collaborate with other stakeholders and need the infrastructure to do it. Just like the best Data Jams.

Last month I had the pleasure of giving the keynote talk on the first day of DataJam North East (a One Team Gov event). Aside from the chance to visit pretty Newcastle-upon-Tyne, it provided an opportunity to see the start of what I hope is a trend — a place-based, collaborative, multi-organisation approach to serving people.

Aside from the policy and cultural barriers to working in this way, there are infrastructure and standards barriers. My talk focused on these barriers. I grounded my points in professional healthcare provision because it’s the main focus of my day job, so the examples came more readily to me. DataJam North East had a broader context for human thriving — in addition to health, it covered education and skills and the impact of location on all these things. Still, I think the points I made apply to those areas too.

This post is an attempt to summarise my talk and capture my experience of the first day of the event.

A place-based, multi-organisation approach to caring for the whole person

Caring for the whole person requires a multi-institutional approach (icon made by Freepik from

In brief, place-based, multi-organisation collaboration enables providers to get a more holistic view of their users and offer their services in ways that better reflect users’ real lives. When Audre Lorde noted that “we don’t lead single issue lives” she could have been talking about the need for holistic healthcare*.

There’s a longer, more considered answer too. Specialisation has enabled a deeper understanding of various elements of the human experience and by extension, improved our ability to provide effective care. This, coupled with perceived cost saving opportunities, has had an impact on the design of organisations and processes within healthcare. Essentially, healthcare is a lot more silo-ed than it used to be and because data follows function, there’s been a proliferation of data silos too. This trend looks set to continue.

But there is a growing realisation that this approach isn’t all win, there are some drawbacks too. The main ones being that it makes it harder to care for people holistically and in some cases, erases many of the efficiencies of specialisation. These shortcomings have an impact on patient outcomes.

There’s always a tension between the broad and of comprehensive, generalist institutions and the focused but narrow care of specialist ones. As is often the case with messy, inconvenient reality neither provides a silver bullet option. The solution lies in managing the tension.

*She wasn’t. She was talking about the incongruity of single-issue politics and the human experience.

What’s infrastructure got to do with it? Well, a lot.

Tina Turner, philosophically musing on what’s love got to do with it? Good question, Ms. Turner. Good question.

So, some degree of specialisation and the silo-ed systems and ways of working that they spawn are part of our reality. Consequently, we need infrastructure that supports responsible and sustainable data access across organisational boundaries. In other words, infrastructure needs to support people, who have legitimate need to do so, to access silo-ed data. It also needs to ensure that such access is responsibly done and with as little friction as possible.

This sort of cross-organisation access is already being facilitated in certain scenarios — think of community pharmacists being able to, with a patient’s permission, get access to Summary Care Record data. But this isn’t a ubiquitous practice and its scope is limited to the primary care sphere. There’s still a lot to be done to broaden the number of health organisations working like this and to increase the number of datasets they have access to.

But in order to support cross-organisation working, infrastructure needs to be able to do more than just manage data discovery and ingestion. It also has to support collaborative handling of data, especially for more sophisticated uses of data such as advanced data analytics. It needs to support those working with the data to do so in standardised ways. In practice this means, analysts in different organisations using the same versions of their preferred analytic languages for prototyping their models as well as using the same training and test data, for example. Doing these things at scale and across organisations, requires a good degree of automation.

For all this effort to be worthwhile, organisations need to be able to get answers to questions that help them serve their own users better. It’s always easier to make the case for investing in cross-organisation infrastructure when the benefits to each contributor is clear. So, analysts need to be able to run models they built in collaboration with peers based elsewhere within their own organisations.

A well-configured Data Science Lab environment could help with many of these things.

What I’d like to see next (and, fingers crossed, at the next DataJam NE)

It was a brilliant event and as I mentioned above, I’m hoping it’s the first of many. As a result, my mind is already turning to the next one (no pressure, Ryan, Celine and team :)). Here are few things I’m hoping to see more of at the next one:

  1. More ‘behind the scenes’ insights: As I set out above (and in my talk on the day) infrastructure and standards matter but getting these right requires broad stakeholders engagement. This can’t be left to the techie s— that’s how we end up with narrow technocratic solutions that ultimately, prove unfit for purpose. For this reason I think it would be great if attendees were exposed, at a high level, to the ‘behind the scenes’ process of understanding a local area and using this to inform stakeholder selection, challenges with discovering and accessing relevant datasets and so on.
  2. More ‘inter-strand’ cross-over: The unconference, data hacking and more structured workshops sessions all ran in parallel. As far as I can tell, people tended to stick within the strands they started out in. At the next Data Jam it would be great to see more mixing. Perhaps, a change in format might encourage this to happen?
  3. Framing the data hacking process: At the best of times, data is an imperfect representation of the world and that makes the drawing of inferences difficult. Add in the messiness of the de-contextualised data and time constraints that are often the hallmarks of data hackathons and insights become even more tenuous. This isn’t a reason not to do hackathons. There’s a lot of value in the brilliant just-do-it spirit of these event but we also need to be really intentional about how we present the degree of uncertainty that we’re dealing with. Also, so much of the value of hackathons lies in the informal setting they provide for multi-disciplinary working. This means its even more important to provide guidelines that help teams work out the best way to frame questions, interpret results and so on. I like the Datakind UK one.

Isn’t this really just a story of data transformation?

Yes, it is all data transformation in the end. And it’s not dissimilar to the digital transformation, that helped usher in the explosion of digital data that we’re currently swimming in. And as was the case with digital transformation, infrastructure is necessary but not sufficient. It needs to be combined with the other two transformation building blocks — governance and skills. I didn’t go into a great deal of detail on these but I noticed that both were covered extensively in the unconference sessions I attended. This is great because transformation doesn’t happen without them.

Data has the potential to transform services and ultimately, organisations. DataJams provide motivated stakeholders with an informal to explore what this transformation could look like. That’s why I am already looking forward to the next one.