Improving Scotland’s Economy with Meta Data Science — Questions
How will we manage the effects of digital disruption and automation?
On Thursday 28th September, 2017, I attended the Leadership Seminar: Digital People — towards a technology strategy for Scotland. The discussion centered on the question:
Q1: How will we manage the effects of digital disruption and automation in a way that maximises the benefits to the Scottish economy and society and ensures that everyone is included in Digital Scotland?
I came away from the seminar wanting to better understand how a data scientist might contribute to the discussion provoked by such a question. I’d like to share my learning here, in the hope of getting feedback from those more knowledgeable than me and providing some lessons learned to those interested in doing data science for social good.
From Words to Science
Often, as data scientists, we need to move beyond the boundary of our familiar technical domain and push to formulate a quantitative problem from nothing more than words. If we can persuade stakeholders that solving the quantitative problem will add value then, and only then, can we start doing data science.
So, how do we formulate quantitative problems? When working on problems in industry, my own process ideally goes like this:
1Get all stakeholders to agree on One Metric That Matters (OMTM). E.g. company valuation, monthly recurring revenue, number of active users.
2 Get all stakeholders to agree on a range of possible actions. E.g. designs of new products, marketing campaigns, new user-interface features.
3 Agree on a process for evaluating which actions move the OMTM in the right direction. E.g. Sequential A/B testing, Micro-Randomised Trials.
4 Agree on the risks involved in optimising the OMTM. E.g. investment of resources, negative effects, differences in objectives, ethical concerns.
It is my personal belief that going through the above steps is the process of data science research design. So as an example, I’m going to start with the question stated at the opening of this article and try to formulate a data science problem.
Improving the Economy
Looking back at Q1, I hear you ask:
What do we mean by ‘digital disruption and automation’? What do we mean by ‘managing the effects’? How do we measure economic benefit so it can be maximised? How do we measure societal benefit so it can be maximised!? What on earth is ‘Digital Scotland’!? How do we identify who is and who isn’t included in Digital Scotland!?!?
When there are many ambiguous elements to a complex question, we should focus on a smaller, simplified problem and discuss the risks involved. So, for now, I’m going to focus only on maximising the economic benefit for Scotland and I’m going to assume that the performance of the companies in Scotland is a nice proxy for Scotland’s economy. Not by coincidence, this aligns with what we do at The Data Lab, and leads to four questions one might ask:
A How should we measure the performance of companies in Scotland and rank them alongside companies from other countries?
B What actions can we take to try to influence the performance and ranking of companies in Scotland?
C How can we evaluate different actions to know which are effective at influencing the performance and ranking of companies in Scotland?
D What are the risks, to Scottish society, in maximising the performance and ranking of Companies in Scotland?
Each of the questions A, B, C, & D, respectively, is intended to be a precursor to achieving the agreement discussed in the statements 1, 2, 3, & 4, and each question will be discussed in more detail in its own subsequent article.
Meta Data Science
Given the world’s most valuable resource is now data [Economist, 2017], it is highly likely that any study of the performance of companies in Scotland is a study of their use of data and data science. Put another way —
To understand which data science strategies and methods are effective, we need to collect data on the impact of data science.
This isn’t such a crazy idea, and the phrase Meta Data Science is discussed in Neil Lawrence’s technical report on Data Readiness Levels. Indeed, the discussion in the previous section might be considered meta data science when the actions being taken relate to the introduction or removal of different data science strategies and methods. This might be hiring new types of data scientists, restructuring or reorganising the data science team, introducing new cloud-based data science solutions, developing an in-house data science platform, exploring new types of research designs, or just classic machine learning experimentation with new types of algorithms. Meta Data Science could aim to understand which of these actions are effective and when they should be applied.
Next Steps
I presented some of these ideas to the Scotland Data Science & Technology Meetup and will be exploring these ideas further over the coming months. Stay tuned, and feel free to follow me on Twitter: @TheLeanAcademic.