What happens to Data Science in a Crisis, why that’s a problem, and what to do.
There are lots of posts offering opinions about Covid-19. This is not one of them.
Instead this is a post about the data science in those posts, and how those posts and the forces that generate them, and have made them so popular, relate to what often happens in (far) less important corporate data science settings.
You can guess that I have not come to praise those posts.
To ape Mr. Shakespeare, I’ve come to bury them, hopefully, and to make some points about what happens to decision making with data in less than ideal settings and when the pressure is on. I want to tell people how I approach understanding the analysis that other people are producing, and I want to say why I think that more people should approach more data analysis in this way. To do this I will draw on my experience of doing data science under pressure in less than ideal situations for what I thought of as high stakes at the time.
I’m not going to comment on whether the conclusions of the Covid-19 posts are right or wrong — if you want to know why then skip to the end. It’s ok so long as you then come back and read the rest of this.
Ok — so I don’t know about Covid-19, but I do know about some of the nonsense that bloggers have been putting out there.There are a lot of nonsense analysis doing the rounds and I can’t go through all of them and call BS. What I will do is use a couple of the things I’ve seen as examples with the idea that if you apply the thought process that I use to clobber them you might as well. If you fancy.
Example 1 : “Coronavirus: Why You Must Act Now, Tomas Pueyo, Chart 7”
In this article Chart 7 is reproduced from a reputable journal and used to underpin reasoning about the difference between “True” and “Official” cases. Here is a sample of the argument that is constructed from this graph
“Up until Jan 23rd, when Wuhan closes, you can look at the grey graph: it’s growing exponentially. True cases were exploding. As soon as Wuhan shuts down, cases slow down. On Jan 24th, when another 15 cities shut down, the number of true cases (again, grey) grinds to a halt. Two days later, the maximum number of true cases was reached, and it has gone down ever since.”
This is a classic: somewhat vaguely a convincing causal relationship is implied from a visual interpretation of some data. It’s compelling, but it relies on us not looking at the detail. First off the statement that up until Jan 23rd Wuhan was experiencing exponential growth is false. From 8th Dec to 9th of Jan there is nothing approximating exponential growth. Perhaps that’s not important to readers, but it doesn’t fill me with confidence that the rest of the inferences are rock solid.
The more important fallacy is the implication that the shutdown of Wuhan and 15 cities is the causal driver of the sudden stop of “true” case growth. What if that were true?
For that to be true the virus would have to have an incubation period of about 2 days. The new social distancing measure would prevent people from getting infected from the moment that it was imposed, and the claim is that that is what is being shown in this data. The best science available  indicates that the incubation period is median 5.1 days.
Is there anything else that kicks my confidence into tiny little bits about the way Pueyo uses this chart and the implications that are built on top of it? Well, dear reader, yes there is. If you have a look at Jan 13th and follow the arrow connecting it to the annotations beneath you will see that there is a text box that reads “2019-nCoV test kits first available”. Pueyo does not mention this; and it’s important.
I don’t know about Covid-19 testing, but I do know that all the other data from other outbreaks has been produced by a system that has had access to the test kits that the Chinese authorities didn’t have and couldn’t have. This means that *any* comparison of the data from Wuhan before testing came in (my guess is about 21st Jan which Pueyo helpfully labels as “official cases exploding”) is completely invalid. They are different things. Data produced by a testing campaign vs data produced by people turning up in hospital unable to breath.
Pueyo doesn’t understand the data. Worse, he hasn’t made any effort to understand it — he hasn’t even looked at the text underneath, generous annotation provided by Wu and McGoogan who produced the chart from actual data with all the signifiers and behaviours that tell me that they probably do understand it rather well. We can look right and see other events on the chart — the Chinese New Year holiday and associated festivals, and reading Wu and McGoogans summary I can see that these are fundamental in the decision making that went on in China, yet Pueyo doesn’t mention this — he draws our attention to the shape of the graphs and asks us to believe a causal link and then goes on to make recommendations about social distancing that are cloaked in the armor of data science.
Example 2 : One of these countries is not like the rest : Mark Handley
Handley is a Professor at UCL and produced this graph which has been widely circulated.
When I have expressed my doubts about this, Handley’s qualification has been cited at me to show that this is a pucker analysis. Handley is a professor of networked systems and not epidemiology or statistics. In any case as an academic I am sure that he supports the idea of review and analysis.
First of all we should ask what is not in his work. I don’t see South Korea, which is a big centre for Covid-19 infection, or Tiawan or Thailand which are smaller ones. Why?
Also why plot cases? And what does that mean? Where does case come from in this data? What happens if we plot something else, like deaths?
Finally look left. Note, Professor Handley has plotted using a log scale for Y. I am comfortable with that, that’s ok for me, I know how to interpret these things. I am not the people reading this graph and running to Tesco’s for loo roll.
I’d like to show a correlation graph of loo roll vs something, but instead here is one that shows that the consumption of cheese is correlated to the number of civil engineering doctorates awarded.
In example 1 evidence is left out, in example 2 patterns are randomly matched. Both behaviors are evident in malformed and poorly constructed analysis that I have seen elsewhere.
Professional Data Science.
I have been a lead data scientist in quite a few crisis meetings where I did have the data, understood the data and I did know what to do with it.
Not only was I certainly the best placed person to do the job, but I was backed by a team of brilliant people who had spent much time training and preparing and armed with technology and methods a decade in advance of the ones that everyone else was using.
Behold I am a God.
Levity aside, the professionals, the folks that do it for a living, are in a hugely better position than everyone else.
But even when everyone is starting from the same place — dealing with a completely novel situation — professionals react quite differently from the amateurs. Typically they go quiet, they don’t even get busy preparing to create answers. This behaviour is a red flag for executives. Executives are action first people (that’s how you get to be an Executive, and it’s also how you both remain an Executive and how you can remain sane as an Executive).
“What”, they (the Executives in charge) ask, often adding emphasis by shaking things, “are they (you) up to?” It’s a bad moment — but what the professionals are up to is trying to work out what it is that needs to be done to get answers that might be both valid and useful. The professionals are focused on that because the professionals have been through painful processes that produced answers that were neither valid or useful at the cost of a great deal of stress. It’s the kind of waste of time, career and health disaster that changes people’s approach to problems fundamentally. But it opens a window onto one of the big problems with Data Science in the real world.
Data Science is dangerous. It’s dangerous because it wins arguments, it’s very hard to argue rationally against data, and it’s rather easy to present it very attractively & engagingly. So you can get people to pay attention to it, and you can convince them that you are right, and it’s new. It’s new in the sense that collecting, sharing and analyzing large amounts of data has been very hard and expensive up until ten or fifteen years ago. The techniques to do all of these things are emergent and very few people are familiar with them or can think critically about them.
Thsis is a nasty problem with the Covid crisis. People are publishing poor data driven analysis and “winning”.
Data Science, in the real world, like all Science, is a social process. You have to get your results out there, they have to be seen and they have to be accepted. When businesses are under pressure there is a premium on speed, and the first mover gets a huge advantage because everyone is waiting for a result and is keen to consume it. Additionally, as noted above, the professionals have often established themselves as “messing about” “not focused” “not onboard” “inflexible” “useless”. This means that a compelling result produced by an outsider is literally seized on by all stakeholders.
Apart from the professional data scientists.
For the professionals the outsider analysis causes two problems. The first is that it has to be investigated because stakeholders will accept nothing less, and this is almost certainly going to be a big waste of time. The second is that if this isn’t done diplomatically and respectfully then you are likely to alienate all concerned and lose your credibility in a big way. The outsider analyst has a good story on their side — they are working out of their responsibility doing things that are too hard for the professionals; “why are we paying these pointy headed guys to be grumpy?” This is double bad because it plays into the “bad attitude” narrative that gets established by the data science team not leaping to action on first contact with the problem.
Life is not fair though, and everyone expects (and by everyone I include professional data scientists themselves) to give the data science guys a hard time about every aspect of the work that they produce, and often this is not done very well, because often the people doing it are in this situation for the first or maybe second time in their career and they just haven’t learned how.
Table 1 shows how I feel when I am dealing with this.
There are four situations. In my experience outsiders mostly produce invalid (incorrect or unactionable) or obvious (known) results, this is not surprising because they are not looking for the gotchas that the professionals got got with the first, second, third and nth time they were working on a problem, worse for them is the fact that they aren’t working in a team. A functional team is a source of support and learning for everyone in it, it’s the cornerstone of a data science effort because it’s the place where the errors and gaps that trip you up get caught — often over a coffee — and squished before you get to the bearpit. This is why many of the criticisms of the work that the professionals do are so frustrating for them, this isn’t the first time that they thought of the problem that’s being raised, so can we please just get on to the good bit?! Of course if you end up on the bottom left you are all court screwed — it’s like a chasm opening under your feet — and it can happen to the best people because this is work being done in non ideal circumstances and under time pressure — and anyone can make a mistake.
Sometimes though outsiders do produce the kernel of something that’s worth looking into. Often this requires a lot more work to get it over the finish line, and often that work completely transforms what’s been found and what should be done. But non-the-less this is work that wouldn’t have happened without the outsider input. Any data scientist or data science team worth their salt will be cockahoop about having this happen, and will heap praise and accolades onto the outsider responsible, but it’s very rare and unfortunately still negative in terms of creating a productive dynamic around high pressure data science. The correct response of jubilant discovery of a valid contribution does relieve the need to sort out the current mess, but it serves to reinforce the very wrong conviction that “the data science team just doesn’t have the right stuff and we could replace them with bunch of guys like the guy from ops who found that thing in the last war room.”
So all of this mitigates against the professionals; they will be slower, will struggle to explain what they’ve done, will struggle to explain why the outsider analysis or folk wisdom is wrong and might make mistakes. But as observed, the professionals are the only people who really have any chance at all of making progress on these problems.
This is the dynamic that I am seeing playing out in the media and blogsphere with regards to Covid-19. The “outsiders” are launching their half baked ideas at policymakers and politicians, catching attention with a bit of D3 or plotly while the professionals are trying to get on with actually understanding what’s going on in real time and providing valid, actionable advice. What do we need to do about this?
There are two things, the first is that we need to step back and listen to the quieter voices of the true professionals. They need time and space or they will not be able to do good work. Shaking things and shouting are the wrong reactions, as are running off and doing what the amateurs are saying and then pretending that that was a responsible thing to do.
Second, well, as consumers of data science, whether in a business context or in an epidemic, there are some meta level things you can do to review what you are seeing — as well as treating analysis critically as per Pueyo & Handley above. Remember — if you see evidence left out or patterns sifted from a sea of noise be skeptical. But even if you can’t see those specific problems in the work (and they are hard to spot) you can apply some filters to figure out if you are seeing the real deal or not.
First pay attention to the process that is used to produce what you are seeing. Was there a declaration of intent made before the analysis was whipped up — or did it emerge unannounced? This is important because if people are saying “ok we are going to look at this specific question using this specific data next week” then they have many less degrees of freedom to suddenly discover the way, the truth and the light which then turns out to be a co-incidence of data, and they are working systematically — which is good. So — did they have an evaluation process which checked the analysis? Do they have standards about the data that they are using and how they are treating it? You don’t need to know too much about the details — just that they are using a systematic process and it’s not made of spit and straw.
Second, look at the team. Is there a team — or is this a lone wolf? Are they working together or are people carving off bits without integration? Are they discussing and interacting and sharing? If you can’t see them together, presenting jointly and fielding questions with a united front then you are exposed to ego and error — beware!
Third, look at experience. Why should you invest in this persons advice? Some outsiders have experience that the professionals lack. They’ve been there and done that — and if you are looking at work from someone like that then it’s worth taking it seriously. But even then beware the anecdote and the singular perspective. This is not a criminal court where a single proven accusation is more than enough to warrant a guilty verdict; we are looking at big pictures and trying to discern the whole story.
Fourth, look at behavior. Is the team or person posturing, or are then inviting debate. Are they open to discussion or are they asserting authority?
So this is the end, and I owe you an opinion, and some people might even have skipped here to find out what I actually think (why?)
I don’t have a clue about Covid-19 because :
1) I don’t have the data to work from. There is a bag of stuff that I feel that I would need to make any sort of sense of these numbers — the differences for enrolment in testing, the distribution of testing (around clusters or on borders, or generally), demographics of the infected, behaviours of the population, co-morbidities and more and more.
2) I know enough about handling data to know that I don’t understand the data on Covid-19 that I can see, and that if I want to understand this data I need to spend a lot of time learning about it, I don’t have a chance of making a contribution to the discussion until I have done that, and by the time I have done that I am pretty sure that all of this would be over.
3) I also need to think and learn about what to do with data like this, because right now I don’t know. Epidemiologists have spent many months of professional training learning about this kind of analysis, and then many years of professional practice developing this knowledge. As a fellow professional I see what they have done and appreciate the gap between their capability and mine, so I will read their stuff and try and gain a bit of knowledge about a parallel world. I might wonder why they aren’t trying out some whizzo method I learned about when doing something else several years ago — but only in the style of thinking “there’s a reason for that, they know this, they aren’t using it, there’s a reason, I wonder what it is…”
The people who can make these statements as professionals are out there issuing the official advice you are reading from the UK Government. They are going to struggle with the challenges that I’ve talked about in this article, and this will make their advice less attractive, less consumable, than some of the blogs and articles you will read from well meaning people who simply don’t understand the issues of this work.
Think critically about all the work that you read — boring and limited as it might be from those tied by professional practice and the limits of solid inference, or exciting and wide-ranging from people armed with MS Excel, some interactive charts and more confidence than sense. Also mostly, almost entirely wrong.
I am applying my tests and critical thinking to the analysis that I see, you must to — if you are going to remain sane. But I won’t be brewing up any outsider takes to distract anyone else, I hope that you don’t as well.