AI versus Covid-19
Let battle commence
I was first asked to write an article on the impact of AI way back in February, when the world was a very different place. In the UK we were free to move around and the biggest sin at the time was shaking hands with someone. Within a matter of weeks most of us were on lockdown and the very real threat of death was at our doorstep.
As our lives have changed, so has the focus of AI. My own work as an AI adviser has gone from helping organisations create medium-term AI strategies to focusing on ‘what can we do right now’. This has included looking at how customers are changing the way they are talking about coronavirus on a week-by-week basis, to seeing how the profile of donors giving to charity are changing week-by-week (charities are set to be a huge loser in this crisis).
I want to therefore talk about how Covid-19 has impacted the world of AI and, much more positively, how Covid-19 can be impacted by AI. I’ll then look at what might be different for AI once ‘normality’ returns.
Covid roughs up AI
The vast majority of artificial intelligence today works on the principle of ‘supervised learning’, meaning that it builds its models based on known information, whether that be labelled images or historical data. For example, if you want your AI to recognise pictures of dogs you need to have a data set of many thousands of pictures of dogs all labelled ‘dog’. You also need just as many pictures that are not of dogs, all labelled ‘not dog’. By feeding these images into the AI, it learns to recognise the features and aspects that best define a dog (to be precise it is only recognising patterns of pixels, but that’s a whole other discussion). Show it a new image, and it is able to work out, with a certain degree of confidence, whether that picture contains a dog or not. In a similar way we can get AI to predict the price of a house by feeding it all of the features of many houses (price, number of rooms, garden size, age, etc) so that, when presented with a new house, it is able to give a good estimate for its value.
In that second example, there is a big assumption that we make: what happened historically will probably happen in the future (maybe with an underlying trend up or down as well). But what happens when that assumption is no longer valid? What happens when people are barred from actually being able to purchase houses? What happens when the world goes through an unprecedented event like a pandemic?
‘Unprecedented’ is the word that AI hates the most. Once we get into unprecedented territory then any AI that relies on supervised learning (which is the vast majority) will really struggle to make any sense of it. One answer is to ignore all our old models and build new ones, which is exactly what many data scientists and AI experts are doing, and they are pointing their algorithms directly at Covid-19.
AI fights back
The efforts to fight Covid-19 using AI tend to fall into three areas of AI capability: prediction, Natural Language Processing (NLP) and optimisation, all of which have their own challenges. The first two areas focus on the structured data (that is the pure numbers — cases, deaths, location ,etc.) and the unstructured data (the huge amount research that is being published on the subject right now) respectively. Work is being done at a grassroots level on both of these fronts: Kaggle is an online platform (owned by Google) that hosts competitions for data scientists, and it currently has two open competitions running — one that provides a huge repository of Covid-19 research to mine for insights, and another to try and forecast the spread of the virus. The optimisation efforts are much more corporate — this is the ability to find potential cures for the coronavirus through drug discovery modelling.
For the first of those Kaggle challenges the White House and a coalition of research groups have prepared the ‘COVID-19 Open Research Dataset’ or CORD-19. This data set contains over 51,000 scholarly articles about COVID-19, SARS-CoV-2, and related coronaviruses. The idea is that data scientists will use NLP, a sub-set of AI focused on making sense of language, to trawl through all of the research and extract meaningful insights. To give some focus, the competition organisers have provided nine questions to answer, including ‘What do we know about COVID-19 risk factors?’ and ‘What do we know about non-pharmaceutical interventions?’.
The challenge will be sorting the wheat from the chaff. Because of the sheer number of research papers there will be some real rubbish in there (for example, one claims the virus is from outer space). But, perhaps, some of those more left-field theories may actually be the gems that we are looking for.
Another area where NLP is helping relieve the Covid-19 burden is through automating inbound communications. Many back-end processes can be automated using technologies such as Robotic Process Automation (RPA) but they require structured data to work on, and people emailing their bank, for example, do not communicate in a structured way. AI though can take those unstructured, natural language emails (or texts, or social media feeds) and make sense of them so that they can be processed downstream. Think of it as a huge triaging operation for your incoming communications, allowing the human agents to focus on the really important queries.
The structured data challenge on Kaggle has been set by the Roche Data Science Coalition (RDSC). They have curated a collection of datasets from 20 global sources, including from Johns Hopkins, the WHO, the World Bank and the New York Times. These data sets contain data that describes local and national infection rates, global social distancing policies, and geospatial data on movement of people. They are asking questions such as ‘Which populations are at risk of contracting COVID-19?’ and ‘Which populations of clinicians and patients require protective equipment?’.
The biggest problem that the data scientists will come up against here is the one I elucidated earlier — there is little historical information to base those future predictions on. But, in a crisis, some data is better than no data, and these are important questions to answer. Caution must be exercised though when the actions taken based on the outputs of these models could have a huge impact to tens of thousands of people’s lives. There is a very fine line between ‘quick and dirty’ and ‘very risky’. The example I gave at the top of this piece that uses AI predictions to help charities identify those people most likely to donate is an ideal use case — the AI can pick up on the fast changing dynamics, and the risks of getting it wrong (someone gets an email and they don’t donate) is minimal.
The third major area where AI is trying to help the fight against Covid-19 is in drug discovery. Much of this work is done on very powerful computers simulating thousands of different scenarios of different molecule interactions and is therefore mainly in the realm of big corporates or well-funded startups. This is all great work, but the discovery phase is only the first of many subsequent phases, all of which take time and money. So, coming up with, say, 10 possible cures for coronavirus might sound good, but it means that each one has to be rigorously tested in the real world, even just to be excluded as a potential candidate. And then they have to be assessed as to whether they can actually be manufactured at scale. So, we shouldn’t think of AI as the silver bullet here, we need to think of it in the wider context of real-world problems and challenges.
Right now, of course, anything anyone can do to help halt the virus or at least slow its spread is to be welcomed. But what about when all of this is over? How will AI look in a post-coved world? Will its battle scars look like wounds or medals?
AI emerges into a post-Covid world
When the dust has settled and we start to go about our ‘normal’ lives again, one thing is for certain: many of the predictions that AI had been making so well before the crisis will be royally screwed up by the huge blip that is Coronavirus. All that historical data that was showing lovely, incremental trends will either have a massive dip or a huge spike during the first half of 2020. And that will make future predictions much harder. Also, many industries may not return to anything like they were before the crisis. Already there is talk about the airline sector looking very different, with less business class travel, fewer large jets and higher prices. Again, long-term predictions will be less accurate than before the crisis for these industries.
But I think the overall impact on AI once the crisis is over will be net positive. As in the examples given here, AI will have shown to people that it can indeed be relevant and helpful in a fast-changing dynamic environment. Crucially, when considering the right use cases, it will have shown that AI can provide real, practical benefits, some of which will have saved lives.
The value of data (and there has been a lot of data generated these last few months) should also be clearer than ever. There was a huge collaborative effort to make large swathes of data available to the data scientists, and this openness should hopefully become the new norm. Any organisation (especially from the BigTech stable) that tries to close down data sources or seek to exploit for profit the data that has been made available will be looked at with new eyes.
I also think that the crisis will have an impact on how AI is done. We all know that the technology is notoriously difficult to implement, but some organisations (in my honest opinion) have tried to make it seem more difficult than it actually is, in order to increase their pricing. The work that was done during the crisis will expose any organisations who try and go back to their old ways once the pandemic is over.
What this crisis is showing us is that amazing things can be done quickly and collaboratively with AI if the motivation and mindset are there. As we look to the future of a post-Covid world, let’s hope we can try to maintain that sense of purpose and drive so that we can realise AI’s full potential.