Data, Prediction and Culture (Or How I Learned to Live Without My Flying Car)

The world is chaotic and we can’t predict anything. Does this really matter?

Pete Sueref
13 min readSep 4, 2020
Image via Wikipedia

Two and a half thousand years ago, the ancient Greeks used to visit a temple in Delphi to hear the Oracle of Apollo issue gnomic predictions about their futures.

In ancient Rome, soothsayers would sacrifice a sheep to the gods, cut open the body to extract the liver, dissect and examine it, and then tell the emperor whether or not they would win the next battle against the Goths.

In Renaissance Europe, travellers used to follow the tea routes and then read fortunes in the leaves at the bottom of the cup.

Today, hedge funds use sophisticated models with arcane mathematics to predict the vagaries of the market.

Is one of these things unlike the others? Maybe. Maybe not.

In 2007, Warren Buffet made a million dollar bet with the hedge fund industry that the market would outperform a selection of handpicked, actively managed funds over a ten year period. In other words, the people who are paid to predict whether stocks would rise or fall would do worse than if you had just taken a cross section of stocks from across the market and stuck with them.

In 2017, Buffet won his bet. And yet hedge funds continue. As do horoscopes, fortune tellers and political analysts.

In June I gave a talk at a virtual conference about data science called Data, Prediction and Culture. Flying cars didn’t feature explicitly in the talk — nor do they feature in this article — but they’re a common trope used to illustrate what life may look like in the future, along with robot butlers, spaceflight and high-fidelity instant video communications. Commercial spaceflight is almost here. AI assistants are starting to become prevalent. And this talk was delivered via Zoom with participants attending from across Europe. William Gibson famously said that “The future is already here, it’s just not very evenly distributed”. Our ability to understand trends and outliers is not at the same level as the progress we make year-on-year in science and technology. And this has pervasive effects in the workplace but also in our lives.

This is a theme I’ve explored in previous talks but it feels more relevant given that the COVID-19 crisis is the starkest illustration of how the improbable can blow up all of our best laid plans.

The following article will cover most of the content of that talk, as well as touching on some other points from previous talks around how we live, work and innovate in a world of noise and data.

On Wednesday June 17th 2015, the headline of the sports pages in the Guardian read “Leicester sack three players over racist orgy on Thailand tour”. One of the sacked players was James Pearson, son of the manager. Two weeks later his father Nigel also parted ways with Leicester City making way for Claudio Ranieri, at that point a journeyman who had just failed spectacularly with the Greek national team. Eleven months later Leicester were crowned champions for the one and only time in their history, having been 1000–1 outsiders at the start of the season. Even late in the season Gary Lineker publicly declared he would present Match of the Day in his underwear should this improbable event occur. Occur it did.

Leicester City lifting the Premier League Trophy
Image via Wikimedia

The following year, two more events occurred that beat the odds. In June, Brexit was voted for by the UK, with bookmakers up until the night before still favouring Remain. And in November, Donald Trump defied the polls to win the US election having started the primaries a year earlier as the rank outsider.

And now, in 2020, a virus has made its way from animal to human and shut down normal life for weeks, months, perhaps years to come.

We have collected billions of data points on football, elections, and viruses. And yet, we have been unable to predict with any reliability either the likelihood, or the severity, of any of these and myriad other events which impact businesses, the economy, public consciousness and ultimately our lives.

And if you think I’m cherry picking specific events to prove a point, this is not to mention the Black Lives Matter movement, floods across the UK, fires engulfing Australia or any of the other multitude of unpredictable occurrences that have happened this year alone. I could also talk about a volcano erupting in Iceland or a tsunami off the coast of Japan or the financial crash of 2008 or Russia’s hostilities towards Crimea. And before that, the UK crashing out of the ERM, the miner’s strike, the oil crisis and the three day week, world wars, famines, depressions and on and on.

A black swan on water
Image via Wikimedia

Nassim Nicholas Taleb wrote a book about these sort of events called The Black Swan. The things that you cannot predict because you cannot even comprehend them happening. Not all of the examples above are black swans, but all of them have unexpected impacts to human behaviour and create noise in complex systems. Is it possible to predict anything when the world is inherently so volatile?

Usually in a talk I like to invite the audience to participate. This is difficult in a virtual talk with the current technology. Spontaneity is not supported, and as we all get to grips with the technology there are social protocols that we haven’t figured out yet. The raising of a hand, making eye contact, discreet coughs, adjusting the chair — cues that reveal something about the intent and the state-of-mind of the participants — are all difficult to replicate behind a screen.

What’s the solution to this? It’s still hard to say. Which is not to suggest virtual conferences don’t work, but perhaps they’re still missing the secret ingredient that makes them really valuable. Normal conferences have had years of refinement and practice to work out the kinks, and even then many of them may be boring or obvious (mea culpa!). Lots of the strength of them is in the networking or the random encounters or last minute decisions to visit this stall or listen to that talk. How do we make virtual conferences work like this? It’s difficult to predict.

What will the next ten years bring? As a start, may I humbly suggest some of the following possibilities: full home automation, wide-scale driverless cars, Michelin-starred lab-grown meat, microgrids, digital currency, the end of privacy, another great depression, applied quantum computing, new nation states appearing, war, the break-up of the UK, the break-up of the USA, the growth of the EU, nuclear fusion, humans on mars, general AI, nanotechnology and human genetic engineering. Many of these are already here, some in their infancy. Others are less likely to happen. But could you say for certain which? And what are the impacts of these events on the economy, on our social systems, on our lives?

A futuristic digital display in a car
Photo by asawin on PxHere

And what are the second or third order effects of these? The Andreessen Horowitz analyst Benedict Evans predicts a future where driverless cars are the death knell for smoking, given that most cigarettes are bought at gas stations, a potential casualty of the coming AI revolution (a driverless electric vehicle will charge itself when convenient). Going further, this would potentially affect tax income through lost cigarette sales, but also health spending due to reduced cases of lung cancer and emphysema. What effects do these have? It’s easy to see that single events have cascading impacts, each like it’s own mandelbrot set. Taken as a whole, realistically, any social prediction is unreliable at best and dishonest at worst.

And yet… the pace of technological change continues. Think back to television shows and films from the 70s, 80s, 90s. Star Trek’s instant communications and handheld tricorders. James Bond’s tech-filled watches and a car that could drive itself when needed. These technologies have not just landed, they have become indispensable and ubiquitous in no time at all. The modern web has only been with us for twenty years, smartphones for thirteen, social networks aren’t old enough to vote (even though they arguably hold a lot of sway over our politics). The pace of change is fast.

Working in data science, then, presents a strange dichotomy. Expectations from our customers and colleagues are high, having seen the accelerating technological changes of the last couple of decades. And yet, we are asked to build models to predict an uncertain future, to make sense of a random world. How do we manage the contrast between these competing factors?

The virtual conference is a good case in point — we have incredible technology, the ability to make instantaneous high definition video available across multiple time zones with very little friction. And yet think about every video conference you’ve ever been a part of. “Can you hear me?”. “Am I on mute?”. “Video is lagging, I’m going to audio to preserve bandwidth”. “Sorry, that was my dog / the kids / the postman…”.

Nobody ever had to contend with a WiFi outage or wandering family members in Star Trek or James Bond or The Avengers or Star Wars or Back to the Future. The reality is different from the cultural image that has been burned into our subconscious, over and over in TV and Film.

In fact, this problem of expectation versus reality is really just scratching the surface. The deeper concern that permeates our lives is how far removed we are from understanding the technology and the data presented all around us. The current Coronavirus crisis is a case in point: each day we are treated to statistics, graphs, facts. Many people will have seen log-scale graphs for the first time when looking at relative cases across countries (this is a chart where one of the axes increases in different magnitudes, for example 1, 10, 100, 1000…). Or been asked to understand second-order derivatives when looking at the speed at which the death rate is falling or rising. And then have to understand the issues concerning the efficacy of masks, two-metre distancing versus one-metre distancing, whether vitamin supplements help and whether obesity or race are a factor in the severity of symptoms. Each of these is debated by experts, analysed by pundits, and put into action entirely differently across countries.

What do we do? Give up on statistics and prediction entirely? Throw our hands up in the air and shout “what’s the point?”. Clearly not. Perhaps I was disingenuous above when suggesting we can’t reliably predict anything. After all, the odds at a bookmaker just reflect the market (at 1000–1, the 1 is still going to happen once every thousand times). Polling gives a percentage estimate that a particular candidate will be elected and 48% isn’t zero. Some hedge funds do actually make money for their clients (Buffet’s Berkshire-Hathaway fund continues to outperform the market).

But the point is that certainty, or even near-certainty, is difficult. And certainty in the future, when looking at the social sphere, is impossible. There is still snake-oil being sold and marketed as “AI”, the opinion that “you can prove anything with statistics” is still prevalent and unhelpful, both in a business and social context, and there is still the expectation that some magic technology solution will solve the world’s ills, from cancer to climate change.

While there isn’t a magic remedy to the problem of expectations versus reality, there are some actions we can take as individuals, as organisations and as a society, that can help us to better face the challenges.

Without wishing to propose widespread policy recommendations, or structural changes to the way our society is organised, perhaps I could humbly offer some simple and relatively easy suggestions that may help, at least a little.

Education. This has been emphasised personally to me given that I am now the primary teacher to three young children while lockdown continues. And while I think education in basic statistics and critical thinking is valuable for school children, it is also essential for adults. One of the things I’m most proud of at is that the data science team I run have produced courses to teach basic programming, data science fundamentals and statistics.

Understanding confidence intervals, p-values, the aforementioned log scale and other core concepts, is vital for decision makers in businesses. Do you know that the result of the A/B test you’re running is statistically significant? Are you sure the time series forecast will be accurate three years into the future? Are you certain the correlation between sales and zodiac sign of the customer is meaningful? Statistical literacy prevents these mistakes. Education has the effect of inoculating a person against the misuse and misinterpretation of data.

Honesty. A known problem in science is the lack of papers published which show negative results. Only the flashy, attention-grabbing papers with positive outcomes get into the top journals. And as such the millions of hours of cumulative drudge-work resulting in non-significant results, never see the light of day. This gives the impression that science is a steady progression forwards and scientists, engineers and technologists are all alchemists in ivory towers, able to bend the world to their will. This is dangerous and exclusionary, and only deepens the apprehension many have about science and data and technology.

In my Data Science team, we have a monthly show-and-tell of projects we have been working on. Importantly, we try to show all the projects we work on in all the various stages of development, including those that never make it — because they ran out of funding or just didn’t work. Demystifying the process invites more people in. Failure is normal and accepted, particularly when it’s learnt from. The biggest benefit though is cultural — it helps set expectations, removes the veil, and rather than encouraging our customers and colleagues to ask for flying cars, encourages reasoned inquiry, intelligent questioning and potential products that are realistic and impactful.

Collaboration. This probably isn’t the best descriptor. Diversity of opinions while remaining understanding and welcoming of new ideas and approaches? (There may be a German word for this). Working in a diverse group, whether that’s regarding politics, race, gender or sexual orientation — this gives us superpowers. They allow us to create things that work across the spectrum, let us test our assumptions safely and cheaply, and prevent echo chambers which lead to stagnation. A mix of opinions and beliefs is also stronger than the individual in prediction, estimation and thinking — The Wisdom of Crowds by James Surowiecki describes this with dozens of examples. But as well as helping us cognitively, collaboration and diversity lets us cope with whatever the future throws at us, predicted or not.

In a business context, collaboration means working across departments and breaking down barriers. Fiefdoms and empire-building is common in large companies, and these are the enemies of rational, science-based thinking. Product teams comprising elements from across different functions are able to build things together with the same set of beliefs and expectations.

My final point on collaboration is best stated by a slide I’ve started including in recent presentations: “You don’t need a business case for kindness”. This should be the default position. It’s possible to hold differing opinions while remaining respectful. Everybody is dealing with something, particularly in the current crisis, and kindness to our colleagues and customers will help us far more than flying cars or predicting the future.

When I finish a talk, I usually give some book recommendations that inspired the talk as well as some tongue-in-cheek predictions, mostly to give a satisfying conclusion to the event, but also to highlight the absurdity of making predictions in public.

First the book recommendations: three related books, (related mostly by the fact that the authors have complicated relationships with each other). The Black Swan by Nassim Nicholas Taleb is packed full of examples, like those above, as well as the impacts they have in a wider context. The Signal and the Noise by Nate Silver is a foray through the world of statistics via political polls, weather forecasting and sports betting, and is very readable. And finally, Superforcasting by Philip Tetlock, who tries to understand why some people actually can make accurate predictions.

Regarding my predictions — I don’t have a great record at the end of these talks. In 2017 I predicted that Donald Trump would be impeached by Christmas. It actually took a couple of years longer. I’ve also predicted that the next US president would be a woman. This wasn’t based on anything other than wish fulfilment. However, I feel like that particular wish has been vindicated based on the performance of women leaders of nations reacting to COVID.

So on a related note, here’s my first prediction: In my lifetime, the CEOs of FTSE 100 companies will remain resolutely male (let’s say 70%). This is gloomy and I hope it is proved wrong. But when you try to predict the weather tomorrow, the most reliable way is to look at the weather today. Perhaps we’ll learn something from the handling of Coronavirus? Perhaps a new social movement will usher in fifth-wave feminism?

My second prediction: the effects of the pandemic will be with us for a long time (1–2 years) and then will be forgotten almost immediately as we return straight back to the world as we left it. The promises of societal upheaval, reduction in carbon, universal basic income — none of them will come to pass as a result of the current pandemic. The world is already rushing to continue business as usual even while at the time of writing there are hundreds of new cases every day.

My final prediction: that something none of us have predicted will come out of the blue in the next ten years and change everything all over again.

I hope it’s something positive.

Cardiff, June 2020

--

--

Pete Sueref

Data Science, Innovation and Tech. Father and widower trying to make sense of things. Nobody knows what they're doing.