ICTC’s Tech & Human Rights Series

AI, Ethics, and Existential Risk

A Conversation with Dr. Stuart Armstrong

ICTC-CTIC
ICTC-CTIC
Published in
9 min readJun 15, 2020

--

On 9 March 2020 ICTC spoke with Dr. Stuart Armstrong as part of ICTC’s Tech & Human Rights Series. Dr. Armstrong is a Research Fellow at the University of Oxford’s Future of Humanity Institute, where he focuses on the safety and possibilities of Artificial Intelligence (AI), and has worked with organizations such as Google’s DeepMind. In this interview, Kiera and Dr. Armstrong discuss the intersections of AI and ethics, and dive into topics such as existential risk, the safety and possibilities of AI, the ability to map humanity’s values onto of AI, and the long-term potential for intelligent life across the reachable universe.

Photo by Ross Sokolovski on Unsplash

Kiera: Thank you so much for making the time to speak with me today, Dr. Armstrong! You are a mathematician by training, with a background in probability. Can you share how you came to study futurism, existential risk, and AI at the Future of Humanity Institute at Oxford?

Dr. Armstrong: I was originally working at a medical research company, and I would drop in occasionally at the Future of Humanity Institute, where they were working on various interesting, important and science-fiction-like problems. They seduced me with these problems and made me care deeply about the fate of humanity, which meant that I was then stuck there afterwards!

Kiera: What kinds of problems were they working on at the Institute, at that time?

Dr. Armstrong: A variety of things, especially initially, because at first the aim was to look at all the major threats and possibilities for humanity. We were looking at things such as human enhancement, space flight, space colonization, big pandemics, nuclear war, AI, climate change — all the things of that nature. In the years since then, we have refined it down a bit. We now seek to focus on areas where there is leverage for our work to make the most difference. For example, this is one of the reasons why we don’t do anything on climate change at the moment, as climate change is already saturated with people working on it. We focus on other areas where our work can have impact, like AI risk, pandemic risk (which is topical at the moment), synthetic bio-versions of pandemics, and things of that nature.

We also give less attention to exploring positive possibilities because it seems that our avenues for influence in those positive areas are relatively narrow, and there are a lot of people in the world who are already going to be working on them. As one example, we mainly don’t look at positive human enhancement possibilities because, despite all the worry or hope that people have in this area, these enhancements are not going to have much impact for the foreseeable future. Smartphones have already changed humans more than any drug we could develop.

Kiera: You’ve just touched upon several of the main risks that you look at, but could you expand a bit on these?

Dr. Armstrong: At the moment, the two big risks seem to be: (1) the AI risk, which I’m working on and has the most uncertainty, and (2) the pandemic (and biotechnical versions of the pandemic) risk, which we are also spending a lot of time on. Now, AI may turn out to be quite weak, with no reason to worry. But it may turn out to be the opposite. One thing we look at, for example, is how likely or possible it is in general to recuperate after a disaster, and this is where AI is unique because AI has some of the greatest uncertainties of all. If AI goes wrong, it will likely be the hardest disaster for humanity to recuperate from because there would be an intelligent adversary, and intelligent adversaries don’t go away. In contrast, pandemics and natural disasters can cause huge amounts of destruction, but once they’ve burnt themselves out, they have burnt themselves out

Kiera: I hope this isn’t too technical, but how do you go about forecasting or modelling these risks, particularly for AI? It seems like a very difficult thing to do.

Dr. Armstrong: It depends on the risk. We can look at two extremes. On one side, let’s take the risk of asteroid impact. Here, it’s fairly simple; it’s science: you’ve got telescopes, estimates for number of asteroids, estimates for how likely they are to hit the Earth, and it turns out the risk is quite lower than expected. At the opposite end is something like AI, which is so uncertain that it is very hard to get a decent estimate. The main thing that I’ve found is that the risk is not sufficiently low enough to be ignored. The numbers fluctuate a bit, but the risks are around 10–25% for a serious disaster this century (such as AI-enabled warfare), depending on the most recent data. Anything that is above 5% is sufficient to pay attention to it, so it doesn’t matter if it fluctuates. If the risk is sufficiently large, then I am committed to working on it.

Now, these are the two extremes — asteroid impact as a very scientific process, and AI, which has extreme uncertainties. In between these two extremes, there are a variety of other risks — for example, what are the risks of a war, the risks of economic collapse, what would people do after a supervolcanic eruption? These take a mix of hard science (for example, to estimate what a supervolcanic eruption is really like) as well as non-hard science (for example, trying to predict the future of politics or people’s reactions to these disasters). In predicting people’s reactions, we have some evidence for how people behave in actual disasters (which is generally a lot better than we stereotype — people tend to behave better than we think), but there is still no perfect sample to fit exactly what we would encounter. In these mixed areas, the ideal is to investigate the topic long enough to find whether either the probability is low enough that we no longer need to work on it, or that its high enough that we should shift more focus on it.

Kiera: You have worked with policymakers, governments, companies and other stakeholders, in relation to managing risk. Do you generally find there is a strong interest outside of academia in understanding and addressing risk?

Dr. Armstrong: Yes. We’ve worked with a range of stakeholders: insurance companies, civil servants, various governments, corporations, and some philanthropists. Although some risks are unhedgable, some risks that are hedgable are not hedged simply because of incentives. For example, some of the biggest insurance companies do have the ability to hedge some of the biggest risks, like coronal mass ejection hitting Earth (which could wipe out the electricity system), but that requires shifting individual incentives and risking profits, which they don’t want to sacrifice or risk. So while insurance companies are doing some stuff in this area, they are often not doing enough because they don’t have the incentives to do it. Thus, insurances might cover some job losses due to AI but not mass unemployment or huge social disruptions or major AI-caused disasters.

And who does have the incentives in that area? Alas, very few people and not many organizations. Although, governments, civil servants, and some philanthropists tend to have a longer view and look at disaster preparedness, so there is more willingness there.

Kiera: You have discussed the possibility of “mapping” humanity’s partially defined values (and I like that, because I think “partially defined” is an important qualifier) onto AI in order to steer it in a better direction. Is it possible to map values onto AI and construct AI systems in a way to steer a future we desire? What would that look like? How would you determine which values are selected?

Dr. Armstrong: So, it turns out you can’t just point an AI at human behaviour and get it to deduce what human preferences are — at least, not without making assumptions. Humans are very good at making these assumptions ourselves, so we rarely get these wrong, but that is because humans can interpret human behaviour quite well. Current AI, in contrast, cannot.

Even if we gave it infinite amounts of data and processing power, AI could not deduce what human preferences are because it’s not just a question of data. It doesn’t matter either if the AI is a neural net, a forecasting ML, or some radical new design. For any AI design, there are some key assumptions we need to put into its algorithm. We need to get “human theory of mind” into the AI. If that is done, then it is just a question of pointing the AI at humans and having it sort out what the various pieces of our preferences are. And then we must deal with the fact that human preferences are contradictory, changeable, manipulatable and underdefined.

Now, human values being contradictory is actually not much of a problem. There are various ways of overcoming this. Human values being changeable is a bit more of a problem. Human values being manipulable is even more of a problem. But the biggest problem is human values being underdefined, because when we try to extend our values to new futures or scenarios that AI may create, we don’t know what that extension will look like. In scenarios where there is very powerful AI that could actually determine these futures, we need to address this. Basically, if the AI is very powerful, the future will come to resemble what the AI would prefer it to be. And so getting its preferences right is vital.

Kiera: How might people go about defining these values going forward?

Dr. Armstrong: At first, it’s a technical problem. You must technically define, at one point, what one person’s values are. After you’ve done that, then you have the issue of reconciling the different values of different humans. This is an issue of politics, of reconciling different sets of preferences and values, and this is what people always talk about because it is so politically difficult to do.

Kiera: Some of your work has to do with very long-term potentials of AI outside of the Earth’s boundaries. For fun, could you comment on that aspect of your work? Is there long-term potential for intelligent life across the reachable universe? What are the risks and potentials of this?

Yes, there is great potential for intelligent life across the universe, and AI might open up new eras of exploration. With AI, it is much easier, for example, to reach a nearby star; only very small probes (in the kilogram range) would be needed to sustain an intelligent AI, which could build what it needed upon arrival. But to send a viable human colony would require hundreds or thousands of tones of equipment.

If AI ends up being powerful, then we will face a dichotomy in which the outcome of the future could be either wonderful or disastrous. If, alternatively, AI is weaker, then we can expect more of the usual “good-to-bad” spread that we find with most issues.

Dr Stuart Armstrong is a Research Fellow at the University of Oxford’s Future of Humanity Institute, where he focuses on the safety and possibilities of AI, defining the potential goals of AI and the long-term potential for intelligent life across the reachable universe. He has worked with organizations and experts, including with Google’s DeepMind, to help AI designers include these safety methods in their designs. He is the author of the book ‘Smarter Than Us: The Rise of Machine Intelligence. You can read more of his work here.
Kiera Schuller is a Research and Policy Analyst at ICTC, with a background in human rights and global governance. Kiera holds an MSc in Global Governance from the University of Oxford. She launched ICTC’s Tech & Human Rights Series in 2020 to explore the emerging ethical and human rights implications of new technologies such as AI and robotics on issues such as privacy, equality, and freedom of expression.

ICTC’s Tech & Human Rights Series:

Our Tech & Human Rights Series dives into the intersections between emerging technologies, social impacts, and human rights. In this series, ICTC speaks with a range of experts about the implications of new technologies such as AI on a variety of issues like equality, privacy, and rights to freedom of expression, whether positive, neutral, or negative. This series also particularly looks to explore questions of governance, participation, and various uses of technology for social good.

--

--

ICTC-CTIC
ICTC-CTIC

Information and Communications Technology Council (ICTC) - Conseil des technologies de l’information et des communications (CTIC)