Artificial Intelligence and Nature
Presentation delivered at the AI, Biodiversity and Citizen Science workshop, Australian Museum, Sydney, Oct 31, 2017.
First, I’d like to respectfully acknowledge the traditional custodians of the land on which we’re conducting this workshop today — the Gadigal of the Eora Nation. This is a workshop which is especially relevant to country and custodianship. I’d like to pay my respects to the First Nations Peoples and their elders, past, present and future.
I’d like to also thank the Centre for Biodiversity Analysis at ANU for sponsoring this workshop. Thanks to Craig Moritz and Claire Stephens. I’d like to thank the Australian Museum, Paul Flemons, Kim McKay. And I’d like to thank Erin Roger of ACSA and the NSW Office of Environment and Heritage for supporting and helping organise this event. And everyone who’s participating today.
Aright, so here’s the thing — AI is everywhere at the moment.
Front page news. Covers of magazines. $40 million per week in Venture Funding. In the biodiversity space, there are some really exciting developments. Computer vision and species recognition is becoming part of Android camera functions. There are drones that identify foliage, sharks, cliff-dwelling mantids. In China the TenCent mobile browser already has plant recognition built into it. I’m excited to hear about more developments from the speakers today.
So the AI train is moving fast. Investors are trying to keep up. Careers depend upon it. According to the NY Times this week, there are AI specialists in the US, fresh out of university with just a few years experience, earning $300,000 per year. As a professor from the University of Washington puts it: “There’s a giant sucking sound of academics going into industry.”
I began to experience something like this sucking sound myself about a year and half ago. QuestaGame, which is a biodiversity technology I helped found, was approached by an AI developer in Computer Vision. QuestaGame’s database of expert-verified species photos was growing fast and this programmer wanted to see if he could use the data to train an app to identify insects.
I wasn’t sure. I wasn’t sure who the photo pixel data belonged to and what the implications might be for QuestaGame’s core mission, which is about connecting people to nature.
At the time, QuestaGame had a clause about AI in its terms and conditions. It still does. Part of it goes like this:
…we believe there are ethical considerations when it comes to training AI systems in the environmental sciences, especially if, at the same time, we neglect to provide opportunities for the amazing, organic supercomputers that exist in people’s skulls.
It goes on to mention how, if AI is going to reflect human intelligence, then we need to ensure that all humans are part of the conversation and so forth.
Some AI applications might help do this. Then again, some might not.
So I began to get in touch with researchers and developers in the AI and biodiversity space — particularly researchers involved in a bird identification app called Merlin BirdID, which uses AI to identify birds. I began asking questions — legal questions, process questions, ethics questions (some of the questions have been distributed to everyone here) — and I began to realise I wasn’t alone in not having answers.
A leading AI researcher at the Computer Vision Group at Cornell came to a similar conclusion, and asked me to write up a more complete list of questions; and to put together a position paper on Ethics and Computer Vision — which I did. And which has become part of the genesis for this workshop.
My background is in communications science; I’ve spent decades exploring the history and cultural implications of communications technology — “artificial intelligence” being the latest manifestation of this — and I’ve designed what are called “collective intelligence” models online. QuestaGame is an example of one such model.
Now Artificial Intelligence is complex.
To show you just how super-complex Artificial Intelligence is, I’m going to put up a slide that diagrams all the different pieces — the neural networks, the machine learning, the tensors and arrays and so forth — all of it on a single slide. It took me quite a long time to fit it all in and it’s a bit difficult to follow, but I’ll walk you through it.
Ok, so it’s not that complex. That’s it. That’s the slide. Communications scientists are simple minded people. That’s basically how Artificial Intelligence looks to us.
There’s a data input. There’s an artificial device (it can be anything — a book, a machine, Stonehenge is a form of AI). There’s a query of that device. And there’s a response.
And then, at the end, we decide: Hm, was that response intelligent? Should I believe it?
This is how AI has worked for thousands of years. I want to look briefly at this genesis now. I’ll refer to Western culture here, but the concepts are paralleled in cultures around the world.
So Aristotle mentions two kinds of Artificial Intelligence. On the one hand we have artificial devices that engage many brains in a single output. Aristotle describes it as “a feast to which many contribute [being] better than a dinner provided out of a single purse.”
Today we call this “Collective Intelligence.” We can trace the general idea to modern times.
It appears, for example, in the 17th century, in Thomas Hobbes’s book, Laviathan. It’s this idea that individual intelligence is simply the distributed knowledge of a larger entity; that our brains are nothing but the nodes of a larger brain, a superorganism, a hive mind. The devices worn by the Borgs of Star Trek are examples of collective intelligence technology. Uber would be a good example of this in the real world.
Of course the possibilities for collective intelligence exploded with the arrival of the Internet. The technical possibilities were profound. All inputs could be treated equally. You could have unbiased peer-review. It could weigh expertise, track who knows what, provide answers to very complex questions.
Thinkers such as Pierre Levy and Henry Jenkins carried the idea into our own times. Not only that, but we discovered, if we applied the economic ideas of Frederick Hayek, Joseph Stiglitz, Hal Varian and many others to the system, it could allow private transactions, protect intellectual property, put economic values on knowledge and so forth — all in real time, from people all over the world.
With the right algorithms, you’d get the fastest, most intelligent solutions possible. You see this technology play out with a lot of today’s MMO games. We’re currently doing this with QuestaGame and it’s pretty fun to watch.
But why has it taken so long? We’ve had the technology for over 40 years.
It turns out that humans may not be well-designed as a species to take advantage of such a technology. We’re much more accustomed to evaluating people not on WHAT they know — but on WHO they know; who their friends are, how they look, how many followers they have, their institutional affiliations, race, gender, class, buying power and so forth.
In such a context, it’s difficult to build what the German philosopher Johann Habermas called an “Ideal Speech” situation, where expertise can be contributed fairly, without bias, by everyone. To design Ideal Speech situations online it requires Ideal Speech situations offline, which we rarely, if ever, have had.
Aristotle writes about another form of Artificial Intelligence. He mentions robots. He writes about how, with robots, “managers wouldn’t need subordinates. Instruments could do their own work by intelligent anticipation.”
This second category of artificial intelligence is what might be called the Pygmalion approach.
Pygmalion, according to the ancient poet Ovid, was a sculptor — a recluse, maybe a bit nerdy, maybe a misogynist of sorts — who loved his imaginary ideal of women, but wasn’t very interested in women in the real world. In fact, he was deeply in love with a statue of a woman he’d created, so much so that he wished for it to come alive. A compassionate Aphrodite granted him that wish.
So the Pygmalion approach is where we create a new life form, a statue, a golem, a wooden puppet with a growing nose, a villainous mini-me, an operating system with the voice of Scarlet Johansson — whatever. It’s the stuff of computer scientists such as Charles Babbage. It’s an artificially intelligent entity that can process information on its own. It’s what Alan Turing called a “Thinking Machine.”
The concept goes way back. There are first hand accounts from China to Persia of dancing robots, working robots, talking robots and so forth.
Now let’s go back to our super complex diagram of AI.
The only difference between the AI of our time and the AI of ancient times is the amount of data gathered and the speed of gathering and processing it.
Each wave of tech innovation — from ships to trains to telegraphs — have allowed many more people to collect and interpret vast amounts of data. In the last decade, these two different forms of AI — “collective intel” and “thinking machines” — have started to converge and feed off each other; and I think we’ll see that the latest developments in AI and Citizen Science include a combination of collective intelligence and thinking machines. Indeed, it looks like this process will start happening more and more on its own. Thinking machines will ask their own questions, get their own answers and act upon them — and, of course, this is where things get a bit weird.
So there are some terrific opportunities here; but there are serious risks as well. Let me start with the risks. I’m going to give two really extreme, well documented examples of just how serious the dangers can be.
The first example is relatively well known. As the historian Edwin Black has written in an exhaustive book on the subject, Hitler couldn’t have carried out the genocides of WWII without this “thinking machine” here — the Hollerith machine and the punch cards that many of us are familiar with. They were great at gathering, recording, storing, analysing large amounts of data — and they were seemingly benign. It’s just data, right? But artificial intelligence, in the wrong hands, can become very dangerous. Especially when the data it collects is not owned and controlled by the people it’s collected from.
A less known example is what happened with this device here: The Telegraph — circa 1877. The British Empire was using the telegraph to gather massive amounts of information from the remotest corners of its realm.
But as the historian Mike Davis has clearly documented, the data was often biased — both the collection and the queries. Rather than transmit a representation of reality, the electronic telegraph sent out a kind of Linnaean shorthand account of the world as the British saw it: In India, it was all about the value of the rupee, threats of war with Russia to the north, legal rulings, the price of cotton, shipping news and so on.
In the eyes of the artificial intelligence of the time, India looked healthy. It was a grand time, often seen as the pinnacle of the British Empire. This is how India looked during the great Durbar of New Delhi in 1877.
But in reality, a lot of India looked like this.
Five and half million people would die of starvation in two years. Queries for information about India didn’t capture this data, because the data about Indians not having rice to eat wasn’t considered valuable to the people who controlled the technology.
Now I realise — wait a minute, we’re talking about Biodiversity here. Holocausts are a bit extreme, and I agree.
But I wouldn’t be the first communications scientist to suggest that the history of Imperialism — colonial expansion, empire building — is less a story of guns and weaponry than a story of data collection and processing power. It’s about big data and bigger data.
In a way, it’s the story of artificial intelligence.
And much of the Big Data of the 17th and 18th centuries was, as the communications scholar Daniel Headrick points out, the Linnaean system of classifying plants and animals. “Taxonomy,” writes Anne Fadiman in her essay ‘Collecting Nature, “is a form of imperialism. Take a bird or a lizard or a flower from Patagonia or the South Seas, perhaps one that has had a local name for centuries, rechristen it with a Latin binomial, and presto! It had become a tiny British colony.”
So yes, there are risks. But at the same time, unlike other forms of scientific endeavour, when it comes to biodiversity, a clock is ticking. The subject of our study is disappearing. There’s an urgency here. Galileo could study the stars at his own pace. They weren’t going anywhere. But we all know species are going extinct at an alarming rate (likely between 1000 and 10,000 times as fast as if humans weren’t part of the equation).
So how do we develop AI in a way that contributes to a social good without distorting reality or allowing the technology to dehumanise us?
The biggest risk may not be AI itself, but our differing ideas about what AI is and what we want to do with it.
It’s a classic “Tragedy of the Commons” situation. The technology exists today, right now, to foster greater ecological literacy among hundreds of millions of people. But at the same time, there’s also incentives to jump aboard the speeding trend of “Thinking Machines,” take the $300,000 per year salary, build a Talking Robot, be part of the headlines, increase followers on Facebook, dazzle the world with a “cool new app.” This week alone, Venture Capitalists invested another $40 million in “Thinking Robots.” Only a tiny fraction of that amount is invested in “Collective Intelligence” for the purpose of biodiversity mapping and education.
I’ll conclude now by suggesting some basic principles that might be worth considering when implementing AI projects:
The first — perhaps an overriding principle — is to put as much effort into developing a human understanding of nature as in developing AI. Celebrate and incentivise the human mind. Both its desire to learn and its desire to teach.
As part of my research, I posted a survey link to a forum of eBird users who had used the Merlin Bird ID app. It was a very crude survey, just 21 responses.
One of the questions was:
If Merlin Bird ID gave you a choice between training the AI or training humans how to identify birds, which would you choose?
Now it could be the case that most people just don’t care. But if we don’t include functionality that allows people to teach others, on their own terms, they may not know it’s even possible.
Another principle is about data ownership and agency. When I look around at the terms and conditions of large data collection apps in the field of biodiversity, I find it strange that AI isn’t mentioned. The options need to be very clear. The data is yours. Here are some different ways you can share it — and what the implications of that might be. Again, gaming mechanics have a lot to teach us here.
Data Bias — there’s a great book called Weapons of Math Destruction by Cathy O’Neil on this topic. Big Data is really hard to keep clean — or to even start clean. A recent paper by Julian Troudet in Scientific Reports, for example, points out some interesting — albeit unsurprising — biases and species gaps in the GBIF data.
But just as there will always be species gaps, there are also people gaps. Citizen Science is by far the largest contributor to GBIF data, and yet it’s but a tiny fraction of all the offline knowledge about biodiversity that exists in the world. This results in Group Bias — which means AI projects need to involve a greater variety of expertise — anthropologists, historians, poets — from a more diverse set of people around the world.
And then there’s Individual Bias — we have a tendency to trust machines, especially if we think they’re smarter than us. Cognitive bias is well documented in the works of the behavioural economists Daniel Kahneman and Amos Tversky — but a whole new set of bias is being discovered about our relationship with thinking machines (in fact, we may even tend to deify them).
There’s the Importance of outliers. This one’s tricky, but again, collective intelligence systems have ways to tackle this. (This presentation is an example of an outlier; its neglect a possible example of a poorly designed network).
And finally, Transparency, which is probably the biggest challenge. Because even if you design technologies that achieve all of the above, it’s difficult to convince people to trust them. Again, I think gaming systems have a lot to teach us here.
So ultimately we need to make smart decisions when developing AI. We can’t plead ignorance after the fact (just because Google and Facebook do it all the time). We need to work together; to use technologies in ways that don’t just help us understand biodiversity, but which strengthen our bond to it, and get us taking action to protect it. The latest AI offers an unprecedented opportunity to do this. Which is why I’m excited to have this workshop today and learn more about the various opportunities from all of you.
Thank you.