Visualizing the progress of machine learning

Enrique Dans
Enrique Dans
4 min readAug 13, 2017

--

In the photo, a copy of Bonnier. For most people, it’s just an old book. But for thousands of Biology students from Europe, it came to symbolize the torture that was the Botany exam, which involved identifying, using Bonnier, a certain number of plant species, among which there were always some grasses with particularly convoluted taxonomies and based on minimally discernible attributes that required the use of a binocular magnifying glass. An arduous, boring task, and one that to be carried out well required a certain experience. By all accounts, identifying botanical species continues to be carried out in the same way and the Bonnier is still used, albeit a newer version, and it is even required for some exams to become a teacher (link in Spanish).

In my days in the mid-1980s, we used a Bonnier exactly that in the illustration (it always looked old, although the edition was from1972), a paperback edition on poor quality paper, in French, and I was obsessed with the idea of ​​digitalizing it. I even prepared a database and a simple interface for it with, dBASE and Clipper, all on MS-DOS. Had I followed through, which I did not when I realized the Herculean task of digitizing all those files and attributes of several thousand species of vascular plants, instead of turning over through the pages of the book, I could have used a screen, and in all honesty it would not have been much of a breakthrough.

Now, more than thirty years later, an article in Nature referenced on Boing Boing, “Artificial intelligence identifies plant species for science”, explains the development of a machine learning algorithm: after training it with some 260,000 digitized images of more than a thousand species of plants in herbaria around the world — it is estimated that there are around 3,000 herbariums of a certain size in the world, with a total of about 350 million samples, of which only a small part has been digitized — the algorithm is able to identify the plant with success rates of 80% (in 90% of cases, the species was among the first five choices of the algorithm). These success rates outweigh those of botanists who are experts in taxonomy (in my time, to pass the exam we needed to correctly identify three plants out of a total of five, and we were just third year students).

When you see an algorithm capable of carrying out a task whose difficulty you are able to appreciate in terms of your experience, you realize the potential of machine learning. My rudimentary attempt at digitization was simply about making it easier to identify plant species. Now all that is required is to show the algorithm the digitized image of the plant, and it immediately identifies its genre and species, with an 80% success rate.

What are the implications for the development of botany? In a few years, given the performance of the algorithm and the necessary corrections, no researcher will be able to determine the species of a plant without the aid of the corresponding algorithm: the few able to will be retirees with their Bonnier on the bookshelf. Of course it was only thanks to the knowledge of those professionals that it was possible to train the algorithm, and those professionals will be assigned other tasks in a world in which it will no longer be necessary to invest time or effort in identifying a plant, because that will be done automatically. It will be necessary to change the way the discipline is taught, to include other types of exercises, other materials and other disciplines, thus expanding the frontiers of knowledge. Will the idea of ​​losing the ability to determine plants by eye be a great loss? No. Just as today hardly anyone can write in cuneiform on a clay tablet.

Does this have anything to do with ​​”intelligent” robots? No, an algorithm that classifies plant species is doing something that until recently, only a human could do, but it is not intelligence: it is simply capable of carrying out a very definite task based on a series of attributes. Apply the algorithm to something else, and you would need to make a whole series of adjustments. Intelligence is something else.

Are robots going to replace botanists? No. This is about applying human intelligence to more important tasks, freeing up resources that were not being optimized… it’s about progress in the discipline. And might professionals refuse to collaborate training an algorithm training for fear of being eventually replaced? The idea is absurd, practically offensive. The question we need to be asking ourselves is how many things that we consider exclusively human today will end up being carried out by algorithms, and how many more things will that free us to do?

(En español, aquí)

--

--

Enrique Dans
Enrique Dans

Professor of Innovation at IE Business School and blogger (in English here and in Spanish at enriquedans.com)