You didn’t need deep learning to generate new molecules

Mostapha Benhenda
Sep 21, 2018 · 4 min read

Molecule generation is a hot topic in AI for drug discovery. In a previous blog post, I exposed how some methods had issues with generating diversity. Since then, new papers appeared to address this problem (sometimes without citing my paper raising it). Most of the proposed solutions are quite complicated, like those from Insilico Medicine, Harvard-Toronto-Insilico Medicine, Israel Institute of Technology, or Stanford.

However, these complicated solutions are probably not necessary, because of another paper, much simpler, from a team led by Koji Tsuda, at the university of Tokyo. They propose a genetic algorithm, called ChemGE, for Chemistry Grammatical Evolution.

They consider a fitness score, which evaluates the desired output: druglikeness, expected activity, and so on. They start from a random collection of molecules, they select the fittest half of the population, and eliminate others (selection). Next, they double the surviving population by random sampling (reproduction), and they randomly tweak the chemical formula of the newborn molecules (mutation). They iterate until the population of molecules is acceptable.

The Tokyo team compared this genetic algorithm with their own deep reinforcement learning algorithm, ChemTS, inspired by AlphaGo. They found that ChemGE performed at least similarly to ChemTS: generated molecules achieved good fitness, and they were sufficiently different from each other. Moreover, ChemGE was much faster.

ChemTS, the deep reinforcement learning baseline

Genetic algorithms have a long history in molecule generation. They date back to 1995 at least, with a paper by Glen and Payne, from Wellcome labs (an ancestor of the big pharma GSK).

Before genetic algorithms, there were even other methods, like this 1989 paper from Abbott Labs (an ancestor of the big pharma Abbvie). You can check this 1994 survey for more details (consider Sci-Hub, if you didn’t subscribe). This historical background can explain why so many people in the pharma industry are skeptical about deep learning: they keep wondering whether it brings anything new to the table.

So the main contribution of this Tokyo paper is the benchmark GA vs. DRL, which is reasonably well performed (an earlier benchmark was also attempted by BenevolentAI, but it was poorly executed, see an older blog post). The result should not be too surprising: similar observations were made about video games tasks. In April 2017, an OpenAI team, led by Ilya Sutskever, compared deep reinforcement learning with evolution strategies, and they often found that both have comparable performance.

Another interesting fact is that this Tokyo paper remains largely under-noticed. It appeared in April 2018, more than five months ago, but apparently, the conclusion didn’t fit the narrative or agenda of mainstream tech journalism, industry and academia. This paper makes the field look embarrassingly cheap, that’s bad news for business. The community is often reluctant to perform careful benchmarks. For example, there is this tweet by Olexandr Isayev, an academic from the university of North Carolina, author of a recent paper about deep reinforcement learning:

Olexandr Isayev should know that the adoption of deep learning followed a benchmark, the famous 2012 ImageNet, in which a deep learning classifier, AlexNet, outperformed alternatives by a wide margin. No benchmark, no deep learning revolution.

Shallow vs. Deep learning: it’s all about benchmarks

Benchmarks are the way to track progress, or lack thereof. I outlined a benchmark proposal in February 2018, and I am still looking for sponsors.

In the meantime, if you meet marketing people, from industry or academia, who want you to adopt their deep learning solution, ask them: how is it better than older, simpler, and faster methods ?


Koji Tsuda reacts:

Anvita Gupta, from Stanford (paper here), reacts (multi-tweet thread):

The AI Lab

Perspectives on AI research and industry

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store