Mostapha Benhenda
1 min readJan 11, 2018

--

It’s an interesting experiment indeed, this ETH Zurich team applied the AstraZeneca method. They make an allusion to the diversity problem here:

Importantly, the newly generated molecules populate the chemical space of the training data, residing within the RXR/PPAR region of the fine-tuning set (Figure 2).

In support of this claim, I only found this sketchy figure 2:

I didn’t find their quantitative evaluation of diversity, and in particular, how they mathematically define this keyword 'populate’. Tell me if you see something.

That’s where I said that an important work remains to be done. In this paper, they seem to only show a visualization.

Visualizations are good for data exploration, and they are cool for a popular science blog, but to get a real grasp of what is going on, it’s necessary to introduce equations, and quantitative metrics. Where are they?

That’s especially important when you want to compare different architectures: for example, to benchmark this AstraZeneca RL method against the Harvard ORGAN. Which one ‘populates’ better?

Therefore, from the viewpoint of the quantitative evaluation of diversity, this ETH Zurich paper doesn’t seem better than both the AstraZeneca paper and the Harvard paper (but the experimental part looks cool).

--

--