Wengong Jin
1 min readJul 7, 2018



I am the author of this MIT paper. Thanks for pointing out some missing references. I am happy to cite them on arXiv.

Regarding the two AstraZeneca paper you mentioned, they trained their model on >1 million compounds from ChemBL, while ours are trained on 250K compounds from ZINC, extracted by (Gómez-Bombarelli et al., 2016). So it’s not directly comparable. The “state-of-the-art” model with 43.5% validity was trained on ZINC as well. Therefore, I disagree that our paper is “dubious”.

To conclude, I think your post is misleading and slanderous.

