NeurIPS 2019
Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
Paper ID:4132
Title:Symmetry-adapted generation of 3d point sets for the targeted discovery of molecules

Reviewer 1

This paper introduces an autoregression model based on the SchNet. The technical parts seem correct, but the experiment section lacks baselines, like the state-of-the-art methods [1]. This paper is well written and organized clearly, yet there are still some points need to be clarified by the authors. In Equation (1), c_i is included in R_{\le i}, Z_{\le i}, so I’m wondering if it is redundant. I guess the authors want to highlight the role of c_i as the focus point, but we actually don’t need it in the final equation, as already illustrated in Equation (4) and Figure 1. According to Section 3, it seems like all the nodes are created sequentially, then what about the loop/ring in the graph? How to generate such a structure when we assume the new points are bonded to the focus point only? Or if the generation process will generate the same points twice when moving along the trajectory? In line 195, it implies that generated molecules -> SMILES -> bond counts, then what’s the difference with counting bounds directly on the generated graph? I think the work is a little incremental since the key building blocks come from the SchNet. In addition, the experiment section lacks the baselines like [1]. For the targeted discovery, it would be better to explore more tasks. For example, the remaining 11 tasks in QM9 all share the same molecules, so they can be trained quickly following the same pre-trained model. According to section 5.3, the targeted discovery is obtained by using some heuristics (data filtering), so it would be more convincing to compare the G-SchNet with other baseline methods to show its advantage in this setting. [1] You, J., Liu, B., Ying, Z., Pande, V., & Leskovec, J. (2018). Graph convolutional policy network for goal-directed molecular graph generation. In Advances in Neural Information Processing Systems (pp. 6410-6421).

Reviewer 2

The authors extend the existing SchNet architecture to a generative framework, called G-SchNet, which can generate rotational invariant 3d point sets. The auxiliary tokens introduced by the authors are very interesting, which can give some constraints on the generated points. The paper is a complete work with enough experiments to demonstrate the effectiveness of the proposed method. In all, the paper would have some practical impact on research.

Reviewer 3

[Originality] This paper proposes a new task of generating 3D geometry of molecules. I think the task is original and important for computational chemistry. The underlying generative model is a variant of SchNet that expands the molecule one atom at a time along with its distance to previous atoms. In that regard, the model is similar to GraphRNN (You et al., 2018), but operating over point clouds instead of graphs. The related work is mostly complete, but I think the author should discuss how is their method different from Mansimov et al., 2019, which is also a generative model for 3D molecular geometry. AFAIK, Mansimov et al.'s model only generates 3D geometry, while GSchNet learns to generate both the molecule (atoms and bonds) as well as their 3D geometry (distances). [Quality] The submission is technically sound. The experiment covers both conditional and unconditional generation tasks. Overall the experimental results support their claims. It compares against CGVAE in terms of how well the learned distribution matches the training set (number of different atoms, bonds, etc.). It would be better to also compare the distribution of chemical properties (logP, QED, etc.). [Clarity] Most of the parts are clear. What is unclear to me is 1) what is the relaxation procedure and why it is necessary for evaluation. 2) The advantage and disadvantage of the proposed method compared to Mansimov et al., 2019. [Significance] The paper developed a novel method addressing a important task in chemistry. I think other researchers will be interested in using the method for future research. ===================================================================================================================== Here is my response to the author rebuttal: 1) It is good that the authors provided comparison with Mansimov et al., 2019 (a very important baseline). The result shows that G-SchNet achieves better performance. It would be helpful if the authors can clarify more on the setup of Mansimov et al.'s model. 2) The authors explained about relaxation procedure. I think it would be better to provide some examples to illustrate how this would impact the performance (RMSD) in the future (e.g., in the appendix) 3) I agree with the authors regarding the novelty of the task (generating 3D molecular geometries). Based on these points, I will keep my original scores.