Sunghwan Kim - PubChem3D: Conformer generation

Document created by Sunghwan Kim on Jul 8, 2015Last modified by Sunghwan Kim on Jul 9, 2015
Version 3Show Document
  • View in full screen mode

  Publication Details (including relevant citation   information):

  E.E. Bolton, S.   Kim, and S.H. Bryant;

  Journal of Cheminformatics, 2011, 3, 4.




  PubChem, an open archive for the biological activities of small   molecules, provides search and analysis tools to assist users in   locating desired information. Many of these tools focus on the   notion of chemical structure similarity at some level. PubChem3D   enables similarity of chemical structure 3-D conformers to   augment the existing similarity of 2-D chemical structure graphs.   It is also desirable to relate theoretical 3-D descriptions of   chemical structures to experimental biological activity. As such,   it is important to be assured that the theoretical conformer   models can reproduce experimentally determined bioactive   conformations. In the present study, we investigate the effects   of three primary conformer generation parameters (the fragment   sampling rate, the energy window size, and force field variant)   upon the accuracy of theoretical conformer models, and determined   optimal settings for PubChem3D conformer model generation and   conformer sampling.


  Using the software package OMEGA from OpenEye Scientific   Software, Inc., theoretical 3-D conformer models were generated   for 25,972 small-molecule ligands, whose 3-D structures were   experimentally determined. Different values for primary conformer   generation parameters were systematically tested to find optimal   settings. Employing a greater fragment sampling rate than the   default did not improve the accuracy of the theoretical conformer   model ensembles. An ever increasing energy window did increase   the overall average accuracy, with rapid convergence observed at   10 kcal/mol and 15 kcal/mol for model building and torsion   search, respectively; however, subsequent study showed that an   energy threshold of 25 kcal/mol for torsion search resulted in   slightly improved results for larger and more flexible   structures. Exclusion of coulomb terms from the 94s variant of   the Merck molecular force field (MMFF94s) in the torsion search   stage gave more accurate conformer models at lower energy   windows. Overall average accuracy of reproduction of bioactive   conformations was remarkably linear with respect to both   non-hydrogen atom count ("size") and effective rotor count   ("flexibility"). Using these as independent variables, a   regression equation was developed to predict the RMSD accuracy of   a theoretical ensemble to reproduce bioactive conformations. The   equation was modified to give a minimum RMSD conformer sampling   value to help ensure that 90% of the sampled theoretical models   should contain at least one conformer within the RMSD sampling   value to a "bioactive" conformation.


  Optimal parameters for conformer generation using OMEGA were   explored and determined. An equation was developed that provides   an RMSD sampling value to use that is based on the relative   accuracy to reproduce bioactive conformations. The optimal   conformer generation parameters and RMSD sampling values   determined are used by the PubChem3D project to generate   theoretical conformer models.


  Address (URL):