Molecular Partition Coefficient from Machine Learning with Polarization and Entropy Embedded Atom- Centered Symmetry Functions

07 June 2022, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Efficient prediction of the partition coefficient ($\log P$) between polar and non-polar phases could shorten the cycle of drug and materials design. In this work, a descriptor, named $\langle q-ACSFs \rangle_{conf}$, is proposed to take the explicit polarization effects in polar phase and conformation ensemble of energetic and entropic significance in non-polar into considerations. The polarization effects are involved by embedding the partial charge directly derived from force fields or quantum chemistry calculations into the atom-centered symmetry functions (ACSFs), together with the entropy effects which are averaged according to Boltzmann distribution of different conformations taken from similarity matrix. The model was trained with the high-dimensional neural networks (HDNNs) on a public dataset PhysProp (with $41039$ samples). Satisfactory $\log P$ prediction performance was achieved on three other datasets, namely, Martel ($707$ molecules), Star \& Non-Star ($266$) and Huuskonen ($1870$). The present $\langle q-ACSFs \rangle_{conf}$ model was also applicable to the $n$-carboxylic acid with the number of carbon ranging from $2$ to $14$ and the $54$ kinds of organic solvents. It is easy to apply the present method to arbitrary sized systems and give a transferable atom-based partition coefficient.

Keywords

partition coefficient
high-dimensional neural networks
polarization
entropy

Supplementary materials

Title
Description
Actions
Title
Molecular Partition Coefficient from Machine Learning with Polarization and Entropy Embedded Atom- Centered Symmetry Functions
Description
Additional details in collected datasets, generation of descriptors, computational methods of Molecular Dynamics (MD) simulations and Quantum Mechanisms (QM), hyper-parameter optimization of high-dimensional neural networks, and contribution from distinct elements with different environments
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.