Insights and Challenges in Correcting Force Field Based Solvation Free Energies Using A Neural Network Potential

29 November 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

We present a comprehensive study investigating the potential gain in accuracy for calculating absolute solvation free energies (ASFE) using a neural network potential for the intramolecular energies. We calculated ASFE using the Open Force Field (OpenFF) and compared the results to previously calculated ASFEs employing the CHARMM General Force Field (CGenFF). By applying a nonequilibrium (NEQ) switching approach between the molecular mechanics (MM) description (either OpenFF or CGenFF) and the machine learning (ML)/MM level of theory (using ANI-2x as the ML potential), we attempt to improve the accuracy of the calculated ASFEs. The predictive performance of the results did not change when applying this approach to all 589 small molecules in the FreeSolv database that ANI-2x can describe. When selecting a subset of 156 molecules, focusing on compounds where the force fields performed poorly, we saw a slight improvement in the root-mean-square error (RMSE) and mean absolute error (MAE). The majority of our calculations utilized unidirectional NEQ protocols based on Jarzynski's equation. Additionally, we conducted bidirectional NEQ switching for the subset of 156 FreeSolv molecules. Notably, only a small fraction (10 out of 156) exhibited statistically significant discrepancies between unidirectional and bidirectional NEQ switching free energy estimates.

Keywords

Solvation free energy
Alchemical pathway
Indirect thermodynamic cycle
Neural Network Potential

Supplementary materials

Title
Description
Actions
Title
Supplementary material to the main manuscript
Description
We describe in detail how ASFEs were calculated on the MM level of theory and how the endstate corrections to the ANI-2x potential were applied for the two respective protocols (UVIE and EXS). In Figure S1 we show the error distribution of the MM calculated ASFEs for three different force fields. In Figure S2 and S3 we show the structures of the outliers observed in Figure 5 for the two investigated force fields. In Figure S4 we show several characteristics of the 10 compounds (of the 156 subset), for which the MM -> ML/MM correction differed by more than 1 kT when computed by Jarzynski's and Crooks' equation.
Actions
Title
Full results for 589 compounds from the FreeSolv database
Description
The ASFE on the MM and ML/MM level for the 589 compounds of the FreeSolv database calculated using protocol EXS are summarized
Actions
Title
Detailed results for the 156 compound subset
Description
The ASFE values on the MM and ML/MM level for the two force fields (OpenFF and CGenFF) for the 156 compounds of the combined dataset calculated with protocol EXS and UVIE are summarized. In the case of the correction for CGenFF, unidirectional values are indicated with "Jar", while bidirectional values are depicted with "Crooks"
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.