Dissemin is shutting down on January 1st, 2025

Published in

Oxford University Press, Briefings in Bioinformatics, 2(23), 2022

DOI: 10.1093/bib/bbab590

Links

Tools

Export citation

Search in Google Scholar

An inductive transfer learning force field (ITLFF) protocol builds protein force fields in seconds

Journal article published in 2022 by Yanqiang Han ORCID, Zhilong Wang, An Chen, Imran Ali, Junfei Cai, Simin Ye, Jinjin Li ORCID
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Abstract Accurate simulation of protein folding is a unique challenge in understanding the physical process of protein folding, with important implications for protein design and drug discovery. Molecular dynamics simulation strongly requires advanced force fields with high accuracy to achieve correct folding. However, the current force fields are inaccurate, inapplicable and inefficient. We propose a machine learning protocol, the inductive transfer learning force field (ITLFF), to construct protein force fields in seconds with any level of accuracy from a small dataset. This process is achieved by incorporating an inductive transfer learning algorithm into deep neural networks, which learn knowledge of any high-level calculations from a large dataset of low-level method. Here, we use a double-hybrid density functional theory (DFT) as a case functional, but ITLFF is suitable for any high-precision functional. The performance of the selected 18 proteins indicates that compared with the fragment-based double-hybrid DFT algorithm, the force field constructed by ITLFF achieves considerable accuracy with a mean absolute error of 0.0039 kcal/mol/atom for energy and a root mean square error of 2.57 $\mathrm{kcal}/\mathrm{mol}/{Å}$ for force, and it is more than 30 000 times faster and obtains more significant efficiency benefits as the system increases. The outstanding performance of ITLFF provides promising prospects for accurate and efficient protein dynamic simulations and makes an important step toward protein folding simulation. Due to the ability of ITLFF to utilize the knowledge acquired in one task to solve related problems, it is also applicable for various problems in biology, chemistry and material science.