Optimization of binding affinities in chemical space with generative pre-trained transformer and deep reinforcement learning

Xu, Xiaopeng; Zhou, Juexiao; Zhu, Chen; Zhan, Qing; Li, Zhongxiao; Zhang, Ruochi; Wang, Yu; Liao, Xingyu; Gao, Xin

Published in

F1000Research, F1000Research, (12), p. 757, 2023

DOI: 10.12688/f1000research.130936.1

F1000Research, F1000Research, (12), p. 757, 2024

DOI: 10.12688/f1000research.130936.2

Tools

Export citation

Search in Google Scholar

Optimization of binding affinities in chemical space with generative pre-trained transformer and deep reinforcement learning

Journal article published in 2024 by Xiaopeng Xu

, Juexiao Zhou, Chen Zhu, Qing Zhan, Zhongxiao Li

, Ruochi Zhang

, Yu Wang, Xingyu Liao, Xin Gao

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving forbidden

Postprint: archiving forbidden

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

Background: The key challenge in drug discovery is to discover novel compounds with desirable properties. Among the properties, binding affinity to a target is one of the prerequisites and usually evaluated by molecular docking or quantitative structure activity relationship (QSAR) models. Methods: In this study, we developed Simplified molecular input line entry system Generative Pre-trained Transformer with Reinforcement Learning (SGPT-RL), which uses a transformer decoder as the policy network of the reinforcement learning agent to optimize the binding affinity to a target. SGPT-RL was evaluated on the Moses distribution learning benchmark and two goal-directed generation tasks, with Dopamine Receptor D2 (DRD2) and Angiotensin-Converting Enzyme 2 (ACE2) as the targets. Both QSAR model and molecular docking were implemented as the optimization goals in the tasks. The popular Reinvent method was used as the baseline for comparison. Results: The results on Moses benchmark showed that SGPT-RL learned good property distributions and generated molecules with high validity and novelty. On the two goal-directed generation tasks, both SGPT-RL and Reinvent were able to generate valid molecules with improved target scores. The SGPT-RL method achieved better results than Reinvent on the ACE2 task, where molecular docking was used as the optimization goal. Further analysis shows that SGPT-RL learned conserved scaffold patterns during exploration. Conclusions: The superior performance of SGPT-RL in the ACE2 task indicates that it can be applied to the virtual screening process where molecular docking is widely used as the criteria. Besides, the scaffold patterns learned by SGPT-RL during the exploration process can assist chemists to better design and discover novel lead candidates.

Published in

Links

Tools

Optimization of binding affinities in chemical space with generative pre-trained transformer and deep reinforcement learning

Abstract