Published in

MDPI, Molecules, 9(27), p. 2980, 2022

DOI: 10.3390/molecules27092980

Links

Tools

Export citation

Search in Google Scholar

UnbiasedDTI: Mitigating Real-World Bias of Drug-Target Interaction Prediction by Using Deep Ensemble-Balanced Learning

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

Drug-target interaction (DTI) prediction through in vitro methods is expensive and time-consuming. On the other hand, computational methods can save time and money while enhancing drug discovery efficiency. Most of the computational methods frame DTI prediction as a binary classification task. One important challenge is that the number of negative interactions in all DTI-related datasets is far greater than the number of positive interactions, leading to the class imbalance problem. As a result, a classifier is trained biased towards the majority class (negative class), whereas the minority class (interacting pairs) is of interest. This class imbalance problem is not widely taken into account in DTI prediction studies, and the few previous studies considering balancing in DTI do not focus on the imbalance issue itself. Additionally, they do not benefit from deep learning models and experimental validation. In this study, we propose a computational framework along with experimental validations to predict drug-target interaction using an ensemble of deep learning models to address the class imbalance problem in the DTI domain. The objective of this paper is to mitigate the bias in the prediction of DTI by focusing on the impact of balancing and maintaining other involved parameters at a constant value. Our analysis shows that the proposed model outperforms unbalanced models with the same architecture trained on the BindingDB both computationally and experimentally. These findings demonstrate the significance of balancing, which reduces the bias towards the negative class and leads to better performance. It is important to note that leaning on computational results without experimentally validating them and by relying solely on AUROC and AUPRC metrics is not credible, particularly when the testing set remains unbalanced.