Explainable deep transfer learning model for disease risk prediction using high-dimensional genomic data

Liu, Long; Meng, Qingyu; Weng, Cherry; Lu, Qing; Wang, Tong; Wen, Yalu

Published in

Public Library of Science, PLoS Computational Biology, 7(18), p. e1010328, 2022

DOI: 10.1371/journal.pcbi.1010328

Tools

Export citation

Search in Google Scholar

Explainable deep transfer learning model for disease risk prediction using high-dimensional genomic data

Journal article published in 2022 by Long Liu

, Qingyu Meng

, Cherry Weng, Qing Lu

, Tong Wang, Yalu Wen

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

Building an accurate disease risk prediction model is an essential step in the modern quest for precision medicine. While high-dimensional genomic data provides valuable data resources for the investigations of disease risk, their huge amount of noise and complex relationships between predictors and outcomes have brought tremendous analytical challenges. Deep learning model is the state-of-the-art methods for many prediction tasks, and it is a promising framework for the analysis of genomic data. However, deep learning models generally suffer from the curse of dimensionality and the lack of biological interpretability, both of which have greatly limited their applications. In this work, we have developed a deep neural network (DNN) based prediction modeling framework. We first proposed a group-wise feature importance score for feature selection, where genes harboring genetic variants with both linear and non-linear effects are efficiently detected. We then designed an explainable transfer-learning based DNN method, which can directly incorporate information from feature selection and accurately capture complex predictive effects. The proposed DNN-framework is biologically interpretable, as it is built based on the selected predictive genes. It is also computationally efficient and can be applied to genome-wide data. Through extensive simulations and real data analyses, we have demonstrated that our proposed method can not only efficiently detect predictive features, but also accurately predict disease risk, as compared to many existing methods.

Published in

Links

Tools

Explainable deep transfer learning model for disease risk prediction using high-dimensional genomic data

Abstract