Published in

MDPI, Machine Learning and Knowledge Extraction, 3(3), p. 720-739, 2021

DOI: 10.3390/make3030036

Links

Tools

Export citation

Search in Google Scholar

Artificial Neural Network Analysis of Gene Expression Data Predicted Non-Hodgkin Lymphoma Subtypes with High Accuracy

Journal article published in 2021 by Joaquim Carreras ORCID, Rifat Hamoudi ORCID
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

Predictive analytics using artificial intelligence is a useful tool in cancer research. A multilayer perceptron neural network used gene expression data to predict the lymphoma subtypes of 290 cases of non-Hodgkin lymphoma (GSE132929). The input layer included both the whole array of 20,863 genes and a cancer transcriptome panel of 1769 genes. The output layer was lymphoma subtypes, including follicular lymphoma, mantle cell lymphoma, diffuse large B-cell lymphoma, Burkitt lymphoma, and marginal zone lymphoma. The neural networks successfully classified the cases consistent with the lymphoma subtypes, with an area under the curve (AUC) that ranged from 0.87 to 0.99. The most relevant predictive genes were LCE2B, KNG1, IGHV7_81, TG, C6, FGB, ZNF750, CTSV, INGX, and COL4A6 for the whole set; and ARG1, MAGEA3, AKT2, IL1B, S100A7A, CLEC5A, WIF1, TREM1, DEFB1, and GAGE1 for the cancer panel. The characteristic predictive genes for each lymphoma subtypes were also identified with high accuracy (AUC = 0.95, incorrect predictions = 6.2%). Finally, the topmost relevant 30 genes of the whole set, which belonged to apoptosis, cell proliferation, metabolism, and antigen presentation pathways, not only predicted the lymphoma subtypes but also the overall survival of diffuse large B-cell lymphoma (series GSE10846, n = 414 cases), and most relevant cancer subtypes of The Cancer Genome Atlas (TCGA) consortium including carcinomas of breast, colorectal, lung, prostate, and gastric, melanoma, etc. (7441 cases). In conclusion, neural networks predicted the non-Hodgkin lymphoma subtypes with high accuracy, and the highlighted genes also predicted the survival of a pan-cancer series.