Published in

Oxford University Press, Bioinformatics, 10(38), p. 2899-2911, 2022

DOI: 10.1093/bioinformatics/btac185

Links

Tools

Export citation

Search in Google Scholar

Annotating regulatory elements by heterogeneous network embedding

Journal article published in 2022 by Yurun Lu, Zhanying Feng ORCID, Songmao Zhang, Yong Wang ORCID
This paper was not found in any repository, but could be made available legally by the author.
This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

Abstract Motivation Regulatory elements (REs), such as enhancers and promoters, are known as regulatory sequences functional in a heterogeneous regulatory network to control gene expression by recruiting transcription regulators and carrying genetic variants in a context specific way. Annotating those REs relies on costly and labor-intensive next-generation sequencing and RNA-guided editing technologies in many cellular contexts. Results We propose a systematic Gene Ontology Annotation method for Regulatory Elements (RE-GOA) by leveraging the powerful word embedding in natural language processing. We first assemble a heterogeneous network by integrating context specific regulations, protein–protein interactions and gene ontology (GO) terms. Then we perform network embedding and associate regulatory elements with GO terms by assessing their similarity in a low dimensional vector space. With three applications, we show that RE-GOA outperforms existing methods in annotating TFs’ binding sites from ChIP-seq data, in functional enrichment analysis of differentially accessible peaks from ATAC-seq data, and in revealing genetic correlation among phenotypes from their GWAS summary statistics data. Availability and implementation The source code and the systematic RE annotation for human and mouse are available at https://github.com/AMSSwanglab/RE-GOA. Supplementary information Supplementary data are available at Bioinformatics online.