Published in

Elsevier, Molecular and Cellular Proteomics, 1(15), p. 329-339, 2016

DOI: 10.1074/mcp.m114.047126

Links

Tools

Export citation

Search in Google Scholar

Integrated transcriptomic-proteomic analysis using a proteogenomic workflow refines rat genome annotation

Journal article published in 2015 by Dhirendra Kumar, Amit Kumar Yadav, Xinying Jia ORCID, Jason Mulvenna, Debasis Dash
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

Proteogenomic re-annotation and mRNA splicing information can lead to discovery of various protein forms for eukaryotic model organisms like rat. However, detection of novel proteoforms from mass spectrometry (MS) proteomics data remains a formidable challenge. We developed EuGenoSuite, an open source multi-algorithmic proteomic search tool and utilized it in our in-house integrated transcriptomic-proteomic (ITP) pipeline to facilitate automated proteogenomic analysis. Using four proteogenomic pipelines (ITP, Peppy, Enosi and ProteoAnnotator) on publically available RNA-seq and MS proteomics data, we discovered 363 novel peptides in rat brain microglia which indicated novel proteoforms for 249 gene loci in rat genome. These novel peptides aided in the discovery of novel exons, translation of annotated untranslated regions (UTRs), pseudogenes and splice variants for various loci; many of which have known disease associations including neurological disorders like Schizophrenia, Amyotrophic Lateral Sclerosis etc. Novel isoforms were also discovered for genes implicated in cardiovascular diseases and breast cancer for which rats are considered model organisms. Our integrative multi-omics data analysis not only enables discovery of new proteoforms but also generates a better reference for human disease studies in model organisms.