Full text: Download
Proteogenomic re-annotation and mRNA splicing information can lead to discovery of various protein forms for eukaryotic model organisms like rat. However, detection of novel proteoforms from mass spectrometry (MS) proteomics data remains a formidable challenge. We developed EuGenoSuite, an open source multi-algorithmic proteomic search tool and utilized it in our in-house integrated transcriptomic-proteomic (ITP) pipeline to facilitate automated proteogenomic analysis. Using four proteogenomic pipelines (ITP, Peppy, Enosi and ProteoAnnotator) on publically available RNA-seq and MS proteomics data, we discovered 363 novel peptides in rat brain microglia which indicated novel proteoforms for 249 gene loci in rat genome. These novel peptides aided in the discovery of novel exons, translation of annotated untranslated regions (UTRs), pseudogenes and splice variants for various loci; many of which have known disease associations including neurological disorders like Schizophrenia, Amyotrophic Lateral Sclerosis etc. Novel isoforms were also discovered for genes implicated in cardiovascular diseases and breast cancer for which rats are considered model organisms. Our integrative multi-omics data analysis not only enables discovery of new proteoforms but also generates a better reference for human disease studies in model organisms.