Published in

Oxford University Press (OUP), Bioinformatics, 15(30), p. 2105-2113

DOI: 10.1093/bioinformatics/btu162

Links

Tools

Export citation

Search in Google Scholar

Filling annotation gaps in yeast genomes using genome-wide contact maps

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

MOTIVATIONS:De novo sequencing of genomes is followed by annotation analyses aiming at identifying functional genomic features such as genes, non-coding RNAs or regulatory sequences, taking advantage of diverse datasets. These steps sometimes fail at detecting non-coding functional sequences: for example, origins of replication, centromeres and rDNA positions have proven difficult to annotate with high confidence. Here, we demonstrate an unconventional application of Chromosome Conformation Capture (3C) technique, which typically aims at deciphering the average 3D organization of genomes, by showing how functional information about the sequence can be extracted solely from the chromosome contact map.RESULTS:Specifically, we describe a combined experimental and bioinformatic procedure that determines the genomic positions of centromeres and ribosomal DNA clusters in yeasts, including species where classical computational approaches fail. For instance, we determined the centromere positions in Naumovozyma castellii, where these coordinates could not be obtained previously. Although computed centromere positions were characterized by conserved synteny with neighboring species, no consensus sequences could be found, suggesting that centromeric binding proteins or mechanisms have significantly diverged. We also used our approach to refine centromere positions in Kuraishia capsulata and to identify rDNA positions in Debaryomyces hansenii. Our study demonstrates how 3C data can be used to complete the functional annotation of eukaryotic genomes.AVAILABILITY AND IMPLEMENTATION:The source code is provided in the Supplementary Material. This includes a zipped file with the Python code and a contact matrix of Saccharomyces cerevisiae.CONTACT:romain.koszul@pasteur.frSUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.