Nakanishi Printing Co, Microbes and environments / JSME, 3(30), p. 208-220, 2015
Full text: Download
Whole-genome sequencing has emerged as one of the most effective means to elucidate the biological roles and molecular features of obligate intracellular symbionts (endosymbionts). However, the de novo assembly of an endosymbiont genome remains a challenge when host and/or mitochondrial DNA sequences are present in a dataset and hinder the assembly of the genome. By focusing on the traits of genome evolution in endosymbionts, we herein developed and investigated a genome-assembly strategy that consisted of two consecutive procedures: the selection of endosymbiont contigs from an output obtained from a de novo assembly performed using a TBLASTX search against a reference genome, named TBLASTX Contig Selection and Filtering (TCSF), and the iterative reassembling of the genome from reads mapped on the selected contigs, named Iterative Mapping and ReAssembling (IMRA), to merge the contigs. In order to validate this approach, we sequenced two strains of the cockroach endosymbiont Blattabacterium cuenoti and applied this strategy to the datasets. TCSF was determined to be highly accurate and sensitive in contig selection even when the genome of a distantly related free-living bacterium was used as a reference genome. Furthermore, the use of IMRA markedly improved sequence assemblies: the genomic sequence of an endosymbiont was almost completed from a dataset containing only 3% of the sequences of the endosymbiont's genome. The efficiency of our strategy may facilitate further studies on endosymbionts.