Oxford University Press, Bioinformatics, 6(38), p. 1497-1503, 2022
DOI: 10.1093/bioinformatics/btac010
Full text: Unavailable
Abstract Motivation CRAM has established itself as a high compression alternative to the BAM file format for DNA sequencing data. We describe updates to further improve this on modern sequencing instruments. Results With Illumina data CRAM 3.1 is 7–15% smaller than the equivalent CRAM 3.0 file, and 50–70% smaller than the corresponding BAM file. Long-read technology shows more modest compression due to the presence of high-entropy signals. Availability and implementation The CRAM 3.0 specification is freely available from https://samtools.github.io/hts-specs/CRAMv3.pdf. The CRAM 3.1 improvements are available in a separate OpenSource HTScodecs library from https://github.com/samtools/htscodecs, and have been incorporated into HTSlib. Supplementary information Supplementary data are available at Bioinformatics online.