Oxford University Press, Bioinformatics, 15(37), p. 2088-2094, 2021
DOI: 10.1093/bioinformatics/btab062
Full text: Unavailable
Abstract Motivation Hi-C matrices are cornerstones for qualitative and quantitative studies of genome folding, from its territorial organization to compartments and topological domains. The high dynamic range of genomic distances probed in Hi-C assays reflects in an inherent stochastic background of the interactions matrices, which inevitably convolve the features of interest with largely non-specific ones. Results Here, we introduce and discuss essHi-C, a method to isolate the specific or essential component of Hi-C matrices from the non-specific portion of the spectrum compatible with random matrices. Systematic comparisons show that essHi-C improves the clarity of the interaction patterns, enhances the robustness against sequencing depth of topologically associating domains identification, allows the unsupervised clustering of experiments in different cell lines and recovers the cell-cycle phasing of single-cells based on Hi-C data. Thus, essHi-C provides means for isolating significant biological and physical features from Hi-C matrices. Availability and implementation The essHi-C software package is available at https://github.com/stefanofranzini/essHIC. Supplementary information Supplementary data are available at Bioinformatics online.