GCparagon: evaluating and correcting GC biases in cell-free DNA at the fragment level

Spiegl, Benjamin; Kapidzic, Faruk; Röner, Sebastian; Kircher, Martin; Speicher, Michael R

Published in

Oxford University Press, NAR Genomics and Bioinformatics, 4(5), 2023

DOI: 10.1093/nargab/lqad102

Tools

Export citation

Search in Google Scholar

GCparagon: evaluating and correcting GC biases in cell-free DNA at the fragment level

Journal article published in 2023 by Benjamin Spiegl

, Faruk Kapidzic, Sebastian Röner, Martin Kircher

, Michael R Speicher

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

Abstract Analyses of cell-free DNA (cfDNA) are increasingly being employed for various diagnostic and research applications. Many technologies aim to increase resolution, e.g. for detecting early-stage cancer or minimal residual disease. However, these efforts may be confounded by inherent base composition biases of cfDNA, specifically the over - and underrepresentation of guanine (G) and cytosine (C) sequences. Currently, there is no universally applicable tool to correct these effects on sequencing read-level data. Here, we present GCparagon, a two-stage algorithm for computing and correcting GC biases in cfDNA samples. In the initial step, length and GC base count parameters are determined. Here, our algorithm minimizes the inclusion of known problematic genomic regions, such as low-mappability regions, in its calculations. In the second step, GCparagon computes weights counterbalancing the distortion of cfDNA attributes (correction matrix). These fragment weights are added to a binary alignment map (BAM) file as alignment tags for individual reads. The GC correction matrix or the tagged BAM file can be used for downstream analyses. Parallel computing allows for a GC bias estimation below 1 min. We demonstrate that GCparagon vastly improves the analysis of regulatory regions, which frequently show specific GC composition patterns and will contribute to standardized cfDNA applications.

Published in

Links

Tools

GCparagon: evaluating and correcting GC biases in cell-free DNA at the fragment level

Abstract