BioMed Central, Journal of Translational Medicine, 1(21), 2023
DOI: 10.1186/s12967-023-04711-5
Full text: Download
Abstract Background Causative genetic variants cannot yet be found for many disorders with a clear heritable component, including chronic fatigue disorders like myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS). These conditions may involve genes in difficult-to-align genomic regions that are refractory to short read approaches. Structural variants in these regions can be particularly hard to detect or define with short reads, yet may account for a significant number of cases. Long read sequencing can overcome these difficulties but so far little data is available regarding the specific analytical challenges inherent in such regions, which need to be taken into account to ensure that variants are correctly identified. Research into chronic fatigue disorders faces the additional challenge that the heterogeneous patient populations likely encompass multiple aetiologies with overlapping symptoms, rather than a single disease entity, such that each individual abnormality may lack statistical significance within a larger sample. Better delineation of patient subgroups is needed to target research and treatment. Methods We use nanopore sequencing in a case of unexplained severe fatigue to identify and fully characterise a large inversion in a highly homologous region spanning the AKR1C gene locus, which was indicated but could not be resolved by short-read sequencing. We then use GC–MS/MS serum steroid analysis to investigate the functional consequences. Results Several commonly used bioinformatics tools are confounded by the homology but a combined approach including visual inspection allows the variant to be accurately resolved. The DNA inversion appears to increase the expression of AKR1C2 while limiting AKR1C1 activity, resulting in a relative increase of inhibitory GABAergic neurosteroids and impaired progesterone metabolism which could suppress neuronal activity and interfere with cellular function in a wide range of tissues. Conclusions This study provides an example of how long read sequencing can improve diagnostic yield in research and clinical care, and highlights some of the analytical challenges presented by regions containing tandem arrays of genes. It also proposes a novel gene associated with a novel disease aetiology that may be an underlying cause of complex chronic fatigue. It reveals biomarkers that could now be assessed in a larger cohort, potentially identifying a subset of patients who might respond to treatments suggested by the aetiology.