Published in

Cambridge University Press, Bjpsych Open, 2(9), 2023

DOI: 10.1192/bjo.2022.636



Export citation

Search in Google Scholar

DRAGON-Data: a platform and protocol for integrating genomic and phenotypic data across large psychiatric cohorts

This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO


Background Current psychiatric diagnoses, although heritable, have not been clearly mapped onto distinct underlying pathogenic processes. The same symptoms often occur in multiple disorders, and a substantial proportion of both genetic and environmental risk factors are shared across disorders. However, the relationship between shared symptoms and shared genetic liability is still poorly understood. Aims Well-characterised, cross-disorder samples are needed to investigate this matter, but few currently exist. Our aim is to develop procedures to purposely curate and aggregate genotypic and phenotypic data in psychiatric research. Method As part of the Cardiff MRC Mental Health Data Pathfinder initiative, we have curated and harmonised phenotypic and genetic information from 15 studies to create a new data repository, DRAGON-Data. To date, DRAGON-Data includes over 45 000 individuals: adults and children with neurodevelopmental or psychiatric diagnoses, affected probands within collected families and individuals who carry a known neurodevelopmental risk copy number variant. Results We have processed the available phenotype information to derive core variables that can be reliably analysed across groups. In addition, all data-sets with genotype information have undergone rigorous quality control, imputation, copy number variant calling and polygenic score generation. Conclusions DRAGON-Data combines genetic and non-genetic information, and is available as a resource for research across traditional psychiatric diagnostic categories. Algorithms and pipelines used for data harmonisation are currently publicly available for the scientific community, and an appropriate data-sharing protocol will be developed as part of ongoing projects (DATAMIND) in partnership with Health Data Research UK.