Full text: Download
Abstract Background The use of electronic patient records for assessing outcomes in clinical trials is a methodological strategy intended to drive faster and more cost-efficient acquisition of results. The aim of this manuscript was to outline the data collection and management considerations of a maternity and perinatal clinical trial using data from electronic patient records, exemplifying the DESiGN Trial as a case study. Methods The DESiGN Trial is a cluster randomised control trial assessing the effect of a complex intervention versus standard care for identifying small for gestational age foetuses. Data on maternal/perinatal characteristics and outcomes including infants admitted to neonatal care, parameters from foetal ultrasound and details of hospital activity for health-economic evaluation were collected at two time points from four types of electronic patient records held in 22 different electronic record systems at the 13 research clusters. Data were pseudonymised on site using a bespoke Microsoft Excel macro and securely transferred to the central data store. Data quality checks were undertaken. Rules for data harmonisation of the raw data were developed and a data dictionary produced, along with rules and assumptions for data linkage of the datasets. The dictionary included descriptions of the rationale and assumptions for data harmonisation and quality checks. Results Data were collected on 182,052 babies from 178,350 pregnancies in 165,397 unique women. Data availability and completeness varied across research sites; each of eight variables which were key to calculation of the primary outcome were completely missing in median 3 (range 1–4) clusters at the time of the first data download. This improved by the second data download following clarification of instructions to the research sites (each of the eight key variables were completely missing in median 1 (range 0–1) cluster at the second time point). Common data management challenges were harmonising a single variable from multiple sources and categorising free-text data, solutions were developed for this trial. Conclusions Conduct of clinical trials which use electronic patient records for the assessment of outcomes can be time and cost-effective but still requires appropriate time and resources to maximise data quality. A difficulty for pregnancy and perinatal research in the UK is the wide variety of different systems used to collect patient data across maternity units. In this manuscript, we describe how we managed this and provide a detailed data dictionary covering the harmonisation of variable names and values that will be helpful for other researchers working with these data. Trial registration Primary registry and trial identifying number: ISRCTN 67698474. Registered on 02/11/16.