Biases arising from linked administrative data for epidemiological research: a conceptual framework from registration to analyses

Shaw, Richard J.; Harron, Katie L.; Pescarini, Julia M.; Pinto Junior, Elzo Pereira; Allik, Mirjam; Siroky, Andressa N.; Campbell, Desmond; Dundas, Ruth; Ichihara, Maria Yury; Leyland, Alastair H.; Barreto, Mauricio L.; Katikireddi, Srinivasa Vittal

Published in

Springer, European Journal of Epidemiology, 12(37), p. 1215-1224, 2022

DOI: 10.1007/s10654-022-00934-w

Tools

Export citation

Search in Google Scholar

Biases arising from linked administrative data for epidemiological research: a conceptual framework from registration to analyses

Journal article published in 2022 by Richard J. Shaw

, Katie L. Harron

, Julia M. Pescarini

, Elzo Pereira Pinto Junior

, Mirjam Allik

, Andressa N. Siroky

, Desmond Campbell

, Ruth Dundas

, Maria Yury Ichihara

, Alastair H. Leyland

, Mauricio L. Barreto

, Srinivasa Vittal Katikireddi

This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Preprint: archiving allowed

Upload

Postprint: archiving restricted

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

AbstractLinked administrative data offer a rich source of information that can be harnessed to describe patterns of disease, understand their causes and evaluate interventions. However, administrative data are primarily collected for operational reasons such as recording vital events for legal purposes, and planning, provision and monitoring of services. The processes involved in generating and linking administrative datasets may generate sources of bias that are often not adequately considered by researchers. We provide a framework describing these biases, drawing on our experiences of using the 100 Million Brazilian Cohort (100MCohort) which contains records of more than 131 million people whose families applied for social assistance between 2001 and 2018. Datasets for epidemiological research were derived by linking the 100MCohort to health-related databases such as the Mortality Information System and the Hospital Information System. Using the framework, we demonstrate how selection and misclassification biases may be introduced in three different stages: registering and recording of people’s life events and use of services, linkage across administrative databases, and cleaning and coding of variables from derived datasets. Finally, we suggest eight recommendations which may reduce biases when analysing data from administrative sources.

Published in

Links

Tools

Biases arising from linked administrative data for epidemiological research: a conceptual framework from registration to analyses

Abstract