Published in

Wiley, Pharmacoepidemiology & Drug Safety, 2023

DOI: 10.1002/pds.5717

Links

Tools

Export citation

Search in Google Scholar

IncidencePrevalence: An R package to calculate population‐level incidence rates and prevalence using the OMOP common data model

This paper was not found in any repository, but could be made available legally by the author.
This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Green circle
Preprint: archiving allowed
Orange circle
Postprint: archiving restricted
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

AbstractPurposeReal‐world data (RWD) offers a valuable resource for generating population‐level disease epidemiology metrics. We aimed to develop a well‐tested and user‐friendly R package to compute incidence rates and prevalence in data mapped to the observational medical outcomes partnership (OMOP) common data model (CDM).Materials and MethodsWe created IncidencePrevalence, an R package to support the analysis of population‐level incidence rates and point‐ and period‐prevalence in OMOP‐formatted data. On top of unit testing, we assessed the face validity of the package. To do so, we calculated incidence rates of COVID‐19 using RWD from Spain (SIDIAP) and the United Kingdom (CPRD Aurum), and replicated two previously published studies using data from the Netherlands (IPCI) and the United Kingdom (CPRD Gold). We compared the obtained results to those previously published, and measured execution times by running a benchmark analysis across databases.ResultsIncidencePrevalence achieved high agreement to previously published data in CPRD Gold and IPCI, and showed good performance across databases. For COVID‐19, incidence calculated by the package was similar to public data after the first‐wave of the pandemic.ConclusionFor data mapped to the OMOP CDM, the IncidencePrevalence R package can support descriptive epidemiological research. It enables reliable estimation of incidence and prevalence from large real‐world data sets. It represents a simple, but extendable, analytical framework to generate estimates in a reproducible and timely manner.