Dissemin is shutting down on January 1st, 2025

Published in

Wiley, Biometrics, 4(78), p. 1674-1685, 2021

DOI: 10.1111/biom.13512

Links

Tools

Export citation

Search in Google Scholar

Efficient odds ratio estimation under two‐phase sampling using error‐prone data from a multi‐national HIV research cohort

Distributing this paper is prohibited by the publisher
Distributing this paper is prohibited by the publisher

Full text: Unavailable

Red circle
Preprint: archiving forbidden
Red circle
Postprint: archiving forbidden
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

AbstractPersons living with HIV engage in routine clinical care, generating large amounts of data in observational HIV cohorts. These data are often error‐prone, and directly using them in biomedical research could bias estimation and give misleading results. A cost‐effective solution is the two‐phase design, under which the error‐prone variables are observed for all patients during Phase I, and that information is used to select patients for data auditing during Phase II. For example, the Caribbean, Central, and South America network for HIV epidemiology (CCASAnet) selected a random sample from each site for data auditing. Herein, we consider efficient odds ratio estimation with partially audited, error‐prone data. We propose a semiparametric approach that uses all information from both phases and accommodates a number of error mechanisms. We allow both the outcome and covariates to be error‐prone and these errors to be correlated, and selection of the Phase II sample can depend on Phase I data in an arbitrary manner. We devise a computationally efficient, numerically stable EM algorithm to obtain estimators that are consistent, asymptotically normal, and asymptotically efficient. We demonstrate the advantages of the proposed methods over existing ones through extensive simulations. Finally, we provide applications to the CCASAnet cohort.