Learning from local to global: An efficient distributed algorithm for modeling time-to-event data

Duan, Rui; Luo, Chongliang; Schuemie, Martijn J.; Tong, Jiayi; Liang, C. Jason; Liang, J. C.; Chang, Howard H.; Boland, Mary Regina; Bian, Jiang; Xu, Hua; Holmes, John H.; Forrest, Christopher B.; Morton, Sally C.; Berlin, Jesse A.; Moore, Jason H.; Mahoney, Kevin B.; Chen, Yong

Published in

Oxford University Press, JAMIA: A Scholarly Journal of Informatics in Health and Biomedicine, 7(27), p. 1028-1036, 2020

DOI: 10.1093/jamia/ocaa044

Tools

Export citation

Search in Google Scholar

Learning from local to global: An efficient distributed algorithm for modeling time-to-event data

Journal article published in 2020 by Rui Duan, Chongliang Luo

, Martijn J. Schuemie

, Jiayi Tong, C. Jason Liang, J. C. Liang, Howard H. Chang, Mary Regina Boland

, Jiang Bian, Hua Xu

, John H. Holmes, Christopher B. Forrest, Sally C. Morton, Jesse A. Berlin, Jason H. Moore and other authors.

This paper is made freely available by the publisher.

Full text: Download

Preprint: archiving allowed

Upload

Postprint: archiving restricted

Upload

Published version: archiving forbidden

Policy details

Data provided by

Abstract

Abstract Objective We developed and evaluated a privacy-preserving One-shot Distributed Algorithm to fit a multicenter Cox proportional hazards model (ODAC) without sharing patient-level information across sites. Materials and Methods Using patient-level data from a single site combined with only aggregated information from other sites, we constructed a surrogate likelihood function, approximating the Cox partial likelihood function obtained using patient-level data from all sites. By maximizing the surrogate likelihood function, each site obtained a local estimate of the model parameter, and the ODAC estimator was constructed as a weighted average of all the local estimates. We evaluated the performance of ODAC with (1) a simulation study and (2) a real-world use case study using 4 datasets from the Observational Health Data Sciences and Informatics network. Results On the one hand, our simulation study showed that ODAC provided estimates nearly the same as the estimator obtained by analyzing, in a single dataset, the combined patient-level data from all sites (ie, the pooled estimator). The relative bias was <0.1% across all scenarios. The accuracy of ODAC remained high across different sample sizes and event rates. On the other hand, the meta-analysis estimator, which was obtained by the inverse variance weighted average of the site-specific estimates, had substantial bias when the event rate is <5%, with the relative bias reaching 20% when the event rate is 1%. In the Observational Health Data Sciences and Informatics network application, the ODAC estimates have a relative bias <5% for 15 out of 16 log hazard ratios, whereas the meta-analysis estimates had substantially higher bias than ODAC. Conclusions ODAC is a privacy-preserving and noniterative method for implementing time-to-event analyses across multiple sites. It provides estimates on par with the pooled estimator and substantially outperforms the meta-analysis estimator when the event is uncommon, making it extremely suitable for studying rare events and diseases in a distributed manner.

Published in

Links

Tools

Learning from local to global: An efficient distributed algorithm for modeling time-to-event data

Abstract