Learning competing risks across multiple hospitals: One-shot distributed algorithms
Zhang, D; Tong, J; Jing, N; et al., Journal of the American Medical Informatics Association
Published
April 2024
Journal
Journal of the American Medical Informatics Association
Abstract
Objectives: To characterize the complex interplay between multiple clinical conditions in a time-to-event analysis framework using data from multiple hospitals, we developed two novel one-shot distributed algorithms for competing risk models (ODACoR). By applying our algorithms to the EHR data from eight national children's hospitals, we quantified the impacts of a wide range of risk factors on the risk of post-acute sequelae of SARS-COV-2 (PASC) among children and adolescents. Materials and methods: Our ODACoR algorithms are effectively executed due to their devised simplicity and communication efficiency. We evaluated our algorithms via extensive simulation studies as applications to quantification of the impacts of risk factors for PASC among children and adolescents using data from eight children's hospitals including the Children's Hospital of Philadelphia, Cincinnati Children's Hospital Medical Center, Children's Hospital of Colorado covering over 6.5 million pediatric patients. The accuracy of the estimation was assessed by comparing the results from our ODACoR algorithms with the estimators derived from the meta-analysis and the pooled data. Results: The meta-analysis estimator showed a high relative bias (∼40%) when the clinical condition is relatively rare (∼0.5%), whereas ODACoR algorithms exhibited a substantially lower relative bias (∼0.2%). The estimated effects from our ODACoR algorithms were identical on par with the estimates from the pooled data, suggesting the high reliability of our federated learning algorithms. In contrast, the meta-analysis estimate failed to identify risk factors such as age, gender, chronic conditions history, and obesity, compared to the pooled data. Discussion: Our proposed ODACoR algorithms are communication-efficient, highly accurate, and suitable to characterize the complex interplay between multiple clinical conditions. Conclusion: Our study demonstrates that our ODACoR algorithms are communication-efficient and can be widely applicable for analyzing multiple clinical conditions in a time-to-event analysis framework.
Authors
Dazheng Zhang, Jiayi Tong, Naimin Jing, Yuchen Yang, Chongliang Luo, Yiwen Lu, Dimitri A Christakis, Diana Güthe, Mady Hornig, Kelly J Kelleher, Keith E Morse, Colin M Rogerson, Jasmin Divers, Raymond J Carroll, Christopher B Forrest, Yong Chen
Keywords
communication-efficient; competing risk model; distributed research network; federated learning; one-shot distributed algorithm
Short Summary
Doctors keep patient information in computer files called electronic health records (EHRs). RECOVER researchers can use these records to learn more about Long COVID, which is when someone feels sick for a long time after having COVID-19. But studying these records is not easy. It can be hard to get certain information from EHRs, when things that might explain why some people get Long COVID happen at different times. It can also be hard to put information from many hospitals together in one place and can cost a lot of money. In this study, RECOVER researchers made a new tool called ODACoR, which stands for “one-shot distributed algorithms for competing risks model.” This tool was used to look at the EHRs of 6.5 million kids and teens from 8 children’s hospitals. Researchers found that ODACoR was able to find information about things that could make children and teens more likely to get Long COVID. ODACoR could also combine information from different hospitals, which did not always work with old ways of studying health information. This tool gave the same results as if all the hospitals had shared all their information in one place, which is hard to do. This study is important because it can help doctors study other kinds of health problems using information from many hospitals.