2026-01-20 バッファロー大学(UB)
<関連情報>
- https://www.buffalo.edu/news/releases/2026/01/Okere-hospitalization-study.html
- https://informatics.bmj.com/content/32/1/e101742
地域健康調査データを用いた心血管リスク因子を持つ患者の入院リスクおよび90日以内の再入院リスクを予測する機械学習モデルの開発 Development of machine learning models to predict risk of hospitalisation and 90-day readmission among patients with cardiovascular risk factors using community health survey data
Arinze Nkemdirim Okere,Tianfeng Li,Md Mohaimenul Islam,…
BMJ Health & Care Informatics Published:31 December 2025

Abstract
Objectives This study aimed to develop and validate machine learning (ML) models to predict all-cause hospital admissions and 90-day readmissions using structured, patient-reported survey data.
Methods A cross-sectional survey was conducted between 3 July 2021 and 18 December 2022, among US adults aged ≥18 years with at least one cardiovascular risk factor. Participants were recruited through social media, community pharmacies and outpatient clinics. The final sample included 1318 participants. Primary outcomes were any all-cause hospitalisation and readmission within 90 days. Eight supervised ML models were trained using an 80:20 train–test split and 10-fold cross-validation. Model performance was evaluated using area under the receiver operating characteristic curve (AUROC), precision, recall, F1 score and calibration metrics. SHapley Additive exPlanations (SHAP) values identified key predictors.
Results Among 1318 participants, 35.0% reported at least one hospitalisation and 10.4% reported a 90-day readmission. The Extra Trees (ET) model demonstrated the best performance across both outcomes. For hospitalisation, ET achieved an AUROC of 0.93, precision of 0.83 and recall of 0.87. For readmission, AUROC was 0.99 with precision of 0.95 and recall of 0.96. SHAP analysis identified heart disease, medication burden, race/ethnicity, employment and insurance status as the most influential predictors.
Discussion Patient-reported data reflecting behavioural, social and clinical factors can predict hospitalisations with high accuracy, complementing traditional EHR-based models.
Conclusions Integrating such patient-reported and behavioural data into electronic health records could enable earlier identification of high-risk individuals and support targeted, preventive interventions to improve healthcare outcomes.


