2026-01-14 テキサス大学オースチン校(UT Austin)

<関連情報>
- https://news.utexas.edu/2026/01/14/ut-undergrad-uncovers-country-specific-factors-linked-to-improved-cancer-outcomes/
- https://www.sciencedirect.com/science/article/pii/S0923753425062751
機械学習により、世界のがん発症の国別要因が明らかに Machine learning reveals country-specific drivers of global cancer outcomes
M.S. Patel, C.S. Pramesh, N.N. Sanford, E.J.G. Feliciano, P.L. Nguyen, P. Iyengar, T.P. Kingham, J. Willmann, B.A. Mahal, N.Y. Lee, M.J.K. Magsanoc-Alikpala, M. Mutebi, J.F. Wu, J.P.G. Robredo, E.C. Dee
Annals of Oncology Available online: 14 January 2026
DOI:https://doi.org/10.1016/j.annonc.2025.11.014
Highlights
- Machine learning used to identify country-level cancer outcome drivers.
- SHAP analysis reveals key national policy levers for reducing cancer mortality.
- GDP per capita, radiotherapy, and UHC are major global contributors to outcomes.
- Web tool provides country-specific cancer system insights for policymakers.
- Model predicts mortality-to-incidence ratio with high accuracy (R2 = 0.852).
Background
Global inequities in access to cancer diagnostics and treatment contribute to wide variation in cancer mortality-to-incidence ratios (MIRs), a proxy for survival. We aimed to develop an interpretable machine learning framework to quantify country-specific health system contributors to MIR and inform policy prioritization.
Materials and methods
We assembled national MIRs from GLOBOCAN 2022 for 185 countries and health system indicators from multilateral sources, including gross domestic product (GDP) per capita, universal health coverage (UHC) index, radiotherapy centers per population, health spending (%GDP), out-of-pocket expenditure, work force densities (physicians; nurses/midwives; surgical work force), pathology availability, Human Development Index, and gender inequality index. A CatBoost gradient-boosting model was trained with repeated leave-one-country-out cross-validation (10 repeats; 1850 predictions). Nested hyperparameter optimization and strict leakage control were used. Model interpretability employed SHapley Additive exPlanations (SHAP; TreeExplainer) to generate global and country-level feature attributions. SHAP values, model-derived metrics quantifying each factor’s contribution to cancer outcomes, were generated. Performance metrics included R2, root mean squared error (RMSE), mean absolute error, and Pearson correlation; uncertainty was estimated by bootstrap resampling.
Results
The model showed strong out-of-sample performance [R2 = 0.852, 95% confidence interval (CI) 0.801-0.891; RMSE 0.057, 95% CI 0.050-0.064]; correlation between predicted and observed MIRs was r = 0.923 (P = 8.30 × 10-78). Global SHAP contributions ranked GDP per capita (22.5%), radiotherapy centers per population (15.4%), and UHC index (12.9%) as the leading determinants. Country-specific SHAP profiles revealed substantial heterogeneity in dominant drivers across settings, enabling tailored policy levers (e.g. infrastructure, coverage expansion, or financial protection). An accompanying web interface provides country-level SHAP summaries for decision support.
Conclusions
An explainable machine learning approach accurately predicts national MIRs and decomposes predictions into country-specific health system attributions. While ecological and noncausal by design, the SHAP profiles translate population-level associations into actionable hypotheses for prioritizing investments—highlighting, across many contexts, radiotherapy capacity and UHC expansion as recurrent levers, and underscoring that higher total health spending alone may be insufficient without strategic allocation. Prospective, country-specific evaluations are warranted to test whether targeting model-identified drivers improve cancer outcomes.

