2025-03-27 ケンブリッジ大学
Microscopic images showing healthy villi (left) and diseased villi (right). (Credit: Florian Jaeckle)
<関連情報>
- https://www.cam.ac.uk/stories/AI-and-coeliac-disease#article
- https://ai.nejm.org/doi/full/10.1056/AIoa2400738
機械学習が病理学者レベルのセリアック病診断を達成 Machine Learning Achieves Pathologist-Level Celiac Disease Diagnosis
Florian Jaeckle, Ph.D., James Denholm, Ph.D., Benjamin Schreiber, Ph.D., Shelley C. Evans, B.Med.Sci. (Hons)., Mike N. Wicks, M.Sc., James Y. H. Chan, M.B.B.S., Adrian C. Bateman, M.D., Sonali Natu, M.D., Mark J. Arends, Ph.D., and Elizabeth Soilleux, Ph.D.
New England Journal of Medicine AI Published: March 27, 2025
Abstract
Background
The diagnosis of celiac disease (CD), an autoimmune disorder with an estimated global prevalence of around 1%, generally relies on the histologic examination of duodenal biopsies. However, interpathologist agreement for CD diagnosis is estimated at no more than 80%. We aim to improve CD diagnosis by developing an accurate, machine-learning-based diagnostic classifier.
Methods
We present a machine learning model that diagnoses the presence or absence of CD from a set of duodenal biopsies representative of real-world clinical data. Our model was trained on a diverse dataset of 3383 whole-slide images of hematoxylin- and eosin-stained duodenal biopsies from four hospitals featuring five different WSI scanners along with their clinical diagnoses. We trained our model using the multiple-instance-learning paradigm in a weakly supervised manner with cross-validation. We evaluated it on an independent test set featuring 644 unseen scans from a different regional NHS trust. In addition, we compared the model’s predictions with independent diagnoses from four specialist pathologists on a subset of the test data.
Results
Our model diagnosed CD in an independent test set from a previously unseen source with accuracy, sensitivity, and specificity exceeding 95% and an area under the receiver operating characteristic curve exceeding 99%. These results indicate that the model has the potential to outperform pathologists. In comparing the model’s predictions with diagnoses on unseen test data from four independent pathologists, we found statistically indistinguishable results between pathologist–pathologist and pathologist–model interobserver agreement (P>96%).
Conclusions
Our model achieved pathologist-level performance in diagnosing the presence or absence of CD from a representative set of duodenal biopsies, including biopsies from a previously unseen hospital. We concluded that our model has the potential to accurately identify or rule out CD, thereby significantly reducing the time required for pathologists to make a diagnosis. (Funded by the National Institute of Health and Care [NIHR205502] and others.)