AIによる説明機能が放射線診断精度を向上（Medical diagnoses: how AI explanations help doctors）

2026-05-25

2026-05-22 ミュンヘン大学（LMU）

ドイツ・ミュンヘン大学（LMU Munich）の研究チームは、医療診断支援AIにおいて、AIが提示する「説明」が医師の診断精度や信頼性に与える影響を調査した。研究では、AIによる診断結果だけでなく、その根拠や推論過程を提示することで、医師がAI提案をより適切に評価できるかを検証した。その結果、説明付きAIは医師の理解や判断支援に有効である一方、不適切または過度に説得力のある説明は誤診リスクを高める可能性もあることが分かった。特に、AI説明が十分に正確でない場合でも、医師がAI判断を過信する傾向が確認された。研究チームは、医療AIにおける「説明可能性（Explainable AI）」は単なる透明化ではなく、医師が批判的に活用できる形で設計する必要があると指摘している。成果は、AIと医師の協働診断システム設計や、安全で信頼性の高い医療AI開発に重要な知見を提供するものである。

Radiological images such as CT and MRI scans were at the heart of the study. This MRI image of a skull shows diffuse contrast-enhancing lesions in the brain. It is the job of radiologists to correctly classify these as, for example, inflammation, a tumor, or multiple sclerosis. With the right clinical questions, AI can provide support in reaching a diagnosis© NEJM

＜関連情報＞

大規模言語モデルによる医学的説明が放射線診断の精度に及ぼす影響 The effect of medical explanations from large language models on diagnostic accuracy in radiology

Philipp Spitzer,Daniel Hendriks,Jan Rudolph,Sarah Schlaeger,Jens Ricke,Niklas Kühl,Boj Friedrich Hoppe & Stefan Feuerriegel
npj Digital Medicine Published:23 April 2026
DOI:https://doi.org/10.1038/s41746-026-02619-0

Abstract

Large language models (LLMs) are increasingly used by physicians for diagnostic support. A key advantage of LLMs is the ability to generate explanations that can help physicians understand the reasoning behind a diagnosis. However, the best-suited format for LLM-generated explanations remains unclear. In this large-scale study, we examined the effect of different formats for LLM explanations on clinical decision-making. For this, we conducted a randomized experiment with radiologists reviewing patient cases with radiological images (N = 2020 assessments). Participants received either no LLM support (control group) or were supported by one of three LLM-generated explanations: (1) a standard output providing the diagnosis without explanation; (2) a differential diagnosis comparing multiple possible diagnoses; or (3) a chain-of-thought explanation offering a detailed reasoning process for the diagnosis. We find that the format of explanations significantly influences diagnostic accuracy. The chain-of-thought explanations yielded the best performance, improving the diagnostic accuracy by 12.2% compared to the control condition without LLM support (P = 0.001). The chain-of-thought explanations are also superior to the standard output without explanation ( + 7.2%; P = 0.040) and the differential diagnosis format ( + 9.7%; P = 0.004). We further assessed the robustness of these findings across case difficulty and different physician backgrounds, such as general vs. specialized radiologists. Evidently, in the controlled setting of our vignette study, explaining the reasoning for a diagnosis helps physicians to identify and correct potential errors in LLM predictions and thus improve overall decisions. Altogether, the results highlight the importance of explanations in medical LLMs to support the reasoning processes of physicians, so that medical LLMs can improve diagnostic performance and, ultimately, patient outcomes.

月	火	水	木	金	土	日
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31