2026-06-24 慶應義塾大学医学部,東京科学大学,静岡大学
◆研究では、約10年間にわたり収集された500名超・約1,300件の世界最大級の精神疾患会話コーパスを活用し、統合失調症患者104名と健常者101名の半構造化面接データを対象に解析を実施した。形態・統語・意味・談話にわたる76種類の言語特徴量を抽出し、因子分析によって代表指標を選定した結果、①格助詞使用の低下、②語彙の意味的類似度の上昇、③副詞使用頻度の低下、という3つの特徴が統合失調症と強く関連することが判明した。これらを組み合わせたモデルは、独立データでAUC=0.87という高い判別性能を示した。
◆本研究は、統合失調症の言語障害を「文法的明示性の低下」「意味空間の狭まり」「文脈調整機能の低下」という3要素で説明する新たな枠組みを提示するものであり、将来的な客観的評価や症状モニタリングへの応用が期待される。
<関連情報>
- https://www.keio.ac.jp/ja/press-release/20260624-press-01/
- https://www.keio.ac.jp/fixed-files/20260624-press-01-v4wl3upx.pdf
- https://www.cambridge.org/core/journals/psychological-medicine/article/an-integrative-nlp-framework-identifies-multilevel-linguistic-phenotypes-of-schizophrenia-across-tasks/C1B85B1D109CC9A88FF5F875DF15D279
統合的な自然言語処理フレームワークは、タスク全体にわたる統合失調症の多段階言語表現型を特定する An integrative NLP framework identifies multilevel linguistic phenotypes of schizophrenia across tasks
Hironobu Nakamura,Yoshinobu Kano,Genichi Sugihara,Ryo Takemura,Yusei Yamaguchi,Masaaki Shimizu,Shunsuke Takagi,Mari Iizuka,Saaya Tashiro and Momoko Kitazawa,…
Psychological Medicine Published:23 June 2026
DOI:https://doi.org/10.1017/S0033291726104668

Abstract
Background
Linguistic abnormalities in schizophrenia (SCZ) span morphological, syntactic, semantic, and discourse levels. Converging cross-linguistic evidence suggests that SCZ may involve semantic narrowing alongside reduced syntactic differentiation, yet how these changes co-occur across linguistic domains and whether they represent core, task-general disturbances remains unclear. We applied a multilevel NLP framework to a large Japanese dataset to identify structurally related linguistic markers of SCZ across elicitation contexts.
Methods
Speech from 104 patients with SCZ and 101 healthy controls was collected through semi-structured interviews. Transcripts from free conversation, storytelling, and picture description were analyzed using GiNZA, Word2Vec, TF-IDF, and SentenceBERT to extract 76 morphosyntactic, semantic, and discourse features. Factor analysis identified representative features independent of diagnosis, which were tested using generalized estimating equations and validated with bootstrap and permutation procedures. Cross-task stability was examined to determine core linguistic markers.
Results
In free conversation, reduced Case-particle (Kakujoshi) and Adverb use and increased Mean Pairwise Word Similarity were strongly associated with SCZ (AUC = 0.87, 95% CI: 0.74–0.97). Adverbial, case-particle, and semantic-network measures functioned as cross-task markers.
Conclusions
SCZ involves multidimensional language disturbances characterized by a tripartite linguistic phenotype of diminished morphosyntactic explicitness, semantic narrowing, and reduced modification-based contextual modulation in spontaneous discourse. Extending cross-linguistic evidence, our results indicate that lexical-semantic contraction co-occurs with reduced overt marking of argument relations in Japanese, alongside weakened adverbial elaboration and framing – suggesting convergent, largely task-general dimensions of SCZ language pathology, most evident in free conversation.

