2026-04-20 中国科学院(CAS)
<関連情報>
- https://english.cas.cn/newsroom/research-news/202604/t20260421_1157776.shtml
- https://academic.oup.com/aob/advance-article-abstract/doi/10.1093/aob/mcag050/8507051
ロリッパ属(アブラナ科) における倍数体複合体を特定するためのフレームワーク:形質進化、標本記録、機械学習の組み合わせ A framework for identifying the polyploid complex in Rorippa (Brassicaceae): combining trait evolution, herbarium records and machine learning
Ting-Shen Han ,Jun-Xian Lv ,Yao-Wu Xing
Annals of Botany Published:05 March 2026
DOI:https://doi.org/10.1093/aob/mcag050
Abstract
Background and Aims
Species identification in polyploid plants remains challenging owing to morphological continuity and genomic redundancy. Such taxonomic uncertainties obscure evolutionary or ecological inference. A critical solution involves the reassessment of polyploid collections using stable diagnostic traits and integrative approaches. Here, we examined the Rorippa dubia–indica complex (Brassicaceae), a morphologically overlapping tetraploid–hexaploid lineage with a native distribution in East Asia.
Methods
We developed a framework that integrates experimental phenotyping, herbarium reassessment and computational modelling for secondary species assessment of polyploid plants. The framework incorporates spatiotemporal data from 3136 field-collected (2017–2020) and 2015 herbarium (1893–2021) specimens. Species were circumscribed using experimental assessments of anatomical, cytological and morphological traits, interpreted within a phylogenetically informed evolutionary context. Stable diagnostic traits were then applied to re-identify specimens for improved species distribution models. Finally, curated trait and species data were used to train machine learning classification models to reconstruct the diagnostic rationale underlying specimen identification.
Key Results
Seed arrangement, number of petals and genome size exhibited clear interspecific differentiation. Phylogenomic analyses based on chloroplast genomes further resolved species circumscription consistent with these traits. According to the revision of specimens and classification models defined by machine learning, we found that initial misidentification rates reached 12–50 % across virtual or physical specimens, largely owing to reliance on plastic traits, such as leaf shape. These errors substantially distorted spatial distribution models and future climate projections.
Conclusions
Our findings underscore the need for secondary specimen evaluation. The framework demonstrates the importance of integrating morphological and phylogenetic inference with machine learning tools to resolve taxonomically difficult polyploid complexes. This approach offers direct applications for biodiversity assessment, evolutionary research and conservation planning.


