機械学習とハーバリウム標本を用いて、分類群内および分類群間の形質-気候関係を分析する。 Analyzing trait-climate relationships within and among taxa using machine learning and herbarium specimens
Brendan C. Wilde, Jason G. Bragg, William Cornwell
American Journal of Botany Published: 12 April 2023
Continental-scale leaf trait studies can help explain how plants survive in different environments, but large data sets are costly to assemble at this scale. Automating the measurement of digitized herbarium collections could rapidly expand the data available to such studies. We used machine learning to identify and measure leaves from existing, digitized herbarium specimens. The process was developed, validated, and applied to analyses of relationships between leaf size and climate within and among species for two genera: Syzygium (Myrtaceae) and Ficus (Moraceae).
Convolutional neural network (CNN) models were used to detect and measure complete leaves in images. Predictions of a model trained with a set of 35 randomly selected images and a second model trained with 35 user-selected images were compared using a set of 50 labeled validation images. The validated models were then applied to 1227 Syzygium and 2595 Ficus specimens digitized by the National Herbarium of New South Wales, Australia. Leaf area measurements were made for each genus and used to examine links between leaf size and climate.
The user-selected training method for Syzygium found more leaves (9347 vs. 8423) using fewer training masks (218 vs. 225), and found leaves with a greater range of sizes than the random image training method. Within each genus, leaf size was positively associated with temperature and rainfall, consistent with previous observations. However, within species, the associations between leaf size and environmental variables were weaker.
CNNs detected and measured leaves with levels of accuracy useful for trait extraction and analysis and illustrate the potential for machine learning of herbarium specimens to massively increase global leaf trait data sets. Within-species relationships were weak, suggesting that population history and gene flow have a strong effect at this level. Herbarium specimens and machine learning could expand sampling of trait data within many species, offering new insights into trait evolution.