2023-10-06 ニューヨーク大学 (NYU)
◆研究の焦点はRNAスプライシングと呼ばれる遺伝子情報伝達プロセスで、このプロセスをトレースし、予測するモデルを開発しました。このモデルは、RNA内の特定の構造がスプライシングを制御することを示し、実験によってその洞察が確認されました。この研究は、ニューラルネットワークの透明性を向上させ、生物学的プロセスの理解を深める重要な一歩です。
<関連情報>
- https://www.nyu.edu/about/news-publications/news/2023/october/researchers-create-a-neural-network-for-genomics-one-that-explai.html
- https://www.pnas.org/doi/10.1073/pnas.2221165120
解釈可能な機械学習でRNAスプライシングロジックを解読する Deciphering RNA splicing logic with interpretable machine learning
Susan E. Liao, Mukund Sudarshan, and Oded Regev
Proceedings of the National Academy of Sciences Published:October 5, 2023
DOI:https://doi.org/10.1073/pnas.2221165120
Significance
Machine learning approaches are increasingly applied to advancing discovery in the biological sciences. However, despite achieving predictive accuracy, many machine learning models cannot explain how they achieve their predictive success. Here, we demonstrate that bespoke data generation coupled with model design that infuses foundational biological knowledge enables an “interpretable-by-design” approach that advances our understanding of RNA splicing. Our model not only accurately predicts the quantitative splicing outcomes but also explains how specific combinations of RNA features dictate splicing outcomes. We validate the network predictions and interpretations through additional data generation and experimental validation. These results demonstrate that “interpretable-by-design” machine learning represents a powerful approach to harnessing the potential of machine learning toward advancing our understanding of biological processes.
Abstract
Machine learning methods, particularly neural networks trained on large datasets, are transforming how scientists approach scientific discovery and experimental design. However, current state-of-the-art neural networks are limited by their uninterpretability: Despite their excellent accuracy, they cannot describe how they arrived at their predictions. Here, using an “interpretable-by-design” approach, we present a neural network model that provides insights into RNA splicing, a fundamental process in the transfer of genomic information into functional biochemical products. Although we designed our model to emphasize interpretability, its predictive accuracy is on par with state-of-the-art models. To demonstrate the model’s interpretability, we introduce a visualization that, for any given exon, allows us to trace and quantify the entire decision process from input sequence to output splicing prediction. Importantly, the model revealed uncharacterized components of the splicing logic, which we experimentally validated. This study highlights how interpretable machine learning can advance scientific discovery.