タンパク質間相互作用を原子レベルで予測する生成AIモデルを開発（Novel Generative AI Model Enables Atomic-Scale Prediction of Protein-Protein Interactions）

2026-06-17

2026-06-16 中国科学院（CAS）

中国科学院上海有機化学研究所の研究チームは、タンパク質間相互作用を原子レベルで予測・設計できる生成AIモデル「Void-X」を開発した。本研究はPNAS誌に掲載された。タンパク質間相互作用の正確な予測は、抗体医薬や糖尿病治療薬などの開発を加速し、新たな治療法創出につながる重要課題である。従来のAIによるタンパク質設計は、まず全体構造（足場）を生成し、その後に結合を最適化するトップダウン型が主流であった。一方、Void-Xは原子の充填状態に着目したボトムアップ型手法を採用し、タンパク質界面の原子間空隙を埋めるように原子クラスターを直接生成することで、安定な相互作用を設計する。研究チームはProtein Data Bankに登録された実験構造から800万超の球状原子クラスターを作成し、その約30％の原子を隠した状態で学習させた。その結果、モデルは1億7200万個のパラメータを持ち、同一鎖内原子クラスターで78.3％、異なる鎖間で68.2％の予測精度を達成した。Void-Xは創薬、タンパク質工学、合成生物学などにおける合理的な分子設計を支援する新たな技術基盤として期待される。

＜関連情報＞

Void-X：タンパク質中の原子パッキングを予測するための生成型空隙充填モデル Void-X: A generative void-filling model for predicting atomic packing in proteins

Jing Yang, Junying Yuan, and James J. Chou
Proceedings of the National Academy of Sciences Published:June 9, 2026
DOI:https://doi.org/10.1073/pnas.2607035123

Significance

Computational protein design is advancing rapidly from designing protein folds to designing specific protein–protein and protein–drug interactions for expediting drug discovery. The majority of the recent design studies employ a top–down approach whereby the overall protein shape is first generated to pack against a given structural site, followed by sequence design to optimize the interaction. Here, we report a different approach wherein an atomic filling model was developed for learning atomic-level interactions. This model, named Void-X, comprises 172 million parameters and achieves a success rate of 53 to 78% depending on the different types of test applications. Void-X represents a prompt-answer model for predicting atomic packing at protein interaction interfaces, with potentially broad application in designing epitope-specific protein binders.

Abstract

Generative AI algorithms such as the transformer and diffusion models have greatly empowered de novo design of proteins capable of specifically interacting with designated structural sites on another protein. Most of these design methods employ a top–down approach, in which an overall protein shape is generated by an AI model to pack against a given structural site, followed by sequence design to optimize the interaction. Despite being trained on limited protein complex structures available in the database, the top–down approach has yielded encouraging results. Here, we propose a bottom–up approach that generates atom clusters for optimal packing against a specified structured region for informing the design of protein–protein interactions. To this end, we trained a masked discrete diffusion model, named Void-X, that uses the diffusion transformer to learn atomic-level interactions and fill atomic voids in protein interaction interfaces. Void-X was trained using 8.7 million spherical clusters of atoms from experimental structures in the Protein Data Bank. In each cluster, ~70% of the atoms are used as context (or prompt), and ~30% are masked for information recovery (or answer). By training the model with 172 million parameters, Void-X achieves an overall accuracy of 78.3% and 68.2% for intra- and interchain spherical clusters, respectively. Furthermore, we find that information entropy is a reliable indicator of the prediction accuracy for Void-X. This level of performance allows de novo generation of molecular interactions at the atomic level, offering an alternative approach of protein design complementary to the existing ones.

月	火	水	木	金	土	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30