2026-06-16 中国科学院(CAS)
<関連情報>
- https://english.cas.cn/newsroom/research-news/202606/t20260611_1161564.shtml
- https://www.pnas.org/doi/10.1073/pnas.2607035123
Void-X:タンパク質中の原子パッキングを予測するための生成型空隙充填モデル Void-X: A generative void-filling model for predicting atomic packing in proteins
Jing Yang, Junying Yuan, and James J. Chou
Proceedings of the National Academy of Sciences Published:June 9, 2026
DOI:https://doi.org/10.1073/pnas.2607035123
Significance
Computational protein design is advancing rapidly from designing protein folds to designing specific protein–protein and protein–drug interactions for expediting drug discovery. The majority of the recent design studies employ a top–down approach whereby the overall protein shape is first generated to pack against a given structural site, followed by sequence design to optimize the interaction. Here, we report a different approach wherein an atomic filling model was developed for learning atomic-level interactions. This model, named Void-X, comprises 172 million parameters and achieves a success rate of 53 to 78% depending on the different types of test applications. Void-X represents a prompt-answer model for predicting atomic packing at protein interaction interfaces, with potentially broad application in designing epitope-specific protein binders.
Abstract
Generative AI algorithms such as the transformer and diffusion models have greatly empowered de novo design of proteins capable of specifically interacting with designated structural sites on another protein. Most of these design methods employ a top–down approach, in which an overall protein shape is generated by an AI model to pack against a given structural site, followed by sequence design to optimize the interaction. Despite being trained on limited protein complex structures available in the database, the top–down approach has yielded encouraging results. Here, we propose a bottom–up approach that generates atom clusters for optimal packing against a specified structured region for informing the design of protein–protein interactions. To this end, we trained a masked discrete diffusion model, named Void-X, that uses the diffusion transformer to learn atomic-level interactions and fill atomic voids in protein interaction interfaces. Void-X was trained using 8.7 million spherical clusters of atoms from experimental structures in the Protein Data Bank. In each cluster, ~70% of the atoms are used as context (or prompt), and ~30% are masked for information recovery (or answer). By training the model with 172 million parameters, Void-X achieves an overall accuracy of 78.3% and 68.2% for intra- and interchain spherical clusters, respectively. Furthermore, we find that information entropy is a reliable indicator of the prediction accuracy for Void-X. This level of performance allows de novo generation of molecular interactions at the atomic level, offering an alternative approach of protein design complementary to the existing ones.

