2026-04-06 理化学研究所

化学構造式を読み解く大規模言語モデルと物理的・生化学的な特徴量を組み合わせたBB-EIT
<関連情報>
- https://www.riken.jp/press/2026/20260406_1/index.html
- https://pubs.acs.org/doi/10.1021/acsami.5c25223
BB-EIT:拡張化学埋め込みを用いたポリマーブラシへのタンパク質吸着に関する一般化予測モデル BB-EIT: A Generalized Prediction Model for Protein Adsorption on Polymer Brushes Using Augmented Chemical Embeddings
Shiwei Su,Nobuyuki Tanaka,Yoshitaka Ushiku,and Koichi Takahashi
ACS Applied Materials & Interfaces Published: April 6, 2026
DOI:https://doi.org/10.1021/acsami.5c25223
Abstract
Precise control of protein adsorption on polymer surfaces is essential in materials science and biomaterial design, with applications in antifouling materials, biosensors, cell culture, and drug delivery systems. However, the complex interactions between polymers and proteins and the limited availability of high-quality interaction data remain major challenges in polymer informatics. Current approaches often lack the generalizability needed to model diverse polymer–protein systems within a single unified framework, and there is a paucity of comprehensive predictive models capable of handling diverse polymer–protein interactions. To address these challenges, we introduce BB-EIT (Biointerface BERT Encoder for Interaction Translation), a novel generalized model designed to accurately predict the amount of diverse protein adsorption on polymer brushes. BB-EIT leverages the pretrained ChemBERTa large language model (LLM) architecture using SMILES strings for robust chemical representation and convenient data augmentation through SMILES enumeration. By adapting the pretrained model with an extended layer integrating a comprehensive set of physicochemical and biochemical features, including polymer thickness, water contact angle, and surface charge as well as protein isoelectric point (pI) and size, the BB-EIT showed state-of-the-art performance and strong generalizability. The model accurately predicted the adsorption behavior in previously unseen polymer and protein systems. This work represents an important step toward the data-driven design of biomaterials with tailored protein adsorption properties.


