2025-06-18 中国科学院 (CAS)
Schematic of machine learning-assisted process analysis and performance prediction (Image by SU Xinwei)
<関連情報>
- https://english.cas.cn/newsroom/research_news/math/202506/t20250618_1045792.shtml
- https://www.sciencedirect.com/science/article/abs/pii/S0376738825005794?via%3Dihub
高性能ウイルスろ過のための機械学習モデリング支援インテリジェントプロセス解析 Machine learning modeling assisted intelligent process analysis for high – performance virus filtration
Xinwei Su, Hao Zhang, Tuanfeng Ma, Jianquan Luo, Shicheng Bi, Chunlei Zhou, Rong Fan, Yinhua Wan
Journal of Membrane Science Available online: 27 May 2025
DOI:https://doi.org/10.1016/j.memsci.2025.124266
Highlights
- Machine learning model for the intelligent analysis of virus filtration process.
- Establishment of a database regarding virus filtration for ML modeling.
- Membrane type is the most important variable in the virus filtration process.
- Univariate PDP analyzed the effects of variables on virus retention.
- Increasing flux can minimize negative interactions on virus retention.
Abstract
Therapeutic proteins are a cornerstone of modern medicine, offering targeted and effective clinical treatments. However, potential viral contamination is a crucial threat to the quality and safety of these products. Membrane technology is regarded as a reliable method for viral clearance, but its performance is influenced by a complex interplay of membrane/feed properties and operating parameters. Conventional experimental approaches to identify key factors governing virus breakthrough are often labor-intensive and time-consuming, limiting their utility for efficient process development. Therefore, we developed a machine learning (ML) workflow to unravel key factors, based on a database regarding the membrane processes for virus removal with high-quality data collected from previous publications. The models were trained and tested with 368 data involving eight input variables (membrane type, flux, volumetric throughput, protein concentration, ionic strength, virus concentration, virus type, pH value) and one output variable (log reduction value, LRV). Random Forest was the best-performing model according to its fitness and accuracy. Feature importance analysis revealed the relative importance of the input variables, ranked from highest to lowest as follows: membrane type > flux > volumetric throughput > protein concentration > ionic strength > virus concentration > virus type > pH value. Univariate partial dependence plot (PDP) was used to analyze the individual impact of each variable on LRV, while bivariate PDP revealed synergistic effects, particularly emphasizing the role of increased flux in mitigating adverse interactions. The reliability of the trained models was validated through virus clearance experiments at the model prediction. This study clarified the interactive impacts of critical process parameters on virus retention using ML modeling and provided a pathway for intelligently intensifying multi-factorial biopharmaceutical processes integrated with ML.