Session: Government Agency Student Posters
Paper Number: 173227
Bayesian Inference on Multi-Scale Probabilistic Knowledge Graph Foundation Model to Support Flexible Biomanufacturing Process Monitoring and Mechanism Learning
Biopharmaceuticals provide essential lifesaving treatments for severe and chronic diseases, including cancer, metabolic diseases, and infectious diseases such as COVID-19, often with advantages such as increased efficacy and reduced side effects. However, current biomanufacturing systems have high variability and lack the flexibility to quickly produce existing and new biopharmaceuticals on demand. These issues arise because bioprocessing in biomanufacturing is enormously complex. With cells or other living organisms as factories, fundamentally, a biomanufacturing process is Biological Systems-of-Systems (Bio-SoS) involving hundreds of biological, physical, and chemical factors dynamically interacting with each other at molecular, cellular, and macroscopic scales and impacting production outcomes. Bioprocessing mechanisms are not systematically understood, and data are often very limited, sparse, and heterogeneous. To address these challenges, we first introduce a general multi-scale probabilistic knowledge graph (pKG) as a foundation model for Bio-SoS in stochastic differential equations (SDEs) form with a modular design, which is capable of flexibly representing spatial-temporal causal interdependencies from molecular to cellular to macroscopic scales for different biomanufacturing processes. To advance scientific understanding of bioprocessing mechanisms across scales, we introduce an innovative mechanism-informed Bayesian learning framework, including an interpretable metamodel and a computationally efficient Bayesian posterior sampling approach, to sequentially fuse heterogeneous data, learn underlying mechanisms, and track critical latent state variables. Specifically, the proposed metamodel, constructed by applying linear noise approximation (LNA) to the pKG foundation model in continuous-time SDE form and further coupled with a sequential learning strategy, enables us to fuse sparse and noisy measurements collected at various scales from different biomanufacturing processes. This can infer heterogeneous latent state variables so that we can explicitly approximate the intractable likelihood function. Compared to typical black-box metamodels such as Gaussian processes, the constructed metamodel completely exploits the domain knowledge and structural information of bioprocessing nonlinear dynamics accounting for molecule-to-molecule interactions, improving sample efficiency and prediction interpretability. In addition, the proposed Bayesian posterior sampling approach utilizes Langevin diffusion (LD) to take advantage of the causal dependence information from the derived likelihood function, so that the mechanistic parameters of the foundation model can be efficiently learned. Further, through generalizing the idea of LNA, we can bypass the difficulty of selecting the step size for solving LD during the posterior learning process. Compared to typical approximate Bayesian computation (ABC)-type and Markov chain Monte Carlo (MCMC) Bayesian posterior sampling approaches, the proposed sampling procedure provides fast provable convergence, accelerating digital twin development and mechanisms learning for biomanufacturing systems. A systematic numerical study inspired by real-world biomanufacturing problems demonstrates the effectiveness of the proposed framework, showing its potential to guide the most informative data collection to reduce model uncertainty and support optimal robust interpretable decision making.
Presenting Author: Wandi Xu Northeastern University
Presenting Author Biography: Wandi Xu is a Ph.D. candidate in the Department of Mechanical and Industrial Engineering at Northeastern University, advised by Professor Wei Xie. Her research interests include machine learning, computer simulation, Bayesian learning, and design of experiments, with applications in biomanufacturing. Previously, she obtained an M.S. in Management Science and Engineering from Shanghai Jiao Tong University and a B.S. in Statistics from Jilin University. Her email address is xu.wand@northeastern.edu.
Authors:
Wandi Xu Northeastern UniversityWei Xie Northeastern University
Bayesian Inference on Multi-Scale Probabilistic Knowledge Graph Foundation Model to Support Flexible Biomanufacturing Process Monitoring and Mechanism Learning
Paper Type
Government Agency Student Poster Presentation
