In-line Raman spectroscopy combined with accurate quantification models can offer detailed real-time insights into a bioprocess by monitoring key process parameters. However, traditional approaches for model calibration require extensive data collection from multiple bioreactor r
...
In-line Raman spectroscopy combined with accurate quantification models can offer detailed real-time insights into a bioprocess by monitoring key process parameters. However, traditional approaches for model calibration require extensive data collection from multiple bioreactor runs, resulting in process-specific models that are sensitive to operational changes. These challenges can be tackled by simplifying experimental data generation or implementation of computational methods to obtain synthetic and augmented Raman spectra. In this study, we utilized a small experimental dataset of 16 single compound spectra to calibrate quantification models by using partial least squares (PLS) and indirect hard modeling (IHM), leading to comparable rRMSEP values for glucose (4.8% and 4.2%), ethanol (11.6% and 6.3%), and biomass (16.2% and 10.0%) when applied to yeast batch and fed-batch bioprocesses. Subsequently, isolated spectral features extracted during IHM were used to generate fully synthetic spectral datasets for PLS model calibration, resulting in rRMSEPs of 3.2% and 14.5% for glucose and ethanol, respectively. Finally, spectra from a single batch process were augmented with the same isolated spectral features, and calibration with these augmented spectra reduced rRMSEP by 18.6% point (glucose) and 4.3% point (ethanol) compared to process-only calibrated models. This study demonstrates how different approaches may support robust development and rapid implementation of Raman spectroscopy-based models while minimizing experimental efforts, where even complete independence of process data can be achieved.