Abstract
Producing single-cell protein (SCP) from food-processing wastewater offers a sustainable approach to resource recovery, animal feed production, and wastewater treatment. Decision-makers need accurate system performance data under variable influent conditions to select operational parameters for efficiency. However, predicting system performance under variable conditions is challenging due to the complexity of unsteady-state bioreactions. This study trained and tested ensemble learning algorithms, including the ensemble of Support Vector Regression, the ensemble of Gaussian Process Regression (GPR), Random Forest, and Extreme Gradient Boosting, to predict outcomes in a continuous-inflow, sequencing-batch-reactor-based SCP system using industrial soybean-processing wastewater. Interpretable analysis and trials validate feature significance for model optimization. Results show that ensemble-learning models, particularly GPR-based ones, outperform linear regression in predicting key effluent and biomass variables essential for operational decision-making. Notably, GPR-based ensembles with influential features predict biomass production (coefficient of determination (R2) = 0.72) against overfitting much better than linear regression (R2 = 0.4).
Original language | English |
---|---|
Article number | 132561 |
Journal | Bioresource Technology |
Volume | 430 |
DOIs | |
Publication status | Published - Aug 2025 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2025 Elsevier Ltd
ASJC Scopus Subject Areas
- Bioengineering
- Environmental Engineering
- Renewable Energy, Sustainability and the Environment
- Waste Management and Disposal
Keywords
- Biomass
- Effluent quality monitoring
- Feature importance
- Gaussian process regression
- Interpretable analysis