Machine Learning Modelling for Predicting the Efficacy of Ionic Liquid-Aided Biomass Pretreatment

Mahanty, Biswanath; Gharami, Munmun; Haldar, Dibyajyoti

doi:10.1007/s12155-024-10747-2

Machine Learning Modelling for Predicting the Efficacy of Ionic Liquid-Aided Biomass Pretreatment

Published: 26 March 2024

(2024)
Cite this article

BioEnergy Research Aims and scope Submit manuscript

Biswanath Mahanty¹,
Munmun Gharami² &
Dibyajyoti Haldar¹

189 Accesses
1 Altmetric
Explore all metrics

Abstract

The influence of ionic liquid (IL) characteristics, lignocellulosic biomass (LCB) properties, and process conditions on LCB pretreatment is not well understood. In this study, a total of 129 experimental data on LCB (grass, agricultural, and forest residues) pretreatment using imidazolium, triethylamine, and choline-amino acid ILs were compiled to develop machine learning (ML) models for cellulose, hemicellulose, lignin, and solid recovery. Following data imputation, a bilayer artificial neural network (ANN) and random forest (RF) regression, the two most widely adopted ML models, were developed. The full-featured ANN following Bayesian hyperparameter (HP) optimisation offered excellent fit on training (R²: 0.936–0.994), though cross-validation (R₂CV) performance remained marginally poor, i.e. between 0.547 and 0.761. The fitness of HP-optimised RF models varied between 0.824 and 0.939 for regression, and between 0.383 and 0.831 in cross-validation. Temperature and pretreatment time had been the most important predictors, except for hemicellulose recovery. Bayesian predictor selection combined with HP optimisation improved the R²CV boundary for ANN (0.555–0.825), as well as for RF models (0.474–0.824). As predictive performance of the models varied depending on target response, use of a larger homogeneous dataset may be warranted. The predictive modelling framework for LCB pretreatment, developed in this study, can be extended to similar biochemical process systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Comparison of machine learning methodologies for predicting kinetics of hydrothermal carbonization of selective biomass

Article 22 August 2021

Comparative Analysis of Lignocellulose Agricultural Waste and Pre-treatment Conditions with FTIR and Machine Learning Modeling

Article 07 April 2022

Machine Learning Approach for Predicting Hydrothermal Liquefaction of Lignocellulosic Biomass

Article 24 May 2024

Data Availability

Data in this study are available on request to the corresponding author.

References

Su C-W, Pang L-D, Qin M et al (2023) The spillover effects among fossil fuel, renewables and carbon markets: evidence under the dual dilemma of climate change and energy crises. Energy 274:127304. https://doi.org/10.1016/j.energy.2023.127304
Article Google Scholar
Stark A (2011) Ionic liquids in the biorefinery: a critical assessment of their potential. Energy Environ Sci 4:19–32. https://doi.org/10.1039/C0EE00246A
Article CAS Google Scholar
Maibam PD, Goyal A (2022) Approach to an efficient pretreatment method for rice straw by deep eutectic solvent for high saccharification efficiency. Bioresour Technol 351:127057. https://doi.org/10.1016/j.biortech.2022.127057
Article CAS PubMed Google Scholar
Wong JL, Khadaroo SNBA, Cheng JLY et al (2023) Green solvent for lignocellulosic biomass pretreatment: an overview of the performance of low transition temperature mixtures for enhanced bio-conversion. Next Mater 1:100012. https://doi.org/10.1016/j.nxmate.2023.100012
Article Google Scholar
Alayoubi R, Mehmood N, Husson E et al (2020) Low temperature ionic liquid pretreatment of lignocellulosic biomass to enhance bioethanol yield. Renew Energy 145:1808–1816. https://doi.org/10.1016/j.renene.2019.07.091
Article CAS Google Scholar
Magina S, Barros-Timmons A, Ventura SPM, Evtuguin DV (2021) Evaluating the hazardous impact of ionic liquids — challenges and opportunities. J Hazard Mater 412:125215. https://doi.org/10.1016/j.jhazmat.2021.125215
Article CAS PubMed Google Scholar
Halder P, Kundu S, Patel S et al (2019) Progress on the pre-treatment of lignocellulosic biomass employing ionic liquids. Renew Sustain Energy Rev 105:268–292. https://doi.org/10.1016/j.rser.2019.01.052
Article CAS Google Scholar
Chen Z, Jiang D, Zhang T et al (2022) Comparison of three ionic liquids pretreatment of Arundo donax L. for enhanced photo-fermentative hydrogen production. Bioresour Technol 343:126088. https://doi.org/10.1016/j.biortech.2021.126088
Article CAS PubMed Google Scholar
Smuga-Kogut M, Kogut T, Markiewicz R, Słowik A (2021) Use of machine learning methods for predicting amount of bioethanol obtained from lignocellulosic biomass with the use of ionic liquids for pretreatment. Energies 14:243. https://doi.org/10.3390/en14010243
Article CAS Google Scholar
Torres-Barrán A, Alonso Á, Dorronsoro JR (2019) Regression tree ensembles for wind energy and solar radiation prediction. Neurocomputing 326–327:151–160. https://doi.org/10.1016/j.neucom.2017.05.104
Article Google Scholar
Qian L, Ni J, Luo M et al (2023) Machine learning models for fast and isothermal hydrothermal liquefaction of biomass: comprehensive experiment and prediction of various product fraction yields. Energy Convers Manag 292:117430. https://doi.org/10.1016/j.enconman.2023.117430
Article CAS Google Scholar
Coşgun A, Günay ME, Yıldırım R (2023) A critical review of machine learning for lignocellulosic ethanol production via fermentation route. Biofuel Res J 10:1859–1875. https://doi.org/10.18331/BRJ2023.10.2.5
Article Google Scholar
Ge H, Zheng J, Xu H (2023) Advances in machine learning for high value-added applications of lignocellulosic biomass. Bioresour Technol 369:128481. https://doi.org/10.1016/j.biortech.2022.128481
Article CAS PubMed Google Scholar
Tian Y, Zhang Y (2022) A comprehensive survey on regularization strategies in machine learning. Inf Fusion 80:146–166. https://doi.org/10.1016/j.inffus.2021.11.005
Article Google Scholar
Wang H, Tang J, Wu M et al (2022) Application of machine learning missing data imputation techniques in clinical decision making: taking the discharge assessment of patients with spontaneous supratentorial intracerebral hemorrhage as an example. BMC Med Inform Decis Mak 22:13. https://doi.org/10.1186/s12911-022-01752-6
Article PubMed PubMed Central Google Scholar
Dudek G (2015) Short-term load forecasting using random forests. In: Filev D et al. Intelligent Systems’2014. Advances in Intelligent Systems and Computing, Springer, Cham, vol 323, pp 821–828. https://doi.org/10.1007/978-3-319-11310-4_71
Bischl B, Binder M, Lang M et al (2023) Hyperparameter optimization: foundations, algorithms, best practices, and open challenges. WIREs Data Min Knowl Discov 13:e1484. https://doi.org/10.1002/widm.1484
Kanthasamy R, Almatrafi E, Ali I et al (2023) Bayesian optimized multilayer perceptron neural network modelling of biochar and syngas production from pyrolysis of biomass-derived wastes. Fuel 350:128832. https://doi.org/10.1016/j.fuel.2023.128832
Article CAS Google Scholar
Phromphithak S, Onsree T, Tippayawong N (2021) Machine learning prediction of cellulose-rich materials from biomass pretreatment with ionic liquid solvents. Bioresour Technol 323:124642. https://doi.org/10.1016/j.biortech.2020.124642
Article CAS PubMed Google Scholar
Luo H, Gao L, Liu Z et al (2021) Prediction of phenolic compounds and glucose content from dilute inorganic acid pretreatment of lignocellulosic biomass using artificial neural network modeling. Bioresour Bioprocess 8:134. https://doi.org/10.1186/s40643-021-00488-x
Article Google Scholar
Jadhav A, Pramod D, Ramanathan K (2019) Comparison of performance of data imputation methods for numeric dataset. Appl Artif Intell 33:913–933. https://doi.org/10.1080/08839514.2019.1637138
Article Google Scholar
Folch-Fortuny A, Arteaga F, Ferrer A (2016) Missing data imputation toolbox for MATLAB. Chemom Intell Lab Syst 154:93–100. https://doi.org/10.1016/j.chemolab.2016.03.019
Article CAS Google Scholar
Beretta L, Santaniello A (2016) Nearest neighbor imputation algorithms: a critical evaluation. BMC Med Inform Decis Mak 16:74. https://doi.org/10.1186/s12911-016-0318-z
Article PubMed PubMed Central Google Scholar
Waljee AK, Mukherjee A, Singal AG et al (2013) Comparison of imputation methods for missing laboratory data in medicine. BMJ Open 3:e002847. https://doi.org/10.1136/bmjopen-2013-002847
Article PubMed PubMed Central Google Scholar
Camargo A (2022) PCAtest: testing the statistical significance of Principal Component Analysis in R. PeerJ 10:e12967. https://doi.org/10.7717/peerj.12967
Article PubMed PubMed Central Google Scholar
Feurer M, Hutter F (2019) Hyperparameter optimization. In: Hutter F, Kotthoff L, Vanschoren J (eds) Automated Machine Learning. The Springer Series on Challenges in Machine Learning. Springer, Cham, pp 3–33. https://doi.org/10.1007/978-3-030-05318-5_1
Sage AJ, Genschel U, Nettleton D (2021) A residual-based approach for robust random forest regression. Stat Interface 14:389–402. https://doi.org/10.4310/20-SII660
Article Google Scholar
Hossain SMZ, Sultana N, Razzak SA, Hossain MM (2022) Modeling and multi-objective optimization of microalgae biomass production and CO₂ biofixation using hybrid intelligence approaches. Renew Sustain Energy Rev 157:112016. https://doi.org/10.1016/j.rser.2021.112016
Article CAS Google Scholar
Shahriari B, Swersky K, Wang Z et al (2016) Taking the human out of the loop: a review of Bayesian optimization. Proc IEEE 104:148–175. https://doi.org/10.1109/JPROC.2015.2494218
Article Google Scholar
Genuer R, Poggi J-M, Tuleau-Malot C (2010) Variable selection using random forests. Pattern Recognit Lett 31:2225–2236. https://doi.org/10.1016/j.patrec.2010.03.014
Article Google Scholar
Tang F, Ishwaran H (2017) Random forest missing data algorithms. Stat Anal Data Min ASA Data Sci J 10:363–377. https://doi.org/10.1002/sam.11348
Article Google Scholar
Kokla M, Virtanen J, Kolehmainen M et al (2019) Random forest-based imputation outperforms other methods for imputing LC-MS metabolomics data: a comparative study. BMC Bioinformatics 20:492. https://doi.org/10.1186/s12859-019-3110-0
Article CAS PubMed PubMed Central Google Scholar
Ascher S, Sloan W, Watson I, You S (2022) A comprehensive artificial neural network model for gasification process prediction. Appl Energy 320:119289. https://doi.org/10.1016/j.apenergy.2022.119289
Article Google Scholar
Jolliffe IT, Cadima J (2016) Principal component analysis: a review and recent developments. Philos Trans R Soc A Math Phys Eng Sci 374:20150202. https://doi.org/10.1098/rsta.2015.0202
Article Google Scholar
Huang X-Y, Ao T-J, Zhang X et al (2023) Develo** high-dimensional machine learning models to improve generalization ability and overcome data insufficiency for mixed sugar fermentation simulation. Bioresour Technol 385:129375. https://doi.org/10.1016/j.biortech.2023.129375
Article CAS PubMed Google Scholar
Greenhill S, Rana S, Gupta S et al (2020) Bayesian optimization for adaptive experimental design: a review. IEEE Access 8:13937–13948. https://doi.org/10.1109/ACCESS.2020.2966228
Article Google Scholar
Zhang W, Chen Q, Chen J et al (2023) Machine learning for hydrothermal treatment of biomass: a review. Bioresour Technol 370:128547. https://doi.org/10.1016/j.biortech.2022.128547
Article CAS PubMed Google Scholar
Abe M, Kuroda K, Sato D et al (2015) Effects of polarity, hydrophobicity, and density of ionic liquids on cellulose solubility. Phys Chem Chem Phys 17:32276–32282. https://doi.org/10.1039/C5CP05808B
Article CAS PubMed Google Scholar
Sun W, Greaves TL, Othman MZ (2020) Electro-assisted pretreatment of lignocellulosic materials in ionic liquid-promoted organic solvents. ACS Sustain Chem Eng 8:18177–18186. https://doi.org/10.1021/acssuschemeng.0c06537
Article CAS Google Scholar
Gallardo K, Castillo R, Mancilla N, Remonsellez F (2020) Biosorption of rare-earth elements from aqueous solutions using walnut shell. Front Chem Eng 2:4. https://doi.org/10.3389/fceng.2020.00004
Article Google Scholar
Torre-Tojal L, Bastarrika A, Boyano A et al (2022) Above-ground biomass estimation from LiDAR data using random forest algorithms. J Comput Sci 58:101517. https://doi.org/10.1016/j.jocs.2021.101517
Article Google Scholar
Probst P, Wright MN, Boulesteix A (2019) Hyperparameters and tuning strategies for random forest. WIREs Data Min Knowl Discov 9:e1301. https://doi.org/10.1002/widm.1301
Zhang W, Cheng X, Hu Y, Yan Y (2019) Online prediction of biomass moisture content in a fluidized bed dryer using electrostatic sensor arrays and the Random Forest method. Fuel 239:437–445. https://doi.org/10.1016/j.fuel.2018.11.049
Article CAS Google Scholar
Ahmad MW, Mourshed M, Rezgui Y (2017) Trees vs neurons: comparison between random forest and ANN for high-resolution prediction of building energy consumption. Energy Build 147:77–89. https://doi.org/10.1016/j.enbuild.2017.04.038
Article Google Scholar
Maniruzzaman M, Rahman MJ, Al-MehediHasan M et al (2018) Accurate diabetes risk stratification using machine learning: role of missing value and outliers. J Med Syst 42:92. https://doi.org/10.1007/s10916-018-0940-7
Article PubMed PubMed Central Google Scholar
Busato S, Gordon M, Chaudhari M et al (2023) Compositionality, sparsity, spurious heterogeneity, and other data-driven challenges for machine learning algorithms within plant microbiome studies. Curr Opin Plant Biol 71:102326. https://doi.org/10.1016/j.pbi.2022.102326
Article PubMed Google Scholar
Martín-Fernández J-A, Hron K, Templ M et al (2015) Bayesian-multiplicative treatment of count zeros in compositional data sets. Stat Modelling 15:134–158. https://doi.org/10.1177/1471082X14535524
Article Google Scholar
Velidandi A, Kumar Gandam P, Latha Chinta M et al (2023) State-of-the-art and future directions of machine learning for biomass characterization and for sustainable biorefinery. J Energy Chem 81:42–63. https://doi.org/10.1016/j.jechem.2023.02.020
Article CAS Google Scholar
Scheda R, Diciotti S (2022) Explanations of machine learning models in repeated nested cross-validation: an application in age prediction using brain complexity features. Appl Sci 12:6681. https://doi.org/10.3390/app12136681
Article CAS Google Scholar
Thomas RM, Bruin W, Zhutovsky P, van Wingen G (2020) Chapter 14 - Dealing with missing data, small sample sizes, and heterogeneity in machine learning studies of brain disorders. In: Mechelli A, Vieira S (eds) Machine Learning Methods and Applications to Brain Disorders. Academic Press, London, pp 249–266. https://doi.org/10.1016/B978-0-12-815739-8.00014-6

Download references

Acknowledgements

D. H. and B. M. would like to acknowledge Karunya Institute of Technology and Sciences, Coimbatore for providing every essential support to perform the experiments and complete this research work.

Funding

This work is financially supported by Karunya Institute of Technology and Sciences, Coimbatore.

Author information

Authors and Affiliations

Division of Biotechnology, Karunya Institute of Technology and Sciences, Coimbatore, 641114, India
Biswanath Mahanty & Dibyajyoti Haldar
Division of Computer Science and Engineering, Karunya Institute of Technology and Sciences, Coimbatore, 641114, India
Munmun Gharami

Authors

Biswanath Mahanty
View author publications
You can also search for this author in PubMed Google Scholar
Munmun Gharami
View author publications
You can also search for this author in PubMed Google Scholar
Dibyajyoti Haldar
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Biswanath Mahanty: conceptualisation, software, writing — original draft, reviewing and editing. Munmun Gharami: data curation, reviewing and editing. Dibyajyoti Haldar: conceptualisation, data curation, reviewing and editing.

Corresponding authors

Correspondence to Biswanath Mahanty or Dibyajyoti Haldar.

Ethics declarations

Ethics Approval and Consent to Participate

The study does not involve any human participants, human data, or human tissue. Ethics approval is not applicable.

Consent for Publication

Not applicable.

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 163 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mahanty, B., Gharami, M. & Haldar, D. Machine Learning Modelling for Predicting the Efficacy of Ionic Liquid-Aided Biomass Pretreatment. Bioenerg. Res. (2024). https://doi.org/10.1007/s12155-024-10747-2

Download citation

Received: 31 January 2024
Accepted: 20 March 2024
Published: 26 March 2024
DOI: https://doi.org/10.1007/s12155-024-10747-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning Modelling for Predicting the Efficacy of Ionic Liquid-Aided Biomass Pretreatment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Comparison of machine learning methodologies for predicting kinetics of hydrothermal carbonization of selective biomass

Comparative Analysis of Lignocellulose Agricultural Waste and Pre-treatment Conditions with FTIR and Machine Learning Modeling

Machine Learning Approach for Predicting Hydrothermal Liquefaction of Lignocellulosic Biomass

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics Approval and Consent to Participate

Consent for Publication

Competing Interests

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 163 KB)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Machine Learning Modelling for Predicting the Efficacy of Ionic Liquid-Aided Biomass Pretreatment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Comparison of machine learning methodologies for predicting kinetics of hydrothermal carbonization of selective biomass

Comparative Analysis of Lignocellulose Agricultural Waste and Pre-treatment Conditions with FTIR and Machine Learning Modeling

Machine Learning Approach for Predicting Hydrothermal Liquefaction of Lignocellulosic Biomass

Data Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics Approval and Consent to Participate

Consent for Publication

Competing Interests

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (DOCX 163 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation