Abstract
Fourteen multivariate regression models were applied to model the bio-oil yield obtained by pyrolysis using different combinations of predictor variables. The data modeling was separated into the reactor regime: batch and continuous. For batch reactor, the Cubist model with the radial base function provided the best bio-oil prediction result with RMSEP of 0.92%, R2 of 0.99, and MAE of 0.73%. This better result was obtained using the process’s modeling variables, proximate composition, elemental composition, and lignocellulose biomass concentration. For continuous reactor, the best result was obtained with the Extremely Randomized Tree model applied to the complete set of predictors with RMSEP of 2.15%, R2 of 0.96, and MAE of 1.74%. Both models showed an outstanding performance for bio-oil yield prediction for batch and continuous reactors widely used in the chemical industry. The optimization analysis of the models showed that the batch reactor achieves a bio-oil yield as high as the fluidized bed reactor if operated under the right conditions. The PSO method used for the optimization found the global optimum for the defined analysis ranges.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs43153-023-00381-4/MediaObjects/43153_2023_381_Fig13_HTML.png)
Similar content being viewed by others
References
Abdullah N, Gerhauser H (2008) Bio-oil derived from empty fruit bunches. Fuel 87:2606–2613
Abnisa F, Arami-Niya A, Wan Daud WMA et al (2013) Utilization of oil palm tree residues to produce bio-oil and bio-char via pyrolysis. Energy Convers Manag 76:1073–1082
Adusumilli S, Bhatt D, Wang H et al (2013) A low-cost INS/GPS integration methodology based on random forest regression. Expert Syst Appl 40:4653–4659
Agatonovic-Kustrin S, Beresford R (2000) Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. J Pharm Biomed Anal 22:717–727
Akhtar J, Saidina Amin N (2012) A review on operating parameters for optimum liquid oil yield in biomass pyrolysis. Renew Sustain Energy Rev 16:5101–5109
Alexopoulos EC (2010) Introduction to multivariate regression analysis. Hippokratia 14:23–28
Alva JAV, Estrada EG (2009) A generalization of Shapiro-Wilk’s test for multivariate normality. Commun Stat - Theory Methods 38:1870–1883
Alvarez J, Amutio M, Lopez G et al (2015) Fast co-pyrolysis of sewage sludge and lignocellulosic biomass in a conical spouted bed reactor. Fuel 159:810–818
Andrade BM, Gois JS, Xavier VL, Luna AS (2020) Comparison of the performance of multiclass classifiers in chemical data: addressing the problem of overfitting with the permutation test. Chemom Intell Lab Syst 201:104013
Andrés J, Lorca P, De Cos Juez FJ, Sánchez-Lasheras F (2011) Bankruptcy forecasting: a hybrid approach using fuzzy c-means clustering and multivariate adaptive regression splines (MARS). Expert Syst Appl 38:1866–1875
Angin D (2013) Effect of pyrolysis temperature and heating rate on biochar obtained from pyrolysis of safflower seed press cake. Bioresour Technol 128:593–597
Asadullah M, Rahman MA, Ali MM et al (2007) Production of bio-oil from fixed bed pyrolysis of bagasse. Fuel 86:2514–2520
Asadullah M, Ab Rasid NS, Kadir SAS, Azdarpour A (2013) Production and detailed characterization of bio-oil from fast pyrolysis of palm kernel shell. Biomass Bioenerg 59:316–324
Ateş F, Pütün E, Pütün AE (2004) Fast pyrolysis of sesame stalk: Yields and structural analysis of bio-oil. J Anal Appl Pyrolysis 71:779–790
Bendtsen C (2012) PSO: particle swarm optimization. R package version 1.0.3. https://CRAN.R-project.org/package=pso
Biradar CH, Subramanian KA, Dastidar MG (2014) Production and fuel quality upgradation of pyrolytic bio-oil from Jatropha Curcas de-oiled seed cake. Fuel 119:81–89
Boehmke B, Greenwell BM (2019) Hands-on machine learning with R. CRC Press, Boca Raton
Boucher TF, Ozanne MV, Carmosino ML et al (2015) A study of machine learning regression methods for major elemental analysis of rocks using laser-induced breakdown spectroscopy. Spectrochim Acta Part B Spectrosc 107:1–10
Box GEP, Cox DR (1964) An analysis of transformations. J R Stat Soc Ser B 26:211–243
Breiman L (2001) Random forests. Mach Learn 45:5–32
Bridgwater AV (2003) Renewable fuels and chemicals by thermal processing of biomass. Chem Eng J 91:87–102
Cao H, **n Y, Yuan Q (2016) Prediction of biochar yield from cattle manure pyrolysis via least squares support vector machine intelligent approach. Bioresour Technol 202:158–164
Casoni AI, Bidegain M, Cubitto MA et al (2015) Pyrolysis of sunflower seed hulls for obtaining bio-oils. Bioresour Technol 177:406–409
Chen X, Zhang H, Song Y, **ao R (2018) Prediction of product distribution and bio-oil heating value of biomass fast pyrolysis. Chem Eng Process Process Intensif 130:36–42
Cutler A, Cutler DR, Stevens JR (2011) Random forests. Mach Learn 45:157–176
Deiss L, Margenot AJ, Culman SW, Demyan MS (2020) Tuning support vector machines regression models improves prediction accuracy of soil properties in MIR spectroscopy. Geoderma 365:114227
Djuris J, Ibric S, Djuric Z (2013) Chemometric methods application in pharmaceutical products and processes analysis and control. Computer-aided applications in pharmaceutical technology. Woodhead Publishing Limited, Sawston, pp 57–90
Ferre J (2009) Regression diagnostics. Comprehensive chemometrics. Elsevier, Oxford, pp 33–89
Filzmoser P, Gschwandtner M (2021) mvoutlier: multivariate outlier detection based on robust methods
Filzmoser P, Maronna R, Werner M (2007) Outlier identification in high dimensions. Comput Stat Data Anal 52:1694–1711
Friedman JH (1991) Multivariate adaptative regression splines. Ann Stat 19:1–141
Friedman JH (2002) Stochastic gradient boosting. Comput Stat Data Anal 38:367–378
Fu P, Hu S, **ang J et al (2010) FTIR study of pyrolysis products evolving from typical agricultural residues. J Anal Appl Pyrolysis 88:117–123
Galvão RKH, Araujo MCU, José GE et al (2005) A method for calibration and validation subset partitioning. Talanta 67:736–740
Garg R, Anand N, Kumar D (2016) Pyrolysis of babool seeds (Acacia nilotica) in a fixed bed reactor and bio-oil characterization. Renew Energy 96:167–171
Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63:3–42
Gonzalez-Sanchez A, Frausto-Solis J, Ojeda-Bustamante W (2014) Predictive ability of machine learning methods for massive crop yield prediction. Span J Agric Res 12:313–328
Greenwell BM (2017) pdp: an R package for constructing partial dependence plots. R J 9:421–436
Greenwell B, Boehmke B, Cunningham J, Developers G (2020) GBM: generalized boosted regression models. R package version 2.1.8
Guedes RE, Luna AS, Torres AR (2018) Operating parameters for bio-oil production in biomass pyrolysis: a review. J Anal Appl Pyrolysis 129:134–149
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Hallinan JS (2013) Computational intelligence in the design of synthetic microbial genetic systems. Methods Microbiol 40:1–37
Hastie T, Tibshiranit R, Friedman J (2008) The elements of statistical learning, 2a. Springer
Hebbali A (2020) olsrr: tools for building OLS regression models, R package version 0.5.3
Henrickson K, Rodrigues F, Pereira FC (2019) Data preparation. Mobility patterns, big data and transport analytics. Elsevier, Oxford, pp 73–106
Heo HS, Park HJ, Dong JI et al (2010) Fast pyrolysis of rice husk under different reaction conditions. J Ind Eng Chem 16:27–31
Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12:69–82
Huang A-N, Hsu C-P, Hou B-R, Kuo H-P (2016) Production and separation of rice husk pyrolysis bio-oils from a fractional distillation column connected fluidized bed reactor. Powder Technol 323:588–593
Isahak WNRW, Hisham MWM, Yarmo MA, Yun Hin TY (2012) A review on bio-oil production from biomass by using pyrolysis method. Renew Sustain Energy Rev 16:5910–5923
James G, Witten D, Hastie T, Tibshiranit R (2013) An introduction to statistical learning. Springer, New York
Jarek S (2012) mvnormtest: normality test for multivariate variables
Jung SH, Kang BS, Kim JS (2008) Production of bio-oil from rice straw and bamboo sawdust under various reaction conditions in a fast pyrolysis plant equipped with a fluidized bed and a char separation system. J Anal Appl Pyrolysis 82:240–247
Kang H (2013) The prevention and handling of the missing data. Korean J Anesthesiol 64:402–406
Karatzoglou A, Smola A, Hornik K, Zeileis A (2004) kernlab—an S4 Package for Kernel Methods in R. J Stat Softw 11:1–20
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the international conference on neural networks. IEEE, pp 1942–1948
Kim SJ, Jung SH, Kim JS (2010) Fast pyrolysis of palm kernel shells: Influence of operation parameters on the bio-oil yield and the yield of phenol and phenolic compounds. Bioresour Technol 101:9294–9300
Kim SW, Koo BS, Ryu JW et al (2013) Bio-oil from the pyrolysis of palm and Jatropha wastes in a fluidized bed. Fuel Process Technol 108:18–124
Kotu V, Deshpande B (2019) Anomaly detection. Data science. Elsevier, Oxford, pp 447–465
Koziel S, Yang X (2011) Computational optimization methods and algorthims. Springer, Berlin
Kuhn M (2020) caret: classification and regression training. R package version 6.0-86
Kuhn M (2021) caret: classification and regression training
Kuhn M, Johnson K (2013) Applied predictive modeling. Springer, New York
Kuhn M, Quinlan R (2020) Cubist: rule- and instance-based regression modeling. R package version 0.2.3
Landry M, Erlinger TP, Patschke D, Varrichio C (2016) Probabilistic gradient boosting machines for GEFCom2014 wind forecasting. Int J Forecast 32:1061–1066
Lee Y, Park J, Ryu C et al (2013) Comparison of biochar properties from biomass residues produced by slow pyrolysis at 500°C. Bioresour Technol 148:196–201
Liaw A, Wiener M (2002) Classification and Regression by randomForest. R News 2:18–22
Liu A, Yang MT (2012) A new hybrid nelder-mead particle swarm optimization for coordination optimization of directional overcurrent relays. Math Probl Eng 2012:1
Looney SW, Hagan JL (2007) Statistical methods for assessing biomarkers and analyzing biomarker data. Handb Stat 27:27–65
Ly HV, Kim SS, Woo HC et al (2015) Fast pyrolysis of macroalga Saccharina japonica in a bubbling fluidized-bed reactor for bio-oil production. Energy 93:1436–1446
Martínez CM, Cao D (2019) Integrated energy management for electrified vehicles. Elsevier, Oxford
Mehdizadeh S, Behmanesh J, Khalili K (2017) Using MARS, SVM, GEP and empirical equations for estimation of monthly mean reference evapotranspiration. Comput Electron Agric 139:103–114
Melkumova LE, Shatskikh SY (2017) Comparing ridge and LASSO estimators for data analysis. Procedia Eng 201:746–755
Merdun H, Sezgin IV (2018) Modelling of pyrolysis product yields by artificial neural networks. Int J Renew Energy Res 8:1178–1188
Metcalf L, Casey W (2016) Introduction to data analysis. Cybersecurity applied mathematics. Elsevier, Oxford, pp 43–65
Milborrow SD from mda:mars by TH and RTUAMF utilities with TL leaps wrapper (2019) earth: multivariate adaptive regression splines. R package version 5.1.2
Mishra P, Pandey CM, Singh U et al (2019) Descriptive statistics and normality tests for statistical data. Ann Card Anaesth 22:67–72
Nayak S, Hubbard A, Sidney S, Syme SL (2018) A recursive partitioning approach to investigating correlates of self-rated health: The CARDIA Study. SSM Popul Heal 4:178–188
Omar R, Idris A, Yunus R et al (2011) Characterization of empty fruit bunch for microwave-assisted pyrolysis. Fuel 90:1536–1544
Onay Ö, Beis SH, Koçkar ÖM (2001) Fast pyrolysis of rape seed in a well-swept fixed-bed reactor. J Anal Appl Pyrolysis 58–59:995–1007
Paenpong C, Pattiya A (2016) Effect of pyrolysis and moving-bed granular filter temperatures on the yield and properties of bio-oil from fast pyrolysis of biomass. J Anal Appl Pyrolysis 119:40–51
Pǎrpǎriţǎ E, Brebu M, Azhar Uddin M et al (2014) Pyrolysis behaviors of various biomasses. Polym Degrad Stab 100:1–9
Pattiya A, Suttibak S (2012) Production of bio-oil via fast pyrolysis of agricultural residues from cassava plantations in a fluidised-bed reactor with a hot vapour filtration unit. J Anal Appl Pyrolysis 95:227–235
Pattiya A, Sukkasi S, Goodwin V (2012) Fast pyrolysis of sugarcane and cassava residues in a free-fall reactor. Energy 44:1067–1077
Pütün AE, Apaydm E, Pütün E (2004) Rice straw as a bio-oil source via pyrolysis and steam pyrolysis. Energy 29:2171–2180
Qin SJ (1997) Neural networks for intelligent sensors and control—practical issues and some solutions. Neural systems for control. Elsevier, Oxford, pp 213–234
Qu T, Guo W, Shen L et al (2011) Experimental study of biomass pyrolysis based on three major components: hemicellulose, cellulose, and lignin. Ind Eng Chem Res 50:10424–10433
Quan C, Gao N, Song Q (2016) Pyrolysis of biomass components in a TGA and a fixed-bed reactor: thermochemical behaviors, kinetics, and product characterization. J Anal Appl Pyrolysis 121:84–92
Quinlan JR (1992) Learning with continuous classes. Aust Jt Conf Artif Intell 92:343–348
Quinlan JR (1993) Combining instance-based and model-based learning. Mach Learn Proc 93:236–243
R Core Team (2020) R: a language and environment for statistical computing. https://www.r-project.org/
Raja SA, Kennedy ZR, Pillai BC, Lee CLR (2010) Flash pyrolysis of jatropha oil cake in electrically heated fluidized bed reactor. Energy 35:2819–2823
Rawlings JO, Pantula SG, Dickey DA (1998) Applied regression analysis: a research tool, 2nd edn. Springer, New York
Razali NM, Wah YB (2011) Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. J Stat Model Anal 2:21–33
Razuan R, Chen Q, Zhang X et al (2010) Pyrolysis and combustion of oil palm stone and palm kernel cake in fixed-bed reactors. Bioresour Technol 101:4622–4629
Rendall R, Pereira A, Reis M (2016) An extended comparison study of large scale datadriven prediction methods based on variable selection, latent variables, penalized regression and machine learning. Comput Aid Chem Eng 38:1629–1634
Serneels S, De Nolf E, Van Espen PJ (2006) Spatial sign preprocessing: a simple way to impart moderate robustness to multivariate estimators. J Chem Inf Model 46:1402–1409
Sharma R, Sheth PN (2015) Thermo-chemical conversion of jatropha deoiled cake: pyrolysis vs. gasification. Int J Chem Eng Appl 6:376–380
Sharma J, Giri C, Granmo OC, Goodwin M (2019) Multi-layer intrusion detection system with ExtraTrees feature selection, extreme learning machine ensemble, and softmax aggregation. Eurasip J Inf Secur 2019:1
Sousa SIV, Martins FG, Alvim-Ferraz MCM, Pereira MC (2007) Multiple linear regression and artificial neural networks based on principal components to predict ozone concentrations. Environ Model Softw 22:97–103
Stevens A, Ramirez-Lopez L (2021) An introduction to the prospectr package
Sulaiman F, Abdullah N (2011) Optimum conditions for maximising pyrolysis liquids of oil palm empty fruit bunches. Energy 36:2352–2359
Sun Y, Liu L, Wang Q et al (2016) Pyrolysis products from industrial waste biomass based on a neural network model. J Anal Appl Pyrolysis 120:94–102
Tang Q, Chen Y, Yang H et al (2020) Prediction of bio-oil yield and hydrogen contents based on machine learning method: effect of biomass compositions and pyrolysis conditions. Energy Fuels 34:11050–11060
Taşar Ş (2022) Estimation of pyrolysis liquid product yield and its hydrogen content for biomass resources by combined evaluation of pyrolysis conditions with proximate-ultimate analysis data: a machine learning application. J Anal Appl Pyrolysis. https://doi.org/10.1016/j.jaap.2022.105546
Therneau T, Atkinson B (2019) rpart: recursive partitioning and regression trees. R package version 4.1–15
Tibshiranit R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58:267–288
Tsai WT, Lee MK, Chang YM (2007) Fast pyrolysis of rice husk: product yields and compositions. Bioresour Technol 98:22–28
Ullah Z, Khan M, Raza Naqvi S et al (2021) A comparative study of machine learning methods for bio-oil yield prediction—a genetic algorithm-based features selection. Bioresour Technol 335:125292
Vapnik V (1995) The nature of statistical learning theory, 2nd edn. Springer, New York
Varma AK, Mondal P (2017) Pyrolysis of sugarcane bagasse in semi batch reactor: Effects of process parameters on product yields and characterization of products. Ind Crops Prod 95:704–717
Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer, New York
Venderbosch RH, Prins W (2010) Fast pyrolysis technology development. Biofuels Bioprod Biorefining 4:178–208
Vittinghoff E, McCulloch CE, Glidden DV, Shiboski SC (2007) Linear and non-linear regression methods in epidemiology and biostatistics. Elsevier B.V., Oxford
Wickham H (2011) The split-apply-combine strategy for data analysis. J Stat Softw 40:1–29
Winters-Miner LA, Bolding PS, Hilbe JM et al (2015) Prediction in medicine—the data mining algorithms of predictive analytics. Practical predictive analytics and decisioning systems for medicine. Elsevier, Oxford, pp 239–259
Wu SR, Chang CC, Chang YH, Wan HP (2016) Comparison of oil-tea shell and Douglas-fir sawdust for the production of bio-oils and chars in a fluidized-bed fast pyrolysis system. Fuel 175:57–63
**ng J, Luo K, Wang H, Fan J (2019) Estimating biomass major chemical constituents from ultimate analysis using a random forest model. Bioresour Technol 288:121541
Yang ZR, Yang Z (2014) Artificial neural networks. Compr Biomed Phys 6:1–17
Yang K, Wu K, Zhang H (2022) Machine learning prediction of the yield and oxygen content of bio-oil via biomass characteristics and pyrolysis conditions. Energy 254:124320. https://doi.org/10.1016/j.energy.2022.124320
Yap BW, Sim CH (2011) Comparisons of various types of normality tests. J Stat Comput Simul 81:2141–2155
Zhang W, Goh ATC (2014) Multivariate adaptive regression splines and neural network models for prediction of pile drivability. Geosci Front 7:1–8
Zhang T, Cao D, Feng X et al (2022) Machine learning prediction of bio-oil characteristics quantitatively relating to biomass compositions and pyrolysis conditions. Fuel 312:122812. https://doi.org/10.1016/j.fuel.2021.122812
Zhou J, Shi X, Du K et al (2016) Feasibility of random-forest approach for prediction of ground settlements induced by the construction of a shield-driven tunnel. Int J Geomech 17:04016129
Zhou J, Li E, Wei H et al (2019) Random forests and cubist algorithms for predicting shear strengths of rockfill materials. Appl Sci 9:1–16
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol 67:301–320
Zou H, Hastie T (2020) elasticnet: elastic-net for sparse estimation and sparse PCA. R package version 1.3
Acknowledgements
The authors are thankful to Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Fundação de Amparo à Pesquisa no Rio de Janeiro (FAPERJ), and Universidade do Estado do Rio de Janeiro (Programa Pró-Ciência) for their financial support. ASL has research scholarships from UERJ (Programa Pró-Ciência), FAPERJ (Cientista de Nosso Estado), and CNPq, respectively.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guedes, R.E., Torres, A.R. & Luna, A.S. Modeling and optimization of the prediction of bio-oil yield using generalized approach with different biomass and reactor types. Braz. J. Chem. Eng. (2023). https://doi.org/10.1007/s43153-023-00381-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s43153-023-00381-4