Abstract
A combination of the smoothly clipped absolute deviation (SCAD) method and the artificial neural network (ANN) was utilized as a novel methodology (SCAD-ANN) in the quantitative structure-retention indices relationship (QSRR). The proposed SCAD method reduces the dimension of data before using the robust ANN modeling method. The efficiency of the SCAD-ANN methods was evaluated by the construction of a QSRR model between the most relevant molecular descriptors (MDs) and RIs for two sets of volatile organic compounds. The SCAD method was applied to training data, and effective MDs were selected in a λ with the lowest cross-validation error (λmin) and were defined as the inputs to the ANN modeling method. All ANN parameters were optimized simultaneously. Some statistical parameters were computed, and the obtained results indicate that the constructed QSRR models have acceptable values. Also, the applicability domain analysis reveals that more than 95% of the data are in the confidence range, indicating that the prediction results of the SCAD-ANN models are reliable.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13738-021-02488-2/MediaObjects/13738_2021_2488_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13738-021-02488-2/MediaObjects/13738_2021_2488_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13738-021-02488-2/MediaObjects/13738_2021_2488_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13738-021-02488-2/MediaObjects/13738_2021_2488_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13738-021-02488-2/MediaObjects/13738_2021_2488_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs13738-021-02488-2/MediaObjects/13738_2021_2488_Fig6_HTML.png)
Similar content being viewed by others
Availability of data and material
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
Code availability
Code for data cleaning and analysis is provided as part of the replication package. It is available at "r-project.org" for review.
References
R. Kaliszan, Chem. Rev. 107, 3212 (2007)
Y. Marrero-Ponce, S.J. Barigye, M.E. Jorge-Rodríguez, T. Tran-Thi-Thu, Chem. Papers 72, 57 (2018)
L. Wu, P. Gong, Y. Wu, K. Liao, H. Shen, Q. Qi, H. Liu, G. Wang, H. Hao, J. Chromatogr. A 1303, 39 (2013)
M. Aćimović, L. Pezo, V. Tešević, I. Čabarkapa, M. Todosijević, Ind. Crops Prod. 154, 112752 (2020)
B.C. Naylor, J.L. Catrow, J.A. Maschek, J.E. Cox, Metabolites 10, 237 (2020)
M. Acimovic, L. Pezo, J. S. Jeremic, M. Cvetkovic, M. Rat, I. Cabarkapa, V. Tesevic, J. Essential Oil Bearing Plants 23, 464 (2020)
B. Pavlić, N. Teslić, P. Kojić, L. Pezo, J. Serb. Chem. Soc. 85, 9 (2020)
D.D. Matyushin, A.Y. Sholokhova, A.E. Karnaeva, A.K. Buryak, Chemometrics Intell. Lab. Syst. 202, 104042 (2020)
S. Đurović, Green Sustainable Process for Chemical and Environmental Engineering and Science, Elsevier (2021)
R. Kaliszan, Handbook of Analytical Separations 8, 587 (2020)
P. Kalhor, O. Yarivand, Anal. Chem. Lett. 6, 371 (2016)
J. Dearden, M.T. Cronin, K.L. Kaiser, SAR QSAR Environ. Res. 20, 241 (2009)
M. Vračko, V. Bandelj, P. Barbieri, E. Benfenati, Q. Chaudhry, M. Cronin, J. Devillers, A. Gallegos, G. Gini, P. Gramatica, SAR QSAR Environ. Res. 17, 265 (2006)
C. Zisi, I. Sampsonidis, S. Fasoula, K. Papachristos, M. Witting, H.G. Gika, P. Nikitas, A. Pappa-Louisi, Metabolites 7 (2017)
I. Sushko, S. Novotarskyi, R. Körner, A.K. Pandey, M. Rupp, W. Teetz, S. Brandmaier, A. Abdelaziz, V.V. Prokopenko, V.Y. Tanchuk, J. Comput. Aided Mol. Des. 25, 533 (2011)
T. Srl, Italy (2007)
O. Soufan, W. Ba-alawi, A. Magana-Mora, M. Essack, V. B. Bajic, Sci. Rep. 8, 1 (2018)
J.P.M. Andries, M. Goodarzi, Y.V. Heyden, Talanta 219, 121266 (2020)
W. Zheng, M. **, Digital Scholarship in the Humanities (2019).
Z. M. Hira, D. F. Gillies, Adv Bioinfo. 2015, 198363 (2015)
A. Jović, K. Brkić, N. Bogunović, in: 2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), IEEE (2015)
R. Tibshirani, J.R. Stat, Soc. Series. B. Stat. Methodol. 58, 267 (1996)
Z. Mozafari, M.A. Chamjangali, M. Arashi, Chemometrics Intell. Lab. Syst., p. 103998 (2020)
O. Farkas, I.G. Zenkevich, F. Stout, J.H. Kalivas, K. Héberger, J. Chromatogr. A 1198, 188 (2008)
E. Daghir-Wojtkowiak, P. Wiczling, S. Bocian, Ł. Kubik, P. Kośliński, B. Buszewski, R. Kaliszan, M.J. Markuszewski, J. Chromatogr. A 1403, 54 (2015)
A.M. Al-Fakih, Z.Y. Algamal, M.H. Lee, M. Aziz, SAR QSAR Environ. Res. 28, 691 (2017)
A. Al-Fakih, Z. Algamal, M. Lee, M. Aziz, SAR QSAR Environ. Res. 29, 339 (2018)
Z.T. Al‐Dabbagh, Z.Y. Algamal, J. Chemom. 33, e3139 (2019)
J. Krmar, M. Vukićević, A. Kovačević, A. Protić, M. Zečević, B. Otašević, J. Chromatogr. A, p. 461146 (2020).
Z.Y. Algamal, M.H. Lee, A.M. Al-Fakih, M. Aziz, SAR QSAR Environ. Res. 27, 703 (2016)
L. Kubik, P. Wiczling, J. Pharm. Biomed. Anal. 127, 176 (2016)
E. Daghir-Wojtkowiak, P. Wiczling, S. Bocian, L. Kubik, P. Koslinski, B. Buszewski, R. Kaliszan, M.J. Markuszewski, J. Chromatogr. A 1403, 54 (2015)
J. Fan, R. Li, J. Am. Stat. Assoc. 96, 1348 (2001)
X.-L. Peng, H. Yin, R. Li, K.-T. Fang, Anal. Chim. Acta 578, 178 (2006)
M.E. Fleming-Jones, R.E. Smith, J. Agric. Food. Chem. 51, 8120 (2003)
R. M. Vinci, L. Jacxsens, B. De Meulenaer, E. Deconink, E. Matsiko, C. Lachat, T. de Schaetzen, M. Canfyn, I. Van Overmeire, P. Kolsteren, Food Control 52, 1 (2015)
R. Ghavami, S. Faham, Chromatographia 72, 893 (2010)
J. Xu, W. Zhang, K. Adhikari, Y.-C. Shi, J. Cereal Sci. 75, 77 (2017)
R. C. Team, Vienna, Austria (2013)
K. Wolinski, J. Hinton, D. Wishart, B. Sykes, F. Richards, A. Pastone, V. Saudek, P. Ellis, G. Maciel, J. McIver, Inc., Gainsville (2007)
P. Breheny, M.P. Breheny, R Foundation for Statistical Computing, Vienna, Austria. URL https://cran.r-project.org/package=ncvreg (2020)
M. Kuhn, R Foundation for Statistical Computing, Vienna, Austria. URL https://cran.r-project.org/package=caret (2012)
A. G. Maldonado, J. Doucet, M. Petitjean, B.-T. Fan, Mol. Divers. 10, 39 (2006)
S.M. Behgozin, M.H. Fatemi, J. Iran. Chem. Soc. 16, 2159 (2019)
A. Tropsha, P. Gramatica, V. K. Gombar, QSAR and Combinatorial Science 22, 69 (2003)
B. Sepehri, Z. Hassanzadeh, R. Ghavami, J. Iran. Chem. Soc. 13, 1525 (2016)
Z. Mozafari, M. Arab Chamjangali, M. Beglari, R. Doosti, Chem. Biol. Drug Des. (2020)
D. C. Montgomery, E. A. Peck, G. G. Vining, Introduction to linear regression analysis, John Wiley & Sons, London (2021)
A. M. E. Saleh, M. Arashi, B. G. Kibria, Theory of ridge regression estimation with applications, John Wiley & Sons, London (2019)
K. Gholivand, A.A.E. Valmoozi, M. Salahi, F. Taghipour, E. Torabi, S. Ghadimi, M. Sharifi, M. Ghadamyari, J. Iran. Chem. Soc. 14, 427 (2017)
L. Asadi, K. Gholivand, K. Zare, J. Iran. Chem. Soc. 13, 1213 (2016)
F. Sadeghi, A. Afkhami, T. Madrakian, R. Ghavami, J. Iran. Chem. Soc., p. 1 (2021)
J. Zupan, J. Gasteiger, Neural networks for chemists: an introduction, John Wiley & Sons, Inc., London (1993)
B. Sepehri, R. Ghavami, S. Farahbakhsh, R. Ahmadi, Int. J. Environ. Sci. Technol. (Tehran), p. 1 (2021)
C. Rücker, G. Rücker, M. Meringer, J. Chem. Inf. Model. 47, 2345 (2007)
R. Ghavami, B. Sepehri, J. Iran. Chem. Soc. 13, 519 (2016)
A. Golbraikh, A. Tropsha, J. Mol. Graphics Modell. 20, 269 (2002)
Acknowledgements
The authors are thankful to the Shahrood University of Technology Research Council for supporting this work.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Contributions
ZM: Methodology, Software, Writing—original draft, Investigation, Writing—review and editing. MAC: Supervision, Writing—review and editing, Data curation. MA: Methodology, Software, Validation, Writing—review and editing. NG: Review and editing.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Rights and permissions
About this article
Cite this article
Mozafari, Z., Arab Chamjangali, M., Arashi, M. et al. QSRR models for predicting the retention indices of VOCs in different datasets using an efficient variable selection method coupled with artificial neural network modeling: ANN-based QSPR modeling. J IRAN CHEM SOC 19, 2617–2630 (2022). https://doi.org/10.1007/s13738-021-02488-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13738-021-02488-2