Item Response Theory Based Ensemble in Machine Learning

Chen, Ziheng; Ahn, Hongshik

doi:10.1007/s11633-020-1239-y

Item Response Theory Based Ensemble in Machine Learning

Research Article
Published: 09 September 2020

Volume 17, pages 621–636, (2020)
Cite this article

International Journal of Automation and Computing Aims and scope Submit manuscript

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

In this article, we propose a novel probabilistic framework to improve the accuracy of a weighted majority voting algorithm. In order to assign higher weights to the classifiers which can correctly classify hard-to-classify instances, we introduce the item response theory (IRT) framework to evaluate the samples’ difficulty and classifiers’ ability simultaneously. We assigned the weights to classifiers based on their abilities. Three models are created with different assumptions suitable for different cases. When making an inference, we keep a balance between the accuracy and complexity. In our experiment, all the base models are constructed by single trees via bootstrap. To explain the models, we illustrate how the IRT ensemble model constructs the classifying boundary. We also compare their performance with other widely used methods and show that our model performs well on 19 datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Voting with random classifiers (VORACE): theoretical and experimental analysis

Article Open access 21 May 2021

Selective ensemble of uncertain extreme learning machine for pattern classification with missing features

Article 19 April 2020

Controlled Under-Sampling with Majority Voting Ensemble Learning for Class Imbalance Problem

References

Z. H. Zhou. Ensemble learning. Encyclopedia of Biometrics, S. Z. Li, Ed., Berlin, Germany: Springer, pp. 411–416, 2009.
Google Scholar
L. Lam, S. Y. Suen. Application of majority voting to pattern recognition: an analysis of its behavior and performance. IEEE Transactions on Systems, Man, and Cybernetics — Part A: Systems and Humans, vol. 27, no. 5, pp. 553–568, 1997. DOI: https://doi.org/10.1109/3468.618255.
Google Scholar
A. F. R. Rahman, H. Alam, M. C. Fairhurst. Multiple classifier combination for character recognition: revisiting the majority voting system and its variations. In Proceedings of the 5th International Workshop on Document Analysis Systems, pp. 167–178, Springer, Princeton, USA, 2002.
Google Scholar
H. Kim, H. Kim, H. Moon, H. Ahn. A weight-adjusted voting algorithm for ensembles of classifiers. Journal of the Korean Statistical Society, vol. 40, no. 4, pp. 437–449, 2011. DOI: https://doi.org/10.1016/j.jkss.2011.03.002.
MathSciNet MATH Google Scholar
S. E. Embretson, S. P. Reise. Item Response Theory, New York, USA: Psychology Press, 2013.
Google Scholar
F. Martínez-Plumed, R. B. C. Prudencio, A. Martínez-Usó, J. Hernández-Orallo. Item response theory in AI: Analysing machine learning classifiers at the instance level. Artificial Intelligence, vol. 271, pp. 18–42, 2019. DOI: https://doi.org/10.1016/j.artint.2018.09.004.
MathSciNet MATH Google Scholar
L. Breiman. Bagging predictors. Machine Learning, vol. 24, no. 2, pp. 123–140, 1996. DOI: https://doi.org/10.1007/BF00058655.
MATH Google Scholar
I. Gandhi, M. Pandey. Hybrid ensemble of classifiers using voting. In Proceedings of International Conference on Green Computing and Internet of Things, IEEE, Noida, India, pp. 399–404, 2015. DOI: https://doi.org/10.1109/ICGCIoT.2015.7380496.
Google Scholar
A. Rojarath, W. Songpan, C. Pong-Inwong. Improved ensemble learning for classification techniques based on majority voting. In Proceedings of the 7th IEEE International Conference on Software Engineering and Service Science, IEEE, Bei**g, China, pp. 107–110, 2016. DOI: https://doi.org/10.1109/ICSESS.2016.7883026.
Google Scholar
C. Cornelio, M. Donini, A. Loreggia, M. S. Pini, F. Rossi. Voting with random classifiers (vorace). ar**v: 1909.08996, 2019. https://arxiv.org/abs/1909.08996.
X. B. Liu, Z. T. Liu, G. J. Wang, Z. H. Cai, H. Zhang. Ensemble transfer learning algorithm. IEEE Access, vol. 6, pp. 2389–2396, 2017. DOI: https://doi.org/10.1109/ACCESS.2017.2782884.
Google Scholar
S. J. Winham, R. R. Freimuth, J. M. Biernacka. A weighted random forests approach to improve predictive performance. Statistical Analysis and Data Mining, vol. 6, no. 6, pp. 496–505, 2013. DOI: https://doi.org/10.1002/sam.11196.
MathSciNet MATH Google Scholar
Y. C. Chen, H. Ahn, J. J. Chen. High-dimensional canonical forest. Journal of Statistical Computation and Simulation, vol. 87, no. 5, pp. 845–854, 2017. DOI: https://doi.org/10.1080/00949655.2016.1231191.
MathSciNet MATH Google Scholar
H. F. Zhou, X. Z. Zhao, X. Wang. An effective ensemble pruning algorithm based on frequent patterns. Knowledge-Based Systems, vol. 56, pp. 79–85, 2014. DOI: https://doi.org/10.1016/j.knosys.2013.10.024.
Google Scholar
Y. Zhang, S. Burer, W. N. Street. Ensemble pruning via semidefinite programming. Journal of Machine Learning Research, vol. 7, no. 1, pp. 1315–1338, 2006.
MathSciNet MATH Google Scholar
L. I. Kuncheva, J. J. Rodríguez. A weighted voting framework for classifiers ensembles. Knowledge and Information Systems, vol. 38, no. 2, pp. 259–275, 2014. DOI: https://doi.org/10.1007/s10115-012-0586-6.
Google Scholar
A. Kabir, C. Ruiz, S. A. Alvarez. Mixed bagging: a novel ensemble learning framework for supervised classification based on instance hardness. In Proceedings of IEEE International Conference on Data Mining, IEEE, Singapore, Singapore, pp.1073–1078, 2018. DOI: https://doi.org/10.1109/ICDM.2018.00137.
Google Scholar
L. V. Utkin, M. S. Kovalev, A. A. Meldo. A deep forest classifier with weights of class probability distribution subsets. Knowledge-based Systems, vol. 173, pp. 15–27, 2019. DOI: https://doi.org/10.1016/j.knosys.2019.02.022.
Google Scholar
H. Reddy, N. Raj, M. Gala, A. Basava. Text-mining-based fake news detection using ensemble methods. International Journal of Automation and Computing, vol. 17, no. 2, pp. 210–221, 2020. DOI: https://doi.org/10.1007/s11633-019-1216-5.
Google Scholar
W. G. Yi, J. Duan, M. Y. Lu. Double-layer Bayesian classifier ensembles based on frequent itemsets. International Journal of Automation and Computing, vol. 9, no. 2, pp. 215–220, 2012. DOI: https://doi.org/10.1007/s11633-012-0636-2.
Google Scholar
G. Wang, J. X. Hao, J. Ma, H. B. Jiang. A comparative assessment of ensemble learning for credit scoring. Expert Systems with Applications, vol. 38, no. 1, pp. 223–230, 2011. DOI: https://doi.org/10.1016/j.eswa.2010.06.048.
Google Scholar
F. Martínez-Plumed, R. B. Prudêncio, A. Martínez-Usó, J. Hernández-Orallo. Making sense of item response theory in machine learning. In Proceedings of the 22nd European Conference on Artificial Intelligence, IOS Press, The Hague, The Netherlands, pp. 1140–1148, 2016. DOI: https://doi.org/10.3233/978-1-61499-672-9-1140.
Google Scholar
C. Zanon, C. S. Hutz, H. Yoo, R. K. Hambleton. An application of item response theory to psychological test development. Psicologia: Refflexão e Crítica, vol. 29, no. 1, Article number 18, 2016. DOI: https://doi.org/10.1186/s41155-016-0040-x.
H. L. Fu, G. Manogaran, K. Wu, M. Cao, S. Jiang, A. M. Yang. Intelligent decision-making of online shop** behavior based on internet of things. International Journal of Information Management, vol. 50, pp. 515–525, 2020. DOI: https://doi.org/10.1016/j.i**fomgt.2019.03.010.
Google Scholar
W. R. Gilks, S. Richardson, D. J. Spiegelhalter. Markov Chain Monte Carlo in Practice. Boca Raton, USA: Chapman & Hall, CRC, 1995.
MATH Google Scholar
Y. Chen, T. S. Filho, R. B. C. Prudencio, T. Diethe, P. Flach. β³-IRT: a new item response model and its applications. ar**v: 1903.04016, 2019. https://arxiv.org/abs/1903.04016.
B. W. Junker, R. J. Patz, N. M. VanHoudnos. Markov chain Monte Carlo for item response models. Handbook of Item Response Theory, Volume Two: Statistical Tools, W. J. van der Linden, Ed., Boca Raton, USA: Chapman and Hall, CRC, pp. 271–325, 2016.
Google Scholar
J. S. Kim, D. M. Bolt. Estimating item response theory models using Markov chain Monte Carlo methods. Educational Measurement: Issues and Practice, vol. 26, no. 4, pp. 38–51, 2007. DOI: https://doi.org/10.1111/j.1745-3992.2007.00107.x.
Google Scholar
M. A. Tanner, W. H. Wong. The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association, vol. 82, no. 398, pp. 528–540, 1987. DOI: https://doi.org/10.1080/01621459.1987.10478458.
MathSciNet MATH Google Scholar
J. H. Albert. Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, vol. 17, no. 3, pp. 251–269, 1992. DOI: https://doi.org/10.3102/10769986017003251.
Google Scholar
Y. Y. Sheng. Markov chain Monte Carlo estimation of normal ogive IRT models matlab. Journal of Statistical Software, vol. 25, no. 8, pp.1–15, 2008. DOI:https://doi.org/10.18637/jss.v025.i08.
Google Scholar
Y. Y. Sheng. Bayesian estimation of the four-parameter IRT model using Gibbs sampling. International Journal of Quantitative Research in Education, vol. 2, no. 3–4, pp. 194–212, 2015. DOI: https://doi.org/10.1504/IJQRE.2015.071736.
Google Scholar
Y. Noel, B. Dauvier. A beta item response model for continuous bounded responses. Applied Psychological Measurement, vol. 31, no. 1, pp. 47–73, 2007. DOI: https://doi.org/10.1177/0146621605287691.
MathSciNet Google Scholar
J. C. Xu, Q. W. Ren, Z. Z. Shen. Prediction of the strength of concrete radiation shielding based on LS-SVM. Annals of Nuclear Energy, vol. 85, pp. 296–300, 2015. DOI: https://doi.org/10.1016/j.anucene.2015.05.030.
Google Scholar
S. Borman. The expectation maximization algorithm: a short tutorial. Submmitted for Publication, vol. 41, 2004.
W. Deng, H. M. Zhao, L. Zou, G. Y. Li, X. H. Yang, D. Q. Wu. A novel collaborative optimization algorithm in solving complex optimization problems. Soft Computing, vol. 21, no. 15, pp. 4387–4398, 2017. DOI: https://doi.org/10.1007/s00500-016-2071-8.
Google Scholar
M. H. Fang, X. H. Hu, T. T. He, Y. Wang, J. M. Zhao, X. J. Shen, J. Yuan. Prioritizing disease-causing genes based on network diffusion and rank concordance. In Proceedings of IEEE International Conference on Bioinformatics and Biomedicine, IEEE, Belfast, UK, pp. 242–247, 2014. DOI: https://doi.org/10.1109/BIBM.2014.6999162.
Google Scholar
S. R. Safavian, D. Landgrebe. A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics, vol. 21, no. 3, pp. 660–674, 1991. DOI: https://doi.org/10.1109/21.97458.
MathSciNet Google Scholar
A. Liaw, M. Wiener. Classification and regression by randomforest. R News, vol. 2–3, pp. 18–22, 2002.
Google Scholar
J. H. Friedman. Stochastic gradient boosting. Computational Statistics & Data Analysis, vol. 38, no. 4, pp. 367–378, 2002. DOI: https://doi.org/10.1016/S0167-9473(01)00065-2.
MathSciNet MATH Google Scholar
S. Mika, G. Ratsch, J. Weston, B. Scholkopf, K. R. Mullers. Fisher discriminant analysis with kernels. In Proceedings of IEEE Signal Processing Society Workshop, IEEE, Madison, USA, pp. 41–48, 1999. DOI: https://doi.org/10.1109/NNSP.1999.788121.
Google Scholar
J. A. K. Suykens, J. Vandewalle. Least squares support vector machine classifiers. Neural Processing Letters, vol. 9, no. 3, pp. 293–300, 1999. DOI: https://doi.org/10.1023/A:1018628609742.
Google Scholar
E. Bauer, R. Kohavi. An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Machine Learning, vol. 36, no. 1–2, pp. 105–139, 1999. DOI: https://doi.org/10.1023/A:1007515423169.
Google Scholar
H. Li, F. D. Chen, K. W. Cheng, Z. Z. Zhao, D. Z. Yang. Prediction of zeta potential of decomposed peat via machine learning: comparative study of support vector machine and artificial neural networks. International Journal of Electrochemical Science, vol. 10, no. 8, pp. 6044–6056, 2015.
Google Scholar
Y. C. Chen, H. Ha, H. Kim, H. Ahn. Canonical forest. Computational Statistics, vol. 29, no. 3–4, pp. 849–867, 2014. DOI: https://doi.org/10.1007/s00180-013-0466-x.
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Applied Mathematics and Statistics, Stony Brook University, New York, 11794-3600, USA
Ziheng Chen & Hongshik Ahn

Authors

Ziheng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hongshik Ahn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ziheng Chen.

Additional information

Recommended by Associate Editor Matjaz Gams

Ziheng Chen received the B. Sc. degree in statistics from Renmin University of China, China in 2016. He is currently a Ph. D. degree candidate in Department of Applied Mathematics and Statistics, Stony Brook University, USA.

His research interests include reinforcement learning, recommending system, tree structure model and ensemble learning theory.

Hongshik Ahn received the B. Sc. degree in mathematics from Seoul National University, South Korea, and the Ph. D. degree in statistics from University of Wisconsin-Madison, USA in 1992. From 1992 to 1996, he was a mathematical statistician at the National Center for Toxicological Research, U.S. Food and Drug Administration, and a faculty member in the Department of Applied Mathematics and Statistics at Stony Brook University, USA from 1996 to present. He was the first Vice President of SUNY Korea for two years from 2012. Currently, he is a professor at Stony Brook University. He has published 2 books, 3 book chapters, over 70 papers in peer-reviewed journals, and 25 conference papers.

His research interests include classification of high-dimensional data, tree-structured regression modeling, survival analysis, and multi-step batch testing for infectious diseases.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Z., Ahn, H. Item Response Theory Based Ensemble in Machine Learning. Int. J. Autom. Comput. 17, 621–636 (2020). https://doi.org/10.1007/s11633-020-1239-y

Download citation

Received: 16 February 2020
Accepted: 04 June 2020
Published: 09 September 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11633-020-1239-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Item Response Theory Based Ensemble in Machine Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Voting with random classifiers (VORACE): theoretical and experimental analysis

Selective ensemble of uncertain extreme learning machine for pattern classification with missing features

Controlled Under-Sampling with Majority Voting Ensemble Learning for Class Imbalance Problem

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Item Response Theory Based Ensemble in Machine Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Voting with random classifiers (VORACE): theoretical and experimental analysis

Selective ensemble of uncertain extreme learning machine for pattern classification with missing features

Controlled Under-Sampling with Majority Voting Ensemble Learning for Class Imbalance Problem

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation