Examining the Bug Prediction Capabilities of Primitive Obsession Metrics

  • Conference paper
  • First Online:
Computational Science and Its Applications – ICCSA 2021 (ICCSA 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12955))

Included in the following conference series:

  • 1343 Accesses

Abstract

Bug prediction is an approach that helps make bug detection more automated during software development. Based on a bug dataset a prediction model is built to locate future bugs. Bug datasets contain information about previous defects in the code, process metrics, or source code metrics, etc. As code smells can indicate potential flaws in the source code, they can be used for bug prediction as well.

In our previous work, we introduced several source code metrics to detect and describe the occurrence of Primitive Obsession in Java. This paper is a further study on three of the Primitive Obsession metrics. We integrated them into an existing, source code metrics-based bug dataset, and studied the effectiveness of the prediction built upon it. We performed a 10 fold cross-validation on the whole dataset and a cross-project validation as well. We compared the new models with the results of the original dataset. While the cross-validation showed no significant change, in the case of the cross-project validation, we have found that the amount of improvement exceeded the amount of deterioration by 5%. Furthermore, the variance added to the dataset was confirmed by correlation and PCA calculations.

This work was partially supported by grant 2018-1.2.1-NKP-2018-00004 “Security Enhancing Technologies for the IoT” funded by the Hungarian National Research, Development and Innovation Office. It was also supported by the European Union, co-financed by the European Social Fund (EFOP-3.6.3-VEKOP-16-2017-00002). The research was supported by the Ministry of Innovation and Technology NRDI Office within the framework of the Artificial Intelligence National Laboratory Program (MILAB).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 96.29
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 128.39
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/stevenalowe/kata-2-tinytypes.

  2. 2.

    The precision can be calculated as the number of true positive elements divided by the sum of the true and false positive elements. It describes what portion of the identifications was actually correct.

  3. 3.

    It describes what portion of the actual relevant elements was identified correctly.

  4. 4.

    F-measure is the harmonic mean of precision and recall.

  5. 5.

    The ROC curve displays the performance of a classification model at all classification threshold. The Area under the ROC curve (AUC) means the area under this curve. It is an aggregated measure of the performance across every classification thresholds.

  6. 6.

    Weka calculates the Weighted Avg. F-Measure with the following formula for an n and y class: \(Weighted\ Avg.\ F-measure= \frac{F-measure(n) \cdot NumOfInstances(x) + F-measure(y) \cdot NumOfInstances(y)}{NumOfInstances(n) + NumOfInstances(y)}\), where NumOfInstances(n), NumOfInstances(y) correspond to the number of instances in the given class.

  7. 7.

    Correlation shows the degree to which a pair of variables are linearly related. We used Pearson correlation for our experiment.

References

  1. Becker, B., Mooney, C.: Categorizing compiler error messages with principal component analysis. In: Proceedings of the 12th China-Europe International Symposium on Software Engineering Education (CEISEE 2016), pp. 1–8 (2016). https://researchrepository.ucd.ie/handle/10197/7889

  2. Behera, R.K., Rath, S.K., Misra, S., Leon, M., Adewumi, A.: Machine learning approach for reliability assessment of open source software. In: Misra, S., et al. (eds.) ICCSA 2019. LNCS, vol. 11622, pp. 472–482. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24305-0_35

    Chapter  Google Scholar 

  3. Boehm, B., Basili, V.R.: Software defect reduction top 10 list. Computer 34(1), 135–137 (2001). https://doi.org/10.1109/2.962984

    Article  Google Scholar 

  4. Catal, C., Diri, B.: A systematic review of software fault prediction studies. Expert Syst. Appl. 36(4), 7346–7354 (2009). https://doi.org/10.1016/j.eswa.2008.10.027. http://www.sciencedirect.com/science/article/pii/S0957417408007215

  5. D’Ambros, M., Lanza, M., Robbes, R.: An extensive comparison of bug prediction approaches. In: 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), pp. 31–41 (2010). https://doi.org/10.1109/MSR.2010.5463279

  6. Ferenc, R., Tóth, Z., Ladányi, G., Siket, I., Gyimóthy, T.: A public unified bug dataset for java and its assessment regarding metrics and bug prediction. Softw. Qual. J. (2020). https://doi.org/10.1007/s11219-020-09515-0

  7. Fowler, M.: Refactoring: Improving the Design of Existing Code. Addison-Wesley, Boston (1999)

    MATH  Google Scholar 

  8. Frank, E., Hall, M.A., Witten, I.H.: Online Appendix for Data Mining: Practical Machine Learning Tools and Techniques, Fourth Edition. Morgan Kaufmann (2016)

    Google Scholar 

  9. Gál, P., Pengő, E.: Primitive enthusiasm: a road to primitive obsession. In: The 11th Conference of PhD Students in Computer Science, pp. 134–137. University of Szeged (2018)

    Google Scholar 

  10. Gamma, E., Helm, R., Johnson, R., Vlissides, J.: Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley Longman Publishing Co. Inc., USA (1995)

    MATH  Google Scholar 

  11. Godfrey, K.: Correlation methods. In: IFAC Proceedings, vol. 12, pp. 527–534 (1979). https://doi.org/10.1016/S1474-6670(17)53974-9. Tutorials presented at the 5th IFAC Symposium on Identification and System Parameter Estimation, Darmstadt, Germany, September

  12. Gupta, A., Suri, B., Kumar, V., Misra, S., Blazauskas, T., Damasevicius, R.: Software code smell prediction model using shannon, rényi and tsallis entropies. Entropy 20, 372 (2018). https://doi.org/10.3390/e20050372

    Article  Google Scholar 

  13. Gupta, A., Suri, B., Misra, S.: A systematic literature review: code bad smells in java source code. In: Gervasi, O., et al. (eds.) ICCSA 2017. LNCS, vol. 10408, pp. 665–682. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-62404-4_49

    Chapter  Google Scholar 

  14. Gyimóthy, T., Ferenc, R., Siket, I.: Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng. 31(10), 897–910 (2005). https://doi.org/10.1109/TSE.2005.112

    Article  Google Scholar 

  15. Hall, T., Zhang, M., Bowes, D., Sun, Y.: Some code smells have a significant but small effect on faults. ACM Trans. Softw. Eng. Methodol. 23(4) (2014). https://doi.org/10.1145/2629648

  16. Jiang, Y., Cukic, B., Ma, Y.: Techniques for evaluating fault prediction models. Empirical Softw. Engg. 13(5), 561–595 (2008). https://doi.org/10.1007/s10664-008-9079-3

  17. Jureczko, M., Madeyski, L.: Towards identifying software project clusters with regard to defect prediction. In: Proceedings of the 6th International Conference on Predictive Models in Software Engineering. PROMISE 2010. ACM (2010). https://doi.org/10.1145/1868328.1868342

  18. Kaur, A., Kaur, I.: An empirical evaluation of classification algorithms for fault prediction in open source projects. J. King Saud Univ. Comput. Inf. Sci. 30 (2016). https://doi.org/10.1016/j.jksuci.2016.04.002

  19. Khomh, F., Di Penta, M., Gueheneuc, Y.: An exploratory study of the impact of code smells on software change-proneness. In: 2009 16th Working Conference on Reverse Engineering, pp. 75–84 (2009). https://doi.org/10.1109/WCRE.2009.28

  20. Mäntylä, M.V., Vanhanen, J., Lassenius, C.: A taxonomy and an initial empirical study of bad smells in code. In: Proceedings of the International Conference on Software Maintenance. ICSM, pp. 381–384. IEEE (2003). https://doi.org/10.1109/ICSM.2003.1235447

  21. Menzies, T., Milton, Z., Turhan, B., Cukic, B., Jiang, Y., Bener, A.: Defect prediction from static code features: current results, limitations, new approaches. Autom. Softw. Eng. 17, 375–407 (2010). https://doi.org/10.1007/s10515-010-0069-5

    Article  Google Scholar 

  22. Moonen, L., Yamashita, A.: Do code smells reflect important maintainability aspects? In: Proceedings of the 2012 IEEE International Conference on Software Maintenance. ICSM, pp. 306–315. IEEE (2012). https://doi.org/10.1109/ICSM.2012.6405287

  23. Open Static Analyser GitHub Page: https://github.com/sed-inf-u-szeged/OpenStaticAnalyzer

  24. Palomba, F., Zanoni, M., Fontana, F.A., De Lucia, A., Oliveto, R.: Toward a smell-aware bug prediction model. IEEE Trans. Softw. Eng. 45(2), 194–218 (2019). https://doi.org/10.1109/TSE.2017.2770122

    Article  Google Scholar 

  25. Pengő, E., Gál, P.: Gras** primitive enthusiasm - approaching primitive obsession in steps. In: Proceedings of the 13th International Conference on Software Technologies. ICSOFT, pp. 389–396. INSTICC, SciTePress (2018). https://doi.org/10.5220/0006918804230430

  26. Song, Q., Shepperd, M., Cartwright, M., Mair, C.: Software defect association mining and defect correction effort prediction. IEEE Trans. Softw. Eng. 32(2), 69–82 (2006). https://doi.org/10.1109/TSE.2006.1599417

  27. Roperia, N.: JSmell: a bad smell detection tool for java systems. Master’s thesis, Maharishi Dayanand University (2009)

    Google Scholar 

  28. Shlens, J.: A tutorial on principal component analysis (2014)

    Google Scholar 

  29. Shukla, S., Behera, R.K., Misra, S., Rath, S.K.: Software reliability assessment using deep learning technique. In: Chakraverty, S., Goel, A., Misra, S. (eds.) Towards Extensible and Adaptable Methods in Computing, pp. 57–68. Springer, Singapore (2018). https://doi.org/10.1007/978-981-13-2348-5_5

    Chapter  Google Scholar 

  30. Singh, V., Misra, S., Sharma, M.: Bug severity assessment in cross project context and identifying training candidates. J. Inf. Knowl. Manage. 16, 1750005 (2017). https://doi.org/10.1142/S0219649217500058

    Article  Google Scholar 

  31. Sjøberg, D.I.K., Yamashita, A., Anda, B.C.D., Mockus, A., Dybå, T.: Quantifying the effect of code smells on maintenance effort. IEEE Trans. Softw. Eng. 39(8), 1144–1156 (2013). https://doi.org/10.1109/TSE.2012.89

    Article  Google Scholar 

  32. Tóth, Z., Gyimesi, P., Ferenc, R.: A public bug database of github projects and its application in bug prediction. In: Gervasi, O., et al. (eds.) ICCSA 2016. LNCS, vol. 9789, pp. 625–638. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42089-9_44

    Chapter  Google Scholar 

  33. Wahono, R.S.: A systematic literature review of software defect prediction: research trends, datasets, methods and frameworks. J. Softw. Eng. 1, 1–16 (2015). https://doi.org/10.3923/JSE.2007.1.12

    Article  Google Scholar 

  34. Weyuker, E., Ostrand, T., Bell, R.: Comparing the effectiveness of several modeling methods for fault prediction. Empirical Softw. Eng. 15, 277–295 (2010). https://doi.org/10.1007/s10664-009-9111-2

    Article  Google Scholar 

  35. Yamashita, A., Moonen, L.: Exploring the impact of inter-smell relations on software maintainability: an empirical study. In: 2013 35th International Conference on Software Engineering (ICSE), pp. 682–691 (2013). https://doi.org/10.1109/ICSE.2013.6606614

  36. Yu, Z., Rajlich, V.: Hidden dependencies in program comprehension and change propagation. In: Proceedings 9th International Workshop on Program Comprehension. IWPC 2001, pp. 293–299 (2001). https://doi.org/10.1109/WPC.2001.921739

  37. Zhang, M., Hall, T., Baddoo, N.: Code bad smells: a review of current knowledge. J. Softw. Maintenance Evol. 23(3), 179–202 (2011). https://doi.org/10.1002/smr.521

  38. Zimmermann, T., Nagappan, N., Gall, H., Giger, E., Murphy, B.: Cross-project defect prediction: a large scale experiment on data vs. domain vs. process. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, pp. 91–100. ESEC/FSE 2009, Association for Computing Machinery, New York (2009). https://doi.org/10.1145/1595696.1595713

  39. Zimmermann, T., Premraj, R., Zeller, A.: Predicting defects for eclipse. In: Proceedings of the Third International Workshop on Predictor Models in Software Engineering, p. 9. PROMISE 2007, IEEE (2007). https://doi.org/10.1109/PROMISE.2007.10

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edit Pengő .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pengő, E. (2021). Examining the Bug Prediction Capabilities of Primitive Obsession Metrics. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2021. ICCSA 2021. Lecture Notes in Computer Science(), vol 12955. Springer, Cham. https://doi.org/10.1007/978-3-030-87007-2_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87007-2_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87006-5

  • Online ISBN: 978-3-030-87007-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation