Log in

Improving the quality of software issue report descriptions in Turkish: An industrial case study at Softtech

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Issue reports are an important part of the software development process. They help developers identify and fix problems in their code. However, problems described in these reports often lack important information, such as the Observed Behavior (OB), Expected Behavior (EB), and Steps to Reproduce (S2R). This can lead to valuable developer time being wasted on gathering the relevant information. This study aims to address this issue by develo** a tool that guides reporters in providing the necessary information in an industrial setting. The study is conducted at Softtech, a software subsidiary of the largest private bank in Turkey. The proposed approach is developed for issue reports written specifically in Turkish language. It is motivated by the need for issue report classification tools that can handle the unique characteristics of the Turkish language, such as the presence of many compound words. We first manually analyze and label 1, 041 issue reports for the existence of OB, S2R, and EB, and then present the specific patterns we found describing the related information. Next, we use morphological analysis to extract keywords and suffixes, and then use them for classification with a machine learning based approach. In addition, we conduct a feasibility study to assess the potential of using large language models for issue report classification tasks as a direction for future research. The results indicate that the tool using the machine learning-based approach can be used to guide in improving the quality of issue reports at Softtech, thereby saving valuable developer time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Spain)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

Due to commercial and legal restrictions, supporting data is not available.

Notes

  1. https://softtech.com.tr

  2. https://www.isbank.com.tr

  3. http://coltekin.net/cagri/trmorph/

  4. https://github.com/ahmetaa/zemberek-nlp

  5. https://github.com/ethemutku/issueReportQuality

  6. https://figshare.com/projects/issueReportQuality/186061

  7. https://chat.openai.com

References

  • Akin AA, Akin MD (2007) Zemberek, an open source nlp framework for turkic languages. Structure 10(2007):1–5

    Google Scholar 

  • Aktas EU, Yilmaz C (2020) Automated issue assignment: results and insights from an industrial case. Empir Soft Eng 25(5):3544–3589

    Article  Google Scholar 

  • Aktas EU, Yilmaz C (2022) Using screenshot attachments in issue reports for triaging. Empir Soft Eng 27(7):1–40

    Google Scholar 

  • Aktas EU, Cakmak E, Inan MC, Yilmaz C (2023). Issue report validation in an industrial context. Accepted for publication. In: Proceedings of the 31st ACM joint european software engineering conference and symposium on the foundations of software engineering

  • Behrang F, Orso A (2018) Test migration for efficient large-scale assessment of mobile app coding assignments. In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis pp. 164-175

  • Bishop CM (2006) Pattern recognition and machine learning. Springer

    Google Scholar 

  • Breiman L (2001) Random forests. Machine Learning 45(1):5–32

    Article  Google Scholar 

  • Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Amodei D (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901

    Google Scholar 

  • Chantree F, Nuseibeh B, De Roeck A, Willis A (2006) Identifying nocuous ambiguities in natural language requirements. In: 14th IEEE international requirements engineering conference (RE’06) pp. 59-68. IEEE

  • Chaparro O, Lu J, Zampetti F, Moreno L, Di Penta M, Marcus A, Bavota G, Ng V (2017) Detecting missing information in bug descriptions. In Proceedings of the 2017 11th joint meeting on foundations of software engineering pp. 396-407

  • Chaparro O, Florez J M, Marcus A (2017) Using observed behavior to reformulate queries during text retrieval-based bug localization. In: 2017 IEEE international conference on software maintenance and evolution (ICSME) pp. 376-387. IEEE

  • Chaparro O, Florez J M, Singh U, Marcus A (2019) Reformulating queries for duplicate bug report detection. In: 2019 IEEE 26th international conference on software analysis, evolution and reengineering (SANER) pp. 218-229. IEEE

  • Chaparro O, Florez JM, Marcus A (2019) Using bug descriptions to reformulate queries during text-retrieval-based bug localization. Empir Soft Eng 24(5):2947–3007

    Article  Google Scholar 

  • Chaparro O, Bernal-Cárdenas C, Lu J, Moran K, Marcus A, Di Penta M, Poshyvanyk D, Ng V (2019). Assessing the quality of the steps to reproduce in bug reports. In: Proceedings of the 2019 27th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering pp. 86-96

  • Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  Google Scholar 

  • CoreNLP (2021) https://stanfordnlp.github.io/CoreNLP/

  • Çöltekin Ç (2010) A freely available morphological analyzer for Turkish. In Proceedings of the seventh international conference on language resources and evaluation, Vol 2, pp 19-28 (LREC’10)

  • Çöltekin Ç (2014) A Set of Open Source Tools for Turkish Natural Language. In Proceedings of the ninth international conference on language resources and evaluation, pp. 1079-1086 (LREC’14)

  • Devlin J, Chang M W, Lee K, Toutanova K (2018) Bert: Pre-training of deep bidirectional transformers for language understanding. ar**v:1810.04805

  • Dougherty G (2012) Pattern recognition and classification: an introduction. Springer Science & Business Media

  • Fazzini M, Moran K, Bernal-Cardenas C, Wendland T, Orso A, Poshyvanyk D (2022) Enhancing mobile app bug reporting via real-time understanding of reproduction steps. IEEE Trans Soft Eng 49(3):1246–1272

    Article  Google Scholar 

  • Femmer H, Fernández DM, Juergens E, Klose M, Zimmer I, Zimmer J (2014). Rapid requirements checks with requirements smells: Two case studies. In Proceedings of the 1st International Workshop on Rapid Continuous Software Engineering (pp. 10-19)

  • Feng S, Chen C (2023) Prompting Is All Your Need: Automated Android Bug Replay with Large Language Models. ar**v:2306.01987

  • Gao J, Galley M, Li L (2018). Neural approaches to conversational AI. In The 41st international ACM SIGIR conference on research and development in information retrieval pp. 1371-1374

  • Hata M, Nishimoto M, Nishiyama K, Kawabata H, Hironaka T (2019) OSAIFU: A Source Code Factorizer on Android Studio. In 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME) pp. 422-425. IEEE

  • Joachims T (1998) Text categorization with support vector machines: Learning with many relevant features. In: Proceedings of the 10th european conference on machine learning, Springer-Verlag, ECML’98, pp 137–142

  • Joulin A, Grave E, Bojanowski P and Mikolov T (2017) Bag of Tricks for Efficient Text Classification. In: Proceedings of the 15th conference of the european chapter of the association for computational linguistics: Volume 2, Short Papers, Association for Computational Linguistics, pp 427–431

  • Kallis R, Di Sorbo A, Canfora G, Panichella S (2019) Ticket tagger: Machine learning driven issue classification. In 2019 IEEE international conference on software maintenance and evolution (ICSME) pp. 406-409. IEEE

  • Kallis R, Di Sorbo A, Canfora G, Panichella S (2021) Predicting issue types on GitHub. Sci Comput Program 205:102598

    Article  Google Scholar 

  • Kallis R, Chaparro O, Di Sorbo A, Panichella S (2022) Nlbse’22 tool competition. In 2022 IEEE/ACM 1st International Workshop on Natural Language-Based Software Engineering (NLBSE) pp. 25-28. IEEE

  • Kang S, Yoon J, Yoo S (2023) Large language models are few-shot testers: Exploring llm-based general bug reproduction. In 2023 IEEE/ACM 45th international conference on software engineering (ICSE) pp 2312-2323. IEEE

  • Lemaître G, Nogueira F, Aridas CK (2017) Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. J Mach Lear Res 18(1):559–563

    Google Scholar 

  • Maiya AS (2022) ktrain: A low-code library for augmented machine learning. J Mach Lear Res 23(1):7070–7075

    MathSciNet  Google Scholar 

  • Manning CD, Schütze H (1999) Foundations of statistical natural language processing. MIT press

  • Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press

    Book  Google Scholar 

  • Oflazer K (1994) Two-level description of Turkish morphology. Literary Linguist Comput 9(2):137–148

    Article  Google Scholar 

  • Oflazer K (2014) Turkish and its challenges for language processing. Lang Resour Eval 48(4):639–653

    Article  Google Scholar 

  • Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: Machine learning in python. J Mach Learn Res 12(Oct):2825–2830

  • Sanh V, Debut L, Chaumond J, Wolf T (2019) DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ar**v:1910.01108

  • Shokripour R, Anvik J, Kasirun ZM, Zamani S (2015) A time-based approach to automatic bug report assignment. J Syst Soft 102:109–122

    Article  Google Scholar 

  • Song Y, Chaparro O (2020) Bee: a tool for structuring and analyzing bug reports. In Proceedings of the 28th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering pp 1551-1555

  • Song Y, Mahmud J, Zhou Y, Chaparro O, Moran K, Marcus A, Poshyvanyk, D (2022) Toward interactive bug reporting for (android app) end-users. In Proceedings of the 30th ACM joint european software engineering conference and symposium on the foundations of software engineering pp. 344-356

  • Song Y, Mahmud J, De Silva N, Zhou Y, Chaparro O, Moran K, Marcus A, Poshyvanyk D (2023) BURT: A Chatbot for Interactive Bug Reporting. ar**v:2302.06050

  • Thompson S K (2012) Sampling (Vol. 755). John Wiley & Sons

  • Zeller A (2009) Why programs fail: a guide to systematic debugging. Elsevier

    Google Scholar 

  • Zhang Z, Winn R, Zhao Y, Yu T, Halfond WG (2023) Automatically Reproducing Android Bug Reports Using Natural Language Processing and Reinforcement Learning. ar**v:2301.07775

  • Zimmermann T, Premraj R, Bettenburg N, Just S, Schroter A, Weiss C (2010) What makes a good bug report? IEEE Trans Soft Eng 36(5):618–643

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ethem Utku Aktas.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by: Bibi Stamatia || Bowen Xu || **aofei **e || Maxime Cordy

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aktas, E.U., Cakmak, E., Inan, M.C. et al. Improving the quality of software issue report descriptions in Turkish: An industrial case study at Softtech. Empir Software Eng 29, 43 (2024). https://doi.org/10.1007/s10664-023-10434-4

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-023-10434-4

Keywords

Navigation