A Hybrid Approach for Arabic Text Summarization Using Domain Knowledge and Genetic Algorithms

Al-Radaideh, Qasem A.; Bataineh, Dareen Q.

doi:10.1007/s12559-018-9547-z

A Hybrid Approach for Arabic Text Summarization Using Domain Knowledge and Genetic Algorithms

Published: 24 March 2018

Volume 10, pages 651–669, (2018)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Qasem A. Al-Radaideh¹ &
Dareen Q. Bataineh¹

1540 Accesses
1 Altmetric
Explore all metrics

Abstract

Text summarization is the process of producing a shorter version of a specific text. Automatic summarization techniques have been applied to various domains such as medical, political, news, and legal domains proving that adapting domain-relevant features could improve the summarization performance. Despite the existence of plenty of research work in the domain-based summarization in English and other languages, there is a lack of such work in Arabic due to the shortage of existing knowledge bases. In this paper, a hybrid, single-document text summarization approach (abbreviated as (ASDKGA)) is presented. The approach incorporates domain knowledge, statistical features, and genetic algorithms to extract important points of Arabic political documents. The ASDKGA approach is tested on two corpora KALIMAT corpus and Essex Arabic Summaries Corpus (EASC). The Recall-Oriented Understudy for Gisting Evaluation (ROUGE) framework was used to compare the automatically generated summaries by the ASDKGA approach with summaries generated by humans. Also, the approach is compared against three other Arabic text summarization approaches. The (ASDKGA) approach demonstrated promising results when summarizing Arabic political documents with average F-measure of 0.605 at the compression ratio of 40%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Single Arabic Document Summarization Using Natural Language Processing Technique

Wajeez: An Extractive Automatic Arabic Text Summarisation System

A Survey of Extractive Arabic Text Summarization Approaches

References

Lloret E, Palomar M. Text summarization in progress: a literature review. Artif Intell Rev. 2010;37(1):1–41.
Article Google Scholar
Radev D, Hovy E, McKeown K. Introduction to the special issue on summarization. Comput linguist. 2002;28(4):399–408.
Article Google Scholar
Ježek, K. and Steinberger, J. Automatic text summarization (the state of the Art 2007 and new challenges). In: the conference Znalosti, Bratislava, Slovakia 2008; p 1–12.
Saggion H. Automatic summarization: an overview. Rev Fr Linguist Appl. 2008;13(1):63–81.
Google Scholar
Luhn H. The automatic creation of literature abstracts. IBM J Res Dev. 1958;2(2):159–65.
Article Google Scholar
Reeve L, Han H, Brooks A. The use of domain-specific concepts in biomedical text summarization. Inf Process Manag. 2007;43(6):1765–76.
Article Google Scholar
Chen Y, Foong O, Yong S, Kurniawan I. Text summarization for oil and gas drilling topic. Int J Comput Electr Autom Control Inf Eng World Acad Sci Technol. 2008;2(6):1799–802.
Google Scholar
Yeh J, Ke H, Yang W, Meng I. Text summarization using a trainable summarizer and latent semantic analysis. Inf Process Manag. 2005;41(1):75–95.
Article Google Scholar
Moens, M., Uyttendaele, C., and Dumortier, J. Abstracting of legal cases: the SALOMON experience. In: the 6th International Conference on Artificial Intelligence and Law (ICAIL97), Melbourne, Australia. 1997; p 114–122.
De Hollander, G. and Marx, M. Summarization of meetings using word clouds. In: the Computer Science and Software Engineering (CSSE) CSI International Symposium, Tehran 2011; p 54–61.
Summers, E. and Stephens, K. Politwitics: summarization of political tweets. 2012. Retrieved Mar. 10, 2015 from the World Wide Web: http://bid.berkeley.edu/cs294-1-spring13/images/3/34/Politwitics_report.pdf.
Chong L, Chen Y. Text summarization for oil and gas news article. Int J Comput Electr Autom Control Inf Eng World Acad Sci Technol. 2009;3(5):1282–5.
Google Scholar
Sarkar K. Using domain knowledge for text summarization in medical domain. Int J Recent Trends Eng. 2009;1(1):200–5.
Google Scholar
Imam I, Hamouda A, Khalek H. An ontology-based summarization system for Arabic documents (OSSAD). Int J Comput Appl. 2013;74(17):38–43.
Google Scholar
Jr S, Pappa C, Freitas A, Kaestner C. Automatic text summarization with genetic algorithm-based attribute selection. Adv Artif Intell–IBERAMIA Springer. 2004:305–14.
Qazvinian V, Hassanabadi L, Halavati R. Summarising text with a genetic algorithm-based sentence extraction. Int J Knowl Manag Stud. 2008;2(4):426–44.
Article Google Scholar
Fattah M, Ren F. Automatic text summarization. Int J Comput Electr Autom Control Inf Eng. 2008;2(1):90–3.
Google Scholar
Litvak M, Last M, Friedman M. A new approach to improving multilingual summarization using genetic algorithms. In: The 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden; 2010. p. 927–36.
Google Scholar
Nandhini K, Balasundaram S. Use of genetic algorithms for cohesive summary extraction to assist reading difficulties. Appl Comput Intell Soft Comput. 2013;2013:1–11.
Article Google Scholar
Hammo B, Abu-Salem H, Evens M. A hybrid Arabic text summarization technique based on text structure and topic identification. Int J Comput Process Lang. 2011;23(01):39–65.
Article Google Scholar
Al-Omour M. Extractive-based Arabic text summarization approach. M.Sc Thesis: Department of Computer Science, Yarmouk University, Irbid, Jordan; 2012.
Google Scholar
Ibrahim A, Elghazaly T, Gheith M. A novel Arabic text summarization model based on rhetorical structure theory and vector space model. Int J Comput Linguist Nat Lang Process. 2013;2(8):480–4.
Google Scholar
Douzidia, F. and Lapalme, G. Lakhas, an Arabic summarization system. In: the Document Understanding Conference (DUC), Boston, USA. 2004; p128–135.
Bawakid, A., and Oussalah, M. A semantic summarization system: the University of Birmingham at TAC 2008. In: the first text analysis conference (TAC), Maryland, USA 2008; p 1–6.
Al-Radaideh Q, Afif M. Arabic text summarization using aggregate similarity. In: The international Arab Conference on Information Technology (ACIT’2009). Yemen; 2009. p. 1–8.
Google Scholar
Sobh I. An optimized dual classification system for Arabic extractive generic text summarization. M.Sc Thesis: Department of Computer Engineering, Cairo University, Giza, Egypt; 2009.
Google Scholar
Hamodeh, A. and Mousa, M. Automatic system for summarizing Arabic comments on social media networks. Al-Majala Al-Dawlia Lelitesalat, Al-Jameia Al-Arabia Lelhasibat. Special Issue. 2013; p 44–56. (In Arabic).
Al-Taani Ahmad and Al-Rousan, Suhaib. Arabic multi-document text summarization. In: the 17^th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2016), Turkey 2016.
Oufaida H, Nouali O, Blache. Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization. J King Saud Univ-Comput Inf Sci. 2014;26(4):450–61.
Google Scholar
Al-Khawaldeh F, Samawi V. Lexical cohesion and entailment-based segmentation for Arabic text summarization (LCEAS). World Comput Sci Inf Technol J (WCSIT). 2015;5(03):51–60.
Google Scholar
Tran HN, Cambria E, Hussain A. Towards GPU-based common-sense reasoning: using fast subgraph matching. Cogn Comput. 2016;8(6):1074–86.
Article Google Scholar
Yunqing **a Y, Cambria E, Hussain A, Zhao H. Word polarity disambiguation using Bayesian model and opinion-level features. Cogn Comput. 2015;7(3):369–80.
Article Google Scholar
Li Y, Pan Q, Yang T, Suhang Wang S, Tang J, Cambria E. Learning word representations for sentiment analysis. Cogn Comput. 2017;9(6):843–51.
Article Google Scholar
Al-Radaideh Q, Gh A-Q. Application of rough set-based feature selection for Arabic sentiment analysis. Cogn Comput. 2017;9(4):346–445.
Article Google Scholar
Recupero D, Presutti V, Consoli S, Gangemi A, Nuzzolese A. Sentilo: frame-based sentiment analysis. Cogn Comput. 2015;7(2):211–25.
Article Google Scholar
Dashtipour K, Poria S, Hussain A, Cambria E, Hawalah A, Gelbukh A, et al. Multilingual sentiment analysis: state-of-the-art and independent comparison of techniques. Cogn Comput. 2016;8:757–71.
Article Google Scholar
Mukhtar N, Khan MA, Chiragh N. Effective use of evaluation measures for the validation of best classifier in Urdu sentiment analysis. Cogn Comput. 2017;9(4):446–56.
Article Google Scholar
Lo SL, Cambria E, Chiong R, Cornforth D. Multilingual sentiment analysis: from formal to informal and scarce resource languages. Artif Intell Rev. 2017;48(4):499–527.
Article Google Scholar
Duwairi R, El-Orfali M. A study of the effects of preprocessing strategies on sentiment analysis for Arabic text. J Inf Sci. 2014;40(4):501–13.
Article Google Scholar
El-Khair I. Effects of stop words elimination for Arabic information retrieval: a comparative study. Int J Comput Inf Sci. 2006;4(3):119–33.
Google Scholar
Green, S. and Manning, C. Better arabic parsing: baselines, evaluations, and analysis. In: the 23^rd International Conference on Computational Linguistics (COLING), Bei**g, China. 2010; p 394–402.
Mustafa S. Word stemming for Arabic information retrieval: the case for simple light stemming. Abhath Al-Yarmouk: Sci Eng Ser. 2012;21(1):123–44.
Google Scholar
Singh J, Gupta V. An efficient corpus-based stemmer. Cogn Comput. 2017;9(5):671–88.
Article Google Scholar
Edmundson H. New methods in automatic extracting. J Assoc Comput Mach. 1969;16(2):264–85.
Article Google Scholar
Perumal K, Chaudhuri B. Language independent sentence extraction based text summarization. In: The 9th international conference on natural language processing (ICON), Chennai, India; 2011. p. 213–7.
Google Scholar
Kumar Y, Salim N. Automatic multi document summarization approaches. J Comput Sci. 2011;8(1):133–40.
Article Google Scholar
Gupta V, Lehal G. A Survey of text summarization extractive techniques. J Emerg Technol Web Intell. 2010;2(3):258–68.
Google Scholar
Miller B, Goldberg D. Genetic algorithms, tournament selection, and the effects of noise. Complex Syst. 1995;9(3):193–212.
Google Scholar
El-Haj, M. and Koulali, R. KALIMAT: a multipurpose Arabic corpus. In the Second Workshop on Arabic Corpus Linguistics, Lancaster University, UK. 2011b; p 22–25. http://sourceforge.net/projects/kalimat/.
El-Haj M., Kruschwitz U., and Fox C. Using mechanical Turk to create a corpus of Arabic summaries. In: The 7th international language resources and evaluation conference (LREC), Valletta, Malta. 2010; p 36–39.
Lin, C. ROUGE: a package for automatic evaluation of summaries. In: the ACL Workshop on Text Summarization Branches out, Barcelona, Spain. 2004; p 74–81.
El-Haj M, Kruschwitz U, Fox C. Experimenting with automatic text summarisation for Arabic. Hum Lang Technol Chall Comput Sci Linguist Springer. 2011a:490–9.

Download references

Author information

Authors and Affiliations

Faculty of Information Technology and Computer Sciences, Yarmouk University, Irbid, Jordan
Qasem A. Al-Radaideh & Dareen Q. Bataineh

Authors

Qasem A. Al-Radaideh
View author publications
You can also search for this author in PubMed Google Scholar
Dareen Q. Bataineh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qasem A. Al-Radaideh.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Informed Consent

All procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki declaration of 1975, as revised in 2008 [15].

Human and Animal Rights

This article does not contain any studies with human or animal subjects performed by the any of the authors.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Al-Radaideh, Q.A., Bataineh, D.Q. A Hybrid Approach for Arabic Text Summarization Using Domain Knowledge and Genetic Algorithms. Cogn Comput 10, 651–669 (2018). https://doi.org/10.1007/s12559-018-9547-z

Download citation

Received: 16 June 2017
Accepted: 29 January 2018
Published: 24 March 2018
Issue Date: August 2018
DOI: https://doi.org/10.1007/s12559-018-9547-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Hybrid Approach for Arabic Text Summarization Using Domain Knowledge and Genetic Algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Single Arabic Document Summarization Using Natural Language Processing Technique

Wajeez: An Extractive Automatic Arabic Text Summarisation System

A Survey of Extractive Arabic Text Summarization Approaches

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Informed Consent

Human and Animal Rights

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A Hybrid Approach for Arabic Text Summarization Using Domain Knowledge and Genetic Algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Single Arabic Document Summarization Using Natural Language Processing Technique

Wajeez: An Extractive Automatic Arabic Text Summarisation System

A Survey of Extractive Arabic Text Summarization Approaches

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Informed Consent

Human and Animal Rights

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation