Abstract
Recently, metaheuristic based algorithms have shown good results in generating automatic multi-document summaries. This paper proposes two algorithms that hybridize the metaheuristic of Global Best Harmony Search and the LexRank Graph based algorithm, called LexGbhs and GbhsLex. The objective function to be optimized is composed of the features of coverage and diversity. Coverage measures the similarity between each sentence of the candidate summary and the centroid of the sentences of the collection of documents, while diversity measures how different the sentences that make up a candidate summary are. The two proposed hybrid algorithms were compared with state of the art algorithms using ROUGE-1, ROUGE-2 and ROUGE-SU4 measurements for the DUC2005 and DUC2006 data sets. After a unified classification was carried out, the LexGbhs algorithm proposed ranked third, showing that the hybridization of metaheuristics with graphs in the generation of extractive summaries of multiple documents is a promising line of research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
WordNet is a lexico-conceptual database of the English language structured in the form of a semantic network, comprising lexical units and the relationships between them.
- 2.
Prestige and centrality in this proposal represent the same concept, with the difference that the first is usually defined for directed graphs, the second for undirected graphs.
References
Nenkova, A.: Automatic summarization. In: Foundations and Trends® in Information Retrieval, vol. 5, pp. 103–233 (2011)
Lloret, E., Palomar, M.: Text summarisation in progress: a literature review. Artif. Intell. Rev. 37, 1–41 (2011)
Becerra, M.E.M., Guzmán, E.L.: A review of the extractive text summarization. Revista Facultad de Ingenierías Fisicomecánicas UIS Ingenierías 12, 7–27 (2013)
Chen, Y.-M., Wang, X.-L., Liu, B.-Q.: Multi-document summarization based on lexical chains. In: Proceedings of 2005 International Conference on Machine Learning and Cybernetics, vol. 3, pp. 1937–1942 (2005)
Park, S., Cha, B.: Query-based multi-document summarization using non-negative semantic feature and NMF clustering, pp. 609–614 (2008)
Wang, D., Li, T., Zhu, S., Ding, C.: Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. Presented at the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore (2008)
Ouyang, Y., Li, W., Li, S., Lu, Q.: Applying regression models to query-focused multi-document summarization. Inf. Process. Manag. 47, 227–237 (2011)
Tang, J., Yao, L., Chen, D.: Multi-topic based query-oriented summarization. Presented at the SIAM International Conference on Data Mining, Nevada, USA (2009)
Erkan, G., Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. Artif. Intell. Res. 22, 457–479 (2004)
Zhang, J., Cheng, X., Xu, H.: GSPSummary: a graph-based sub-topic partition algorithm for summarization. In: Li, H., Liu, T., Ma, W.-Y., Sakai, T., Wong, K.-F., Zhou, G. (eds.) AIRS 2008. LNCS, vol. 4993, pp. 321–334. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-68636-1_31
Ferreira, R., et al.: A multi-document summarization system based on statistics and linguistic treatment. Expert Syst. Appl. 41, 5780–5787 (2014)
Radev, D.R., **g, H., Budzikowska, M.: Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. Inf. Process. Manag. 40(6), 919–938 (2004)
Haghighi, A., Vanderwende, L.: Exploring content models for multi-document summarization. Presented at the Conference of the North American Chapter of the ACL, Boulder, Colorado (2009)
Lei, H., Yanxiang, H., Furu, W., Wenjie, L.: Modeling document summarization as multi-objective optimization. In: Third International Symposium on Intelligent Information Technology and Security Informatics (IITSI), China, pp. 382–386 (2010)
Liu, D., Wang, Y., Liu, C., Wang, Z.: Multiple documents summarization based on genetic algorithm. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds.) FSKD 2006. LNCS (LNAI), vol. 4223, pp. 355–364. Springer, Heidelberg (2006). https://doi.org/10.1007/11881599_40
Mendoza, M., et al.: A New memetic algorithm for multi-document summarization based on CHC algorithm and greedy search. In: Gelbukh, A., Espinoza, F.C., Galicia-Haro, Sofía N. (eds.) MICAI 2014. LNCS (LNAI), vol. 8856, pp. 125–138. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13647-9_14
Alguliev, R.M., Aliguliyev, R.M., Isazade, N.R.: DESAMC + DocSum: differential evolution with self-adaptive mutation and crossover parameters for multi-document summarization. Knowl. Based Syst. 36, 21–38 (2012)
Abdel-Raouf, O., Metwally, M.A.-B.: A survey of harmony search algorithm. Eng. Appl. Artif. Intell. 70, 17–26 (2013)
Meng, W., **nlai, T.: Extract summarization using concept-obtained and hybrid parallel genetic algorithm. Presented at the 8th International Conference on Natural Computation (2012)
Fattah, M.A.: A hybrid machine learning model for multi-document summarization. Appl. Intell. 40, 592–600 (2013)
Cobos, C., Perez, J., Estupiñan, C.: Una revisión de la búsuqeda armónica. Revista Avances en Sistemas e Informática 8, 14 (2011)
Manning, C.D., Raghavan, P., Schütze, H.: An Introduction to Information Retrieval. Cambridge University Press, Cambridge (2009)
Omran, M.G.H., Mahdavi, M.: Global-best harmony search. Appl. Math. Comput. 198, 643–656 (2008)
Lin, C.-Y.: Rouge: a package for automatic evaluation of summaries. In: Proceedings of the ACL 2004 Workshop on Text Summarization Branches Out (2004)
N.I.O.S.A. Technology: NIST covering array tables—about these pages (2008). http://math.nist.gov/coveringarrays/coveringarray.html
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Cuéllar, C., Mendoza, M., Cobos, C. (2018). Automatic Generation of Multi-document Summaries Based on the Global-Best Harmony Search Metaheuristic and the LexRank Graph-Based Algorithm. In: Castro, F., Miranda-Jiménez, S., González-Mendoza, M. (eds) Advances in Computational Intelligence. MICAI 2017. Lecture Notes in Computer Science(), vol 10633. Springer, Cham. https://doi.org/10.1007/978-3-030-02840-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-02840-4_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02839-8
Online ISBN: 978-3-030-02840-4
eBook Packages: Computer ScienceComputer Science (R0)