Abstract
The graph-based method of coherence assessment of texts based on the analysis of semantic, grammatical, and lexical consistency of sentence phrases has been suggested. The experimental verification of the efficiency of the method has been performed on the English-language corpora. The metrics obtained can indicate that the suggested method outperforms other modern approaches. The method can be applied to other languages by replacing the linguistic models according to the features of a certain language.
Similar content being viewed by others
References
M. Z. Kurdi, Natural Language Processing and Computational Linguistics 2: Semantics, Discourse and Applications, Wiley-ISTE (2018).
S. Poulimenou, S. Stamou, S. Papavlasopoulos, and M. Poulos, “Short text coherence hypothesis,” J. of Quantitative Linguistics, Vol. 23, Iss. 2, 191–210 (2016). https://doi.org/10.1080/09296174.2016.1142328.
O. O. Marchenko, O. S. Radyvonenko, T. S. Ignatova, P. V. Titarchuk, and D. V. Zhelezniakov, “Improving text generation through introducing coherence metrics,” Cybern. Syst. Analysis, Vol. 56, No. 1, 13–21 (2020). https://doi.org/10.1007/s10559-020-00216-x.
S. Pogorilyy and A. Kramov, “Automated extraction of structured information from a variety of web pages,” in: Proc. 11th International Conference of Programming UkrPROG 2018 (Kyiv, Ukraine, 22–24 May, 2018), Kyiv (2018), pp. 149–158.
R. Barzilay and M. Lapata, “Modeling local coherence: an entity-based approach,” Computational Linguistics, Vol. 34, No. 1, 1–34 (2008). https://doi.org/10.1162/coli.2008.34.1.1.
M. Mesgar and M. Strube, “Normalized entity graph for computing local coherence,” in: Proc. TextGraphs-9: The Workshop on Graph-Based Methods for Natural Language Processing (Doha, Quatar, 29 Oct, 2014) (2014), pp. 1–5. https://doi.org/10.3115/v1/w14-3701.
J. Li and E. Hovy, “A model of coherence based on distributed sentence representation,” in: Proc. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (Doha, Quatar, 25–29 Oct, 2014) (2014), pp. 2039–2048. https://doi.org/10.3115/v1/d14-1218.
B. Cui, Y. Li, Y. Zhang, and Z. Zhang, “Text coherence analysis based on deep neural network,” in: Proc. 2017 ACM on Conference on Information and Knowledge Management (CIKM’17) (Singapore, 6–10 Nov, 2017) (2017), pp. 2027–2030. https://doi.org/10.1145/3132847.3133047.
J. W. G. Putra and T. Tokunaga, “Evaluating text coherence based on semantic similarity graph,” in: Proc. TextGraphs-11: The Workshop on Graph-Based Methods for Natural Language Processing (Vancouver, Canada, 3 Nov, 2017) (2017), pp. 76–85. 2017. https://doi.org/10.18653/v1/w17-2410.
G. Giray and M. O. Ünalír, “Assessment of text coherence using an ontology-based relatedness measurement method,” Expert Systems, Vol. 37, Iss. 3, 1–24 (2019). https://doi.org/10.1111/exsy.12505.
T. Bohn, Y. Hu, J. Zhang, and C. X. Ling, “Learning sentence embeddings for coherence modelling and beyond,” in: Proc. Recent Advances in Natural Language Processing (Varna, Bulgaria, 2–4 Sept, 2019) (2019), pp. 151–160. https://doi.org/10.26615/978-954-452-056-4_018.
G. Angeli, M. J. J. Premkumar, and C. Manning, “Leveraging linguistic structure for open domain information extraction,” in: Proc. 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Vol. 1: Long Papers) (Bei**g, China, 26–31 July, 2015) (2015), pp. 344–354. https://doi.org/10.3115/v1/p15-1034.
S. Pogorilyy and A. Kramov, “Coreference resolution method using a convolutional neural network,” in: Proc. 2019 IEEE International Conference on Advanced Trends in Information Theory (ATIT) (Kyiv, Ukraine, 18–20 Dec, 2019) (2019), pp. 397–401. https://doi.org/10.1109/ATIT49449.2019.9030596.
Q. Le and T. Mikolov, “Distributed representations of sentences and documents,” in: Proc. 31st International Conference on Machine Learning (Bei**g, China, 21–26 June, 2014); JMLR, Vol. 32 (2014), pp. 1188–1196e.
T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in: NIPS’13: Proc. 26th International Conference on Neural Information Processing Systems (Lake Tahoe, Nevada, USA, 5–8 Dec, 2013), Vol. 2 (2013), pp. 3111–3119.
T. Mikolov, E. Grave, P. Bojanowski, C. Puhrsch, A. Joulin, “Advances in pre-training distributed word representations,” in: Proc. Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (Miyazaki, Japan, 7–12 May, 2018) (2018), pp. 52–55.
S. Pogorilyy and A. Kramov, “Method of the coherence evaluation of Ukrainian text,” Data Recording, Storage & Processing, Vol. 20, No. 4, 64–75 (2018). https://doi.org/10.35681/1560-9189.2018.20.4.178945.
OntoNotes Release 5.0, Linguistic Data Consortium (2020). URL: https://catalog.ldc.upenn.edu/LDC2013T19.
Author information
Authors and Affiliations
Corresponding author
Additional information
Translated from Kibernetika i Sistemnyi Analiz, No. 6, November–December, 2020, pp. 38–45.
Rights and permissions
About this article
Cite this article
Pogorilyy, S.D., Kramov, A.A. Assessment of Text Coherence by Constructing the Graph of Semantic, Lexical, and Grammatical Consistancy of Phrases of Sentences. Cybern Syst Anal 56, 893–899 (2020). https://doi.org/10.1007/s10559-020-00309-7
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10559-020-00309-7