Abstract
Natural language processing in specific domains such as financial markets requires the knowledge of domain ontology. Therefore, develo** a domain-specific lexicon to improve financial context sentiment analysis is noteworthy. In this paper, by exploring a wide related corpus along with using lexical resources, a hybrid approach is proposed to build a lexicon specialized for financial markets sentiment analysis. The lexicon is applied on a large dataset gathered from Twitter during nine months. Experimental results demonstrate a significant correlation between extracted sentiments from the corpus and market trends which indicates lexicon’s superior efficiency in measuring market sentiment compared with general-purpose dictionaries.
Similar content being viewed by others
References
Asghar, MZ, Khan, A, Ahmad, S, Maria, Q, & Khan, IA. (2017). Lexicon-enhanced sentiment analysis framework using rule-based classification scheme. PloS One, 12(2), e0171649. https://doi.org/10.1371/journal.pone.0171649.
Baccianella, S, Esuli, A, & Sebastiani, F. (2010). Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In Proceedings of the international conference on language resources and evaluation (pp. 2200–2204). Valletta: LREC.
Bo, P, Lee, L, & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Empirical methods in NLP (pp. 79–86). Philadelphia.
Bollen, J, Mao, H, & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1–8. https://doi.org/10.1016/j.jocs.2010.12.007.
Bravo-Marquez, F, Frank, E, & Pfahringer, B. (2016). Building a twitter opinion lexicon from automatically-annotated tweets. Knowledge-Based Systems, 108(C), 65–78. https://doi.org/10.1016/j.knosys.2016.05.018.
Chen, C C, Huang, H H, & Chen, H H. (2018). Ntusd-fin: a market sentiment dictionary for financial social media data applications. In Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018). Miyazaki.
Das, S R, & Chen, M Y. (2007). Yahoo! for Amazon: sentiment extraction from small talk on the web. Management Science, 53(9), 1375–1388. https://doi.org/10.1287/mnsc.1070.0704.
Devitt, A, & Ahmad, K. (2007). Sentiment polarity identifcation in financial news: a cohesion-based approach. In Proceedings of the 45th annual meeting of the association of computational linguistics (pp. 984–991). Prague.
Ding, X, Zhang, Y, Liu, T, & Duan, J. (2014). Using structured events to predict stock price movement: an empirical investigation. In The 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1415–1425). Doha.
Eirinaki, M, Pisal, S, & Singh, J. (2012). Feature-based opinion mining and ranking. Journal of Computer and System Sciences, 78(4), 1175–1184. https://doi.org/10.1016/j.jcss.2011.10.007.
Feldman, R. (2013). Techniques and applications for sentiment analysis. Communications of the ACM, 56(4), 82–89.
Figure Eight Inc. (2019). Figure eight | the essential high-quality data annotation platform. https://d1p17r2m4rzlbo.cloudfront.net/wp-content/uploads/2016/03/Full-Economic-News-DFE-839861.csv, (Online; Retrieved 5 2019, from Figure Eight).
Financial Times. (2019). Financial times lexicon. http://lexicon.ft.com, (Online; Retrieved 3 2019).
Hagenau, M, Liebmann, M, & Neumann, D. (2013). Automated news reading: stock price prediction based on financial news using context-capturing features. Decision Support Systems, 55(3), 685–697. https://doi.org/10.1016/j.dss.2013.02.006.
Huang, C J, Liao, J J, Yang, D X, Chang, T Y, & Luo, Y C. (2010). Realization of a news dissemination agent based on weighted association rules and text mining techniques. Expert Systems with Applications, 37(9), 6409–6413. https://doi.org/10.1016/j.eswa.2010.02.078.
Investopedia. (2019a). Fundamental analysis. https://www.investopedia.com/terms/f/fundamentalanalysis.asp, (Online; Accessed 23 Sept 2019).
Investopedia. (2019b). Technical analysis. https://www.investopedia.com/terms/t/technicalanalysis.asp, (Online; Accessed 23 Sept 2019).
InvestorWords. (2019). Investment and financial dictionary by investorwords.com. http://www.investorwords.com, (Online; Retrieved 7 2019).
Kamps, J, Marx, M, Mokken, RJ, & de Rijke, M. (2004). Using wordnet to measure semantic orientations of adjectives. In Proceedings of the international conference on language resources and evaluation (LREC) (pp. 1115–1118). Lisbon.
Khadjeh Nassirtoussi, A, Aghabozorgi, S, Wah, T Y, & Ling Ngo, D C. (2015). Text mining of news-headlines for forex market prediction: a multi-layer dimension reduction algorithm with semantics and sentiment. Expert Systems with Applications, 42(1), 306–324. https://doi.org/10.1016/j.eswa.2014.08.004.
Kim, SM, & Hovy, E. (2004). Determining the sentiment of opinions. In COLING ’04 proceedings of the 20th international conference on computational linguistics (pp. 1367–1373). Geneva.
Kiritchenko, S, Zhu, X, & Saif, M M. (2014). Sentiment analysis of short informal text. Journal of Artificial Intelligence Research, 50(1), 723–762. https://doi.org/10.1613/jair.4272.
Li, X, **e, H, Chen, L, Wang, J, & Deng, X. (2014). News impact on stock price return via sentiment analysis. Knowledge-Based Systems, 69 (1), 14–23. https://doi.org/10.1016/j.knosys.2014.04.022.
Liu, B. (2012). Sentiment analysis and opinion mining. Vermont: Morgan and Claypool Publishers.
Loughran, T, & McDonald, B. (2011). When is a liability not a liability?textual analysis, dictionaries, and 10-ks. Journal of Finance, 66(1), 35–65. https://doi.org/10.1111/j.1540-6261.2010.01625.x.
Miller, G A, Beckwith, R, Fellbaum, C, Gross, D, & Miller, K J. (1990). Introduction to wordnet: an on-line lexical database. International Journal of Lexicography, 3(4), 235–244. https://doi.org/10.1093/ijl/3.4.235.
Mittermayer, M A. (2004). Forecasting intraday stock price trends with text mining techniques. In The 37th annual Hawaii international conference on system sciences (p. 30064.2). Big Island.
Mittermayer, MA, & Knolmayer, GF. (2007). Text mining systems for market response to news: a survey. In Proceedings of the IADIS European conference on data mining (pp. 164–169). Lisbon.
Oliveira, N, Cortez, P, & Areal, N. (2014). Automatic creation of stock market lexicons for sentiment analysis using stocktwits data. In IDEAS ’14 proceedings of the 18th international database engineering and applications symposium (pp. 115–123). Porto.
Oliveiraa, N, Corteza, P, & Arealb, N. (2016). Stock market sentiment lexicon acquisition using microblogging data and statistical measures. Decision Support Systems, 85(C), 62–73. https://doi.org/10.1016/j.dss.2016.02.013.
Oxford University Press. (2019). Oxford learners dictionaries. https://www.oxfordlearnersdictionaries.com, (Online; Retrieved 8 2019).
Peramunetilleke, D, & Wong, R K. (2002). Currency exchange rate forecasting from news headlines. Australian Computer Science Communications, 24 (2), 131–139.
Schumaker, R P, Zhang, Y, Huang, C, & Chen, H. (2012). Evaluating sentiment in financial news articles. Decision Support Systems, 53(3), 458–464. https://doi.org/10.1016/j.dss.2012.03.001.
Sohangir, S, Pett, N, & Wang, D. (2018). Financial sentiment lexicon analysis. In 2018 IEEE 12th international conference on semantic computing (ICSC), Laguna Hills (pp. 286–289).
Stavrianou, P, Andritsos, A, & Nicoloyannis, N. (2007). Overview and semantic issues of text mining. ACM SIGMOD Record, 36(3), 23–34.
Stone, P J, Bales, R F, Namenwirth, Z, & Ogilvie, D M. (1962). The general inquirer: a computer system for content analysis and retrieval based on the sentence as a unit of information. Behavioral Science, 7(4), 484–498. https://doi.org/10.1002/bs.3830070412.
Tabari, N, & Hadzikadic, M. (2019). Context sensitive sentiment analysis of financial tweets: a new dictionary. In SB, R (Ed.) Intelligent methods and big data in industrial. Studies in Big Data (pp. 367–374). Cham: Springer.
Tabari, N, Seyeditabari, A, Peddi, T, Hadzikadic, M, & Zadrozny, W. (2019). A comparison of neural network methods for accurate sentiment analysis of stock market tweets. In Lecture notes in computer science (pp. 51–65). Cham: Springer.
Taboada, M, Brooke, J, Tofiloski, M, Voll, K, & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational Linguistics, 37(2), 267–307. https://doi.org/10.1162/COLI_a_00049.
Tetlock, P C. (2007). Giving content to investor sentiment. Journal of Finance, 62(3), 1139–1168. https://doi.org/10.1111/j.1540-6261.2007.01232.x.
Turney, PD. (2001). Mining the web for synonyms: Pmi-ir versus lsa on toefl. In Proceedings of the twelfth European conference on machine learning, Freiburg (pp. 491–502).
Turney, P D. (2002). Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In ACL ’02 proceedings of the 40th annual meeting on association for computational linguistics (pp. 417–424). Philadelphia.
Wikipedia. (2019a). Bag-of-words model. https://en.wikipedia.org/wiki/Bag-of-words_model, (Online; Retrieved 9 2019, from Wikipedia).
Wikipedia. (2019b). n-gram. https://en.wikipedia.org/wiki/N-gram, (Online; Retrieved 9 2019, from Wikipedia).
Wilson, T, Hoffman, P, Somasundaran, S, Kessler, J, Wiebe, J, & Choiea, Y. (2005a). Opinionfinder: a system for subjectivity analysis. In HLT-Demo ’05 proceedings of HLT/EMNLP on interactive demonstrations (pp. 34–35). Vancouver.
Wilson, T, Wiebe, J, & Hoffmann, P. (2005b). Recognizing contextual polarity in phrase-level sentiment analysis. In The human language technology conference and the conference on empirical methods in natural language processing (HLT/EMNLP-2005), Vancouver (pp. 399–433).
Wilson, T, Wiebe, J, & Hoffmann, P. (2005c). Subjectivity lexicon| mpqa. http://mpqa.cs.pitt.edu/lexicons, (Online; Retrieved 8 2019).
Wisniewski, T P, & Lambe, B. (2013). The role of media in the credit crunch: the case of the banking sector. Journal of Economic Behavior and Organization, 85, 163–175. https://doi.org/10.1016/j.jebo.2011.10.012.
**n, D, & Tanaka-Ishii, K. (2020). Stock embeddings acquired from news articles and price history, and an application to portfolio optimization. In Proceedings of the 58th annual meeting of the association for computational linguistics (pp. 3353–3363).
**ng, F Z, Cambria, E, & Welsch, R E. (2018). Natural language based financial forecasting: a survey. Artificial Intelligence Review, 50(1), 49–73. https://doi.org/10.1007/s10462-017-9588-9.
Yu, L C, Wu, J L, Chang, P C, & Chu, H S. (2013). Using a contextual entropy model to expand emotion words and their intensity for the sentiment classification of stock market news. Knowledge-Based Systems, 41, 89–97. https://doi.org/10.1016/j.knosys.2013.01.001.
Zhang, W, & Skiena, S. (2010). Trading strategies to exploit blog and news sentiment. In The international conference on Weblogs and Social Media (ICWSM 2020), Washington, DC (pp. 375–378).
Author information
Authors and Affiliations
Corresponding author
Additional information
Availability of data and material
The data which was collected and used during the research is available upon request. All the data gathered from other sources has been publicly available when the research was conducted.
Code availability
The codes were written in Python language and are available upon request.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yekrangi, M., Abdolvand, N. Financial markets sentiment analysis: develo** a specialized Lexicon. J Intell Inf Syst 57, 127–146 (2021). https://doi.org/10.1007/s10844-020-00630-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-020-00630-9