Hadoop Framework for Entity Recognition Within High Velocity Streams Using Deep Learning

  • Conference paper
  • First Online:
Data Engineering and Intelligent Computing

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 542 ))

  • 1181 Accesses

Abstract

Social media such as twitter, Facebook are the sources for Stream data. They generate unstructured formal text on various topics containing, emotions expressed on persons, organizations, locations, movies etc. Characteristics of such stream data are velocity, volume, incomplete, often incorrect, cryptic and noisy. Hadoop framework is proposed in our earlier work for recognising and resolving entities within semi structured data such as e-catalogs. This paper extends the framework for recognising and resolving entities from unstructured data such as tweets. Such a system can be used in data integration, de-duplication, detecting events, sentiment analysis. The proposed framework will recognize pre-defined entities from streams using Natural Language Processing (NLP) for extracting local context features and uses Map Reduce for entity resolution. Test results proved that the proposed entity recognition system could identify predefined entities such as location, organization and person entities with an accuracy of 72%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Thailand)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 245.03
Price includes VAT (Thailand)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 299.99
Price excludes VAT (Thailand)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Li, C., Sun, A., Weng, J., He, Q.: Tweet segmentation and its application to named entity recognition. IEEE Trans. Knowl. Data Eng. 558–570 (2015)

    Google Scholar 

  2. Zirikly, A., Diab, M.: Named entity recognition for arabic social media. In: Proceedings of NAACL-HLT 2015, pp. 176–185 (2015)

    Google Scholar 

  3. Kaur, A., Josan, G.S.: Evaluation of Punjabi named entity recognition using context word feature. In: IJCA, vol. 96, no 20, pp. 32–38 (2014)

    Google Scholar 

  4. Dlugolinsky, S., Krammer, P., Ciglan, M.: Combining named entity recognition methods for concept extraction in microposts. Microposts 1–41 (2014)

    Google Scholar 

  5. Patil, N., Patil, A.S., Pawar, B.V.: Survey of named entity recognition systems with respect to Indian and foreign languages. IJCA, vol. 134, no. 16, pp. 21–26 (2016)

    Google Scholar 

  6. Bonadiman, D., Severyn, A., Moschitti, A.: Deep neural networks for named entity recognition in Italian. In: QCRI (2016)

    Google Scholar 

  7. ERIC: Named-entity recognition using deep learning. http://eric-yuan.me/ner_1/. Accessed Apr 2015

  8. Jurafsky, D., Martin, J.H.: Speech and Language Processing, Chapter 9 (2015)

    Google Scholar 

  9. Wachsmuth, H.: Text analysis pipelines: towards ad-hoc large-scale text mining, p. 139 (2015)

    Google Scholar 

  10. Prabhakar Benny, S., Vasavi, S., Anupriya, P.: International Conference on Computational Modeling and Security (CMS 2016). Elsevier Procedia Computer Science (2016)

    Google Scholar 

  11. Neubig, G.: NLP programming tutorial 5—part of speech tagging with hidden Markov models. http://www.phontron.com/slides/nlp-programming-en-04-hmm.pdf. Accessed Apr 2016

  12. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 1(12), 2493–2537 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Vasavi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Vasavi, S., Prabhakar Benny, S. (2018). Hadoop Framework for Entity Recognition Within High Velocity Streams Using Deep Learning. In: Satapathy, S., Bhateja, V., Raju, K., Janakiramaiah, B. (eds) Data Engineering and Intelligent Computing. Advances in Intelligent Systems and Computing, vol 542 . Springer, Singapore. https://doi.org/10.1007/978-981-10-3223-3_23

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3223-3_23

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3222-6

  • Online ISBN: 978-981-10-3223-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Navigation