Extracting Keyphrases from News Articles Using Crowdsourcing

  • Conference paper
  • First Online:
Urban Intelligence and Applications

Part of the book series: Studies in Distributed Intelligence ((SDI))

  • 285 Accesses

Abstract

Keyphrase extraction is a very important task in text mining. However, keyphrase extraction of news articles cannot be addressed by existing machine-based approaches effectively because of various reasons. This paper employs crowdsourcing for keyphrase extraction of news articles. We first design a proper crowdsourcing mechanism to extract keyphrases from news articles and then adapt three truth inference algorithms (namely IMLK, IMLK-I, and IMLK-ED) for integrating multiple lists of keyphrases provided by workers. The experiments show that crowdsourcing can significantly improve the performance of the machine-based approach (i.e., KeyRank).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Z. **a, X. Wang, X. Sun, Q. Wang, A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans. Parallel Distrib. Syst. 27(2), 340–352 (2015)

    Article  Google Scholar 

  2. Z. Fu, X. Wu, Q. Wang, K. Ren, Enabling central keyword-based semantic extension search over encrypted outsourced data. IEEE Trans. Inf. Forensics Secur. 12(12), 2986–2997 (2017)

    Article  Google Scholar 

  3. M. Lease, On quality control and machine learning in crowdsourcing, in Proceedings of the 11 th AAAI Conference on Human Computation (AAAI, Menlo Park, 2011), pp. 97–102

    Google Scholar 

  4. J. Zhang, X. Wu, V.S. Sheng, Learning from crowdsourcing labeled data: A survey. J. Artif. Intell. Rev. 46(4), 543–576 (2016)

    Article  Google Scholar 

  5. V.S. Sheng, F. Provost, P.G. Ipeirotis, Get another label? Improving data quality and data mining using multiple, noisy labelers, in Proceedings of the 14 th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, New York, 2008), pp. 614–622

    Google Scholar 

  6. G. Li, J. Wang, Y. Zheng, M.J. Franklin, Crowdsourced data management: a survey. IEEE Trans. Knowl. Data Eng. 28(9), 2296–2319 (2016)

    Article  Google Scholar 

  7. Mturk. https://www.mturk.com/. 2017

  8. Crowdflower. http://www.crowdflower.com/. 2017

  9. G. Li, Y. Zheng, J. Fan, J. Wang, R. Cheng, Crowdsourced data management: overview and challenges, in Proceedings of the 2017 ACM International Conference on Management of Data (ACM, New York, 2017), pp. 1711–1716

    Google Scholar 

  10. Q. Wang, V.S. Sheng, C. Hu, Keyphrase extraction using sequential pattern mining and entropy, in Proceedings of the 2017 IEEE International Conference on Big Knowledge (IEEE, Piscataway, 2017), pp. 88–95

    Google Scholar 

  11. I.H. Witten, G.W. Paynter, E. Frank, C. Gutwin, C.G. Nevill-Manning, KEA: practical automatic keyphrase extraction, in Proceedings of the 4th ACM Conference on Digital libraries (ACM, New York, 1999), pp. 1–23

    Google Scholar 

  12. G. Ercan, I. Cicekli, Using lexical chains for keyword extraction. Inf. Process. Manag. 43(6), 1705–1714 (2007)

    Article  Google Scholar 

  13. S. Xu, S. Yang, C.M. Lau, Keyword extraction and headline generation using novel word feature, in Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI, Menlo Park, 2010), pp. 1461–1466

    Google Scholar 

  14. K.S. Hasan, V. Ng, Automatic keyphrase extraction: a survey of the state of the art. in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pp. 1262–1273, 2014

    Google Scholar 

  15. Z. Fu, X. Wu, C. Guan, X. Sun, K. Ren, Toward efficient multi-keyword fuzzy search over encrypted outsourced data with accuracy improvement. IEEE Trans. Inf. Forensics Secur. 11(12), 2706–2716 (2016)

    Article  Google Scholar 

  16. Z. Fu, K. Ren, J. Shu, X. Sun, F. Huang, Enabling personalized search over encrypted outsourced data with efficiency improvement. IEEE Trans. Parallel Distrib. Syst. 27(9), 2546–2559 (2016)

    Article  Google Scholar 

  17. R. Agrawal, R. Srikant, Mining sequential patterns, in Proceedings of the 11th International Conference on Data Engineering (IEEE, Piscataway, 1995), pp. 3–14

    Google Scholar 

  18. G. Li, C. Chai, J. Fan, X. Weng, J. Li, Y. Zheng, CDB: optimizing queries with crowd-based selections and joins, in Proceedings of the 2017 ACM International Conference on Management of Data (ACM, New York, 2017), pp. 1463–1478

    Google Scholar 

  19. L. von Ahn, L. Dabbish, Labeling images with a computer game, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (ACM, New York, 2004), pp. 319–326

    Google Scholar 

  20. Y. Zheng, G. Li, Y. Li, C. Shan, R. Cheng, Truth inference in crowdsourcing: Is the problem solved? Proc. Vldb Endowment 10(5), 541–552 (2017)

    Article  Google Scholar 

  21. Q. Wang, V.S. Sheng, Z. Liu, Exploring methods of assessing influence relevance of news articles. in Proceedings of the 4th ICCCS(6), pp. 525–536, 2018

    Google Scholar 

  22. C. Shannon, A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948)

    Article  MathSciNet  Google Scholar 

  23. INSPEC. https://github.com/snkim/AutomaticKeyphraseExtraction

Download references

Acknowledgment

This work is partially supported by the US National Science Foundation under grant IIS-1115417, the National Natural Science Foundation of China under Grant (61725205, 61876217, 3177167, 31671589, and 31371533), the Key Laboratory of Agricultural Electronic Commerce, Ministry of Agriculture of China under Grant (AEC2018003), the Anhui Foundation for Science and Technology Major Project under Grant (16030701092 and 18030901034), the 2016 Anhui Foundation for Natural Science Major Project of the Higher Education Institutions under grant (kJ2016A836), and the Hefei Major R&D Projects of Key Technologies under grant (J2018G14).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Victor S. Sheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, Q., Zhong, J., Gu, L., Yang, K., Sheng, V.S. (2020). Extracting Keyphrases from News Articles Using Crowdsourcing. In: Yuan, X., Elhoseny, M. (eds) Urban Intelligence and Applications. Studies in Distributed Intelligence . Springer, Cham. https://doi.org/10.1007/978-3-030-45099-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-45099-1_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-45098-4

  • Online ISBN: 978-3-030-45099-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation