Sentence Difficulty Analysis with Local Feature Space and Global Distributional Difference

  • Conference paper
Convergence and Hybrid Information Technology (ICHIT 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7425))

Included in the following conference series:

  • 2327 Accesses

Abstract

In this paper, we consider the problem of sentence difficulty analysis from various angles. Past works have endeavored to design deterministic scoring algorithms depending only on semantic and syntactic information. We propose instead not only to hire local feature space representing individual sentence with its syntactic and semantic structure, but also to consider global distributional difference among corpora. For the local feature space, we select 28 linguistic features and transform them into conjuncted and discretized form. By applying global score classification, we can show its much improved results. We test our proposed model to 1,000 sentences and get much higher accuracy than traditional learning models such as SVM and AdaBoost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bormuth, J.R.: Readability: A New Approach. Reading Research Quarterly 1(3), 79–132 (1966)

    Article  Google Scholar 

  2. Klee, T., Fitzgerald, D.: The Relation between Grammatical Development and Mean Length of Utterance in Morphemes. Journal of Child Language 12, 251–269 (1985)

    Article  Google Scholar 

  3. Taylor, W.L.: Cloze Procedure: A New Tool for Measuring Readability. Journalism Quarterly 30, 415–433 (1953)

    Google Scholar 

  4. Dubay, W.H.: The Principles of Readability. Impact Information, Costa Mesa (2004)

    Google Scholar 

  5. Kireyev, K., Landauer, T.K.: Word Maturity: Computational Modeling of Word Knowledge. In: 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-HLT (2011)

    Google Scholar 

  6. Roark, B., Bachrach, A., Cardenas, C., Pallier, C.: Deriving Lexical and Syntactic Expectation-based Measures for Psycholinguistic Modeling via Incremental Top-down Parsing. In: 2009 Conference on Empirical Methods in Natural Language Processing, pp. 324–333 (2009)

    Google Scholar 

  7. Mitchell, J., Lapata, M., Demberg, V., Keller, F.: Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure. In: 48th Annual Meeting of the Association for Computational Linguistics, pp. 196–206 (2010)

    Google Scholar 

  8. Catlett, J.: On Changing Continuous Attributes into Ordered Discrete Attributes. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 164–178. Springer, Heidelberg (1991)

    Chapter  Google Scholar 

  9. Fayyad, U.M., Irani, K.B.: Multi-Interval Discretization of Continuous-valued Attributes for Classification Leaning. In: International Joint Conference on Artificial Intelligence, pp. 1022–1027 (1993)

    Google Scholar 

  10. Kullback, S., Leibler, R.A.: On Information and Sufficiency. Annals of Mathematical Statistics 22(1), 79–86 (1951)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, YB., Kim, Y., Kim, YS. (2012). Sentence Difficulty Analysis with Local Feature Space and Global Distributional Difference. In: Lee, G., Howard, D., Kang, J.J., Ślęzak, D. (eds) Convergence and Hybrid Information Technology. ICHIT 2012. Lecture Notes in Computer Science, vol 7425. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32645-5_89

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-32645-5_89

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-32644-8

  • Online ISBN: 978-3-642-32645-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation