Ensembled Transferred Embeddings

  • Chapter
  • First Online:
Machine Learning for Data Science Handbook
  • 2099 Accesses

Abstract

Deep learning has become a very popular method for text classification in recent years, due to its ability to improve the accuracy of previous state-of-the-art methods on several benchmarks. However, these improvements required hundreds of thousands to millions labeled training examples, which in many cases can be very time consuming and/or expensive to acquire. This problem is especially significant in domain specific text classification tasks where pretrained embeddings and models are not optimal. In order to cope with this problem, we propose a novel learning framework, Ensembled Transferred Embeddings (ETE), which relies on two key ideas: (1) Labeling a relatively small sample of the target dataset, in a semi-automatic process (2) Leveraging other datasets from related domains or related tasks that are large-scale and labeled, to extract “transferable embeddings” Evaluation of ETE on a large-scale real-world item categorization dataset provided to us by PayPal, shows that it significantly outperforms traditional as well as state-of-the-art item categorization methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now
Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 213.99
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
EUR 267.49
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We use the term noisy to describe user generated text that typically contain grammatical errors, nonstandard spellings, abbreviations, etc., as previously done with tweets on Twitter, (Baldwin et al., 2015).

  2. 2.

    The specific number of 1970 instances was chosen to fit our budget constraint of 200 USD.

  3. 3.

    Our goal here was to demonstrate the advantages of the ETE framework on a large-scale real-world problem, rather than pursuing the best possible accuracy.

  4. 4.

    The harmonic mean of precision and recall of each class weighted by the class proportion in the data.

References

  • Baldwin, T., de Marneffe, M.-C., Han, B., Kim, Y.-B., Ritter, A., & Xu, W. (2015). Shared tasks of the 2015 workshop on noisy user-generated text: Twitter lexical normalization and named entity recognition. In Proceedings of the workshop on noisy user-generated text (pp. 126–135).

    Google Scholar 

  • Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. ar**v preprint ar**v:1810.04805.

    Google Scholar 

  • Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., & Bengio, S. (2010). Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research, 11 (Feb), 625–660.

    MathSciNet  MATH  Google Scholar 

  • Hadar, Y., & Shmueli, E. (2021a). Categorizing items with short and noisy descriptions using ensembled transferred embeddings. Expert Systems with Applications.

    Google Scholar 

  • Hadar, Y., & Shmueli, E. (2021b). Source code for ensembled transferred enbeddings.https://github.com/h-yonatan/Ensembled-Transferred-Enbeddings. (Accessed: 2021-05-27)

    Google Scholar 

  • Hedderich, M. A., Lange, L., Adel, H., Strötgen, J., & Klakow, D. (2020). A survey on recent approaches for natural language processing in low-resource scenarios. ar**v preprint ar**v:2010.12309.

    Google Scholar 

  • Kiros, R., Zhu, Y., Salakhutdinov, R. R., Zemel, R., Urtasun, R., Torralba, A., et al. (2015). Skip-thought vectors. In Advances in neural information processing systems (pp. 3294–3302).

    Google Scholar 

  • Kozareva, Z. (2015). Everyone likes shop**! multi-class product categorization for e-commerce. In Proceedings of the 2015 conference of the north American chapter of the association for computational linguistics: Human language technologies (pp. 1329–1333).

    Google Scholar 

  • Krishnan, A., & Amarthaluri, A. (2019). Large scale product categorization using structured and unstructured attributes. ar**v preprint ar**v:1903.04254.

    Google Scholar 

  • Li, M. Y., Kok, S., & Tan, L. (2018). Don’t classify, translate: Multi-level e-commerce product categorization via machine translation. ar**v preprint ar**v:1812.05774.

    Google Scholar 

  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. ar**v preprint ar**v:1310.4546.

    Google Scholar 

  • Ruder, S. (2019). Neural transfer learning for natural language processing (Unpublished doctoral dissertation). NUI Galway.

    Google Scholar 

  • Sharif Razavian, A., Azizpour, H., Sullivan, J., & Carlsson, S. (2014). CNN features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 806–813).

    Google Scholar 

  • Werbin-Ofir, H., Dery, L., & Shmueli, E. (2019). Beyond majority: Label ranking ensembles based on voting rules. Expert Systems with Applications, 136, 50–61.

    Article  Google Scholar 

  • Wolpert, D. H. (1992). Stacked generalization. Neural networks, 5 (2), 241–259.

    Article  Google Scholar 

Download references

Acknowledgements

This research was funded by PayPal. We would like to thank our colleagues from PayPal: Yaeli, Adam, Omer, and Avihay who provided meaningful insights and greatly assisted in improving this work.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Hadar, Y., Shmueli, E. (2023). Ensembled Transferred Embeddings. In: Rokach, L., Maimon, O., Shmueli, E. (eds) Machine Learning for Data Science Handbook. Springer, Cham. https://doi.org/10.1007/978-3-031-24628-9_26

Download citation

Publish with us

Policies and ethics

Navigation