Improving Transfer Learning in Cross Lingual Opinion Analysis Through Negative Transfer Detection

  • Conference paper
  • First Online:
Knowledge Science, Engineering and Management (KSEM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9403))

Abstract

Transfer learning has been used as a machine learning method to make good use of available language resources for other resource-scarce languages. However, the cumulative class noise during iterations of transfer learning can lead to negative transfer which can adversely affect performance when more training data is used. In this paper, we propose a novel transfer learning method which can detect negative transfers. This approach detects high quality samples after certain iterations to identify class noise in new transferred training samples and remove them to reduce misclassifications. With the ability to detect bad training samples and remove them, our method can make full use of large unlabeled training data available in the target language. Furthermore, the most important contribution in this paper is the theory of class noise detection. Our new class noise detection method overcame the theoretic flaw of a previous method based on Gaussian distribution. We applied this transfer learning method with negative transfer detection to cross lingual opinion analysis. Evaluation on the NLP&CC 2013 cross-lingual opinion analysis dataset shows that the proposed approach outperforms the state-of-the-art systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Angluin, D., Laird, P.: Learning from Noisy Examples. Machine Learning 2(4), 343–370 (1988)

    Article  Google Scholar 

  2. Arnold, A., Nallapati, R., Cohen, W.W.: A comparative study of methods for TTL. In: Proc. 7th IEEE ICDM Work-shops, pp. 77–82 (2007)

    Google Scholar 

  3. Blitzer, J., McDonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: Proc. EMNLP, pp. 120–128 (2006)

    Google Scholar 

  4. Brodley, C.E., Friedl, M.A.: Identifying and Eliminating Mislabeled Training Instances. Journal of Artificial Intelligence Research 11, 131–167 (1999)

    Article  MATH  Google Scholar 

  5. Chao, D., Guo, M.Z., Liu, Y., Li, H.F.: Participatory learning based semi-supervised classification. In: Proc. of 4th ICNC, pp. 207–216 (2008)

    Google Scholar 

  6. Cheng, Y., Li, Q.Y.: Transfer learning with data edit. LNAI, pp. 427–434 (2009)

    Google Scholar 

  7. Fukumoto, F., Suzuki, Y., Matsuyoshi, S.: Text classification from positive and unlabeled data using misclassified data correction. In: Proc. of 51st ACL, pp. 474–478 (2013)

    Google Scholar 

  8. Holmstedt, T.: Interpolation of quasi-normed spaces. Math. Scand. 26, 177–199 (1970)

    Article  MathSciNet  MATH  Google Scholar 

  9. Jiang, Y., Zhou, Z.-H.: Editing training data for kNN classifiers with neural network ensemble. In: Yin, F.-L., Wang, J., Guo, C. (eds.) ISNN 2004. LNCS, vol. 3173, pp. 356–361. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  10. Li, M., Zhou, Z.-H.: SETRED: self-training with editing. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 611–621. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Li, M., Zhou, Z.H.: COTRADE: Confident Co-Training With Data Editing. IEEE Transactions on Systems, Man, and Cybernetics—Part B Cybernetics 41(6), 1612–1627 (2011)

    Article  Google Scholar 

  12. Gui, L., Xu, R.F., Lu, Q. et. al.: Cross-lingual opinion analysis via negative transfer detection. In: Proc. of 52th ACL(2), pp. 860–865 (2014)

    Google Scholar 

  13. Montgomery-Smith, S.J.: The distribution of Rademacher sums. Proc. Amer. Math. Soc. 109, 517–522 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  14. Muhlenbach, F., Lallich, S., Zighed, D.A.: Identifying and Handling Mislabeled Instances. Journal of Intelligent Information System 22(1), 89–109 (2004)

    Article  MATH  Google Scholar 

  15. Pan, S.J., Yang, Q.: A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering 22(10), 1345–1360 (2010)

    Article  Google Scholar 

  16. Sluban, B., Gamberger, D., Lavra, N.: Advances in class noise detection. In: Proc.19th ECAI, pp. 1105–1106 (2010)

    Google Scholar 

  17. Wan, X.: Co-training for cross-lingual sentiment classification. In: Proc. of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, pp. 235–243 (2009)

    Google Scholar 

  18. Zhu, X.Q., Wu, X.D., Chen, Q.J.: Eliminating class noise in large datasets. In: Proc. of 12th ICML, pp. 920–927 (2003)

    Google Scholar 

  19. Zhu, X.Q.: Cost-guided class noise handling for effective cost-sensitive learning. In: Proc. of 4th IEEE ICDM, pp. 297–304 (2004)

    Google Scholar 

  20. Zighed, D.A., Lallich, S., Muhlenbach, F.: A statistical approach to class separability. Applied Stochastic Models in Business and Industry 21(2), 187–197 (2005)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruifeng Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Gui, L., Lu, Q., Xu, R., Wei, Q., Cao, Y. (2015). Improving Transfer Learning in Cross Lingual Opinion Analysis Through Negative Transfer Detection. In: Zhang, S., Wirsing, M., Zhang, Z. (eds) Knowledge Science, Engineering and Management. KSEM 2015. Lecture Notes in Computer Science(), vol 9403. Springer, Cham. https://doi.org/10.1007/978-3-319-25159-2_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25159-2_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25158-5

  • Online ISBN: 978-3-319-25159-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation