Abstract
Transfer learning has been used as a machine learning method to make good use of available language resources for other resource-scarce languages. However, the cumulative class noise during iterations of transfer learning can lead to negative transfer which can adversely affect performance when more training data is used. In this paper, we propose a novel transfer learning method which can detect negative transfers. This approach detects high quality samples after certain iterations to identify class noise in new transferred training samples and remove them to reduce misclassifications. With the ability to detect bad training samples and remove them, our method can make full use of large unlabeled training data available in the target language. Furthermore, the most important contribution in this paper is the theory of class noise detection. Our new class noise detection method overcame the theoretic flaw of a previous method based on Gaussian distribution. We applied this transfer learning method with negative transfer detection to cross lingual opinion analysis. Evaluation on the NLP&CC 2013 cross-lingual opinion analysis dataset shows that the proposed approach outperforms the state-of-the-art systems.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Angluin, D., Laird, P.: Learning from Noisy Examples. Machine Learning 2(4), 343–370 (1988)
Arnold, A., Nallapati, R., Cohen, W.W.: A comparative study of methods for TTL. In: Proc. 7th IEEE ICDM Work-shops, pp. 77–82 (2007)
Blitzer, J., McDonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: Proc. EMNLP, pp. 120–128 (2006)
Brodley, C.E., Friedl, M.A.: Identifying and Eliminating Mislabeled Training Instances. Journal of Artificial Intelligence Research 11, 131–167 (1999)
Chao, D., Guo, M.Z., Liu, Y., Li, H.F.: Participatory learning based semi-supervised classification. In: Proc. of 4th ICNC, pp. 207–216 (2008)
Cheng, Y., Li, Q.Y.: Transfer learning with data edit. LNAI, pp. 427–434 (2009)
Fukumoto, F., Suzuki, Y., Matsuyoshi, S.: Text classification from positive and unlabeled data using misclassified data correction. In: Proc. of 51st ACL, pp. 474–478 (2013)
Holmstedt, T.: Interpolation of quasi-normed spaces. Math. Scand. 26, 177–199 (1970)
Jiang, Y., Zhou, Z.-H.: Editing training data for kNN classifiers with neural network ensemble. In: Yin, F.-L., Wang, J., Guo, C. (eds.) ISNN 2004. LNCS, vol. 3173, pp. 356–361. Springer, Heidelberg (2004)
Li, M., Zhou, Z.-H.: SETRED: self-training with editing. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 611–621. Springer, Heidelberg (2005)
Li, M., Zhou, Z.H.: COTRADE: Confident Co-Training With Data Editing. IEEE Transactions on Systems, Man, and Cybernetics—Part B Cybernetics 41(6), 1612–1627 (2011)
Gui, L., Xu, R.F., Lu, Q. et. al.: Cross-lingual opinion analysis via negative transfer detection. In: Proc. of 52th ACL(2), pp. 860–865 (2014)
Montgomery-Smith, S.J.: The distribution of Rademacher sums. Proc. Amer. Math. Soc. 109, 517–522 (1990)
Muhlenbach, F., Lallich, S., Zighed, D.A.: Identifying and Handling Mislabeled Instances. Journal of Intelligent Information System 22(1), 89–109 (2004)
Pan, S.J., Yang, Q.: A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering 22(10), 1345–1360 (2010)
Sluban, B., Gamberger, D., Lavra, N.: Advances in class noise detection. In: Proc.19th ECAI, pp. 1105–1106 (2010)
Wan, X.: Co-training for cross-lingual sentiment classification. In: Proc. of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP, pp. 235–243 (2009)
Zhu, X.Q., Wu, X.D., Chen, Q.J.: Eliminating class noise in large datasets. In: Proc. of 12th ICML, pp. 920–927 (2003)
Zhu, X.Q.: Cost-guided class noise handling for effective cost-sensitive learning. In: Proc. of 4th IEEE ICDM, pp. 297–304 (2004)
Zighed, D.A., Lallich, S., Muhlenbach, F.: A statistical approach to class separability. Applied Stochastic Models in Business and Industry 21(2), 187–197 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Gui, L., Lu, Q., Xu, R., Wei, Q., Cao, Y. (2015). Improving Transfer Learning in Cross Lingual Opinion Analysis Through Negative Transfer Detection. In: Zhang, S., Wirsing, M., Zhang, Z. (eds) Knowledge Science, Engineering and Management. KSEM 2015. Lecture Notes in Computer Science(), vol 9403. Springer, Cham. https://doi.org/10.1007/978-3-319-25159-2_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-25159-2_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25158-5
Online ISBN: 978-3-319-25159-2
eBook Packages: Computer ScienceComputer Science (R0)