A Bayesian Classifier for Learning from Tensorial Data

Liu, Wei; Chan, Jeffrey; Bailey, James; Leckie, Christopher; Chen, Fang; Ramamohanarao, Kotagiri

doi:10.1007/978-3-642-40991-2_31

Wei Liu^23,24,
Jeffrey Chan²³,
James Bailey^23,24,
Christopher Leckie^23,24,
Fang Chen²⁴ &
…
Kotagiri Ramamohanarao^23,24

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8189))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

2936 Accesses

Abstract

Traditional machine learning methods characterize data observations by feature vectors, where an entry of a vector denotes a scalar feature value of a data instance. While this data representation facilitates the application of conventional machine learning algorithms, in many cases it is not the best way of extracting all useful information from the data observations. In this paper we relax the (often unstated) assumption of vectorizing features of data instances, and allow a more natural representation of the data in a tensor format. Tensors are multi-mode (aka multi-way) arrays, of whom vectors (i.e., one-mode tensors) and matrices (i.e., two-mode tensors) are special cases. We show that the tensor representation captures useful information that is difficult to provide in the conventional vector format. More importantly, to effectively utilize the rich information contained in tensors, we propose a novel semi-naive Bayesian tensor classification method (which we call Bat) that builds predictive models directly on data in tensor form (instead of on their vectorizations). We apply Bat to supervised learning problems, and perform comprehensive experiments on classifying text documents and graphs, which demonstrate (1) the advantage of the tensor representation over conventional feature-vectorization approaches, and (2) the superiority of the proposed Bat tensor classifier over other existing learners.

Download to read the full chapter text

Chapter PDF

Dimension Reduction for Tensor Classification

Optimal Calculation of Tensor Learning Approaches

Factorization of Multiple Tensors for Supervised Feature Extraction

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc. (2011)
Google Scholar
Kong, X., Yu, P.: Semi-supervised feature selection for graph classification. In: Proceedings of the 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 793–802 (2010)
Google Scholar
Kong, X., Fan, W., Yu, P.: Dual active feature and sample selection for graph classification. In: Proceedings of the 17th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 654–662 (2011)
Google Scholar
Yan, X., Han, J.: Closegraph: mining closed frequent graph patterns. In: Proceedings of the 9th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 286–295 (2003)
Google Scholar
Fei, H., Huan, J.: Structure feature selection for graph classification. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM), pp. 991–1000 (2008)
Google Scholar
Saigo, H., Nowozin, S., Kadowaki, T., Kudo, T., Tsuda, K.: gBoost: a mathematical programming approach to graph classification and regression. Machine Learning 75(1), 69–89 (2009)
Article Google Scholar
Kolda, T., Bader, B.: Tensor decompositions and applications. SIAM Review 51(3), 455 (2009)
Article MathSciNet MATH Google Scholar
Liu, W., Kan, A., Chan, J., Bailey, J., Leckie, C., Pei, J., Kotagiri, R.: On compressing weighted time-evolving graphs. In: Proceedings of CIKM 2012, pp. 2319–2322 (2012)
Google Scholar
Liu, W., Chan, J., Bailey, J., Leckie, C., Ramamohanarao, K.: Mining labelled tensors by discovering both their common and discriminative subspaces. In: Proceedings of SDM 2013 (2013)
Google Scholar
Tao, D., Li, X., Wu, X., Hu, W., Maybank, S.: Supervised tensor learning. Knowledge and Information Systems 13(1), 1–42 (2007)
Article Google Scholar
Webb, G., Boughton, J., Wang, Z.: Not so naive bayes: Aggregating one-dependence estimators. Machine Learning 58(1), 5–24 (2005)
Article MATH Google Scholar
Jiang, L., Zhang, H.: Weightily averaged one-dependence estimators. In: Yang, Q., Webb, G. (eds.) PRICAI 2006. LNCS (LNAI), vol. 4099, pp. 970–974. Springer, Heidelberg (2006)
Chapter Google Scholar
Jiang, L., Zhang, H., Cai, Z.: A novel Bayes model: Hidden naive Bayes. IEEE Transaction on Knowledge and Data Engineering 21(10), 1361–1371 (2009)
Article Google Scholar
Cooper, G.F., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Machine Learning 9(4), 309–347 (1992)
MATH Google Scholar
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Machine Learning 29(2-3), 131–163 (1997)
Article MATH Google Scholar
Kononenko, I.: Semi-naive bayesian classifier. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 206–219. Springer, Heidelberg (1991)
Chapter Google Scholar
Zheng, F., Webb, G., Suraweera, P., Zhu, L.: Subsumption resolution: an efficient and effective technique for semi-naive bayesian learning. Machine Learning 87(1), 93–125 (2012)
Article MathSciNet MATH Google Scholar
Webb, G.: Multiboosting: A technique for combining boosting and wagging. Machine Learning 40(2), 159–196 (2000)
Article Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA data mining software: an update. SIGKDD Explorations Newsletter 11(1), 10–18 (2009)
Article Google Scholar
Hersh, W., Buckley, C., Leone, T., Hickam, D.: Ohsumed: an interactive retrieval evaluation and new large test collection for research. In: Proceedings of SIGIR, pp. 192–201 (1994)
Google Scholar
Han, E.-H.(S.), Karypis, G.: Centroid-based document classification: Analysis and experimental results. In: Zighed, D.A., Komorowski, J., Żytkow, J.M. (eds.) PKDD 2000. LNCS (LNAI), vol. 1910, pp. 424–431. Springer, Heidelberg (2000)
Google Scholar
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning (ICML), pp. 233–240 (2006)
Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)
MATH Google Scholar
Quinlan, J.R.: C4.5: programs for machine learning, vol. 1. Morgan kaufmann (1993)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing and Information Systems, The University of Melbourne, Australia
Wei Liu, Jeffrey Chan, James Bailey, Christopher Leckie & Kotagiri Ramamohanarao
ATP and Victoria Research Laboratory, National ICT Australia, Australia
Wei Liu, James Bailey, Christopher Leckie, Fang Chen & Kotagiri Ramamohanarao

Authors

Wei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Chan
View author publications
You can also search for this author in PubMed Google Scholar
James Bailey
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Leckie
View author publications
You can also search for this author in PubMed Google Scholar
Fang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kotagiri Ramamohanarao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001, Leuven, Belgium
Hendrik Blockeel
Fraunhofer IAIS, Department of Knowledge Discovery, Schloss Birlinghoven, University of Bonn, 53754, Sankt Augustin, Germany
Kristian Kersting
LIACS, Universiteit Leiden, Niels Bohrweg 1, 2333, Leiden, CA, The Netherlands
Siegfried Nijssen
Department of Computer Science and Engineering, Czech Technical University, Technicka 2, 16627, Prague 6, Czech Republic
Filip Železný

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, W., Chan, J., Bailey, J., Leckie, C., Chen, F., Ramamohanarao, K. (2013). A Bayesian Classifier for Learning from Tensorial Data. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2013. Lecture Notes in Computer Science(), vol 8189. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40991-2_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-40991-2_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40990-5
Online ISBN: 978-3-642-40991-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Bayesian Classifier for Learning from Tensorial Data

Abstract

Chapter PDF

Similar content being viewed by others

Dimension Reduction for Tensor Classification

Optimal Calculation of Tensor Learning Approaches

Factorization of Multiple Tensors for Supervised Feature Extraction

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Bayesian Classifier for Learning from Tensorial Data

Abstract

Chapter PDF

Similar content being viewed by others

Dimension Reduction for Tensor Classification

Optimal Calculation of Tensor Learning Approaches

Factorization of Multiple Tensors for Supervised Feature Extraction

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation