Abstract
We propose two models in this paper. The concept of association model is put forward to obtain the co-occurrence relationships among keywords in the documents and the hierarchical Hamming clustering model is used to reduce the dimensionality of the category feature vector space which can solve the problem of the extremely high dimensionality of the documents' feature space. The results of experiment indicate that it can obtain the co-occurrence relations among key-words in the documents which promote the recall of classification system effectively. The hierarchical Hamming clustering model can reduce the dimensionality of the category feature vector efficiently, the size of the vector space is only about 10% of the primary dimensionality.
Similar content being viewed by others
References
Diao Qian, Wang Yong-Cheng. Apery Algorithm of Chinese Information Automatic Classification,ICCIP'98, 1998, (6): 216.
Chen H, Schuffels C, Orwig R. Internet Categorization and Search: A Self-Organizing Approach.Journal of Visual Communication and Image Representation, 1996,7 (1): 88–102.
Li Y H, Jain A K, Classification of Text Documents.The Computer Journal, 1998,41, (8): 537–546.
Lin Sheng-fu, Hong Cheng-an,Introduce of Neutral Network and Pattern Recognttion. Bei**g: Press of Qsing Hua Sci Tech, 1996, (Ch).
Jiao Li-cheng.Theory of Neutral Network System. **'an: Press of **'an University, 1992 (Ch).
Hu Shao-ren, Yu Shao-bo, Dai Ren-kui.Introduce of Neutral Network. Changsha: Press of University of National Defense Sci and Tech, 1994 (Ch).
Kaski S, Lagus K, Honkela T,et al. Statistical Aspects of the WEBSOM System in Organizing Document Aggregations.Computer Science and Statistics, 1998, (29): 281–290.
Yang **ng-jun, Zheng Jun-li.Manual Neutral Network. Bei**g: Press of Higher Fducation, 1992 (Ch).
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: Supporteded by the National 863 Project of China (2001AA142160, 2002AA145090)
Biography: Su Gui-yang (1974-), male, Ph. D candidate, research direction: information filter and text classification.
Rights and permissions
About this article
Cite this article
Gui-yang, S., Jian-hua, L., Ying-hua, M. et al. Concept association and hierarchical Hamming clustering model in text classification. Wuhan Univ. J. Nat. Sci. 9, 339–342 (2004). https://doi.org/10.1007/BF02907890
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF02907890