Abstract
We consider an information retrieval (IR) system on a low-cost highperformance PC cluster environment. The IR system replicates the Web pages locally, it is indexed by the inverted-index file (IIF), and the vector space model is used as ranking strategy. In the IR system, the inverted-index file (IIF) is partitioned into pieces using the lexical and the greedy declustering methods. The lexical method assigns each of the terms in the IIF lexicographically to each of the processing nodes in turn and the greedy one is based on the probability of co-occurrence of an arbitrary pair of terms in the IIF and distributed to the cluster nodes to be stored on each node’s hard disk. For each incoming user’s query with multiple terms, terms are sent to the corresponding nodes that contain the relevant pieces of the IIF to be evaluated in parallel. We study how query performance is affected by two declustering methods with various-sized IIF. According to the experiments, the greedy method shows about 3.7% enhancement overall when compared with the lexical method.
1 This paper was supported in part by the Korea Science and Engineering Foundation under contact NO. 2000-2-30300-002-3.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Park, S.H., Kwon, H.C.: An Improved Relevance Feedback for Korean Information Retrieval System. Proceedings of the 16th IASTED International Conference on Applied Informatics, IASTED/ACTA Press, Garmisch-Partenkirchen, Germany (1998) 65–68
Frakes, W., Baeza-Yates, R.: Information retrieval-data structures & algorithms. Prentice-Hall (1992)
Cormack, G.V., Clarke, C.L.A., Palmer, C.R., Kisman, D.I.E.: Fast Automatic Passage Ranking (MultiText Experiment for TREC-8). The proceedings of the Eighth Text Retrieval Conference (TREC-8), Gaithersburg, Maryland (1999) 735–741
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chung, Y., Kwon, HC., Chung, SH., Ryu, K.R. (2001). Declustering Web Content Indices for Parallel Information Retrieval. In: Zhong, N., Yao, Y., Liu, J., Ohsuga, S. (eds) Web Intelligence: Research and Development. WI 2001. Lecture Notes in Computer Science(), vol 2198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45490-X_41
Download citation
DOI: https://doi.org/10.1007/3-540-45490-X_41
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42730-8
Online ISBN: 978-3-540-45490-8
eBook Packages: Springer Book Archive