Declustering Web Content Indices for Parallel Information Retrieval

Chung, Yoo**; Kwon, Hyuk-Chul; Chung, Sang-Hwa; Ryu, Kwang Ryel

doi:10.1007/3-540-45490-X_41

Yoo** Chung⁵,
Hyuk-Chul Kwon⁶,
Sang-Hwa Chung⁶ &
…
Kwang Ryel Ryu⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2198))

Included in the following conference series:

Asia-Pacific Conference on Web Intelligence

665 Accesses
1 Citations

Abstract

We consider an information retrieval (IR) system on a low-cost highperformance PC cluster environment. The IR system replicates the Web pages locally, it is indexed by the inverted-index file (IIF), and the vector space model is used as ranking strategy. In the IR system, the inverted-index file (IIF) is partitioned into pieces using the lexical and the greedy declustering methods. The lexical method assigns each of the terms in the IIF lexicographically to each of the processing nodes in turn and the greedy one is based on the probability of co-occurrence of an arbitrary pair of terms in the IIF and distributed to the cluster nodes to be stored on each node’s hard disk. For each incoming user’s query with multiple terms, terms are sent to the corresponding nodes that contain the relevant pieces of the IIF to be evaluated in parallel. We study how query performance is affected by two declustering methods with various-sized IIF. According to the experiments, the greedy method shows about 3.7% enhancement overall when compared with the lexical method.

1 This paper was supported in part by the Korea Science and Engineering Foundation under contact NO. 2000-2-30300-002-3.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Park, S.H., Kwon, H.C.: An Improved Relevance Feedback for Korean Information Retrieval System. Proceedings of the 16th IASTED International Conference on Applied Informatics, IASTED/ACTA Press, Garmisch-Partenkirchen, Germany (1998) 65–68
Google Scholar
Frakes, W., Baeza-Yates, R.: Information retrieval-data structures & algorithms. Prentice-Hall (1992)
Google Scholar
Cormack, G.V., Clarke, C.L.A., Palmer, C.R., Kisman, D.I.E.: Fast Automatic Passage Ranking (MultiText Experiment for TREC-8). The proceedings of the Eighth Text Retrieval Conference (TREC-8), Gaithersburg, Maryland (1999) 735–741
Google Scholar

Download references

Author information

Authors and Affiliations

Research Institute of Computer, Information & Communication, Pusan National University, Pusan, 609-735, South Korea
Yoo** Chung
School of Electrical and Computer Engineering, Pusan National University, Pusan, 609-735, South Korea
Hyuk-Chul Kwon, Sang-Hwa Chung & Kwang Ryel Ryu

Authors

Yoo** Chung
View author publications
You can also search for this author in PubMed Google Scholar
Hyuk-Chul Kwon
View author publications
You can also search for this author in PubMed Google Scholar
Sang-Hwa Chung
View author publications
You can also search for this author in PubMed Google Scholar
Kwang Ryel Ryu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Systems and Information Engineering, Maebashi Institute of Technology, 460-1 Kamisadori-Cho, Maebashi-City, 371-0816, Japan
Ning Zhong
Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada, S4S 0A2
Yiju Yao
Department of Computer Science, Hong Kong Baptist University, 224 Waterloo Road, Kowloon, Hong Kong, China
Jiming Liu
Department of Information and Computer Science, Waseda University, 3-4-1 Okubo Shinjuku-Ku, Tokyo, 169, Japan
Setsuo Ohsuga

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chung, Y., Kwon, HC., Chung, SH., Ryu, K.R. (2001). Declustering Web Content Indices for Parallel Information Retrieval. In: Zhong, N., Yao, Y., Liu, J., Ohsuga, S. (eds) Web Intelligence: Research and Development. WI 2001. Lecture Notes in Computer Science(), vol 2198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45490-X_41

Download citation

DOI: https://doi.org/10.1007/3-540-45490-X_41
Published: 19 October 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42730-8
Online ISBN: 978-3-540-45490-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics