Abstract
“Text-source discovery” is the problem of identifying relevant document databases that potentially contain documents that match a user query. GlOSS [6] is a cost-effective technique for solving the text-source discovery problem. However, the GlOSS approach assumes that the document databases are fully cooperative in exporting statistical information about their collections. This paper discusses how the GlOSS technique can be applied to a dynamic and uncooperative Web environment in assisting users to locate relevant Web information sources. keywords: text-source discovery, GlOSS, search engines
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
The Web Robots FAQ. URL: http://www.mesquite.com.
M.E. Maron D.C. Blair. An evaluation of retrieval effectiveness for a full-text document retrieval system. Communications of the ACM, 28(3):290–299, 1985.
S. Feldman. Just the answers, please: Choosing a web search service. The Magazine for Database Professionals, May 1997.
B. Grossan. Search Engines: What They Are? How They Work? URL: http://webreference.com/content/search/features.html.
V.N. Gudivada. Information retrieval on the world wide web. IEEE Internet Computing, 1(5):58–68, 1997.
Anthony Tomasic Luis Gravano, Héctor García-Molina. The effectiveness of GlOSS for the text-database discovery problem. In Proceedings of the 1994 ACM SIGMOD.
Anthony Tomasic Luis Gravano, Héctor García-Molina. Generalizing GlOSS to vector-space databases and broker hierarchies. In Proceedings of the 1995 VLDB Conference, May 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ng, CY., Kao, B., Cheung, D. (2000). Text-Source Discovery and GlOSS Update in a Dynamic Web. In: Terano, T., Liu, H., Chen, A.L.P. (eds) Knowledge Discovery and Data Mining. Current Issues and New Applications. PAKDD 2000. Lecture Notes in Computer Science(), vol 1805. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45571-X_50
Download citation
DOI: https://doi.org/10.1007/3-540-45571-X_50
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67382-8
Online ISBN: 978-3-540-45571-4
eBook Packages: Springer Book Archive