On the Efficient Determination of Most Near Neighbors
Horseshoes, Hand Grenades, Web Search, and Other Situations When Close Is Close Enough Second Edition
Reference Work Entry In depth
Chapter
Thanks for persisting with this mixed bag of algorithms, mathematical curiosities, and algorithms engineering. I mean this in the best spirit of the HAKMEM collection of similar oddities from the MIT AI lab fo...
Chapter
Chapter
The sampling mechanisms described in the previous sections provide unbiased estimators of the standard unweighted Jaccard coefficient, in which all features are treated as being equally important. But some fea...
Chapter
As mentioned in the Forward (sic, as described on page xv), shortly after the initial version of this publication was placed in the hands of my publisher, I received a disconcerting preprint from ** Li: he a...
Chapter
When comparing pages in a corpus, there are some things one has to consider (which we do in the following sections).
What are the features of ...
Chapter
In the 15 years since we worked on Alta Vista (while applying these techniques to Bing, and subsequently) , we have discovered a few ways to compute consistent random samples from streams, typically using fewe...
Chapter
In this chapter, we look, or in some cases look yet again, at a few of the applications we have made of these and other sampling techniques.
Chapter
In 1995, just prior to public release, we discovered a problem with the Alta Vista search engine: most of the time searching worked just fine, but sometimes the results were highly repetitive. We knew that the...
Book
Horseshoes, Hand Grenades, Web Search, and Other Situations When Close Is Close Enough Second Edition
Living Reference Work Entry In depth
Chapter
When comparing pages in a corpus, there are some things one has to consider (which we do in the following sections):
What are the features of ...
Chapter
In the fifteen years since we worked on Alta Vista (while applying these techniques to Bing, and subsequently), we have discovered a few ways to compute consistent random samples from streams, typically using ...
Chapter
In this chapter, we look, or in some cases look yet again, at a few of the applications we have made of these and other sampling techniques.
Chapter
Chapter
The sampling mechanisms described in the previous sections provide unbiased estimators of the standard unweighted Jaccard coefficient, in which all features are treated as being equally important. But some fea...
Book
Horseshoes, Hand Grenades,Web Search, and Other SituationsWhen Close is Close Enough
Chapter
In 1995, just prior to public release, we discovered a problem with the Alta Vista search engine: most of the time searching worked just fine, but sometimes the results were highly repetitive. We knew that the...
Reference Work Entry In depth
Chapter and Conference Paper
The study of integer factoring algorithms and the design of faster factoring algorithms is a subject of great importance in cryptology (cf. [1]), and a constant concern for cryptographers. In this paper we presen...