@xxh
2015-06-20T02:51:01.000000Z
字数 1352
阅读 329
[Nearest Neighbor]LSH Algorithm
LSH Algorithm and Implementation (E2LSH) from MIT:
http://www.mit.edu/~andoni/LSH/
an introduction of nearest neighbor
http://people.csail.mit.edu/gregory/annbook/introduction.pdf
blog:
http://blog.csdn.net/icvpr/article/details/12342159
http://jacoxu.com/?p=496
Approximate Nearest Neighbors Search in High Dimensions and Locality-Sensitive Hashing
https://graphics.stanford.edu/courses/cs468-06-fall/Slides/aneesh-michael.pdf
- applied in high dimensional spaces; one of the method for approximate nearest neighbor searching
- general idea: hash the data points using several hash functions so as to ensure that, for each function, the probability of collision is much higher for points which are close to each other than for those which are far apart. Then, one can determine near neighbors by hashing the query point and retrieving elements stored in buckets containing that point.
- key points:
D(p,q)≤r→Pro[h(p)=h(q)]≥p1
D(p,q)>r(1+ε)→Pro[h(p)=h(q)]≤p2
h:= hash function
- algo:
- s1: offline: build up a hash table with hash function, by the searching(training?) result find the # of hash table L and # of hash functions in each table K
- 希望原本相邻的数据经过LSH hash后,都能够落入到相同的桶内,而不相邻的数据经过LSH hash后,都能够落入到不同的桶中。
- s2: online, for new p0, get the datas in the place: h(p0) and compare their distance
- comment: 减少需要匹配的数据点个数的同时保证查找到最近邻的数据点的概率很大(不是一定找得到)。
- modification: LSH hash functions (>1 hash table and >1 hash function for each table)