ritchie46 / lsh-rs Goto Github PK
View Code? Open in Web Editor NEWLocality Sensitive Hashing in Rust with Python bindings
License: MIT License
Locality Sensitive Hashing in Rust with Python bindings
License: MIT License
I'm roughly using the following code:
let query_emb: Vec<f32>;
let doc_emb: Vec<Vec<f32>>; // contains 3 document embeddings
...
let mut lsh = LshMem::new(10, 30, 512).srp().unwrap();
let _x = lsh.store_vecs(&doc_emb[..]);
let result = lsh.query_bucket(&query_emb).unwrap();
println!("lsh-rs: {:?}", result);
Unfortunately, the result is empty. I'm testing the same query and documents with ngt-rs and I get some results (I'm looking for an alternative to ngt-rs which runs on windows). Is this a problem of using better parameters?
I'm trying to run the examples and it seems like the project doesn't build at the moment. The compiler is reporting a few places where what appears to be a private serde
module is being used. Did serde update and remove that export? Or am I missing something in order to import private modules?
error[E0603]: module `export` is private
--> lsh-rs/src/hash.rs:8:12
|
8 | use serde::export::PhantomData;
| ^^^^^^ private module
|
note: the module `export` is defined here
--> /Users/isaac/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.120/src/lib.rs:275:5
|
275 | use self::__private as export;
| ^^^^^^^^^^^^^^^^^^^^^^^^^
error[E0603]: module `export` is private
--> lsh-rs/src/table/sqlite.rs:9:12
|
9 | use serde::export::PhantomData;
| ^^^^^^ private module
|
note: the module `export` is defined here
--> /Users/isaac/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.120/src/lib.rs:275:5
|
275 | use self::__private as export;
| ^^^^^^^^^^^^^^^^^^^^^^^^^
error[E0603]: module `export` is private
--> lsh-rs/src/data.rs:4:12
|
4 | use serde::export::fmt::{Debug, Display};
| ^^^^^^ private module
|
note: the module `export` is defined here
--> /Users/isaac/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.120/src/lib.rs:275:5
|
275 | use self::__private as export;
| ^^^^^^^^^^^^^^^^^^^^^^^^^
Building using Rust-1.57.0 with Cargo.toml
:
[dependencies]
clap = "*"
log = "*"
env_logger = "*"
lsh-rs = {version = "*", features = ["blas"]}
ndarray = {version = "*", features = ["blas"]}
results in the following error:
Compiling lsh-rs v0.4.0
error[E0603]: module `export` is private
--> XXX/.cargo/registry/src/github.com-1ecc6299db9ec823/lsh-rs-0.4.0/src/hash.rs:8:12
|
8 | use serde::export::PhantomData;
| ^^^^^^ private module
|
note: the module `export` is defined here
--> XXX/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/lib.rs:276:5
|
276 | use self::__private as export;
| ^^^^^^^^^^^^^^^^^^^^^^^^^
error[E0603]: module `export` is private
--> XXX/.cargo/registry/src/github.com-1ecc6299db9ec823/lsh-rs-0.4.0/src/table/sqlite.rs:9:12
|
9 | use serde::export::PhantomData;
| ^^^^^^ private module
|
note: the module `export` is defined here
--> XXX/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/lib.rs:276:5
|
276 | use self::__private as export;
| ^^^^^^^^^^^^^^^^^^^^^^^^^
error[E0603]: module `export` is private
--> XXX/.cargo/registry/src/github.com-1ecc6299db9ec823/lsh-rs-0.4.0/src/data.rs:4:12
|
4 | use serde::export::fmt::{Debug, Display};
| ^^^^^^ private module
|
note: the module `export` is defined here
--> XXX/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.136/src/lib.rs:276:5
|
276 | use self::__private as export;
| ^^^^^^^^^^^^^^^^^^^^^^^^^
This seems to be caused by change to serde
. Any chance to bump up the implementation to the latest Rust?
I ran into this error with the latest version of lsh-rs:
Compiling lsh-rs v0.4.0
error[E0432]: unresolved import `serde::export`
--> C:\Users\steve\.cargo\registry\src\index.crates.io-6f17d22bba15001f\lsh-rs-0.4.0\src\hash.rs:8:12
|
8 | use serde::export::PhantomData;
| ^^^^^^ could not find `export` in `serde`
error[E0433]: failed to resolve: could not find `export` in `serde`
--> C:\Users\steve\.cargo\registry\src\index.crates.io-6f17d22bba15001f\lsh-rs-0.4.0\src\data.rs:4:12
|
4 | use serde::export::fmt::{Debug, Display};
| ^^^^^^ could not find `export` in `serde`
error[E0432]: unresolved import `serde::export`
--> C:\Users\steve\.cargo\registry\src\index.crates.io-6f17d22bba15001f\lsh-rs-0.4.0\src\table\sqlite.rs:9:12
|
9 | use serde::export::PhantomData;
| ^^^^^^ could not find `export` in `serde`
Some errors have detailed explanations: E0432, E0433.
For more information about an error, try `rustc --explain E0432`.
error: could not compile `lsh-rs` (lib) due to 3 previous errors
warning: build failed, waiting for other jobs to finish...
I was running the latest serde:
serde = { version = "1.0.187", features = ["derive"] }
Based on this post, I downgraded to serde 1.0.118
and it compiled:
serde = { version = "=1.0.118", features = ["derive"] }
Maybe we need to update lsh-rs to be compatible with their new API?
Following #9 (comment), tracking separately: This make sense, BLAS is only for squeezing out max performance, but should definitely be opt in.
Would be nice to be able to build Python bindings + Blas support for mac.
Hello,
I'm curious to see if I can get this to work with plain K, L
parametrized LSH. The setup is that I already have L Vec
s each with K (u32) hashes of the input sample, which I'd like to be able to feed directly into the LSH. The LSH hasher in this case then only queries for exact matches in each of the L
hash tables and returns the union of all matches across all tables.
Is such a setup possible using the current API?
Thanks,
Paul
Should we drop the Error return in the builder pattern and just panic? An error could return if we could not create a sqlite database.
Hi, i'm python user.
I am enjoy to use your wonderful algorithm.
While using it, i got some question.
Is there any function that add samples after indexing
lsh = SRP(n_projections=19, n_hash_tables=10)
lsh.fit(data_points)
after build like this.
i want to add some data_points again.
but i don't want indexing all data_points that already indexed.
so i just wondered there might be some function that can add moer data points.
Thanks for your help.
The crate::table::general::HashTables
has a delete method that is only implemented for the in memory backend. Could be useful for sqlite backend as well.
See: https://github.com/ritchie46/lsh-rs/blob/master/lsh-rs/src/table/general.rs
Mmapping the fs can make the Sqlite backend faster.
https://www.sqlite.org/mmap.html
This could be a configuration for LSH<H, N, Sqlite<_>, K>
that could be set as builder pattern.
let lsh = LshSql::new(..)
.use_mmap()
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.