Giter Site home page Giter Site logo

Comments (6)

RioMichael avatar RioMichael commented on August 23, 2024

For index construction time, it really depends on how you set the parameters (and how strong your CPU is). For me, I used all default parameters, and it took me 33893.343 s to build the index (only 1 thread, using Intel Core i5-8500). If I used more thread (i.e. 4 or more), this time would be reduced significantly. So I couldn't tell whether your index construction time is fast or slow (and I haven't try to use NSG yet).

For query time, as I used default "MaxCheck" value (8192), the average time for each 100 neighbors search was 0.003575 s with 0.91352 recall. And if I set "MaxCheck" to 16384, the average time was 0.006024 s for a recall of 0.94465. I wonder if your time included other things (i.e. loading index, which takes about 6s on my pc).

I haven't tried the add and delete function yet, but would consider to do that soon.
I would also suggest you to check this website: http://ann-benchmarks.com/ (maybe you have already done so). All experiments were run in Docker containers on Amazon EC2 c5.4xlarge instances that are equipped with Intel Xeon Platinum 8124M CPU (16 cores available, 3.00 GHz, 25.0MB Cache) and 32GB of RAM.

My environment:

Python 2.7
Windows 10

from sptag.

MrHwc avatar MrHwc commented on August 23, 2024

I used the official default parameters,It takes about 6 s to load the index, and the time to query 1000 vectors is about 28 s. Why is my time longer than you?
my machine:
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz
Ubuntu 18.04
python3

from sptag.

RioMichael avatar RioMichael commented on August 23, 2024

For my GIST index, I would need to set an extreme parameter ("MaxCheck" to 500000, which would not make any sense in real world situation since the size of the data-set is only 1000000) to achieve a time similar to yours (with a recall of 0.95308).
I have also tried to build the index again with a different configuration (fix a small bug in previous index), the search time would be longer (with a much higher recall), but still much lower than your time.
I wonder if you were running other programs (that used up your CPU) when doing the search. Other than that, I currently may not be able to answer your question.

from sptag.

MrHwc avatar MrHwc commented on August 23, 2024

Use the same index to reach the same recall(0.9), Server query time is 23.14s,my PC query time is 13.57s,Why server test results are slower than computers?
server machine:
Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz
Ubuntu 18.04

my PC:
Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz 3.40GHz
Ubuntu 18.04

from sptag.

muzzynine avatar muzzynine commented on August 23, 2024

I ran some other algorithms and performance tests using the ann-benchmark tool.
(There was no time to run on the various datasets.)
The y-axis is logarithmic scale, so the performance gap is actually greater.
Given the index size, speed, and performance of the algorithm, I think the results are not reasonable.
Please suggest comments or additional tests about my results.

Dataset : SIFT-128 1M
Experiment :

  • Perturbation tree : BKT
  • NeighborhoodSize : [4, 8, 16, 32, 64, 96]
  • MaxCheck : [100, 200, 400, 1000, 2000, 4000, 8000]

QPS vs Recall
Index size vs Recall

from sptag.

muzzynine avatar muzzynine commented on August 23, 2024

I set optimization options for the build script and rerun the tests.
The index size is a bit large but I think it is reasonable performance.

QPS vs Recall

Unrelated to this issue: Why did you choose RNG as your data structure?

from sptag.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.