Comments (13)
sure I can work with @jpountz on the upgrade (and perhaps on config options for enabling quantization in HNSW in Anserini)
from anserini.
Definitely! I'm in the middle of running our regressions, and then planning on merging in #2275 which is a big code dump.
But let's queue up after that?
BTW, are there are new codecs introduce that we gotta upgrade? HNSW indexer currently hard codes Lucene95Codec
:
https://github.com/castorini/anserini/blob/master/src/main/java/io/anserini/index/IndexHnswDenseVectors.java#L283C14-L283C14
from anserini.
But let's queue up after that?
Sure, no hurry.
are there are new codecs introduce that we gotta upgrade?
Indeed, you'll need to replace Lucene95Codec
with Lucene99Codec
and Lucene95HnswVectorsFormat
with Lucene99HnswVectorsFormat
when you upgrade. There are new options as well, e.g. you could use Lucene99HnswScalarQuantizedVectorsFormat
to enable int8 scalar quantization.
from anserini.
int8
Nice. Is there int16 or float16 as well as intermediate step?
When we're ready for that, can you and @tteofili work on that together?
from anserini.
Nice. Is there int16 or float16 as well as intermediate step?
Not at this point., we're missing native support for float16 in the JVM.
from anserini.
Nice. Is there int16 or float16 as well as intermediate step?
Not at this point., we're missing native support for float16 in the JVM.
And do you have numbers of speed/effectiveness tradeoffs vs. full float32?
If not, I guess we should rerun https://arxiv.org/abs/2308.14963 ?
from anserini.
Mileage varies, the main benefit is that you only need one byte per dimension to fit in RAM to get decent performance, vs. 4 bytes per RAM without scalar quantization. So this allows addressing more data with the same amount of RAM.
It turns out that we accidentally turned on quantization on Lucene's nightly benchmarks between Nov 13th and yesterday, there was a noticeable ~30% speedup, even though all vectors already fit in memory at 4 bytes per dimension. http://people.apache.org/~mikemccand/lucenebench/VectorSearch.html
@benwtrent might have more info than I do.
from anserini.
For reference, there have been lots of performance improvements in 9.8 and 9.9 for sparse retrieval too, see e.g. http://people.apache.org/~mikemccand/lucenebench/OrHighHigh.html over recent months. One optimization in particular, apache/lucene#12444 (annotation FK on the nigthly charts, and a blog that describes the optimization) should help significantly with cases that are hard for dynamic pruning, such as learned sparse representations. So I would expect much better numbers for Lucene if you were to run benchmarks from https://arxiv.org/abs/2110.11540 again.
from anserini.
And do you have numbers of speed/effectiveness tradeoffs vs. full float32?
The PR that did the change has a few more numbers about speed and effectiveness: apache/lucene#12582 (comment)
from anserini.
re: HNSW - yup, I suppose faster is a given... my question is more about how much you give up in terms of effectiveness...
from anserini.
And do you have numbers of speed/effectiveness tradeoffs vs. full float32?
The PR that did the change has a few more numbers about speed and effectiveness: apache/lucene#12582 (comment)
Thanks, this is good info.
But as I always say... you need a real search task like MS MARCO, BEIR, etc.
from anserini.
The JVM just doesn't support f16. Reading from disk, doing fast vector operations, etc., its just bad. Even in JDK21.
There have been steps to fix this (finally adding an intrinsic for de/encoding f16), but its not there yet.
We cannot add f16 until there is something in Panama Vector that handles it.
from anserini.
Upgrade completed #2302
from anserini.
Related Issues (20)
- Maven build / test issue HOT 2
- Add DL19/DL20 for Cohere V3 embeddings HOT 2
- Anserini Retrieval latency question - Mono thread/CPU ?
- bge-base-en-v1.5 encoder query length issues HOT 1
- Allow trec_eval to take symbols representing standard qrels (instead of full qrel files) HOT 7
- Upgrade JDK? HOT 4
- Add dl22 docs to Anserini HOT 2
- Change local filename of downloaded pre-built index HOT 4
- Duplicate downloading of ONNX files for test cases?
- Can't run 2CR on pre-built indexes directly on fatjar - can't read YAML files HOT 14
- Building anserini on MacOS HOT 21
- Missing appassembler-maven-plugin:2.1.0:assemble HOT 6
- Instructions for reproducing runs on MS MARCO V2.1 with prebuilt indexes HOT 1
- Align RunMsMarco with Fatjar regression instructions HOT 2
- Errors with new MS MARCO v2.1 and BEIR regressions HOT 6
- REST API design HOT 4
- Implement run fusion directly in Anserini
- Aligned doc output with 2CR repro classes HOT 1
- Try out new REST API - connect with RankLLM HOT 1
- Discussion: REST API routes for different corpus/model combinations - how do we name? HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from anserini.