Comments (7)
@lintool What should happen in the case where we have two file names that would in theory have the same associated symbol?
Ex. Symbol
msmarco-v2-passage
could be associated withqrels.msmarco-v2-passage.dev.txt
andqrels.msmarco-v2-passage.dev2.txt
.Would it be okay to simply require symbol names to be longer?
In your example, msmarco-v2-passage
would not be a valid symbol. User would need to specify either dev or dev2.
As long as the file names are not identical (and they shouldn't be), it's fine. We specify which qrels we're using in the command line to trec_eval
.
from anserini.
working on this
from anserini.
@xpbowler any progress here?
from anserini.
Sorry! I've been busy with finals these past 2 weeks. Last exam is tomorrow, so I'll get back on this after.
from anserini.
@lintool What should happen in the case where we have two file names that would in theory have the same associated symbol?
Ex. Symbol msmarco-v2-passage
could be associated with qrels.msmarco-v2-passage.dev.txt
and qrels.msmarco-v2-passage.dev2.txt
.
Would it be okay to simply require symbol names to be longer?
from anserini.
For the symbols, let's use the bindings here: https://github.com/castorini/anserini/blob/master/src/main/java/io/anserini/eval/Qrels.java
So instead of
java -cp anserini-0.35.1-fatjar.jar trec_eval -c -M 10 -m recip_rank tools/topics-and-qrels/qrels.msmarco-passage.dev-subset.txt run.msmarco-v1-passage.dev.bm25.txt
We can just do:
java -cp anserini-0.35.1-fatjar.jar trec_eval -c -M 10 -m recip_rank msmarco-passage.dev-subset run.msmarco-v1-passage.dev.bm25.txt
from anserini.
hey @xpbowler just to prevent duplicate work, I think @DanielKohn1208 is on this!
from anserini.
Related Issues (20)
- Maven build / test issue HOT 2
- Add DL19/DL20 for Cohere V3 embeddings HOT 2
- Anserini Retrieval latency question - Mono thread/CPU ?
- bge-base-en-v1.5 encoder query length issues HOT 1
- Upgrade JDK? HOT 4
- Add dl22 docs to Anserini HOT 2
- Change local filename of downloaded pre-built index HOT 4
- Duplicate downloading of ONNX files for test cases?
- Can't run 2CR on pre-built indexes directly on fatjar - can't read YAML files HOT 14
- Building anserini on MacOS HOT 21
- Missing appassembler-maven-plugin:2.1.0:assemble HOT 6
- Instructions for reproducing runs on MS MARCO V2.1 with prebuilt indexes HOT 1
- Align RunMsMarco with Fatjar regression instructions HOT 3
- Errors with new MS MARCO v2.1 and BEIR regressions HOT 6
- REST API design HOT 4
- Implement run fusion directly in Anserini HOT 5
- Aligned doc output with 2CR repro classes HOT 1
- Try out new REST API - connect with RankLLM HOT 1
- Discussion: REST API routes for different corpus/model combinations - how do we name? HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from anserini.