Comments (4)
What I understand from the quote below is that they only use the logits (score) for comparison between answers spans, instead of using the probabilities after applying the softmax function.
to allow comparison and aggregation of results from different segments, we remove the final softmaxlayer over different answer spans.
@fmikaelian what do you think?
from cdqa.
Here are my takeways:
- They use Anserini as their document retriever, based on open source Lucene. It uses the BM25 ranking function.
- There are 2 types of retrievers: single-stage vs. multi-stage.
- Article retrieval underperforms paragraph retrieval by a large margin.
- To score predictions they use a weighted linear interpolation between the BERT and Anserini scores:
S = (1 - μ) * S_anserini + μ * S_bert
withμ=0.5
- In production mode they retrieve
k=10
paragraphs. - Answering a question with this setup takes 2,35 seconds in average with a Tesla P40 GPU.
They modified BERT to compare predictions in a meaningful way (see #36):
to allow comparison and aggregation of results from different segments, we remove the final softmaxlayer over different answer spans.
But this is to be clarified.
from cdqa.
Yes
It would be useful to cross check with the paper of danqi chen where she mentions something about it with their DrQA app
https://cs.stanford.edu/~danqi/papers/thesis.pdf
Also we should follow this thread: huggingface/transformers#360
from cdqa.
In section 5.2.3 of Danqi Chen's thesis:
We apply our trained DOCUMENT READER for each single paragraph that appears inthe top 5 Wikipedia articles and it predicts an answer span with a confidence score. To make scores compatible across paragraphs in one or several retrieved documents, we use the unnormalized exponential and take argmax over all considered paragraph spans for our final prediction. This is just a very simple heuristic and there are better ways to aggregate evidence over different paragraphs
This part seems to be implemented here:
And here:
Our predict()
function does not currently returns this confidence score. How can we get it in our setup and modify it for comparision?
from cdqa.
Related Issues (20)
- is there any limit on the no.of rows in a data-frame for the annotator to load the json file?
- ModuleNotFoundError: No module named 'torch' HOT 2
- -
- Syntax error when importing my csv file
- numpy core fromnumeric.py error in QAPipeline.fit_retriever HOT 1
- MemoryError workaround HOT 1
- Maintenance of the project HOT 5
- can not use PIP to install
- return link to the pdf file page where the answer is located
- cdqa install error HOT 2
- How to use cdQA for non-English language? HOT 2
- ValueError: empty vocabulary; perhaps the documents only contain stop words in TfidfVectorizer
- Wrong default
- getting this error while loading the custom data. HOT 1
- Adding annotated training dataset
- CDQA is not installing Anaconda Navigator using PIP command HOT 4
- How can I link cdQA model to SQuAD v2 model? For QA model
- ModuleNotFoundError: No module named 'transformers.modeling_bert' HOT 1
- pdf_converter cdqa throws AttributeError: type object 'object' has no attribute 'dtype'
- Could not find a version that satisfies the requirement cdqa (from versions: none)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cdqa.