Giter Site home page Giter Site logo

Comments (5)

memray avatar memray commented on September 26, 2024 1

Thank you @DevSinghSachan ! Really appreciate your help!

from emdr2.

DevSinghSachan avatar DevSinghSachan commented on September 26, 2024

Hi @memray,

Thanks for running the codes and the replication effort. I was able to reproduce the issue on my end. The issue was because of the incorrect evidence embedding pickle file that I had uploaded for TriviaQA. Here is the link to the correct one. You can download it as:
wget https://www.dropbox.com/s/ydav0jzny4ihztm/trivia-ssm-step3250.pkl

You should get performance close to (dev: 71.33 / test: 71.38) with this embedding file. Can you try this out and let me know if this works?

Response for the other questions:

  1. Currently, we do not plan to release the code for MSS pre-training with this repo. If we are releasing it, then I will update the readme file.
  2. Good catch. I can provide the BERT checkpoint in the readme link. But, we do not recommend using this BERT checkpoint for dense retrievers. We recommend to use the MSS pre-trained retriever, which we always found to be optimal. The BERT checkpoint in the retrievers script is just for placeholder, it's loaded first but then substituted by the MSS checkpoint.
  3. Currently, this code will work fine for EMDR2. We had our implementations of FiD in separate branches but just to keep the codebase simple, we did not include it. Our point of including these ablations was to show that while the performance of FiD depends on the accuracy of the initial retriever, the performance of EMDR2 is independent of the initial retriever, if we have a good unsupervised retriever such as MSS. So, we include the performance of FiD with MSS retriever, and then when EMDR2 is trained starting with MSS retriever, it improves answer generation considerably. The difference in the scores of FiD(MSS retriever; MSS reader) and EMDR2(MSS retriever; MSS reader) helps to understand how much the end-to-end training has really benefitted the retriever to improve its retrieval accuracy and reader to improve its answer generation.

Hope this helps!

from emdr2.

memray avatar memray commented on September 26, 2024

Hi @DevSinghSachan ,

Thank you very much for the detailed reply and updated resources. I'll test it with the new embedding and let you know if I can reproduce it as soon as I can.

I'd really appreciate it if you can release the MSS pre-training code. It can be very helpful to the community to reproduce your results and explore novel pretraining ideas.

Regarding the FiD ablation, if I understand it correctly, the retriever of FiD is fixed during the reader training, isn't it? But in EMDR2, the retriever is also updated due to the end-to-end training.

Thank you,
Rui

from emdr2.

memray avatar memray commented on September 26, 2024

Hi @DevSinghSachan ,

I tested the new embedding and it works. Thank you!

Best,
Rui

from emdr2.

DevSinghSachan avatar DevSinghSachan commented on September 26, 2024

Thanks @memray for testing with the updated embedding!

I have now added the details of the MSS training data in the README (https://github.com/DevSinghSachan/emdr2#data-for-masked-salient-spans-mss-training-optional).

  1. For pre-training, I just masked one or more named entities, and predicted them with the EMDR2 setup just like how we do with question-answer pairs. Hope this is helpful.

  2. Yes, as you mentioned, in FiD, the retriever weights are fixed during training, while in EMDR2 both the reader and the retriever are jointly trained.

Best,
Devendra

from emdr2.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.