Giter Site home page Giter Site logo

Load model from Opus-MT about nematus HOT 6 CLOSED

lhk avatar lhk commented on August 16, 2024
Load model from Opus-MT

from nematus.

Comments (6)

emjotde avatar emjotde commented on August 16, 2024

Hi, marian-nmt maintainer here. That assumption hasn't been true for a while. Only a certain class of RNN-based used to compatible with old Theano-based Nematus. Not sure if that is still the case.

from nematus.

lhk avatar lhk commented on August 16, 2024

Hi, thanks for the quick feedback :)

The pretrained models seem to be in a very readable format. It's lots of .npy files.

Is there some documentation on the layout? Can I manually reproduce the corresponding model in tensorflow and read in the weights?
I would love to use marian-nmt or opus-mt but for deployment, it has to be tensorflow.

from nematus.

emjotde avatar emjotde commented on August 16, 2024

The Hugging face people are cooking something up right now. So maybe just wait? I am sure they will announce it.

from nematus.

rsennrich avatar rsennrich commented on August 16, 2024

If you can't wait, or just for those interested: if multiple toolkits implement the same, well-defined architecture (like the Transformer), it's possible in principle to map the parameters from one to the other. We also wrote such a conversion to port RNN models from our Theano to our Tensorflow codebase: https://github.com/EdinburghNLP/nematus/blob/master/nematus/theano_tf_convert.py

However, the devil is in the details. For example, implementations may have slight differences in the architecture (some well-known variants are pre-norm and post-norm Transformers, see https://arxiv.org/pdf/1906.01787.pdf ), and if the architecture differs, you will not be able to port models between toolkits by just copying the weights.

from nematus.

emjotde avatar emjotde commented on August 16, 2024

If it was only architecture that would be easy :) E.g. between Marian and Fairseq one of the differences is a positional embedding starting point shifted by 1. Try to find that by yourself.

Then again it's an opportunity to learn all the things you never want to know about not one but two toolkits! :)

from nematus.

emjotde avatar emjotde commented on August 16, 2024

Something potentially working. All questions to be directed to the Huggingface people :)

https://github.com/huggingface/transformers/blob/master/src/transformers/convert_marian_to_pytorch.py

from nematus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.