Giter Site home page Giter Site logo

Comments (5)

albertz avatar albertz commented on May 24, 2024

Can you post the full dictionary of the s layer (or however you called it) (in your new decoder)?

However, it does not work.

Can you be more specific about that? If there is some exception, post the code in a Gist or Pastebin and link it here.

I am using an older version of RETURNN.

Always check with the latest version of RETURNN. Otherwise we potentially might waste time discussing things which are already solved. (Even if this is maybe unlikely in this case, as I don't remember any related changes recently, but you never know...)

What if we want to reuse the entire rec-subnet?

This should be already possible, I think, by setting reuse_params of the rec layer itself. However, there are a lot of problematic edge cases why this will not work. E.g. I think it expects that it finds all the needed params in the reuse_layer and otherwise will throw an exception. Maybe we could extend it by automatically falling back to create a new var if it is not found. But this could be error prone, e.g. you might miss that it actually did not share the params but just recreated all of them. That is why I usually prefer explicitness, and not too much automatic fallbacks which could hide potential bugs.

from returnn-experiments.

iankur avatar iankur commented on May 24, 2024

Can you post the full dictionary of the s layer (or however you called it) (in your new decoder)?

s sublayer definition is as follows:

"s": {"class": "rnn_cell", "unit": "LSTMBlock", "from": ["prev:target_embed", "prev:att"], "n_out": 1000,
            "reuse_params": {"map": {
               "kernel": {"reuse_layer": "base:output", "custom": (lambda reuse_layer, **kwargs: reuse_layer.params["s/rec/lstm_cell/kernel"])},
               "bias": {"reuse_layer": "base:output", "custom": (lambda reuse_layer, **kwargs: reuse_layer.params["s/rec/lstm_cell/bias"])}}}
     }

I tried with both kernel and lstm_cell/kernel as keys in map (similarly for bias) to make sure it is not an issue.

Can you be more specific about that? If there is some exception, post the code in a Gist or Pastebin and link it here.

I am not getting any error with my RETURNN version. It simply creates new variables required for this layer (kernel and bias). I also tried setting reuse_params of new decoder to share the entire decoder. In this case, new variables are created for all the sublayers.

I am having some issues in running with the latest code, so I can't say anything with certainity there. However, I guess, the latest version will also have this issue since the issue may be associated with variable scoping. For some reason, rnn_cell creation logic (here) does not use var_creation_scope which reuses variable scope if parameter is shared (here).

from returnn-experiments.

albertz avatar albertz commented on May 24, 2024

Ah I see. I don't exactly remember why that was the case.
Btw, instead of RnnCellLayer, you can also use RecLayer now (should work fine, also inside another rec layer), and I think that uses var_creation_scope correctly.

from returnn-experiments.

iankur avatar iankur commented on May 24, 2024

Thanks Albert. I will check as per your suggestion and follow up in case of any new issues.

from returnn-experiments.

albertz avatar albertz commented on May 24, 2024

I just noticed that you opened this issue in the wrong repo. This is about RETURNN itself, not about the experiments.
I pushed a commit which should fix this.

from returnn-experiments.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.