Hi, I am following <a href="https://github.com/rwth-i6/returnn-experiments/blob/ma

Reusing parameters inside rec layer about returnn-experiments HOT 5 CLOSED

rwth-i6 commented on May 24, 2024

Reusing parameters inside rec layer

from returnn-experiments.

Comments (5)

albertz commented on May 24, 2024

Can you post the full dictionary of the s layer (or however you called it) (in your new decoder)?

However, it does not work.

Can you be more specific about that? If there is some exception, post the code in a Gist or Pastebin and link it here.

I am using an older version of RETURNN.

Always check with the latest version of RETURNN. Otherwise we potentially might waste time discussing things which are already solved. (Even if this is maybe unlikely in this case, as I don't remember any related changes recently, but you never know...)

What if we want to reuse the entire rec-subnet?

This should be already possible, I think, by setting reuse_params of the rec layer itself. However, there are a lot of problematic edge cases why this will not work. E.g. I think it expects that it finds all the needed params in the reuse_layer and otherwise will throw an exception. Maybe we could extend it by automatically falling back to create a new var if it is not found. But this could be error prone, e.g. you might miss that it actually did not share the params but just recreated all of them. That is why I usually prefer explicitness, and not too much automatic fallbacks which could hide potential bugs.

from returnn-experiments.

iankur commented on May 24, 2024

Can you post the full dictionary of the s layer (or however you called it) (in your new decoder)?

s sublayer definition is as follows:

"s": {"class": "rnn_cell", "unit": "LSTMBlock", "from": ["prev:target_embed", "prev:att"], "n_out": 1000,
            "reuse_params": {"map": {
               "kernel": {"reuse_layer": "base:output", "custom": (lambda reuse_layer, **kwargs: reuse_layer.params["s/rec/lstm_cell/kernel"])},
               "bias": {"reuse_layer": "base:output", "custom": (lambda reuse_layer, **kwargs: reuse_layer.params["s/rec/lstm_cell/bias"])}}}
     }

I tried with both kernel and lstm_cell/kernel as keys in map (similarly for bias) to make sure it is not an issue.

Can you be more specific about that? If there is some exception, post the code in a Gist or Pastebin and link it here.

I am not getting any error with my RETURNN version. It simply creates new variables required for this layer (kernel and bias). I also tried setting reuse_params of new decoder to share the entire decoder. In this case, new variables are created for all the sublayers.

I am having some issues in running with the latest code, so I can't say anything with certainity there. However, I guess, the latest version will also have this issue since the issue may be associated with variable scoping. For some reason, rnn_cell creation logic (here) does not use var_creation_scope which reuses variable scope if parameter is shared (here).

from returnn-experiments.

albertz commented on May 24, 2024

Ah I see. I don't exactly remember why that was the case.
Btw, instead of RnnCellLayer, you can also use RecLayer now (should work fine, also inside another rec layer), and I think that uses var_creation_scope correctly.

from returnn-experiments.

iankur commented on May 24, 2024

Thanks Albert. I will check as per your suggestion and follow up in case of any new issues.

from returnn-experiments.

albertz commented on May 24, 2024

I just noticed that you opened this issue in the wrong repo. This is about RETURNN itself, not about the experiments.
I pushed a commit which should fix this.

from returnn-experiments.

Reusing parameters inside rec layer about returnn-experiments HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent