Giter Site home page Giter Site logo

sgn's People

Contributors

hwang1996 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

zhushaoquan

sgn's Issues

Unable to download pre-trained model

I tried downloading the pre-trained model but it gave me this error.

User account '' from identity provider '' does not exist in tenant 'Nanyang Technological University' and cannot access the application '08e18876-6177-487e-b8b5-cf950c1e598c'.

Can you please let me know if I can get the pre-trained model from somewhere else?

About initialization of hidden states in sentence-level ON-LSTM

Hi, thank you for providing the source code of your paper. I am very interested in your paper.
I'd like to ask you a question about initializing hidden states in ON-LSTM.

From line 126 to 146 at train.py, the code initialize hidden states when starting a batch (L125~L127).
Then, the sentence-level ON-LSTM hidden state, namely hidden, is used for training (L146).
However, at next iteration, this hidden is used again after detaching, although it is not related to the next iteration.

I think that this hidden should be again initialized as zero vectors.
Could you tell me why you used the previous hidden state is used during training?

Thanks

    hidden = model.init_hidden(args.batch_size)
    hidd = model.init_hidden(args.batch_size*19)
    hidd_cand = model.init_hidden(args.batch_size*4)
    batch = 0
    acc_list = []
    total_var = 0

    it = tqdm(range(len(train_dataloader)), desc="Epoch {}/{}".format(epoch, args.epochs), ncols=0)
    data_iter = iter(train_dataloader)
    for niter in it:
        input_ids, cand_ids, target = data_iter.next()
        if args.cuda:
            input_ids = input_ids.cuda()
            cand_ids = cand_ids.cuda()
            targets = target.cuda()

        hidden = repackage_hidden(hidden)
        hidd = repackage_hidden(hidd)
        hidd_cand = repackage_hidden(hidd_cand)
        optimizer.zero_grad()

        output, result_prob, hidden, rnn_hs, dropped_rnn_hs, cand_emb = model(input_ids, cand_ids, hidden, hidd, hidd_cand)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.