Giter Site home page Giter Site logo

bso's Introduction

Code for Sequence-to-Sequence Learning as Beam-Search Optimization (Wiseman and Rush, 2016).

This code is adapted from a much earlier version of Yoon Kim's seq2seq-attn code.

For questions/concerns/bugs feel free to contact swiseman at seas.harvard.edu.

Running Experiments

First prepare the data as in data_prep/.

All seq2seq baselines use the seq2seq-attn code.

Word-ordering Experiments

Pretrain with

th pretrain.lua -data_file wo-train.hdf5 -val_data_file wo-val.hdf5 -savefile wopt -num_layers 2 -rnn_size 256 -word_vec_size 256 -save_after 10 -adagrad -layer_etas 0.02,0.01,0.2 -epochs 10 -curriculum 1 -dropout 0.2

Unconstrained train with

th bso_train.lua -data_file wo-train.hdf5 -val_data_file wo-val.hdf5 -savefile wosave -num_layers 2 -rnn_size 256 -word_vec_size 256 -adagrad -layer_etas 0.02,0.02,0.2 -curriculum 0 -epochs 39 -train_from wopt_epoch10.00_*.t7 -dropout 0.2 -max_beam_size 6 -beam_size 2

Constrained training is accomplished by adding the argument '-con wo' to the above.

Predict with

``th predict.lua -val_data_file wo-val.hdf5 -model wosave_epoch39*.t7 -src_file wo-src-val.txt -src_dict wo.src.dict -targ_dict wo.targ.dict -beam_size 5 -con wo -output_file val-unconstrwo-preds.out```

Train the seq2seq baseline as

th train.lua -data_file wo-train.hdf5 -val_data_file wo-val.hdf5 -savefile wos2s -num_layers 2 -rnn_size 256 -word_vec_size 256 -save_after 10 -param_init 0.1 -adagrad -layer_lrs 0.02,0.01,0.2 -lr_decay 1 -epochs 30 -curriculum 1 -dropout 0.2

(and use the epoch with the lowest validation perplexity)

Dependency Parsing Experiments

Pretrain with

th pretrain.lua -data_file dep-train.hdf5 -val_data_file dep-val.hdf5 -savefile deppt -num_layers 2 -rnn_size 300 -word_vec_size 300 -save_after 5 -adagrad -layer_etas 0.02,0.02,0.2 -epochs 5 -curriculum 1 -dropout 0.3 -pre_word_vecs_enc dep_src_w2v.h5 -pre_word_vecs_dec dep_targ_w2v.h5

Constrained train with

th bso_train.lua -data_file dep-train.hdf5 -val_data_file dep-val.hdf5 -savefile condep -num_layers 2 -rnn_size 300 -word_vec_size 300 -save_after 16 -adagrad -curriculum 0 -epochs 16 -train_from deppt_epoch5_*.t7 -dropout 0.3 -max_beam_size 6 -beam_size 2 -layer_etas 0.02,0.02,0.1 -ignore_eos -src_dict dep.src.dict -targ_dict dep.targ.dict -con sr

(Unconstrained training can be accomplished by leaving out the '-con sr' argument)

Predict with

th predict.lua -val_data_file dep-val.hdf5 -model condep_epoch16_*.t7 -gpuid 1 -src_file dep-src-val.txt -src_dict dep.src.dict -targ_dict dep.targ.dict -beam_size 5 -con sr -output_file val-condepb5-preds.out

Train the seq2seq baseline as

th train.lua -data_file dep-train.hdf5 -val_data_file dep-val.hdf5 -savefile deps2s -num_layers 2 -rnn_size 300 -word_vec_size 300 -adagrad -layer_lrs 0.02,0.02,0.2 -lr_decay 1 -epochs 25 -curriculum 1 -dropout 0.3 -pre_word_vecs_enc dep_src_w2v.h5 -pre_word_vecs_dec dep_targ_w2v.h5

(and use the epoch with the lowest validation perplexity)

MT Experiments

Pretrain with

th pretrain.lua -data_file mixer-train.hdf5 -val_data_file mixer-val.hdf5 -savefile mixerpt -num_layers 1 -rnn_size 256 -word_vec_size 256 -save_after 3 -adagrad -layer_etas 0.02,0.02,0.2 -epochs 3 -curriculum 1 -dropout 0.2

Train with

th bso_train.lua -data_file mixer-train.hdf5 -val_data_file mixer-val.hdf5 -savefile mixersave -num_layers 1 -rnn_size 256 -word_vec_size 256 -save_after 21 -adagrad -curriculum 0 -epochs 21 -train_from mixerpt_epoch3_*.t7 -dropout 0.2 -max_beam_size 6 -beam_size 2 -layer_etas 0.02,0.02,0.1 -mt_delt_multiple 1

Predict with

th predict.lua -val_data_file mixer-val.hdf5 -model mixersave_epoch21_*.t7 -src_file valid.de-en.de -src_dict mixer.src.dict -targ_dict mixer.targ.dict -beam_size 5 -output_file val-mixer-preds.out

Train the seq2seq baseline as

th pretrain.lua -data_file mixer-train.hdf5 -val_data_file mixer-val.hdf5 -savefile mixers2s -num_layers 1 -rnn_size 256 -word_vec_size 256 -adagrad -layer_lrs 0.02,0.02,0.2 -epochs 15 -lr_decay 1 -curriculum 1 -dropout 0.2

(and use the epoch with the lowest validation perplexity).

MIT License.

bso's People

Contributors

swiseman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bso's Issues

Error while training the model

In MT DATASET i have successfully pretrained my model.Now i move to train my model
I execute the command :

th bso_train.lua -data_file data_prep/MT/prep-train.hdf5 -val_data_file data_prep/MT/prep-val.hdf5 -savefile mixersave -num_layers 1 -rnn_size 4 -word_vec_size 4 -save_after 21 -adagrad -curriculum 0 -epochs 21 -train_from mixerpt_epoch3_*.t7 -dropout 0.2 -max_beam_size 6 -beam_size 2 -layer_etas 0.02,0.02,0.1 -mt_delt_multiple 1

but it shows error
loading data...
done!
Source vocab size: 32011, Target vocab size: 22824
Source max sent len: 283, Target max sent len: 285
layer_etas 0.02 0.02 0.1
loading mixerpt_epoch3_175.20.t7...
Number of parameters: 7137368
/home/singla/torch/install/bin/luajit: /home/singla/Desktop/Singla/BSO/margin_beam_searcher.lua:106: attempt to call method 'floor' (a nil value)
stack traceback:
/home/singla/Desktop/Singla/BSO/margin_beam_searcher.lua:106: in function 'nextSearchStep'
bso_train.lua:536: in function 'train_batch'
bso_train.lua:666: in function 'train'
bso_train.lua:953: in function 'main'
bso_train.lua:956: in main chunk
[C]: in function 'dofile'
...ngla/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

How can i fix it ? pls help

Error: Could not find or load main class edu.stanford.nlp.trees.EnglishGrammaticalStructure

When i write the command
java -cp stanford-parser-3.3.0.jar edu.stanford.nlp.trees.EnglishGrammaticalStructure -treeFile ~/Projects/group/data/wsj/train.txt -conllx -basic > train.conll.txt ( the first one for Creating the Dependency Parsing Dataset)
i get the error
Error: Could not find or load main class edu.stanford.nlp.trees.EnglishGrammaticalStructure

i have searched this error but i am unable to resolve it and proceed further

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.