saikrishnarallabandi / compositionality-expts Goto Github PK
View Code? Open in Web Editor NEWThis is a repo to hold experiments in the space of compositionality
Home Page: http://www.cs.cmu.edu/~srallaba/VQA/
License: Apache License 2.0
This is a repo to hold experiments in the space of compositionality
Home Page: http://www.cs.cmu.edu/~srallaba/VQA/
License: Apache License 2.0
@saikrishnarallabandi @berzentine
not a priority but resolve this
Traceback (most recent call last):
File "main_VAE.py", line 245, in
val_loss = evaluate(val_data)
File "main_VAE.py", line 153, in evaluate
recon_batch, mu, log_var = model(data, None)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/srallaba/project_777/explorations/lm/model.py", line 45, in forward
mu, log_var = self.encoder(embedding,hidden)
File "/home/srallaba/project_777/explorations/lm/model.py", line 29, in encoder
output, hidden = self.rnn(emb, hidden)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/rnn.py", line 192, in forward
output, hidden = func(input, self.all_weights, hx, batch_sizes)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py", line 324, in forward
return func(input, *fargs, **fkwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py", line 288, in forward
dropout_ts)
RuntimeError: CUDA error: out of memory
Aftr epoch 0 Train KL Loss: 4.0685910552845594e+26 Train CE Loss: 7.474030815264613 Val KL Loss: 4.96021764349418e+26 Val CE Loss: 7.47340297397879 Time: 324.20956802368164
vs Naive update:
Aftr epoch 0 Train KL Loss: 71134.24406546808 Train CE Loss: 7.477112519003681 Val KL Loss: 84342.38469062098 Val CE Loss: 7.476956954997839 Time: 289.88534116744995
torch.backends.cudnn.CuDNNError: 1: b'CUDNN_STATUS_NOT_INITIALIZED'
Exception ignored in: <bound method CuDNNHandle.del of <torch.backends.cudnn.CuDNNHandle object at 0x7fc57e00fcf8>>
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/backends/cudnn/init.py", line 181, in del
check_error(lib.cudnnDestroy(self))
ctypes.ArgumentError: argument 1: <class 'TypeError'>: Don't know how to convert parameter 1
Got this while running from:
~/projects/multimodal/garages/garage_lm
KL Term becomes too low during training
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "/home/srallaba/project_777/repos/n2nmn/util/clevr_train/data_reader.py", line 135, in _run_prefetch
batch = batch_loader.load_one_batch(sample_ids)
File "/home/srallaba/project_777/repos/n2nmn/util/clevr_train/data_reader.py", line 56, in load_one_batch
input_seq_batch[:seq_length, n] = question_inds
ValueError: cannot copy sequence with size 47 to array axis with dimension 45
Aftr epoch 0 Train KL Loss: 70880.33538945712 Train CE Loss: 7.475933839819846 Val KL Loss: 83803.62003476919 Val CE Loss: 7.47542022777086 Time: 165.22837734222412
Aftr epoch 1 Train KL Loss: 88129.83940536014 Train CE Loss: 7.475455455475286 Val KL Loss: 90637.69099604593 Val CE Loss: 7.475114441888723 Time: 165.42635655403137
Aftr epoch 2 Train KL Loss: 92180.11840200602 Train CE Loss: 7.47502081677742 Val KL Loss: 93999.66817360037 Val CE Loss: 7.475185290207461 Time: 166.00701189041138
Aftr epoch 3 Train KL Loss: 95777.1658170661 Train CE Loss: 7.4751619369832225 Val KL Loss: 97136.72526997971 Val CE Loss: 7.475099759285574 Time: 169.06363463401794
Aftr epoch 4 Train KL Loss: 98035.97741159725 Train CE Loss: 7.475510899023118 Val KL Loss: 99218.52989204838 Val CE Loss: 7.475828764411288 Time: 171.14043927192688
Aftr epoch 5 Train KL Loss: 101160.41113218335 Train CE Loss: 7.4758301856452976 Val KL Loss: 102355.64565868203 Val CE Loss: 7.47584792600282 Time: 274.885231256485
Aftr epoch 6 Train KL Loss: 103964.74948309254 Train CE Loss: 7.475863241439719 Val KL Loss: 105352.48923403073 Val CE Loss: 7.475834933004341 Time: 165.6708481311798
Aftr epoch 7 Train KL Loss: 104515.88532606945 Train CE Loss: 7.475988712158419 Val KL Loss: 104982.18300688136 Val CE Loss: 7.4761259709724275 Time: 165.33933973312378
Aftr epoch 8 Train KL Loss: 106197.50219496923 Train CE Loss: 7.476182853080396 Val KL Loss: 107269.56916241866 Val CE Loss: 7.476052244550373 Time: 211.1073079109192
Aftr epoch 9 Train KL Loss: 108623.7602032599 Train CE Loss: 7.476069112703245 Val KL Loss: 109815.83511709188 Val CE Loss: 7.476049446442158 Time: 399.2169167995453
Log:
Loaded the question to layout dict from ./gt_layout_%s_new_parse.npy % train2014
Image ID: 78077
Question ID: 1
Image name: COCO_train2014_000000078077
Question: Is this a modern train?
Whole thingy {'question': 'Is this a modern train?', 'image_id': 78077, 'question_id': 1}
Problem: They dont have question id '1' in the parse
Epoch 459 Train KL Loss: -0.00021647507542962426 Train CE Loss: 9.346391046369398 Dev KL loss tensor(-0.0002, device='cuda:0') Dev CE Loss: tensor(9.3670, device='cuda:0'
So the rnnlm script we have has some memory leak somewer. see if you can catch it.
This uses 800MB:
https://github.com/pytorch/examples/blob/master/word_language_model/main.py
Our script uses 9GB:
https://github.com/saikrishnarallabandi/compositionality-expts/blob/debug/garages/garage_lm/main_rnnlm_barebones.py
LM prints different perplexity for val and test
This happens in VAELM. Testing for RNNLM
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.