curiousai / ladder Goto Github PK
View Code? Open in Web Editor NEWLadder network is a deep learning algorithm that combines supervised and unsupervised learning
License: MIT License
Ladder network is a deep learning algorithm that combines supervised and unsupervised learning
License: MIT License
Can someone show me the test error curve of the Ladder on Cifar10 dataset with 4000 labeled training? I wrote the code with Tensorflow, and got the accuracy of 78% with 20 epoch, and I can almost get the accuracy of 75% after 6 epoch. So, I just want to check if the results are similar. Many Thanks. The Figure shown bellow is my result.
ladder.pdf
I can run mnist_100_full and got similar result as in the paper. But when I ran mnist_100_conv_gamma, it hangs after "INFO:main.utils:e 0, i 0:V_C_class nan, V_E 90, V_C_de 1" as following. Could you please help? Your help is greatly appreciated.
By the way, I had commented the function pool_2d() in nn.py.
-Anna
$THEANO_FLAGS='floatX=float32' python run.py train --encoder-layers convf:32:5:1:1-maxpool:2:2-convv:64:3:1:1-convf:64:3:1:1-maxpool:2:2-convv:128:3:1:1-convv:10:1:1:1-globalmeanpool:6:6-fc:10 --decoder-spec 0-0-0-0-0-0-0-0-0-gauss --denoising-cost-x 0,0,0,0,0,0,0,0,0,1 --labeled-samples 100 --unlabeled-samples 50000 --seed 1 -- mnist_100_conv_gamma
ERROR:main:Subprocess returned fatal: Not a git repository (or any parent up to mount point /nfs/home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
INFO:main:Logging into results/mnist_100_conv_gamma11/log.txt
INFO:main:== COMMAND LINE ==
INFO:main:run.py train --encoder-layers convf:32:5:1:1-maxpool:2:2-convv:64:3:1:1-convf:64:3:1:1-maxpool:2:2-convv:128:3:1:1-convv:10:1:1:1-globalmeanpool:6:6-fc:10 --decoder-spec 0-0-0-0-0-0-0-0-0-gauss --denoising-cost-x 0,0,0,0,0,0,0,0,0,1 --labeled-samples 100 --unlabeled-samples 50000 --seed 1 -- mnist_100_conv_gamma
INFO:main:== PARAMETERS ==
INFO:main: zestbn : bugfix
INFO:main: dseed : 1
INFO:main: top_c : 1
INFO:main: super_noise_std : 0.3
INFO:main: batch_size : 100
INFO:main: dataset : mnist
INFO:main: valid_set_size : 10000
INFO:main: num_epochs : 150
INFO:main: whiten_zca : 0
INFO:main: unlabeled_samples : 50000
INFO:main: decoder_spec : ('0', '0', '0', '0', '0', '0', '0', '0', '0', 'gauss')
INFO:main: valid_batch_size : 100
INFO:main: denoising_cost_x : (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0)
INFO:main: f_local_noise_std : 0.3
INFO:main: cmd : train
INFO:main: act : relu
INFO:main: lrate_decay : 0.67
INFO:main: seed : 1
INFO:main: lr : 0.002
INFO:main: save_to : mnist_100_conv_gamma
INFO:main: save_dir : results/mnist_100_conv_gamma11
INFO:main: commit :
INFO:main: contrast_norm : 0
INFO:main: encoder_layers : ('convf:32:5:1:1', 'maxpool:2:2', 'convv:64:3:1:1', 'convf:64:3:1:1', 'maxpool:2:2', 'convv:128:3:1:1', 'convv:10:1:1:1', 'globalmeanpool:6:6', 'fc:10')
INFO:main: labeled_samples : 100
INFO:main:Using 10000 examples for validation
INFO:main.model:Encoder: clean, labeled
INFO:main.model: 0: noise 0
/nfs/home/yan/PycharmProjects/ladder-master_aicurious2/ladder.py:454: UserWarning: The method getOutputShape
is deprecated useget_conv_output_shape
instead.
stride, bm))
INFO:main.model: f1: convf, relu, BN, noise 0.00, params [32, 5, 1, 1], dim (1, 28, 28) -> (32, 32, 32)
/nfs/home/yan/PycharmProjects/ladder-master_aicurious2/nn.py:288: UserWarning: pool_2d() will have the parameter ignore_border default value changed to True (currently False). To have consistent behavior with all Theano version, explicitly add the parameter ignore_border=True. On the GPU, using ignore_border=True is needed to use cuDNN. When using ignore_border=False and not using cuDNN, the only GPU combination supported is when ds == st and padding == (0, 0) and mode == 'max'
. Otherwise, the convolution will be executed on CPU.
z = pool_2d(z, ds=poolsize, st=poolstride)
INFO:main.model: f2: maxpool, linear, BN, noise 0.00, params [2, 2], dim (32, 32, 32) -> (32, 16, 16)
INFO:main.model: f3: convv, relu, BN, noise 0.00, params [64, 3, 1, 1], dim (32, 16, 16) -> (64, 14, 14)
INFO:main.model: f4: convf, relu, BN, noise 0.00, params [64, 3, 1, 1], dim (64, 14, 14) -> (64, 16, 16)
INFO:main.model: f5: maxpool, linear, BN, noise 0.00, params [2, 2], dim (64, 16, 16) -> (64, 8, 8)
INFO:main.model: f6: convv, relu, BN, noise 0.00, params [128, 3, 1, 1], dim (64, 8, 8) -> (128, 6, 6)
INFO:main.model: f7: convv, relu, BN, noise 0.00, params [10, 1, 1, 1], dim (128, 6, 6) -> (10, 6, 6)
INFO:main.model: f8: globalmeanpool, linear, BN, noise 0.00, params [6, 6], dim (10, 6, 6) -> (10, 1, 1)
INFO:main.model: f9: fc, softmax, BN, noise 0.00, params 10, dim (10, 1, 1) -> (10,)
INFO:main.model:Encoder: corr, labeled
INFO:main.model: 0: noise 0.3
INFO:main.model: f1: convf, relu, BN, noise 0.30, params [32, 5, 1, 1], dim (1, 28, 28) -> (32, 32, 32)
INFO:main.model: f2: maxpool, linear, BN, noise 0.30, params [2, 2], dim (32, 32, 32) -> (32, 16, 16)
INFO:main.model: f3: convv, relu, BN, noise 0.30, params [64, 3, 1, 1], dim (32, 16, 16) -> (64, 14, 14)
INFO:main.model: f4: convf, relu, BN, noise 0.30, params [64, 3, 1, 1], dim (64, 14, 14) -> (64, 16, 16)
INFO:main.model: f5: maxpool, linear, BN, noise 0.30, params [2, 2], dim (64, 16, 16) -> (64, 8, 8)
INFO:main.model: f6: convv, relu, BN, noise 0.30, params [128, 3, 1, 1], dim (64, 8, 8) -> (128, 6, 6)
INFO:main.model: f7: convv, relu, BN, noise 0.30, params [10, 1, 1, 1], dim (128, 6, 6) -> (10, 6, 6)
INFO:main.model: f8: globalmeanpool, linear, BN, noise 0.30, params [6, 6], dim (10, 6, 6) -> (10, 1, 1)
INFO:main.model: f9: fc, softmax, BN, noise 0.30, params 10, dim (10, 1, 1) -> (10,)
INFO:main.model:Decoder: z_corr -> z_est
INFO:main.model: g9: gauss, denois 1.00, dim None -> (10,)
INFO:main.model: g8: 0, , dim (10,) -> (10, 1, 1)
INFO:main.model: g7: 0, , dim (10, 1, 1) -> (10, 6, 6)
INFO:main.model: g6: 0, , dim (10, 6, 6) -> (128, 6, 6)
INFO:main.model: g5: 0, , dim (128, 6, 6) -> (64, 8, 8)
INFO:main.model: g4: 0, , dim (64, 8, 8) -> (64, 16, 16)
INFO:main.model: g3: 0, , dim (64, 16, 16) -> (64, 14, 14)
INFO:main.model: g2: 0, , dim (64, 14, 14) -> (32, 16, 16)
INFO:main.model: g1: 0, , dim (32, 16, 16) -> (32, 32, 32)
INFO:main.model: g0: 0, , dim (32, 32, 32) -> (1, 28, 28)
INFO:main:Found the following parameters: [f_7_b, f_6_b, f_4_b, f_3_b, f_1_b, g_9_a5, f_9_c, f_9_b, g_9_a4, g_9_a3, g_9_a2, g_9_a1, g_9_a10, g_9_a9, g_9_a8, g_9_a7, g_9_a6, f_1_W, f_3_W, f_4_W, f_6_W, f_7_W, f_9_W]
INFO:blocks.algorithms:Taking the cost gradient
INFO:blocks.algorithms:The cost gradient computation graph is built
INFO:main:Balancing 100 labels...
INFO:main.nn:Batch norm parameters: f_1_bn_mean_clean, f_1_bn_var_clean, f_2_bn_mean_clean, f_2_bn_var_clean, f_3_bn_mean_clean, f_3_bn_var_clean, f_4_bn_mean_clean, f_4_bn_var_clean, f_5_bn_mean_clean, f_5_bn_var_clean, f_6_bn_mean_clean, f_6_bn_var_clean, f_7_bn_mean_clean, f_7_bn_var_clean, f_8_bn_mean_clean, f_8_bn_var_clean, f_9_bn_mean_clean, f_9_bn_var_clean
INFO:main:Balancing 100 labels...
INFO:main.nn:Batch norm parameters: f_1_bn_mean_clean, f_1_bn_var_clean, f_2_bn_mean_clean, f_2_bn_var_clean, f_3_bn_mean_clean, f_3_bn_var_clean, f_4_bn_mean_clean, f_4_bn_var_clean, f_5_bn_mean_clean, f_5_bn_var_clean, f_6_bn_mean_clean, f_6_bn_var_clean, f_7_bn_mean_clean, f_7_bn_var_clean, f_8_bn_mean_clean, f_8_bn_var_clean, f_9_bn_mean_clean, f_9_bn_var_clean
INFO:blocks.main_loop:Entered the main loop
/nfs/home/yan/.conda/envs/ladder2/lib/python2.7/site-packages/pandas/core/generic.py:939: PerformanceWarning:
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed-integer,key->block0_values] [items->[0]]
return pytables.to_hdf(path_or_buf, key, self, **kwargs)
INFO:blocks.algorithms:Initializing the training algorithm
INFO:blocks.algorithms:The training algorithm is initialized
INFO:blocks.extensions.monitoring:Monitoring on auxiliary data started
INFO:blocks.extensions.monitoring:Monitoring on auxiliary data finished
INFO:main.utils:e 0, i 0:V_C_class nan, V_E 90, V_C_de 1
Dear Developer:
I am trying to modified the noise injection to different type of distortion, but I got a problem.
I added a piece of code into your program, and able to see the visible input image, could you please tell me can I see(or print) the variable z(0)=x+noise. If yes, could you show me how to do it.
It will be a great help if I know how to get (or visually see) the noisy layer variable in any layer.
Many thanks!!
If the conv layer is used and the output dimension is (m, m, c) for an input of ( k, k, c ), what is the dimensions of mu and v of Equation (1)? Is it just a c-dimensional vector or a tensor of size (m, m, c)?
Here is my error when running the run.py file: Could you fix it?
python3 run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec gauss --denoising-cost-x 1000,10,0.1,0.1,0.1,0.1,0.1 --labeled-samples 100 --unlabeled-samples 60000 --seed 1 -- mnist_100_full
Traceback (most recent call last):
File "run.py", line 14, in
import theano
File "/home/mvidovic/virtuel_env/Ladder/lib/python3.6/site-packages/theano/init.py", line 88, in
from theano.configdefaults import config
File "/home/mvidovic/virtuel_env/Ladder/lib/python3.6/site-packages/theano/configdefaults.py", line 137, in
in_c_key=False)
File "/home/mvidovic/virtuel_env/Ladder/lib/python3.6/site-packages/theano/configparser.py", line 287, in AddConfigVar
configparam.get(root, type(root), delete_key=True)
File "/home/mvidovic/virtuel_env/Ladder/lib/python3.6/site-packages/theano/configparser.py", line 335, in get
self.set(cls, val_str)
File "/home/mvidovic/virtuel_env/Ladder/lib/python3.6/site-packages/theano/configparser.py", line 346, in set
self.val = self.filter(val)
File "/home/mvidovic/virtuel_env/Ladder/lib/python3.6/site-packages/theano/configdefaults.py", line 116, in filter
'You are tring to use the old GPU back-end. '
ValueError: You are tring to use the old GPU back-end. It was removed from Theano. Use device=cuda* now. See https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29 for more information.
Starting from line 205 of ladder.py it looks like only the unlabeled data is considered for the decoder.
This is different from how I understood the paper, where it seem like all data is used for the encoder and decoder.
If that is really not the case, shouldn't then in the fully supervised scenario the ladder turn just into a normal MLP? Where is then the boost in classification accuracy coming from?
Running the command for the mnist dataset
./run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec sig --denoising-cost-x 1000,10,0.1,0.1,0.1,0.1,0.1 --labeled-samples 100 --unlabeled-samples 60000 --seed 1 -- mnist_100_full
I get this error:
ERROR:blocks.main_loop:Error occured during training.
Blocks will attempt to run on_error
extensions, potentially saving data, before exiting and reraising the error. Note that the usual after_training
extensions will not be run. The original error will be re-raised and also stored in the training log. Press CTRL + C to halt Blocks immediately.
Traceback (most recent call last):
File "./run.py", line 649, in
if train(d) is None:
File "./run.py", line 500, in train
main_loop.run()
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/blocks/main_loop.py", line 188, in run
reraise_as(e)
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/blocks/utils/init.py", line 225, in reraise_as
six.reraise(type(new_exc), new_exc, orig_exc_traceback)
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/blocks/main_loop.py", line 164, in run
self.algorithm.initialize()
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/blocks/algorithms/init.py", line 224, in initialize
self._function = theano.function(self.inputs, [], updates=all_updates)
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/theano/compile/function.py", line 300, in function
output_keys=output_keys)
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/theano/compile/pfunc.py", line 488, in pfunc
no_default_updates=no_default_updates)
File "/home/teslah2o/ladder/venv/local/lib/python2.7/site-packages/theano/compile/pfunc.py", line 216, in rebuild_collect_shared
raise TypeError(err_msg, err_sug)
TypeError: ('An update must have the same type as the original shared variable (shared_var=f_5_b, shared_var.type=TensorType(float32, vector), update_val=Elemwise{sub,no_inplace}.0, update_val.type=TensorType(float64, vector))., If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.\n\nOriginal exception:\n\tTypeError: An update must have the same type as the original shared variable (shared_var=f_5_b, shared_var.type=TensorType(float32, vector), update_val=Elemwise{sub,no_inplace}.0, update_val.type=TensorType(float64, vector))., If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.')
Do you know how to fix it?
Error messages are shown below:
ERROR:blocks.main_loop:Error occured during training.
Blocks will attempt to run `on_error` extensions, potentially saving data, before exiting and reraising the error. Note that the usual `after_training` extensions will *not* be run$
The original error will be re-raised and also stored in the training log. Press CTRL + C to halt Blocks immediately.
Traceback (most recent call last):
File "./run.py", line 652, in <module>
if train(d) is None:
File "./run.py", line 501, in train
main_loop.run()
File "/home/alexchang/ENV/local/lib/python2.7/site-packages/blocks/main_loop.py", line 197, in run
reraise_as(e)
File "/home/alexchang/ENV/local/lib/python2.7/site-packages/blocks/utils/__init__.py", line 258, in reraise_as
six.reraise(type(new_exc), new_exc, orig_exc_traceback)
File "/home/alexchang/ENV/local/lib/python2.7/site-packages/blocks/main_loop.py", line 183, in run
while self._run_epoch():
File "/home/alexchang/ENV/local/lib/python2.7/site-packages/blocks/main_loop.py", line 232, in _run_epoch
while self._run_iteration():
File "/home/alexchang/ENV/local/lib/python2.7/site-packages/blocks/main_loop.py", line 253, in _run_iteration
self.algorithm.process_batch(batch)
File "/home/alexchang/ENV/local/lib/python2.7/site-packages/blocks/algorithms/__init__.py", line 287, in process_batch
self._function(*ordered_batch)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 871, in __call__
storage_map=getattr(self.fn, 'storage_map', None))
File "/usr/local/lib/python2.7/dist-packages/theano/gof/link.py", line 314, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py", line 859, in __call__
outputs = self.fn()
ValueError: The hardcoded shape for the number of rows in the image (8) isn't the run time shape (7).
Apply node that caused the error: ConvOp{('imshp', (192, 8, 8)),('kshp', (3, 3)),('nkern', 192),('bsize', 200),('dx', 1),('dy', 1),('out_mode', 'valid'),('unroll_batch', 5),('unrol$
_kern', 2),('unroll_patch', False),('imshp_logical', (192, 8, 8)),('kshp_logical', (3, 3)),('kshp_logical_top_aligned', True)}(Elemwise{Composite{(i0 + (i1 * i2))}}[(0, 2)].0, f_9_$
)
Toposort index: 1201
Inputs types: [TensorType(float32, 4D), TensorType(float32, 4D)]
Inputs shapes: [(200, 192, 7, 7), (192, 192, 3, 3)]
Inputs strides: [(37632, 196, 28, 4), (6912, 36, 12, 4)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Subtensor{int64::}(ConvOp{('imshp', (192, 8, 8)),('kshp', (3, 3)),('nkern', 192),('bsize', 200),('dx', 1),('dy', 1),('out_mode', 'valid'),('unroll_batch', 5),('un
roll_kern', 2),('unroll_patch', False),('imshp_logical', (192, 8, 8)),('kshp_logical', (3, 3)),('kshp_logical_top_aligned', True)}.0, ScalarFromTensor.0), Subtensor{:int64:}(ConvOp{
('imshp', (192, 8, 8)),('kshp', (3, 3)),('nkern', 192),('bsize', 200),('dx', 1),('dy', 1),('out_mode', 'valid'),('unroll_batch', 5),('unroll_kern', 2),('unroll_patch', False),('imsh
p_logical', (192, 8, 8)),('kshp_logical', (3, 3)),('kshp_logical_top_aligned', True)}.0, ScalarFromTensor.0)]]
Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "./run.py", line 652, in <module>
if train(d) is None:
File "./run.py", line 411, in train
ladder = setup_model(p)
File "./run.py", line 182, in setup_model
ladder.apply(x, y, x_only)
File "/home/alexchang/Course_105_1/ML/ML2016/ladder_og/ladder.py", line 203, in apply
noise_std=self.p.f_local_noise_std)
File "/home/alexchang/Course_105_1/ML/ML2016/ladder_og/ladder.py", line 185, in encoder
noise_std=noise)
File "/home/alexchang/Course_105_1/ML/ML2016/ladder_og/ladder.py", line 350, in f
z, output_size = self.f_conv(h, spec, in_dim, gen_id('W'))
File "/home/alexchang/Course_105_1/ML/ML2016/ladder_og/ladder.py", line 452, in f_conv
filter_size), border_mode=bm)
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Original exception:
ValueError: The hardcoded shape for the number of rows in the image (8) isn't the run time shape (7).
Apply node that caused the error: ConvOp{('imshp', (192, 8, 8)),('kshp', (3, 3)),('nkern', 192),('bsize', 200),('dx', 1),('dy', 1),('out_mode', 'valid'),('unroll_batch', 5),('unroll
_kern', 2),('unroll_patch', False),('imshp_logical', (192, 8, 8)),('kshp_logical', (3, 3)),('kshp_logical_top_aligned', True)}(Elemwise{Composite{(i0 + (i1 * i2))}}[(0, 2)].0, f_9_W
)
Toposort index: 1201
Inputs types: [TensorType(float32, 4D), TensorType(float32, 4D)]
Inputs shapes: [(200, 192, 7, 7), (192, 192, 3, 3)]
Inputs strides: [(37632, 196, 28, 4), (6912, 36, 12, 4)]
Inputs values: ['not shown', 'not shown']
Outputs clients: [[Subtensor{int64::}(ConvOp{('imshp', (192, 8, 8)),('kshp', (3, 3)),('nkern', 192),('bsize', 200),('dx', 1),('dy', 1),('out_mode', 'valid'),('unroll_batch', 5),('un
roll_kern', 2),('unroll_patch', False),('imshp_logical', (192, 8, 8)),('kshp_logical', (3, 3)),('kshp_logical_top_aligned', True)}.0, ScalarFromTensor.0), Subtensor{:int64:}(ConvOp{
('imshp', (192, 8, 8)),('kshp', (3, 3)),('nkern', 192),('bsize', 200),('dx', 1),('dy', 1),('out_mode', 'valid'),('unroll_batch', 5),('unroll_kern', 2),('unroll_patch', False),('imsh
p_logical', (192, 8, 8)),('kshp_logical', (3, 3)),('kshp_logical_top_aligned', True)}.0, ScalarFromTensor.0)]]
Backtrace when the node is created(use Theano flag traceback.limit=N to make it longer):
File "./run.py", line 652, in <module>
if train(d) is None:
File "./run.py", line 411, in train
ladder = setup_model(p)
File "./run.py", line 182, in setup_model
ladder.apply(x, y, x_only)
File "/home/alexchang/Course_105_1/ML/ML2016/ladder_og/ladder.py", line 203, in apply
noise_std=self.p.f_local_noise_std)
File "/home/alexchang/Course_105_1/ML/ML2016/ladder_og/ladder.py", line 350, in f
z, output_size = self.f_conv(h, spec, in_dim, gen_id('W'))
File "/home/alexchang/Course_105_1/ML/ML2016/ladder_og/ladder.py", line 452, in f_conv
filter_size), border_mode=bm)
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Do you have any idea?
I have run the code using 100 labeled samples 10000 labeled validation 50000 unlabele on my GPU.
Here is the results:
Saving to results/mnist_100_full0/trained_params
e 150, i 90000:V_C_class nan, V_E nan, V_C_de nan nan nan nan nan nan nan, T_C_de 0.00541 0.0612 0.922 0.351 0.159 0.0551 0.0312, T_C_class 0.000315, VF_C_class nan, VF_E nan, VF_C_de nan nan nan nan nan nan nan
valid_final_error_rate_clean nan
Took 130.8 minutes
What is the meaning of V_C T_C etc.?
Thanks.
I am trying to run this but got NaN errors when the program prints validation costs/accuracy. The command I ran is:
THEANO_FLAGS='floatX=float32,device=gpu0,lib.cnmem=0.8' FUEL_DATA_PATH=. \
./run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec gauss \
--denoising-cost-x 1000,10,0.1,0.1,0.1,0.1,0.1 --labeled-samples 100 \
--unlabeled-samples 60000 --seed 1 -- mnist_100_full
Logger output:
Using gpu device 0: GeForce GTX 1060 6GB (CNMeM is enabled with initial size: 80.0% of memory, cuDNN not available)
INFO:main:Logging into results/mnist_100_full5/log.txt
INFO:main:== COMMAND LINE ==
INFO:main:./run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec gauss --denoising-cost-x 1000,10,0.1,0.1,0.1,0.1,0.1 --labeled-samples 100 --unlabeled-samples 60000 --seed 1 -- mnist_100_full
INFO:main:== PARAMETERS ==
INFO:main: zestbn : bugfix
INFO:main: dseed : 1
INFO:main: top_c : 1
INFO:main: super_noise_std : 0.3
INFO:main: batch_size : 100
INFO:main: dataset : mnist
INFO:main: valid_set_size : 10000
INFO:main: num_epochs : 150
INFO:main: whiten_zca : 0
INFO:main: unlabeled_samples : 60000
INFO:main: decoder_spec : ('gauss',)
INFO:main: valid_batch_size : 100
INFO:main: denoising_cost_x : (1000.0, 10.0, 0.1, 0.1, 0.1, 0.1, 0.1)
INFO:main: f_local_noise_std : 0.3
INFO:main: cmd : train
INFO:main: act : relu
INFO:main: lrate_decay : 0.67
INFO:main: seed : 1
INFO:main: lr : 0.002
INFO:main: save_to : mnist_100_full
INFO:main: save_dir : results/mnist_100_full5
INFO:main: commit : 6df36de9d9a6e69dd55ce0985ebbb489633d118e
INFO:main: contrast_norm : 0
INFO:main: encoder_layers : ('1000', '500', '250', '250', '250', '10')
INFO:main: labeled_samples : 100
INFO:main:Using 0 examples for validation
INFO:main.model:Encoder: clean, labeled
INFO:main.model: 0: noise 0
INFO:main.model: f1: fc, relu, BN, noise 0.00, params 1000, dim (1, 28, 28) -> (1000,)
INFO:main.model: f2: fc, relu, BN, noise 0.00, params 500, dim (1000,) -> (500,)
INFO:main.model: f3: fc, relu, BN, noise 0.00, params 250, dim (500,) -> (250,)
INFO:main.model: f4: fc, relu, BN, noise 0.00, params 250, dim (250,) -> (250,)
INFO:main.model: f5: fc, relu, BN, noise 0.00, params 250, dim (250,) -> (250,)
INFO:main.model: f6: fc, softmax, BN, noise 0.00, params 10, dim (250,) -> (10,)
INFO:main.model:Encoder: corr, labeled
INFO:main.model: 0: noise 0.3
INFO:main.model: f1: fc, relu, BN, noise 0.30, params 1000, dim (1, 28, 28) -> (1000,)
INFO:main.model: f2: fc, relu, BN, noise 0.30, params 500, dim (1000,) -> (500,)
INFO:main.model: f3: fc, relu, BN, noise 0.30, params 250, dim (500,) -> (250,)
INFO:main.model: f4: fc, relu, BN, noise 0.30, params 250, dim (250,) -> (250,)
INFO:main.model: f5: fc, relu, BN, noise 0.30, params 250, dim (250,) -> (250,)
INFO:main.model: f6: fc, softmax, BN, noise 0.30, params 10, dim (250,) -> (10,)
INFO:main.model:Decoder: z_corr -> z_est
INFO:main.model: g6: gauss, denois 0.10, dim None -> (10,)
INFO:main.model: g5: gauss, denois 0.10, dim (10,) -> (250,)
INFO:main.model: g4: gauss, denois 0.10, dim (250,) -> (250,)
INFO:main.model: g3: gauss, denois 0.10, dim (250,) -> (250,)
INFO:main.model: g2: gauss, denois 0.10, dim (250,) -> (500,)
INFO:main.model: g1: gauss, denois 10.00, dim (500,) -> (1000,)
INFO:main.model: g0: gauss, denois 1000.00, dim (1000,) -> (1, 28, 28)
INFO:main:Found the following parameters: [f_6_W, f_5_b, f_5_W, f_4_b, f_4_W, f_3_b, f_3_W, f_2_b, f_2_W, f_1_b, f_1_W, g_6_a5, f_6_c, f_6_b, g_6_a4, g_6_a3, g_6_a2, g_6_a1, g_6_a10, g_6_a9, g_6_a8, g_6_a7, g_6_a6, g_5_a5, g_5_W, g_5_a4, g_5_a3, g_5_a2, g_5_a1, g_5_a10, g_5_a9, g_5_a8, g_5_a7, g_5_a6, g_4_a5, g_4_W, g_4_a4, g_4_a3, g_4_a2, g_4_a1, g_4_a10, g_4_a9, g_4_a8, g_4_a7, g_4_a6, g_3_a5, g_3_W, g_3_a4, g_3_a3, g_3_a2, g_3_a1, g_3_a10, g_3_a9, g_3_a8, g_3_a7, g_3_a6, g_2_a5, g_2_W, g_2_a4, g_2_a3, g_2_a2, g_2_a1, g_2_a10, g_2_a9, g_2_a8, g_2_a7, g_2_a6, g_1_a5, g_1_W, g_1_a4, g_1_a3, g_1_a2, g_1_a1, g_1_a10, g_1_a9, g_1_a8, g_1_a7, g_1_a6, g_0_a5, g_0_W, g_0_a4, g_0_a3, g_0_a2, g_0_a1, g_0_a10, g_0_a9, g_0_a8, g_0_a7, g_0_a6]
INFO:blocks.algorithms:Taking the cost gradient
INFO:blocks.algorithms:The cost gradient computation graph is built
INFO:main:Balancing 100 labels...
INFO:main.nn:Batch norm parameters: f_1_bn_mean_clean, f_1_bn_var_clean, f_2_bn_mean_clean, f_2_bn_var_clean, f_3_bn_mean_clean, f_3_bn_var_clean, f_4_bn_mean_clean, f_4_bn_var_clean, f_5_bn_mean_clean, f_5_bn_var_clean, f_6_bn_mean_clean, f_6_bn_var_clean
INFO:main:Balancing 100 labels...
INFO:main.nn:Batch norm parameters: f_1_bn_mean_clean, f_1_bn_var_clean, f_2_bn_mean_clean, f_2_bn_var_clean, f_3_bn_mean_clean, f_3_bn_var_clean, f_4_bn_mean_clean, f_4_bn_var_clean, f_5_bn_mean_clean, f_5_bn_var_clean, f_6_bn_mean_clean, f_6_bn_var_clean
INFO:blocks.main_loop:Entered the main loop
/home/mren/anaconda2/envs/ladder/lib/python2.7/site-packages/pandas/core/generic.py:939: PerformanceWarning:
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed-integer,key->block0_values] [items->[0]]
return pytables.to_hdf(path_or_buf, key, self, **kwargs)
INFO:blocks.algorithms:Initializing the training algorithm
INFO:blocks.algorithms:The training algorithm is initialized
INFO:blocks.extensions.monitoring:Monitoring on auxiliary data started
INFO:blocks.extensions.monitoring:Monitoring on auxiliary data finished
INFO:main.utils:e 0, i 0:V_C_class nan, V_E nan, V_C_de nan nan nan nan nan nan nan
INFO:blocks.extensions.monitoring:Monitoring on auxiliary data started
INFO:blocks.extensions.monitoring:Monitoring on auxiliary data finished
INFO:main.utils:e 1, i 600:V_C_class nan, V_E nan, V_C_de nan nan nan nan nan nan nan, T_C_de 0.019 0.253 0.994 0.955 0.843 0.679 0.231, T_C_class 0.156
INFO:main.nn:Iter 1, lr 0.002000
INFO:blocks.extensions.monitoring:Monitoring on auxiliary data started
INFO:blocks.extensions.monitoring:Monitoring on auxiliary data finished
INFO:main.utils:e 2, i 1200:V_C_class nan, V_E nan, V_C_de nan nan nan nan nan nan nan, T_C_de 0.0103 0.135 0.975 0.9 0.422 0.2 0.0925, T_C_class 0.0308
INFO:main.nn:Iter 2, lr 0.002000
INFO:blocks.extensions.monitoring:Monitoring on auxiliary data started
INFO:blocks.extensions.monitoring:Monitoring on auxiliary data finished
INFO:main.utils:e 3, i 1800:V_C_class nan, V_E nan, V_C_de nan nan nan nan nan nan nan, T_C_de 0.00909 0.123 0.97 0.863 0.239 0.131 0.0731, T_C_class 0.0143
INFO:main.nn:Iter 3, lr 0.002000
Hi, I came into some error that "ValueError: graph contains cycles" from training main-loop in theano.
And this only happened when running gamma model with conv layers on both dataset "mnist" and "cifar". But if I run the "Conv-FC" model or supervised learning model on two datasets, the training can work. I am not sure how to debug for this error. Could you please help with this issue? Thanks!
I am unable to run any of the examples on Ubuntu. Here is the error:
INFO:main.model: f1: convv, leakyrelu, BN, noise 0.00, params [96, 3, 1, 1], dim (3, 32, 32) -> (96, 30, 30)
Traceback (most recent call last):
File "./run.py", line 652, in
if train(d) is None:
File "./run.py", line 411, in train
ladder = setup_model(p)
File "./run.py", line 182, in setup_model
ladder.apply(x, y, x_only)
File "/home/attaullah/Downloads/dlbook_exercises-master/ladder.py", line 197, in apply
clean = self.act.clean = encoder(input_concat, 'clean')
File "/home/attaullah/Downloads/dlbook_exercises-master/ladder.py", line 185, in encoder
noise_std=noise)
File "/home/attaullah/Downloads/dlbook_exercises-master/ladder.py", line 350, in f
z, output_size = self.f_conv(h, spec, in_dim, gen_id('W'))
File "/home/attaullah/Downloads/dlbook_exercises-master/ladder.py", line 451, in f_conv
filter_size), border_mode=bm)
File "/home/attaullah/.local/lib/python2.7/site-packages/theano/tensor/nnet/conv.py", line 153, in conv2d
return op(input, filters)
File "/home/attaullah/.local/lib/python2.7/site-packages/theano/gof/op.py", line 615, in call
node = self.make_node(*inputs, **kwargs)
File "/home/attaullah/.local/lib/python2.7/site-packages/theano/tensor/nnet/conv.py", line 655, in make_node
"inputs(%s), kerns(%s)" % (_inputs.dtype, _kerns.dtype))
NotImplementedError: The image and the kernel must have the same type.inputs(float64), kerns(float32)
I have already set data_path in ".fuelrc" and downloaded data but no luck. I using Ubuntu 17.04, 64 bit python 2.7 and all required python packages with CPU ONLY system. any help in this regard will be much appreciated.
also on some examples:
INFO:main.model:Decoder: z_corr -> z_est
Traceback (most recent call last):
File "run.py", line 652, in
if train(d) is None:
File "run.py", line 411, in train
ladder = setup_model(p)
File "run.py", line 182, in setup_model
ladder.apply(x, y, x_only)
File "/home/attaullah/Downloads/dlbook_exercises-master/ladder.py", line 231, in apply
top_g=top_g)
File "/home/attaullah/Downloads/dlbook_exercises-master/ladder.py", line 576, in g
a1 = bi(0., 'a1')
File "/home/attaullah/Downloads/dlbook_exercises-master/ladder.py", line 499, in
bi = lambda inits, name: self.bias(inits * np.ones(num_filters),
File "/home/attaullah/.local/lib/python2.7/site-packages/numpy/core/numeric.py", line 192, in ones
a = empty(shape, dtype, order)
TypeError: 'numpy.float64' object cannot be interpreted as an index
I run:
run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec sig --denoising-cost-x 1000,10,0.1,0.1,0.1,0.1,0.1 --labeled-samples 100 --unlabeled-samples 60000 --seed 1 -- mnist_100_full
I get:
File "run.py", line 649, in
if train(d) is None:
File "run.py", line 405, in train
in_dim, data, whiten, cnorm = setup_data(p, test_set=False)
File "run.py", line 240, in setup_data
train_set = dataset_class("train")
File "/usr/local/lib/python2.7/dist-packages/fuel/datasets/mnist.py", line 37, in init
which_sets=which_sets, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/fuel/datasets/hdf5.py", line 179, in init
raise ValueError('which_sets
should be an iterable of strings')
ValueError: which_sets
should be an iterable of strings
How could I solve this? Thanks in advance.
I am using linux and theano 0.9, python 2.7 (under linux). I get the following attribute error. Any help?
Thanks
THEANO_FLAGS='floatX=float32' python run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec gauss --denoising-cost-x 2000,20,0.1,0.1,0.1,0.1,0.1 --labeled-samples 50 --unlabeled-samples 60000 --seed 1 -- mnist_50_full
/home/me/ladder/venv2/local/lib/python2.7/site-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
"downsample module has been moved to the theano.tensor.signal.pool module.")
INFO:main:Logging into results/mnist_50_full1/log.txt
INFO:main:== COMMAND LINE ==
INFO:main:run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec gauss --denoising-cost-x 2000,20,0.1,0.1,0.1,0.1,0.1 --labeled-samples 50 --unlabeled-samples 60000 --seed 1 -- mnist_50_full
INFO:main:== PARAMETERS ==
INFO:main: zestbn : bugfix
INFO:main: dseed : 1
INFO:main: top_c : 1
INFO:main: super_noise_std : 0.3
INFO:main: batch_size : 100
INFO:main: dataset : mnist
INFO:main: valid_set_size : 10000
INFO:main: num_epochs : 150
INFO:main: whiten_zca : 0
INFO:main: unlabeled_samples : 60000
INFO:main: decoder_spec : ('gauss',)
INFO:main: valid_batch_size : 100
INFO:main: denoising_cost_x : (2000.0, 20.0, 0.1, 0.1, 0.1, 0.1, 0.1)
INFO:main: f_local_noise_std : 0.3
INFO:main: cmd : train
INFO:main: act : relu
INFO:main: lrate_decay : 0.67
INFO:main: seed : 1
INFO:main: lr : 0.002
INFO:main: save_to : mnist_50_full
INFO:main: save_dir : results/mnist_50_full1
INFO:main: commit : 78956cd
INFO:main: contrast_norm : 0
INFO:main: encoder_layers : ('1000', '500', '250', '250', '250', '10')
INFO:main: labeled_samples : 50
INFO:main:Using 0 examples for validation
INFO:main.model:Encoder: clean, labeled
INFO:main.model: 0: noise 0
INFO:main.model: f1: fc, relu, BN, noise 0.00, params 1000, dim (1, 28, 28) -> (1000,)
INFO:main.model: f2: fc, relu, BN, noise 0.00, params 500, dim (1000,) -> (500,)
INFO:main.model: f3: fc, relu, BN, noise 0.00, params 250, dim (500,) -> (250,)
INFO:main.model: f4: fc, relu, BN, noise 0.00, params 250, dim (250,) -> (250,)
INFO:main.model: f5: fc, relu, BN, noise 0.00, params 250, dim (250,) -> (250,)
INFO:main.model: f6: fc, softmax, BN, noise 0.00, params 10, dim (250,) -> (10,)
INFO:main.model:Encoder: corr, labeled
INFO:main.model: 0: noise 0.3
INFO:main.model: f1: fc, relu, BN, noise 0.30, params 1000, dim (1, 28, 28) -> (1000,)
INFO:main.model: f2: fc, relu, BN, noise 0.30, params 500, dim (1000,) -> (500,)
INFO:main.model: f3: fc, relu, BN, noise 0.30, params 250, dim (500,) -> (250,)
INFO:main.model: f4: fc, relu, BN, noise 0.30, params 250, dim (250,) -> (250,)
INFO:main.model: f5: fc, relu, BN, noise 0.30, params 250, dim (250,) -> (250,)
INFO:main.model: f6: fc, softmax, BN, noise 0.30, params 10, dim (250,) -> (10,)
INFO:main.model:Decoder: z_corr -> z_est
INFO:main.model: g6: gauss, denois 0.10, dim None -> (10,)
INFO:main.model: g5: gauss, denois 0.10, dim (10,) -> (250,)
INFO:main.model: g4: gauss, denois 0.10, dim (250,) -> (250,)
INFO:main.model: g3: gauss, denois 0.10, dim (250,) -> (250,)
INFO:main.model: g2: gauss, denois 0.10, dim (250,) -> (500,)
INFO:main.model: g1: gauss, denois 20.00, dim (500,) -> (1000,)
INFO:main.model: g0: gauss, denois 2000.00, dim (1000,) -> (1, 28, 28)
INFO:main:Found the following parameters: [f_5_b, f_4_b, f_3_b, f_2_b, f_1_b, g_6_a5, f_6_c, f_6_b, g_6_a4, g_6_a3, g_6_a2, g_6_a1, g_6_a10, g_6_a9, g_6_a8, g_6_a7, g_6_a6, g_5_a5, g_5_a4, g_5_a3, g_5_a2, g_5_a1, g_5_a10, g_5_a9, g_5_a8, g_5_a7, g_5_a6, g_4_a5, g_4_a4, g_4_a3, g_4_a2, g_4_a1, g_4_a10, g_4_a9, g_4_a8, g_4_a7, g_4_a6, g_3_a5, g_3_a4, g_3_a3, g_3_a2, g_3_a1, g_3_a10, g_3_a9, g_3_a8, g_3_a7, g_3_a6, g_2_a5, g_2_a4, g_2_a3, g_2_a2, g_2_a1, g_2_a10, g_2_a9, g_2_a8, g_2_a7, g_2_a6, g_1_a5, g_1_a4, g_1_a3, g_1_a2, g_1_a1, g_1_a10, g_1_a9, g_1_a8, g_1_a7, g_1_a6, g_0_a5, g_0_a4, g_0_a3, g_0_a2, g_0_a1, g_0_a10, g_0_a9, g_0_a8, g_0_a7, g_0_a6, f_1_W, f_2_W, f_3_W, f_4_W, f_5_W, f_6_W, g_5_W, g_4_W, g_3_W, g_2_W, g_1_W, g_0_W]
INFO:blocks.algorithms:Taking the cost gradient
INFO:blocks.algorithms:The cost gradient computation graph is built
INFO:main:Balancing 50 labels...
INFO:main.nn:Batch norm parameters: f_1_bn_mean_clean, f_1_bn_var_clean, f_2_bn_mean_clean, f_2_bn_var_clean, f_3_bn_mean_clean, f_3_bn_var_clean, f_4_bn_mean_clean, f_4_bn_var_clean, f_5_bn_mean_clean, f_5_bn_var_clean, f_6_bn_mean_clean, f_6_bn_var_clean
INFO:main:Balancing 50 labels...
INFO:main.nn:Batch norm parameters: f_1_bn_mean_clean, f_1_bn_var_clean, f_2_bn_mean_clean, f_2_bn_var_clean, f_3_bn_mean_clean, f_3_bn_var_clean, f_4_bn_mean_clean, f_4_bn_var_clean, f_5_bn_mean_clean, f_5_bn_var_clean, f_6_bn_mean_clean, f_6_bn_var_clean
INFO:blocks.main_loop:Entered the main loop
/home/me/ladder/venv2/local/lib/python2.7/site-packages/pandas/core/generic.py:1101: PerformanceWarning:
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed-integer,key->block0_values] [items->[0]]
return pytables.to_hdf(path_or_buf, key, self, **kwargs)
INFO:blocks.algorithms:Initializing the training algorithm
ERROR:blocks.main_loop:Error occured during training.
Blocks will attempt to run on_error
extensions, potentially saving data, before exiting and reraising the error. Note that the usual after_training
extensions will not be run. The original error will be re-raised and also stored in the training log. Press CTRL + C to halt Blocks immediately.
Traceback (most recent call last):
File "run.py", line 653, in
if train(d) is None:
File "run.py", line 502, in train
main_loop.run()
File "/home/me/ladder/venv2/local/lib/python2.7/site-packages/blocks/main_loop.py", line 197, in run
reraise_as(e)
File "/home/me/ladder/venv2/local/lib/python2.7/site-packages/blocks/utils/init.py", line 258, in reraise_as
six.reraise(type(new_exc), new_exc, orig_exc_traceback)
File "/home/me/ladder/venv2/local/lib/python2.7/site-packages/blocks/main_loop.py", line 172, in run
self.algorithm.initialize()
File "/home/me/ladder/venv2/local/lib/python2.7/site-packages/blocks/algorithms/init.py", line 128, in initialize
self.inputs = ComputationGraph(update_values).inputs
File "/home/me/ladder/venv2/local/lib/python2.7/site-packages/blocks/graph/init.py", line 74, in init
self._get_variables()
File "/home/me/ladder/venv2/local/lib/python2.7/site-packages/blocks/graph/init.py", line 125, in _get_variables
inputs = graph.inputs(self.outputs)
File "/home/me/ladder/venv2/local/lib/python2.7/site-packages/theano/gof/graph.py", line 693, in inputs
vlist = ancestors(variable_list, blockers)
File "/home/me/ladder/venv2/local/lib/python2.7/site-packages/theano/gof/graph.py", line 672, in ancestors
dfs_variables = stack_search(deque(variable_list), expand, 'dfs')
File "/home/me/ladder/venv2/local/lib/python2.7/site-packages/theano/gof/graph.py", line 640, in stack_search
expand_l = expand(l)
File "/home/me/ladder/venv2/local/lib/python2.7/site-packages/theano/gof/graph.py", line 670, in expand
if r.owner and (not blockers or r not in blockers):
AttributeError: 'numpy.float32' object has no attribute 'owner'
Original exception:
AttributeError: 'numpy.float32' object has no attribute 'owner'
Hi, I just setup the environment, and got a issue in my first running. Could you please show me how to solve this problem, many thanks!!!
$ python run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec gauss --denoising-cost-x 2000,20,0.1,0.1,0.1,0.1,0.1 --labeled-samples 50 --unlabeled-samples 60000 --seed 1 -- mnist_50_full
INFO:main:Logging into results/mnist_50_full6/log.txt
INFO:main:== COMMAND LINE ==
INFO:main:run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec gauss --denoising-cost-x 2000,20,0.1,0.1,0.1,0.1,0.1 --labeled-samples 50 --unlabeled-samples 60000 --seed 1 -- mnist_50_full
INFO:main:== PARAMETERS ==
INFO:main: zestbn : bugfix
INFO:main: dseed : 1
INFO:main: top_c : 1
INFO:main: super_noise_std : 0.3
INFO:main: batch_size : 100
INFO:main: dataset : mnist
INFO:main: valid_set_size : 10000
INFO:main: num_epochs : 150
INFO:main: whiten_zca : 0
INFO:main: unlabeled_samples : 60000
INFO:main: decoder_spec : ('gauss',)
INFO:main: valid_batch_size : 100
INFO:main: denoising_cost_x : (2000.0, 20.0, 0.1, 0.1, 0.1, 0.1, 0.1)
INFO:main: f_local_noise_std : 0.3
INFO:main: cmd : train
INFO:main: act : relu
INFO:main: lrate_decay : 0.67
INFO:main: seed : 1
INFO:main: lr : 0.002
INFO:main: save_to : mnist_50_full
INFO:main: save_dir : results/mnist_50_full6
INFO:main: commit : 5a8daa1
INFO:main: contrast_norm : 0
INFO:main: encoder_layers : ('1000', '500', '250', '250', '250', '10')
INFO:main: labeled_samples : 50
Traceback (most recent call last):
File "run.py", line 656, in
if train(d) is None:
File "run.py", line 410, in train
in_dim, data, whiten, cnorm = setup_data(p, test_set=False)
File "run.py", line 245, in setup_data
train_set = dataset_class(["train"])
File "/usr/local/lib/python2.7/dist-packages/fuel/datasets/mnist.py", line 36, in init
super(MNIST, self).init(self.data_path, which_set, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/fuel/datasets/hdf5.py", line 146, in init
"{}.".format(self.available_splits))
ValueError: '['train']' split is not provided by this dataset. Available splits are (u'test', u'train').
`INFO:main.utils:e 0, i 0:V_C_class nan, V_E nan, V_C_de nan
ERROR:blocks.main_loop:Error occured during training.
Blocks will attempt to run on_error
extensions, potentially saving data, before exiting and reraising the error. Note that the usual after_training
extensions will not be run. The original error will be re-raised and also stored in the training log. Press CTRL + C to halt Blocks immediately.
Traceback (most recent call last):
File "run.py", line 660, in
if train(d) is None:
File "run.py", line 509, in train
main_loop.run()
File "/home/julian/anaconda2/envs/ladder/lib/python2.7/site-packages/blocks/main_loop.py", line 197, in run
reraise_as(e)
File "/home/julian/anaconda2/envs/ladder/lib/python2.7/site-packages/blocks/utils/init.py", line 258, in reraise_as
six.reraise(type(new_exc), new_exc, orig_exc_traceback)
File "/home/julian/anaconda2/envs/ladder/lib/python2.7/site-packages/blocks/main_loop.py", line 183, in run
while self._run_epoch():
File "/home/julian/anaconda2/envs/ladder/lib/python2.7/site-packages/blocks/main_loop.py", line 232, in _run_epoch
while self._run_iteration():
File "/home/julian/anaconda2/envs/ladder/lib/python2.7/site-packages/blocks/main_loop.py", line 253, in _run_iteration
self.algorithm.process_batch(batch)
File "/home/julian/anaconda2/envs/ladder/lib/python2.7/site-packages/blocks/algorithms/init.py", line 287, in process_batch
self._function(*ordered_batch)
File "/home/julian/anaconda2/envs/ladder/lib/python2.7/site-packages/theano/compile/function_module.py", line 871, in call
storage_map=getattr(self.fn, 'storage_map', None))
File "/home/julian/anaconda2/envs/ladder/lib/python2.7/site-packages/theano/gof/link.py", line 314, in raise_with_op
reraise(exc_type, exc_value, exc_trace)
File "/home/julian/anaconda2/envs/ladder/lib/python2.7/site-packages/theano/compile/function_module.py", line 859, in call
outputs = self.fn()
ValueError: GpuElemwise. Input dimension mis-match. Input 2 (indices start at 0) has shape[2] == 6, but the output's size on that axis is 5.
Apply node that caused the error: GpuElemwise{Composite{((i0 + (i1 * i2)) + i3)}}[(0, 0)](GpuJoin.0, CudaNdarrayConstant{[[[[ 0.30000001]]]]}, GpuReshape{4}.0, GpuDimShuffle{x,0,x,x}.0)
Toposort index: 1535
Inputs types: [CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, (True, True, True, True)), CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, (True, False, True, True))]
Inputs shapes: [(200, 192, 5, 5), (1, 1, 1, 1), (200, 192, 6, 6), (1, 192, 1, 1)]
Inputs strides: [(4800, 25, 5, 1), (0, 0, 0, 0), (6912, 36, 6, 1), (0, 1, 0, 0)]
Inputs values: ['not shown', CudaNdarray([[[[ 0.30000001]]]]), 'not shown', 'not shown']
Outputs clients: [[GpuElemwise{Composite{Switch(i0, i1, (i2 * i1))},no_inplace}(GpuElemwise{Composite{Cast{float32}(GT(i0, i1))},no_inplace}.0, GpuElemwise{Composite{((i0 + (i1 * i2)) + i3)}}[(0, 0)].0, CudaNdarrayConstant{[[[[ 0.1]]]]}), GpuElemwise{Composite{Cast{float32}(GT(i0, i1))},no_inplace}(GpuElemwise{Composite{((i0 + (i1 * i2)) + i3)}}[(0, 0)].0, CudaNdarrayConstant{[[[[ 0.]]]]})]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
Original exception:
ValueError: GpuElemwise. Input dimension mis-match. Input 2 (indices start at 0) has shape[2] == 6, but the output's size on that axis is 5.
Apply node that caused the error: GpuElemwise{Composite{((i0 + (i1 * i2)) + i3)}}[(0, 0)](GpuJoin.0, CudaNdarrayConstant{[[[[ 0.30000001]]]]}, GpuReshape{4}.0, GpuDimShuffle{x,0,x,x}.0)
Toposort index: 1535
Inputs types: [CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, (True, True, True, True)), CudaNdarrayType(float32, 4D), CudaNdarrayType(float32, (True, False, True, True))]
Inputs shapes: [(200, 192, 5, 5), (1, 1, 1, 1), (200, 192, 6, 6), (1, 192, 1, 1)]
Inputs strides: [(4800, 25, 5, 1), (0, 0, 0, 0), (6912, 36, 6, 1), (0, 1, 0, 0)]
Inputs values: ['not shown', CudaNdarray([[[[ 0.30000001]]]]), 'not shown', 'not shown']
Outputs clients: [[GpuElemwise{Composite{Switch(i0, i1, (i2 * i1))},no_inplace}(GpuElemwise{Composite{Cast{float32}(GT(i0, i1))},no_inplace}.0, GpuElemwise{Composite{((i0 + (i1 * i2)) + i3)}}[(0, 0)].0, CudaNdarrayConstant{[[[[ 0.1]]]]}), GpuElemwise{Composite{Cast{float32}(GT(i0, i1))},no_inplace}(GpuElemwise{Composite{((i0 + (i1 * i2)) + i3)}}[(0, 0)].0, CudaNdarrayConstant{[[[[ 0.]]]]})]]
HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
`
There is a error message: ValueError: GpuElemwise. Input dimension mis-match. Input 2 (indices start at 0) has shape[2] == 6, but the output's size on that axis is 5.
Do you have some idea about it?
When balanced_classes=True in line
https://github.com/arasmus/ladder/blob/master/run.py#L154
the examples from each class are add one after the other
however there is no additional shuffling of i_labeled
the only shuffling (with dseed
) is on the entire data set in setup_data
but then make_datastream
is called and it sort outs the labeled examples from each class and undo the shuffling.
This can reduce SGD optimization.
I got the following error when trying to run the cifar10 example:
Traceback (most recent call last):
File "./run.py", line 652, in <module>
if train(d) is None:
File "./run.py", line 410, in train
ladder = setup_model(p)
File "./run.py", line 181, in setup_model
ladder.apply(x, y, x_only)
File "/home/petteri/ladder/ladder.py", line 195, in apply
clean = self.act.clean = encoder(input_concat, 'clean')
File "/home/petteri/ladder/ladder.py", line 183, in encoder
noise_std=noise)
File "/home/petteri/ladder/ladder.py", line 349, in f
z, output_size = self.f_conv(h, spec, in_dim, gen_id('W'))
File "/home/petteri/ladder/ladder.py", line 450, in f_conv
filter_size), border_mode=bm)
File "/home/petteri/anaconda2/lib/python2.7/site-packages/theano/tensor/nnet/conv.py", line 153, in conv2d
return op(input, filters)
File "/home/petteri/anaconda2/lib/python2.7/site-packages/theano/gof/op.py", line 602, in __call__
node = self.make_node(*inputs, **kwargs)
File "/home/petteri/anaconda2/lib/python2.7/site-packages/theano/tensor/nnet/conv.py", line 655, in make_node
"inputs(%s), kerns(%s)" % (_inputs.dtype, _kerns.dtype))
NotImplementedError: The image and the kernel must have the same type.inputs(float64), kerns(float32)
Which got fixed by casting manually at line 180 of ladder.py
:
h = T.cast(h, 'float32')
But this again later leads to TypeError:
INFO:blocks.algorithms:Initializing the training algorithm
ERROR:blocks.main_loop:Error occured during training.
Blocks will attempt to run `on_error` extensions, potentially saving data, before exiting and reraising the error. Note that the usual `after_training` extensions will *not* be run. The original error will be re-raised and also stored in the training log. Press CTRL + C to halt Blocks immediately.
Traceback (most recent call last):
File "./run.py", line 652, in <module>
if train(d) is None:
File "./run.py", line 500, in train
main_loop.run()
File "/home/petteri/anaconda2/lib/python2.7/site-packages/blocks/main_loop.py", line 188, in run
reraise_as(e)
File "/home/petteri/anaconda2/lib/python2.7/site-packages/blocks/utils/__init__.py", line 225, in reraise_as
six.reraise(type(new_exc), new_exc, orig_exc_traceback)
File "/home/petteri/anaconda2/lib/python2.7/site-packages/blocks/main_loop.py", line 164, in run
self.algorithm.initialize()
File "/home/petteri/anaconda2/lib/python2.7/site-packages/blocks/algorithms/__init__.py", line 224, in initialize
self._function = theano.function(self.inputs, [], updates=all_updates)
File "/home/petteri/anaconda2/lib/python2.7/site-packages/theano/compile/function.py", line 322, in function
output_keys=output_keys)
File "/home/petteri/anaconda2/lib/python2.7/site-packages/theano/compile/pfunc.py", line 443, in pfunc
no_default_updates=no_default_updates)
File "/home/petteri/anaconda2/lib/python2.7/site-packages/theano/compile/pfunc.py", line 208, in rebuild_collect_shared
raise TypeError(err_msg, err_sug)
TypeError: ('An update must have the same type as the original shared variable (shared_var=f_11_b, shared_var.type=TensorType(float32, vector), update_val=Elemwise{sub,no_inplace}.0, update_val.type=TensorType(float64, vector))., If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.\n\nOriginal exception:\n\tTypeError: An update must have the same type as the original shared variable (shared_var=f_11_b, shared_var.type=TensorType(float32, vector), update_val=Elemwise{sub,no_inplace}.0, update_val.type=TensorType(float64, vector))., If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.')
Any thoughts on where it goes wrong?
The MNIST cases are successfully replicated but I was not able to run the CIFAR10 case.
When running the command:
./run.py train --encoder-layers convv:96:3:1:1-convf:96:3:1:1-convf:96:3:1:1-maxpool:2:2-convv:192:3:1:1-convf:192:3:1:1-convv:192:3:1:1-maxpool:2:2-convv:192:3:1:1-convv:192:1:1:1-convv:10:1:1:1-globalmeanpool:0 --decoder-spec 0-0-0-0-0-0-0-0-0-0-0-0-0 --dataset cifar10 --act leakyrelu --denoising-cost-x 0,0,0,0,0,0,0,0,0,0,0,0,0 --num-epochs 20 --lrate-decay 0.5 --seed 1 --whiten-zca 3072 --contrast-norm 55 --top-c False --labeled-samples 4000 --unlabeled-samples 50000 -- cifar_4k_baseline
I get error as following:
Traceback (most recent call last):
File "run.py", line 651, in <module>
if train(d) is None:
File "run.py", line 405, in train
in_dim, data, whiten, cnorm = setup_data(p, test_set=False)
File "run.py", line 287, in setup_data
whiten.fit(p.whiten_zca, get_data(d.train, d.train_ind))
File "run.py", line 279, in get_data
data = d.get_data(request=i)[d.sources.index('features')]
File "/home/chiehchi/anaconda/lib/python2.7/site-packages/fuel/datasets/hdf5.py", line 532, in get_data
data, shapes = self._in_memory_get_data(state, request)
File "/home/chiehchi/anaconda/lib/python2.7/site-packages/fuel/datasets/hdf5.py", line 545, in _in_memory_get_data
for data_source in self.data_sources]
File "/home/chiehchi/anaconda/lib/python2.7/site-packages/fuel/utils.py", line 241, in index_within_subset
request = self[subset_request]
File "/home/chiehchi/anaconda/lib/python2.7/site-packages/fuel/utils.py", line 118, in __getitem__
if key == slice(None, None, None):
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
How could I solve this? Thanks in advance!
Do you have intuition on what this means? When training, denoising costs are printed and computed -- but I infer not added to the actual cost that is used in the backprop step.
Hi, deal all
I find runing the training program using too much CPU resource, and my GPU seems not involved. Is there any suggestions on it?
I'm trying to run one of the commands from the README file:
THEANO_FLAGS='floatX=float32' ./run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec gauss --denoising-cost-x 1000,1,0.01,0.01,0.01,0.01,0.01 --labeled-samples 60000 --unlabeled-samples 60000 --seed 1 -- mnist_all_full
but keep getting the following error:
Traceback (most recent call last):
File "./run.py", line 651, in <module>
if train(d) is None:
File "./run.py", line 423, in train
step_rule=Adam(learning_rate=ladder.lr))
File "/usr/local/lib/python2.7/site-packages/blocks/algorithms/__init__.py", line 794, in __init__
self.learning_rate = shared_floatx(learning_rate, "learning_rate")
File "/usr/local/lib/python2.7/site-packages/blocks/utils/__init__.py", line 167, in shared_floatx
return theano.shared(theano._asarray(value, dtype=dtype),
File "/usr/local/lib/python2.7/site-packages/theano/misc/safe_asarray.py", line 33, in _asarray
rval = numpy.asarray(a, dtype=dtype, order=order)
File "/usr/local/lib/python2.7/site-packages/numpy/core/numeric.py", line 474, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.
If I look at the parameters passed to "array(...)" in line 474 of numeric.py, I see the following:
a -> learning_rate
type(a) -> <class 'theano.tensor.sharedvar.ScalarSharedVariable'>
dtype -> float64
Could this be due to a version conflict of any of the involved libraries?
I think that in line
https://github.com/arasmus/ladder/blob/c8d26028e54c886b08d8fbccd5a6be1bbe76da52/run.py#L592
you are missing nargs='+'
by chance this does not have any affect on the running of the code
I am able to run mnist but I get error while running cifar-10 :
I get the error :
TypeError: pool_2d() got an unexpected keyword argument 'ds'
so I modified the line 288 in nn.py to
z = pool_2d(z, ws=poolsize, stride=poolstride)
and I tried running with both theano 0.8.0 / 0.8.2 and 0.9.0 but I am getting this error now:
line 283, in pool_2d
assert cuda.dnn.dnn_available()
AttributeError: 'module' object has no attribute 'dnn'
nvcc --version
gives the following:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2013 NVIDIA Corporation
Built on Wed_Jul_17_18:36:13_PDT_2013
Cuda compilation tools, release 5.5, V5.5.0
Any Idea how to resolve this error ?
Could some one tell me what will happen if I use normal batch normalization not the one that developed by ladder network? And I have also don't understand why in evaluation process test and validation needs the statistics of train data?
wangxiao@GTX980:/Desktop/ladder-master$ ./run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec gauss --denoising-cost-x 2000,0,0,0,0,0,0 --labeled-samples 60000 --unlabeled-samples 60000 --seed 1 -- mnist_all_bottom/Desktop/ladder-master$ ./run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec 0-0-0-0-0-0-gauss --denoising-cost-x 0,0,0,0,0,0,2 --labeled-samples 60000 --unlabeled-samples 60000 --seed 1 -- mnist_all_gamma
/usr/local/lib/python2.7/dist-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
"downsample module has been moved to the theano.tensor.signal.pool module.")
ERROR:main:Subprocess returned fatal: Not a git repository (or any of the parent directories): .git
INFO:main:Logging into results/mnist_all_bottom0/log.txt
INFO:main:== COMMAND LINE ==
INFO:main:./run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec gauss --denoising-cost-x 2000,0,0,0,0,0,0 --labeled-samples 60000 --unlabeled-samples 60000 --seed 1 -- mnist_all_bottom
INFO:main:== PARAMETERS ==
INFO:main: zestbn : bugfix
INFO:main: dseed : 1
INFO:main: top_c : 1
INFO:main: super_noise_std : 0.3
INFO:main: batch_size : 100
INFO:main: dataset : mnist
INFO:main: valid_set_size : 10000
INFO:main: num_epochs : 150
INFO:main: whiten_zca : 0
INFO:main: unlabeled_samples : 60000
INFO:main: decoder_spec : ('gauss',)
INFO:main: valid_batch_size : 100
INFO:main: denoising_cost_x : (2000.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
INFO:main: f_local_noise_std : 0.3
INFO:main: cmd : train
INFO:main: act : relu
INFO:main: lrate_decay : 0.67
INFO:main: seed : 1
INFO:main: lr : 0.002
INFO:main: save_to : mnist_all_bottom
INFO:main: save_dir : results/mnist_all_bottom0
INFO:main: commit :
INFO:main: contrast_norm : 0
INFO:main: encoder_layers : ('1000', '500', '250', '250', '250', '10')
INFO:main: labeled_samples : 60000
Traceback (most recent call last):
File "./run.py", line 651, in
if train(d) is None:
File "./run.py", line 405, in train
in_dim, data, whiten, cnorm = setup_data(p, test_set=False)
File "./run.py", line 240, in setup_data
train_set = dataset_class("train")
File "/usr/local/lib/python2.7/dist-packages/fuel/datasets/mnist.py", line 36, in init
super(MNIST, self).init(self.data_path, which_set, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/fuel/datasets/mnist.py", line 40, in data_path
return os.path.join(config.data_path, self.filename)
File "/usr/local/lib/python2.7/dist-packages/fuel/config_parser.py", line 101, in getattr
"provided: {}.".format(key))
fuel.config_parser.ConfigurationError: Configuration not set and no default provided: data_path.
wangxiao@GTX980:
/usr/local/lib/python2.7/dist-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.
"downsample module has been moved to the theano.tensor.signal.pool module.")
ERROR:main:Subprocess returned fatal: Not a git repository (or any of the parent directories): .git
INFO:main:Logging into results/mnist_all_gamma0/log.txt
INFO:main:== COMMAND LINE ==
INFO:main:./run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec 0-0-0-0-0-0-gauss --denoising-cost-x 0,0,0,0,0,0,2 --labeled-samples 60000 --unlabeled-samples 60000 --seed 1 -- mnist_all_gamma
INFO:main:== PARAMETERS ==
INFO:main: zestbn : bugfix
INFO:main: dseed : 1
INFO:main: top_c : 1
INFO:main: super_noise_std : 0.3
INFO:main: batch_size : 100
INFO:main: dataset : mnist
INFO:main: valid_set_size : 10000
INFO:main: num_epochs : 150
INFO:main: whiten_zca : 0
INFO:main: unlabeled_samples : 60000
INFO:main: decoder_spec : ('0', '0', '0', '0', '0', '0', 'gauss')
INFO:main: valid_batch_size : 100
INFO:main: denoising_cost_x : (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 2.0)
INFO:main: f_local_noise_std : 0.3
INFO:main: cmd : train
INFO:main: act : relu
INFO:main: lrate_decay : 0.67
INFO:main: seed : 1
INFO:main: lr : 0.002
INFO:main: save_to : mnist_all_gamma
INFO:main: save_dir : results/mnist_all_gamma0
INFO:main: commit :
INFO:main: contrast_norm : 0
INFO:main: encoder_layers : ('1000', '500', '250', '250', '250', '10')
INFO:main: labeled_samples : 60000
Traceback (most recent call last):
File "./run.py", line 651, in
if train(d) is None:
File "./run.py", line 405, in train
in_dim, data, whiten, cnorm = setup_data(p, test_set=False)
File "./run.py", line 240, in setup_data
train_set = dataset_class("train")
File "/usr/local/lib/python2.7/dist-packages/fuel/datasets/mnist.py", line 36, in init
super(MNIST, self).init(self.data_path, which_set, kwargs)
File "/usr/local/lib/python2.7/dist-packages/fuel/datasets/mnist.py", line 40, in data_path
return os.path.join(config.data_path, self.filename)
File "/usr/local/lib/python2.7/dist-packages/fuel/config_parser.py", line 101, in getattr
"provided: {}.".format(key))
*__fuel.config_parser.ConfigurationError: Configuration not set and no default provided: data_path.*
wangxiao@GTX980:~/Desktop/ladder-master$
When running the command for mnist_100_full,
# Full
run.py train --encoder-layers 1000-500-250-250-250-10 --decoder-spec sig --denoising-cost-x 1000,10,0.1,0.1,0.1,0.1,0.1 --labeled-samples 100 --unlabeled-samples 60000 --seed 1 -- mnist_100_full
I get 16.7% error, which is nothing close to the 1.13% reported in the paper.
I run this code several times and always get a test accuracy of about 98.5%. It's a little bit lower than the reported accuracy on the paper, isn't it?
Hi I'm reimplementing the ladder and tagger networks in TensorFlow and have found what I believe to be a (minor) bug. Could you please clarify this?
Batch normalization (BN) parameters in the decoder (lines 488-494 in ladder.py) are not annotated as having role BNPARAM. As such, they are not replaced in the graph for the training set statistics by TestMonitoring._get_bn_params. If one attempts to evaluate the model after training with very small batch sizes performance is degraded because the mean and variance for the BN step is computed from a very small sample.
Thanks,
Guillem
Hi !
I'm newbie in this domain and i study the method proposed in the paper as a school project,
I fixed the errors of the packages version, i downloaded the data but the second command of conversion doesn't work so i got the following error:
Traceback (most recent call last):
File "./run.py", line 652, in
if train(d) is None:
File "./run.py", line 406, in train
in_dim, data, whiten, cnorm = setup_data(p, test_set=False)
File "./run.py", line 235, in setup_data
}[p.dataset]
KeyError: 'a'
Any one has an idea about how to solve this error ?
Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.