atcold / pytorch-cortexnet Goto Github PK

View Code? Open in Web Editor NEW

361.0 20.0 80.0 13.65 MB

PyTorch implementation of the CortexNet predictive model

Home Page: http://tinyurl.com/CortexNet/

Python 0.52% Jupyter Notebook 99.34% Shell 0.13% Gnuplot 0.01%

pytorch video deep-learning predictive-modeling self-supervised unsupervised-learning

pytorch-cortexnet's Introduction

CortexNet

This repo contains the PyTorch implementation of CortexNet.
Check the project website for further information.

Project structure

The project consists of the following folders and files:

data/: contains Bash scripts and a Python class definition inherent video data loading;
image-pretraining/: hosts the code for pre-training TempoNet's discriminative branch;
model/: stores several network architectures, including PredNet, an additive feedback Model01, and a modulatory feedback Model02 (CortexNet);
notebook/: collection of Jupyter Notebooks for data exploration and results visualisation;
utils/: scripts for
- (current or former) training error plotting,
- experiments diff,
- multi-node synchronisation,
- generative predictions visualisation,
- network architecture graphing;
results@: link to the location where experimental results will be saved within 3-digit folders;
new_experiment.sh*: creates a new experiment folder, updates last@, prints a memo about last used settings;
last@: symbolic link pointing to a new results sub-directory created by new_experiment.sh;
main.py: training script for CortexNet in MatchNet or TempoNet configuration;

Dependencies

scikit-video: accessing images / videos

pip install sk-video

tqdm: progress bar

conda config --add channels conda-forge
conda update --all
conda install tqdm

IDE

This project has been realised with PyCharm by JetBrains and the Vim editor. Grip has been also fundamental for crafting decent documtation locally.

Initialise environment

Once you've determined where you'd like to save your experimental results — let's call this directory <my saving location> — run the following commands from the project's root directory:

ln -s <my saving location> results  # replace <my saving location>
mkdir results/000 && touch results/000/train.log  # init. placeholder
ln -s results/000 last  # create pointer to the most recent result

Setup new experiment

Ready to run your first experiment? Type the following:

./new_experiment.sh

GPU selection

Let's say your machine has N GPUs. You can choose to use any of these, by specifying the index n = 0, ..., N-1. Therefore, type CUDA_VISIBLE_DEVICES=n just before python ... in the following sections.

Train MatchNet

Download e-VDS35 (e.g. e-VDS35-May17.tar) from here.
Use data/resize_and_split.sh to prepare your (video) data for training. It resizes videos present in folders of folders (i.e. directory of classes) and may split them into training and validation set. May also skip short videos and trim longer ones. Check data/README.md for more details.
Run the main.py script to start training. Use -h to print the command line interface (CLI) arguments help.

python -u main.py --mode MatchNet <CLI arguments> | tee last/train.log

Train TempoNet

Download e-VDS35 (e.g. e-VDS35-May17.tar) from here.
Pre-train the forward branch (see image-pretraining/) on an image data set (e.g. 33-image-set.tar from here);
Use data/resize_and_sample.sh to prepare your (video) data for training. It resizes videos present in folders of folders (i.e. directory of classes) and samples them. Videos are then distributed across training and validation set. May also skip short videos and trim longer ones. Check data/README.md for more details.
Run the main.py script to start training. Use -h to print the CLI arguments help.

python -u main.py --mode TempoNet --pre-trained <path> <CLI args> | tee last/train.log

GPU selection

To run on a specific GPU, say n, type CUDA_VISIBLE_DEVICES=n just before python ....

pytorch-cortexnet's People

Contributors

Stargazers

Watchers

Forkers

peratham ml-lab allensmile chagge benjamesbabala tigerneil caomw quanvuong ajaytalati sheuan ranarag awokeknowing oppa3109 codeaudit mnrmja007 chundiliu aitazhixin yoerking likeucode wanjinchang sunkaianna pustar macfrellow tony32769 dorraeathologeek hardik2396 miscacc2020 hedgefair jaredmarkowitz codeac29 mzhai2 denethor1997 solertis rogertrullo kikyou123 daobinhuang grseb9s robbiewu008 stanlee321 lambdawill naruto-sasuke jxchen01 frankfanteev ajayghoshrr psavine42 damienstanton hyzcn zcrwind lee2839 ansuini marcociccone arvind-cp donadonny hibiscuses afcarl vinxentzhang shubhampachori12110095 aivanni pugongyingizh9395 batermj amirunpri2018 charlesj-abu soonhwan-kwon ssgalitsky kokocheung dipika-singhania add1993 yuv4r4j petinhoss7 xujinglin geochri stjordanis housiyun ashoknp-git lihu8918 akhilgakhar xuanphu108 cryptowealth-technology audiowiz

pytorch-cortexnet's Issues

Issue about ConvLstm

Hi Atcold

Thanks for the wonderful code and also the wonderful tutorials on pytroch!
I actually have a doubt --
I read two articles detailing the ConvLstm model in
[1] https://en.wikipedia.org/w/index.php?title=Long_short-term_memory&oldid=784163987#cite_note-20
[2] https://arxiv.org/pdf/1506.04214.pdf -- equation (3)

In both the above models, while computing the input, forget and output gates, they have used the previous cell states i.e prev_cell whereas you have not used. Can you please tell me why not ?

Thanks
Devraj

Regarding the datasets and pre-trained model weights

Hello, it seems that the data link provided in your readme cannot be accessed at ftp://elab-board2.ecn.purdue.edu/e-VDS/e-VDS35-May17.tar. Additionally, I was wondering if you have any pre-trained model weights available?

can not download the dataset

when i download the data in https://engineering.purdue.edu/elab/eVDS/#download,
i find

what is wrong with that?

RuntimeError: bool value of Tensor with more than one value is ambiguous

While running your code：“DiscriminativeCell.py",, I encountered this error：
line 35, in forward
input_projection = self.first and bottom_up or f.relu(f.max_pool2d(self.from_bottom(bottom_up), POOL, POOL))
RuntimeError: bool value of Tensor with more than one value is ambiguous。

Can you give me any hint to solve?
Thank you very much.

Could you tell me your environment?

Really thank you for sharing the source code.
I am trying to run this project but there are some errors which I can't solve.

So, I want to use a same running environment with you.

thank you

Trying to train MatchNet but getting issues output loss calculation.

MatchNet_output_error.txt
MatchNet_output.txt

small issue

Hi, i thought i'd let you know, I had to change VideoFolder.py to call close() and not _close() to close video files, here: https://github.com/Atcold/pytorch-CortexNet/blob/master/data/VideoFolder.py#L146,L158

Execution error with network_bisection.ipynb

Running the code in network_bisection.ipynb I found an error in the code:

my_embedding = torch.zeros(512)
def fun(m, i, o): my_embedding.copy_(o.data)
h = avgpool_layer.register_forward_hook(fun)
h_x = resnet_18(x)
h.remove()

RuntimeError: expand(torch.FloatTensor{[1, 512, 1, 1]}, size=[512]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (4)

RuntimeError: bool value of Variable objects containing non-empty torch.cuda.FloatTensor is ambiguous

Hello, Atcold. First, thank you for issuing such a great program.
While I download the e-VDS35-m256-May17.tar and followed the steps that are posted on github to train a sample model for MatchNet, the program crashed at the first epoch.
Here is the stack from python.

Traceback (most recent call last): File "main.py", line 402, in <module> main() File "main.py", line 195, in main train(train_loader, model, (mse, nll_final, nll_train), optimiser, epoch) File "main.py", line 305, in train ce_loss, mse_loss, state, x_hat_data = compute_loss(x[t], x[t + 1], y[t], state) File "main.py", line 265, in compute_loss (x_hat, state_), (_, idx) = model(V(x_), state_) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 225, in __call__ result = self.forward(*input, **kwargs) File "/home/fung/workspace/pytorch-CortexNet/model/Model02.py", line 79, in forward s = state[layer - 1] or V(x.data.clone().zero_()) File "/usr/local/lib/python3.5/dist-packages/torch/autograd/variable.py", line 122, in __bool__ torch.typename(self.data) + " is ambiguous") RuntimeError: bool value of Variable objects containing non-empty torch.cuda.FloatTensor is ambiguous

The error occurs at this line s= state[layer - 1] or V(x.data.clone().zero_()) , because it needs to combine the state.data and input x, after the first frame. So, the program really crashed at state[layer - 1] .

I used the python3 to run the code and here is the command I used to run python3 main.py --mode MatchNet --cuda | tee last/train.log. Also, the video data is in the default path.

So, what can I do to fix this error or something I missed or did wrong?

Paper

Hi @Atcold , can you provide the corresponding paper for this algorithm?

Data Parallel for ConvLSTM

Hi,Thanks for sharing great work!

I want to train my model with convLSTMcell Layer on multiple GUPs, but how do I do that?
I'd appreciate it if you could give me a tip! Thanks!

RuntimeError: bool value of Tensor with more than one value is ambiguous

Hello,
Thank you for your excellent work.

While attempting to reproduce your results using
python = 3.6
pytorch = 0.4.1

I get the following error
 File "/pytorch-CortexNet-master/model/Model02.py", line 76, in forward
    s = state[layer - 1] or V(x.data.clone().zero_())
RuntimeError: bool value of Tensor with more than one value is ambiguous

Seems related to the variables & tensors merge from 0.3 -> 0.4. Could you please provide a suggestion.

System Specifications Required

Hi @Atcold
I am very interested in this work and would like to implement this. Could you please reply the specifications of the system which you used for training?

Can't load in pre-trained model

I'd like to extract features from trained model in MatchNet mode, but it won't load the pretrained model in.

    49     print('Load pre-trained weights')
    50     # args.pre_trained = 'image-pretraining/model02D-33IS/model_best.pth.tar'
--> 51     dict_33 = torch.load(args.pre_trained)['state_dict']

TypeError: 'Model02' object is not subscriptable

In [10]: args.pre_trained
Out[10]: './last/model.pth.tar'

Any ideas? the training never saved a "model_best.pth.tar" file, only a model.pth.tar

about the required memory size of GPU and training time

Hi, @Atcold ,

Thanks for releasing such a great package and I'm very interested in this work and want to have a practice on it. Could you reply the required memory size of GPU (I only have 980Ti , 6GB) to run the training process? Also could you elaborate on how long you trained your network in each settings (unsupervised and supervised) and which GPUs you used?

Thanks!

error in training

I'm running using your parameters on the data you indicate in the README. Starts okay, but dies in the first epoch. Any ideas?
Thanks.

~/torch_codes/pytorch-CortexNet$ python -u main.py --mode MatchNet --size 3 32 64 128 256 --tau 0 --big-t 10 --log-interval 10 --cuda --view 2 --show-x_hat --epochs 30  --model model_02 --lr-decay 10 10 --data /work/CortexNet_Experiments/VDS35_data/preprocessed-data | tee last/train.log
CLI arguments: --mode MatchNet --size 3 32 64 128 256 --tau 0 --big-t 10 --log-interval 10 --cuda --view 2 --show-x_hat --epochs 30 --model model_02 --lr-decay 10 10 --data /work/CortexNet_Experiments/VDS35_data/preprocessed-data
Current commit hash: bc28dac4e6a1ad9abb11e2fbc48d310a85e9903a
Define image pre-processing
Define train data loader
Define validation data loader
Define model

---------------------------- Building model Model02 ----------------------------
Hidden layers: 4
Net sizing: (3, 32, 64, 128, 256, 970)
Input spatial size: 3 x (256, 256)
Layer 1 ------------------------------------------------------------------------
Bottom size: 3 x (256, 256)
Top size: 32 x (128, 128)
Layer 2 ------------------------------------------------------------------------
Bottom size: 32 x (128, 128)
Top size: 64 x (64, 64)
Layer 3 ------------------------------------------------------------------------
Bottom size: 64 x (64, 64)
Top size: 128 x (32, 32)
Layer 4 ------------------------------------------------------------------------
Bottom size: 128 x (32, 32)
Top size: 256 x (16, 16)
Classifier ---------------------------------------------------------------------
256 --> 970
--------------------------------------------------------------------------------

Create a MSE and balanced NLL criterions
Instantiate a SGD optimiser
Training epoch 1
Traceback (most recent call last):
  File "main.py", line 394, in <module>
    main()
  File "main.py", line 194, in main
    train(train_loader, model, (mse, nll_final, nll_train), optimiser, epoch)
  File "main.py", line 297, in train
    ce_loss, mse_loss, state, x_hat_data = compute_loss(x[t], x[t + 1], y[t], state)
  File "main.py", line 261, in compute_loss
    (x_hat, state_), (_, idx) = model(V(x_), state_)
  File "...anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "...torch_codes/pytorch-CortexNet/model/Model02.py", line 76, in forward
    s = state[layer - 1] or V(x.data.clone().zero_())
  File "...anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 123, in __bool__
    torch.typename(self.data) + " is ambiguous")
RuntimeError: bool value of Variable objects containing non-empty torch.cuda.FloatTensor is ambiguous

Does the prednet accept batches?

Does the prednet accept batches during train/test?

The input is given as-

input_sequence = Variable(torch.rand(T, 1, 1, 4 * 2 ** L, 6 * 2 ** L))

I assumed (time-step, batch size, channels, length, breadth) is the input format. Am I wrong?

TypeError: expected Variable as element 1 in argument 0, but got tuple

Hello,

"TypeError: expected Variable as element 1 in argument 0, but got tuple" Error message is printed at
x = torch.cat((x, s), 1) in Model02.py

I modified some codes following the closed issues.

I tried to debug and fix. But I can't.
Do you know what is the problem?
Thank you

atcold / pytorch-cortexnet Goto Github PK

pytorch-cortexnet's Introduction

CortexNet

Project structure

Dependencies

IDE

Initialise environment

Setup new experiment

GPU selection

Train MatchNet

Train TempoNet

GPU selection

pytorch-cortexnet's People

Contributors

Stargazers

Watchers

Forkers

pytorch-cortexnet's Issues

Recommend Projects

Recommend Topics

Recommend Org