Giter Site home page Giter Site logo

atcold / pytorch-cortexnet Goto Github PK

View Code? Open in Web Editor NEW
361.0 20.0 80.0 13.65 MB

PyTorch implementation of the CortexNet predictive model

Home Page: http://tinyurl.com/CortexNet/

Python 0.52% Jupyter Notebook 99.34% Shell 0.13% Gnuplot 0.01%
pytorch video deep-learning predictive-modeling self-supervised unsupervised-learning

pytorch-cortexnet's Introduction

CortexNet

This repo contains the PyTorch implementation of CortexNet.
Check the project website for further information.

Project structure

The project consists of the following folders and files:

  • data/: contains Bash scripts and a Python class definition inherent video data loading;
  • image-pretraining/: hosts the code for pre-training TempoNet's discriminative branch;
  • model/: stores several network architectures, including PredNet, an additive feedback Model01, and a modulatory feedback Model02 (CortexNet);
  • notebook/: collection of Jupyter Notebooks for data exploration and results visualisation;
  • utils/: scripts for
    • (current or former) training error plotting,
    • experiments diff,
    • multi-node synchronisation,
    • generative predictions visualisation,
    • network architecture graphing;
  • results@: link to the location where experimental results will be saved within 3-digit folders;
  • new_experiment.sh*: creates a new experiment folder, updates last@, prints a memo about last used settings;
  • last@: symbolic link pointing to a new results sub-directory created by new_experiment.sh;
  • main.py: training script for CortexNet in MatchNet or TempoNet configuration;

Dependencies

pip install sk-video
  • tqdm: progress bar
conda config --add channels conda-forge
conda update --all
conda install tqdm

IDE

This project has been realised with PyCharm by JetBrains and the Vim editor. Grip has been also fundamental for crafting decent documtation locally.

Initialise environment

Once you've determined where you'd like to save your experimental results — let's call this directory <my saving location> — run the following commands from the project's root directory:

ln -s <my saving location> results  # replace <my saving location>
mkdir results/000 && touch results/000/train.log  # init. placeholder
ln -s results/000 last  # create pointer to the most recent result

Setup new experiment

Ready to run your first experiment? Type the following:

./new_experiment.sh

GPU selection

Let's say your machine has N GPUs. You can choose to use any of these, by specifying the index n = 0, ..., N-1. Therefore, type CUDA_VISIBLE_DEVICES=n just before python ... in the following sections.

Train MatchNet

  • Download e-VDS35 (e.g. e-VDS35-May17.tar) from here.
  • Use data/resize_and_split.sh to prepare your (video) data for training. It resizes videos present in folders of folders (i.e. directory of classes) and may split them into training and validation set. May also skip short videos and trim longer ones. Check data/README.md for more details.
  • Run the main.py script to start training. Use -h to print the command line interface (CLI) arguments help.
python -u main.py --mode MatchNet <CLI arguments> | tee last/train.log

Train TempoNet

  • Download e-VDS35 (e.g. e-VDS35-May17.tar) from here.
  • Pre-train the forward branch (see image-pretraining/) on an image data set (e.g. 33-image-set.tar from here);
  • Use data/resize_and_sample.sh to prepare your (video) data for training. It resizes videos present in folders of folders (i.e. directory of classes) and samples them. Videos are then distributed across training and validation set. May also skip short videos and trim longer ones. Check data/README.md for more details.
  • Run the main.py script to start training. Use -h to print the CLI arguments help.
python -u main.py --mode TempoNet --pre-trained <path> <CLI args> | tee last/train.log

GPU selection

To run on a specific GPU, say n, type CUDA_VISIBLE_DEVICES=n just before python ....

pytorch-cortexnet's People

Contributors

atcold avatar codeac29 avatar shi69 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-cortexnet's Issues

Issue about ConvLstm

Hi Atcold

Thanks for the wonderful code and also the wonderful tutorials on pytroch!
I actually have a doubt --
I read two articles detailing the ConvLstm model in
[1] https://en.wikipedia.org/w/index.php?title=Long_short-term_memory&oldid=784163987#cite_note-20
[2] https://arxiv.org/pdf/1506.04214.pdf -- equation (3)

In both the above models, while computing the input, forget and output gates, they have used the previous cell states i.e prev_cell whereas you have not used. Can you please tell me why not ?

Thanks
Devraj

Regarding the datasets and pre-trained model weights

Hello, it seems that the data link provided in your readme cannot be accessed at ftp://elab-board2.ecn.purdue.edu/e-VDS/e-VDS35-May17.tar. Additionally, I was wondering if you have any pre-trained model weights available?

RuntimeError: bool value of Tensor with more than one value is ambiguous

While running your code:“DiscriminativeCell.py",, I encountered this error:
line 35, in forward
input_projection = self.first and bottom_up or f.relu(f.max_pool2d(self.from_bottom(bottom_up), POOL, POOL))
RuntimeError: bool value of Tensor with more than one value is ambiguous。

Can you give me any hint to solve?
Thank you very much.

Could you tell me your environment?

Really thank you for sharing the source code.
I am trying to run this project but there are some errors which I can't solve.

So, I want to use a same running environment with you.

thank you

Execution error with network_bisection.ipynb

Running the code in network_bisection.ipynb I found an error in the code:

my_embedding = torch.zeros(512)
def fun(m, i, o): my_embedding.copy_(o.data)
h = avgpool_layer.register_forward_hook(fun)
h_x = resnet_18(x)
h.remove()

RuntimeError: expand(torch.FloatTensor{[1, 512, 1, 1]}, size=[512]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (4)

RuntimeError: bool value of Variable objects containing non-empty torch.cuda.FloatTensor is ambiguous

Hello, Atcold. First, thank you for issuing such a great program.
While I download the e-VDS35-m256-May17.tar and followed the steps that are posted on github to train a sample model for MatchNet, the program crashed at the first epoch.
Here is the stack from python.

Traceback (most recent call last): File "main.py", line 402, in <module> main() File "main.py", line 195, in main train(train_loader, model, (mse, nll_final, nll_train), optimiser, epoch) File "main.py", line 305, in train ce_loss, mse_loss, state, x_hat_data = compute_loss(x[t], x[t + 1], y[t], state) File "main.py", line 265, in compute_loss (x_hat, state_), (_, idx) = model(V(x_), state_) File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 225, in __call__ result = self.forward(*input, **kwargs) File "/home/fung/workspace/pytorch-CortexNet/model/Model02.py", line 79, in forward s = state[layer - 1] or V(x.data.clone().zero_()) File "/usr/local/lib/python3.5/dist-packages/torch/autograd/variable.py", line 122, in __bool__ torch.typename(self.data) + " is ambiguous") RuntimeError: bool value of Variable objects containing non-empty torch.cuda.FloatTensor is ambiguous

The error occurs at this line s= state[layer - 1] or V(x.data.clone().zero_()) , because it needs to combine the state.data and input x, after the first frame. So, the program really crashed at state[layer - 1] .

I used the python3 to run the code and here is the command I used to run python3 main.py --mode MatchNet --cuda | tee last/train.log. Also, the video data is in the default path.

So, what can I do to fix this error or something I missed or did wrong?

Paper

Hi @Atcold , can you provide the corresponding paper for this algorithm?

Data Parallel for ConvLSTM

Hi,Thanks for sharing great work!

I want to train my model with convLSTMcell Layer on multiple GUPs, but how do I do that?
I'd appreciate it if you could give me a tip! Thanks!

RuntimeError: bool value of Tensor with more than one value is ambiguous

Hello,
Thank you for your excellent work.

While attempting to reproduce your results using
python = 3.6
pytorch = 0.4.1

I get the following error
 File "/pytorch-CortexNet-master/model/Model02.py", line 76, in forward
    s = state[layer - 1] or V(x.data.clone().zero_())
RuntimeError: bool value of Tensor with more than one value is ambiguous

Seems related to the variables & tensors merge from 0.3 -> 0.4. Could you please provide a suggestion.

System Specifications Required

Hi @Atcold
I am very interested in this work and would like to implement this. Could you please reply the specifications of the system which you used for training?

Can't load in pre-trained model

I'd like to extract features from trained model in MatchNet mode, but it won't load the pretrained model in.

    49     print('Load pre-trained weights')
    50     # args.pre_trained = 'image-pretraining/model02D-33IS/model_best.pth.tar'
--> 51     dict_33 = torch.load(args.pre_trained)['state_dict']

TypeError: 'Model02' object is not subscriptable

In [10]: args.pre_trained
Out[10]: './last/model.pth.tar'

Any ideas? the training never saved a "model_best.pth.tar" file, only a model.pth.tar

about the required memory size of GPU and training time

Hi, @Atcold ,

Thanks for releasing such a great package and I'm very interested in this work and want to have a practice on it. Could you reply the required memory size of GPU (I only have 980Ti , 6GB) to run the training process? Also could you elaborate on how long you trained your network in each settings (unsupervised and supervised) and which GPUs you used?

Thanks!

error in training

I'm running using your parameters on the data you indicate in the README. Starts okay, but dies in the first epoch. Any ideas?
Thanks.

~/torch_codes/pytorch-CortexNet$ python -u main.py --mode MatchNet --size 3 32 64 128 256 --tau 0 --big-t 10 --log-interval 10 --cuda --view 2 --show-x_hat --epochs 30  --model model_02 --lr-decay 10 10 --data /work/CortexNet_Experiments/VDS35_data/preprocessed-data | tee last/train.log
CLI arguments: --mode MatchNet --size 3 32 64 128 256 --tau 0 --big-t 10 --log-interval 10 --cuda --view 2 --show-x_hat --epochs 30 --model model_02 --lr-decay 10 10 --data /work/CortexNet_Experiments/VDS35_data/preprocessed-data
Current commit hash: bc28dac4e6a1ad9abb11e2fbc48d310a85e9903a
Define image pre-processing
Define train data loader
Define validation data loader
Define model

---------------------------- Building model Model02 ----------------------------
Hidden layers: 4
Net sizing: (3, 32, 64, 128, 256, 970)
Input spatial size: 3 x (256, 256)
Layer 1 ------------------------------------------------------------------------
Bottom size: 3 x (256, 256)
Top size: 32 x (128, 128)
Layer 2 ------------------------------------------------------------------------
Bottom size: 32 x (128, 128)
Top size: 64 x (64, 64)
Layer 3 ------------------------------------------------------------------------
Bottom size: 64 x (64, 64)
Top size: 128 x (32, 32)
Layer 4 ------------------------------------------------------------------------
Bottom size: 128 x (32, 32)
Top size: 256 x (16, 16)
Classifier ---------------------------------------------------------------------
256 --> 970
--------------------------------------------------------------------------------

Create a MSE and balanced NLL criterions
Instantiate a SGD optimiser
Training epoch 1
Traceback (most recent call last):
  File "main.py", line 394, in <module>
    main()
  File "main.py", line 194, in main
    train(train_loader, model, (mse, nll_final, nll_train), optimiser, epoch)
  File "main.py", line 297, in train
    ce_loss, mse_loss, state, x_hat_data = compute_loss(x[t], x[t + 1], y[t], state)
  File "main.py", line 261, in compute_loss
    (x_hat, state_), (_, idx) = model(V(x_), state_)
  File "...anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "...torch_codes/pytorch-CortexNet/model/Model02.py", line 76, in forward
    s = state[layer - 1] or V(x.data.clone().zero_())
  File "...anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 123, in __bool__
    torch.typename(self.data) + " is ambiguous")
RuntimeError: bool value of Variable objects containing non-empty torch.cuda.FloatTensor is ambiguous

Does the prednet accept batches?

Does the prednet accept batches during train/test?

The input is given as-

input_sequence = Variable(torch.rand(T, 1, 1, 4 * 2 ** L, 6 * 2 ** L))

I assumed (time-step, batch size, channels, length, breadth) is the input format. Am I wrong?

TypeError: expected Variable as element 1 in argument 0, but got tuple

Hello,

"TypeError: expected Variable as element 1 in argument 0, but got tuple" Error message is printed at
x = torch.cat((x, s), 1) in Model02.py

I modified some codes following the closed issues.

I tried to debug and fix. But I can't.
Do you know what is the problem?
Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.