anirudh-chakravarthy / casenet Goto Github PK

View Code? Open in Web Editor NEW

84.0 4.0 12.0 60 KB

A Pytorch implementation of CASENet for the Cityscapes Dataset

MATLAB 14.98% Python 85.02%

edge-detection boundary-detection semantic-boundaries cityscapes-dataset casenet pytorch

casenet's Introduction

CASENet: PyTorch Implementation

This is an implementation for paper CASENet: Deep Category-Aware Semantic Edge Detection.

This repository has been adapted to work with the Cityscapes Dataset.

Input pre-processing

The author's preprocessing repository was used. Instructions for use can be found in the cityscapes-preprocess directory. This is used to generate binary files for each label in the dataset.

For data loading into hdf5 file, an hdf5 file containing these binary files needs to be generated. For this conversion, run:

python utils/convert_bin_to_hdf5.py

after updating the directory paths (use absolute paths).

Model

The model used is the ResNet-101 variant of CASENet.

The model configuration (.prototxt) can be found: here.

The download links for pretrained weights for CASENet can be found: here.

Converting Caffe weights to PyTorch

To use the pretrained caffemodel in PyTorch, use extract-caffe-params to save each layer's weights in numpy format. The code along with instructions for usage can be found in the utils folder.

To load these numpy weights for each layer into a PyTorch model, run:

python modules/CASENet.py

after updating the directory path at this line to the parent directory containing the numpy files (use absolute paths).

NOTE: The Pytorch pre-trained weights can be downloaded from: Google Drive.

Training

For training, run:

python main.py

Optional arguments:
    -h, --help              show help message and exit
    --checkpoint-folder     path to checkpoint dir (default: ./checkpoint)
    --multigpu              use multiple GPUs (default: False)
    -j, --workers           number of data loading workers (default: 16)
    --epochs                number of total epochs to run (default: 150)
    --start-epoch           manual epoch number (useful on restarts)
    --cls-num               The number of classes (default: 19 for Cityscapes)
    --lr-steps              iterations to decay learning rate by 10
    --acc-steps             steps for Gradient accumulation  (default: 1)
    -b, --batch-size        mini-batch size (default: 1)
    --lr                    lr (default: 1e-7)
    --momentum              momentum (default: 0.9)
    --weight-decay          weight decay (default: 1e-4)
    -p, --print-freq        print frequency (default: 10)
    --resume-model          path to latest checkpoint (default: None)
    --pretrained-model      path to pretrained checkpoint (default: None)

Visualization

For visualizing feature maps, ground truths and predictions, run:

python visualize_multilabel.py [--model MODEL] [--image_file IMAGE] [--image_dir IMAGE_DIR] [--output_dir OUTPUT_DIR]

For example, to visualize on the validation set of Cityscapes dataset:

python visualize_multilabel.py -m pretrained_models/model_casenet.pth.tar -f leftImg8bit/val/lindau/lindau_000045_000019_leftImg8bit.png -d cityscapes-preprocess/data_proc/ -o output/

Testing

For testing a pretrained model on new images, run:

python get_results_for_benchmark.py [--model MODEL] [--image_file IMAGE] [--image_dir IMAGE_DIR] [--output_dir OUTPUT_DIR]

For example,

python get_results_for_benchmark.py -m pretrained_models/model_casenet.pth.tar -f img1.png -d images/ -o output/

A class wise prediction map will be generated in the output directory specified.

Additional changes

Added script to convert edges to contours and visualize
Implemented gradient accumulation to increase batch size without increasing memory usage.
Cleared unused variable to empty cache memory.

Acknowledgements

Data processing was done from the following repository: https://github.com/Chrisding/cityscapes-preprocess
Conversion from caffemodel to numpy format was done using the following repository: https://github.com/nilboy/extract-caffe-params
The model was heavily referenced from the following repository for the Semantic Boundaries Dataset: https://github.com/lijiaman/CASENet
The original Caffe implementation of the paper, can be downloaded from the following link: http://www.merl.com/research/license#CASENet

casenet's People

Contributors

Stargazers

Watchers

Forkers

zhouleisjtu fengfengfeng96 magic-123 krishna-22 yeongkwoncho chexqi dreamskiss000 jwyhhh123 terrisgo baoshishui shanhedian2017 johconstantine

casenet's Issues

Pytorch Pretrained Model on Google Drive is broken.

First, thanks a lot for your excellent work, which has made convenient for everyone . Yet I have met a trouble that the Pytorch pretrained model on Google Drive is broken. The extract-caffe-parameters code needs to import caffe, which is so inconvenient in my situation. Could you please upload the Pytorch pretrained model again?

Where can I find the metric F -ODS and AP?

The paper mention that they use the maximum F-measure (MF) at optimal dataset scale (ODS), and average precision (AP) , do you have the code?

Can you release the final model parameters on cityscrapes?

Hello, thank you for your wonderful job! I am trying to reproduce the results of CASENet based on the repository. So I wonder if you can release the final model parameter for better reference, i.e., testing and showing the performance of the code. I will appreciate it very much!

How to run convert_bin_to_hdf5.py ?

Hi, I've finished the pre-processing but there was an error occur after running "python utils/convert_bin_to_hdf5.py"

Traceback (most recent call last):
File "utils/convert_bin_to_hdf5.py", line 66, in
convert_num_to_bitfield(label_data, h, w, npz_name, root_folder, h5_file)
File "utils/convert_bin_to_hdf5.py", line 30, in convert_num_to_bitfield
padded_bit_tensor = torch.cat((torch.zeros(cls_num-actual_len).byte(), bit_tensor.byte()), dim=0)
RuntimeError: $ Torch: invalid memory size -- maybe an overflow? at ..\aten\src\TH\THGeneral.cpp:188

The main reason of the error is that 'cls_num' is smaller than 'actual_len', so 'cls_num-actual'_len value is under 0.

I don't understand what is 'actual_len'. What does the actual length mean?

Ask for converted numpy pretrained weights

Hi,
After reading this issue, I noticed that you are very professional on this problem.

I have tried a lot for converting caffe weights to numpy. But it's very hard to finish this, some bugs always occurs.

So, could you please release the converted pretrained weights or just send to me?

Thank you so much!

My email: [email protected]

would be useful to use CASENet for binary edge detection?

Thank you for sharing your implementation.
I am wondering would be useful to use the CASENet network for the binary edge detection for the custom dataset or adopting the CASENet network with Hausdorff distance loss (or even cross-entropy loss) instead of multi-label loss?

Here is an example of my ground truth:

Thank you in advance

Where is the dir 'data_aug'?

I have run the code to preprocess CityScapes data and found that no directory named 'data_aug', with 'list_train_aug.txt' under that.

It is really confusing.

Performance of this code

Thank you for your implement so much!

Can you please report some performance about your code on cityscapes dataset?

Thank you again!

Best wishes!

Questions about side-outputs.

Is anyone working on this repo? I trained and tested this model on anothoer dataset, the side output pictures was like this:

You can see that those side outputs doesn't show any boundary, but the fused predictions looks ok. I noticed that in the author's essay, side activations shows boundaries of the input images like this:

I wonder if anyone is having the same problem.

Performance of this code

Thanks for sharing the code. Have you ever reproduced the result with this code repo?
Thanks.

how to get the contour like this?

I run the visualize_multiplelable code. It seems the edge information are output separately to different images. How can I get the contour like this? Thank you.

Failing at testing

Hi! I'm trying to test your code but something seems to not be working properly. I'm trying to use the command:

python get_results_for_benchmark.py -m pretrained_models/model_casenet.pth.tar -f lena.png -d images/ -o output/

but got the following error:

(truncated)
score_cls_side5.weight is loaded successfully
score_cls_side5.bias is loaded successfully
upsample_cls_side5.weight is loaded successfully
ce_fusion.weight is loaded successfully
ce_fusion.bias is loaded successfully
Traceback (most recent call last):
  File "get_results_for_benchmark.py", line 68, in <module>
    for cls_idx in xrange(num_cls):
NameError: name 'xrange' is not defined

I have downloaded the pre-trained model for PyTorch and placed it in the folder pretrained_models/. I have downloaded lena.png image from Wikipedia (just for simple test) and placed it in the folder images/. What am I missing? Can you help me out?

Thank you!

How can I obtain .h5 files

Could you provide train_aug_label_binary_np.h and test_label_binary_np.h5 files?

process NYUv2

the nyuv2 has 40 classes,if the code of 'cityscape_process' can process the nyuv2 dataset? cityscape only has 19 classes.

How to preprocess the dataset?

Hi, thanks for sharing your code. I downloaded the gtFine_trainvaltest.zip (241MB) dataset and decompressed it, but I don't quite follow the instructions in the README.

First, python utils/convert_bin_to_hdf5.py file contains hardcoded file val.txt, but where can I find this file? I don't see such file in the repository or the dataset.

Also, what should I do to run the whole repo from scratch, for example, from dataset preprocess to training, validating and testing? Some more detailed instructions would be a great help.

Run python3 utils/convert_bin_to_hdf5.py
Run python3 main.py

Are the two commands above all that needed to run the repository?

Thanks!

Loss extremely high and doesn't decrease

Hello,
I follow your code and run it on a subset of the database (50 images). The code run til the end but the loss doesn't go down. The loss also is extremely high (about 1000000). Is it normal?

Looking forward to your reply