In pixels we trust: From Pixel Labeling to Object Localization and Scene Categorization

This is a repository with the original implementation of the described solutions in our IROS 2018 paper.

License

This repository is released under the GNU General Public License (refer to the LICENSE file for details).

Citing

If you make use of this data and software, please cite the following reference in any publications:

@Inproceedings{herranz2018,
Author = {Herranz-Perdiguero, C. and Redondo-Cabrera, C. and L\'opez-Sastre, R.~J.},
Title = {In pixels we trust: From Pixel Labeling to Object Localization and Scene Categorization},
Booktitle = {IROS},
Year = {2018}
}

Datasets and initial setup

In order to use this software, clone this repository. We will call that directory PROJECT_ROOT. There, we will have three folders:

PROJECT_ROOT/deeplab_NYU contains the code needed to perform semantic segmentation.
PROJECT_ROOT/NYU_depth_v2 contains the NYU Depth v2 dataset.
PROJECT_ROOT/Classification contains the code needed to perform scene classification.

Download the NYU Depth v2 dataset. To do so, go to PROJECT_ROOT/NYU_depth_v2 and run the following script:

  cd $PROJECT_ROOT/NYU_depth_v2
  ./Download_NYU_depth.sh

The dataset is saved in a .mat file which is located in $PROJECT_ROOT/NYU_depth_v2/NYUdepth

Now, to download a pretrained model to initialize the network and the best of our models, run:

   cd $PROJECT_ROOT/NYU_depth_v2
  ./Download_Models.sh

You can find the file init.caffemodel in $PROJECT_ROOT/deeplab_NYU/NYUdepth/model/resnet. We will use that model to initialize the weights of the segmentation network. In $PROJECT_ROOT/deeplab_NYU/NYUdepth/model/resnet/trained_model you can find the final model used in our IROS 2018 paper.

Finally we need to extract the images from the .mat file we have downloaded. Under PROJECT_ROOT/NYU_depth_v2 just run the Matlab script ObtainImages.m. Also, to overcome memory limitations, we will split the images in two halves with overlap. We can do that by running SplitWholeDataset.m.

Image Segmentation

We need first to compile the DeepLab based model. Run:

  cd $PROJECT_ROOT/deeplab_NYU
  make
  make pycaffe
  make matcaffe

Once the compilation has been successfully completed, run:

  cd $PROJECT_ROOT/deeplab_NYU
  python run_NYUdepth.py

to train or test a model. Please, note that in the script there is a flag to choose between train or test. By default, we search in $PROJECT_ROOT/deeplab_NYU/NYUdepth/model/resnet for a .caffemodel to initialize the weights of the network in the training step, or to find the model to use during testing.

To visualize the results after the test has been completed, run the Matlab script GetSegResults_Split.m with do_save_results to '1' in $PROJECT_ROOT/deeplab_NYU/matlab/my_script_NYUdepth. We will need these results to perform the scene classification task.
To evaluate the results using the different metrics reported in our paper run: Get_mIOU.m in $PROJECT_ROOT/deeplab_NYU/matlab/my_script_NYUdepth

Scene classification

Once we have the results from our segmentation model, we can use them to categorize different scenes following the approach described in our paper. Simply:

In PROJECT_ROOT/Classification/histograms, run Generate_histograms.m to generate the histogram-like features from the segmented images. The code allows to choose the number of spatial pyramid levels you want to play with.
Under PROJECT_ROOT/Classification/SVM, you can find the models used to perform scene classification with both a Linear SVM and an Additive Kernel SVM. You just need to run run_SVM.m or run_addtive_SVM.m

gramuah / ipwt Goto Github PK

ipwt's Introduction

In pixels we trust: From Pixel Labeling to Object Localization and Scene Categorization

License

Citing

Datasets and initial setup

Image Segmentation

Scene classification

ipwt's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent