Giter Site home page Giter Site logo

aharley / segaware Goto Github PK

View Code? Open in Web Editor NEW
144.0 11.0 31.0 7.66 MB

Segmentation-Aware Convolutional Networks Using Local Attention Masks

Makefile 0.32% CMake 1.32% CSS 0.12% Jupyter Notebook 45.14% C++ 42.02% Shell 0.30% Python 3.96% Cuda 3.99% MATLAB 1.97% M 0.01% Protocol Buffer 0.83% C 0.03% Limbo 0.01%

segaware's Introduction

Segmentation-Aware Convolutional Networks Using Local Attention Masks

[Project Page] [Paper]

Segmentation-aware convolution filters are invariant to backgrounds. We achieve this in three steps: (i) compute segmentation cues for each pixel (i.e., “embeddings”), (ii) create a foreground mask for each patch, and (iii) combine the masks with convolution, so that the filters only process the local foreground in each image patch.

Installation

For prerequisites, refer to DeepLabV2. Our setup follows theirs almost exactly.

Once you have the prequisites, simply run make all -j4 from within caffe/ to compile the code with 4 cores.

Learning embeddings with dedicated loss

  • Use Convolution layers to create dense embeddings.
  • Use Im2dist to compute dense distance comparisons in an embedding map.
  • Use Im2parity to compute dense label comparisons in a label map.
  • Use DistLoss (with parameters alpha and beta) to set up a contrastive side loss on the distances.

See scripts/segaware/config/embs for a full example.

Setting up a segmentation-aware convolution layer

  • Use Im2col on the input, to arrange pixel/feature patches into columns.
  • Use Im2dist on the embeddings, to get their distances into columns.
  • Use Exp on the distances, with scale: -1, to get them into [0,1].
  • Tile the exponentiated distances, with a factor equal to the depth (i.e., channels) of the original convolution features.
  • Use Eltwise to multiply the Tile result with the Im2col result.
  • Use Convolution with bottom_is_im2col: true to matrix-multiply the convolution weights with the Eltwise output.

See scripts/segaware/config/vgg for an example in which every convolution layer in the VGG16 architecture is made segmentation-aware.

Using a segmentation-aware CRF

  • Use the NormConvMeanfield layer. As input, give it two copies of the unary potentials (produced by a Split layer), some embeddings, and a meshgrid-like input (produced by a DummyData layer with data_filler { type: "xy" }).

See scripts/segaware/config/res for an example in which a segmentation-aware CRF is added to a resnet architecture.

Replicating the segmentation results presented in our paper

  • Download pretrained model weights here, and put that file into scripts/segaware/model/res/.
  • From scripts, run ./test_res.sh. This will produce .mat files in scripts/segaware/features/res/voc_test/mycrf/.
  • From scripts, run ./gen_preds.sh. This will produce colorized .png results in scripts/segaware/results/res/voc_test/mycrf/none/results/VOC2012/Segmentation/comp6_test_cls. An example input-ouput pair is shown below:

- If you zip these results, and submit them to the official PASCAL VOC test server, you will get 79.83900% IOU.

If you run this set of steps for the validation set, you can run ./eval.sh to evaluate your results on the PASCAL VOC validation set. If you change the model, you may want to run ./edit_env.sh to update the evaluation instructions.

Citation

@inproceedings{harley_segaware,
  title = {Segmentation-Aware Convolutional Networks Using Local Attention Masks},
  author = {Adam W Harley, Konstantinos G. Derpanis, Iasonas Kokkinos},
  booktitle = {IEEE International Conference on Computer Vision (ICCV)},
  year = {2017},
}

Help

Feel free to open issues on here! Also, I'm pretty good with email: [email protected]

segaware's People

Stargazers

Yuanbing Zhu avatar Denis Hadjivelichkov avatar  avatar celns avatar Alvin Deng avatar Daniel E Cook avatar zhanghe avatar An-zhi WANG avatar LeYang avatar Jake avatar  avatar Brian Chao avatar Haidong Qi avatar wilson avatar Jizong's Fox avatar  avatar  avatar Xin Cai avatar  avatar Deangela Neves avatar Xiaolin Zhang avatar Edward Seo avatar ShawYun avatar allRight avatar hz avatar  avatar steve foy avatar Hao Wei avatar  avatar guanfuchen avatar Markus Siemens avatar Changjiang Cai avatar Vlad Florea avatar Shubham Pachori avatar zou hongwei avatar QCSun avatar  avatar  avatar Yuqing Wang avatar Hsu avatar Marius Slavescu avatar  avatar  avatar Liang Depeng avatar  avatar Liviu-Daniel avatar XellossRyan avatar 程辉 avatar Muha Ajjan avatar Yu Yang avatar  avatar Iuliia (Yulia) Kotseruba avatar  avatar QQ Han avatar Jee Li avatar Zhu Zhen avatar  avatar  avatar Jing Yang avatar Weicai Ye 叶伟才 avatar  avatar Yasser Souri avatar Yili Zhao avatar Yao Lu avatar Xuejian Rong avatar Jinming Su avatar Kinshuk avatar  avatar DL avatar Wanyue Zhang avatar Tomek Roszczynialski avatar J avatar Albert avatar Jason Bunk avatar Cheng Wang avatar Yunfeng Wang avatar Lucas Cinelli avatar Enming XIE avatar  avatar  avatar ray0809 avatar  avatar Kinpzz avatar Mayur Bhangale avatar  avatar JackieYung avatar lanye avatar eladar avatar Yiliang avatar TianqiTang avatar winuz49 avatar Meet Shah  avatar  avatar Huijun Liu/John Liu avatar Mude Lin avatar  avatar  avatar Hoagy avatar Balint Cristian avatar AIHGF avatar

Watchers

James Cloos avatar Fahim Mannan avatar Warren_liu avatar Adam W. Harley avatar BenJueWeng avatar Nkiruka Uzuegbunam avatar Lingchen avatar TianqiTang avatar Kun Wang avatar  avatar paper2code - bot avatar

segaware's Issues

syncedmem.hpp:31] Check failed: error == cudaSuccess (29 vs. 0) driver shutting down

*** Check failure stack trace: ***
@ 0x7fa08bf23daa (unknown)
@ 0x7fa08bf23ce4 (unknown)
@ 0x7fa08bf236e6 (unknown)
@ 0x7fa08bf26687 (unknown)
@ 0x7fa08c5791e1 caffe::SyncedMemory::~SyncedMemory()
@ 0x7fa08c5c6fb2 boost::detail::sp_counted_impl_p<>::dispose()
@ 0x40a52e boost::detail::sp_counted_base::release()
@ 0x7fa08c5df1e5 caffe::Blob<>::~Blob()
@ 0x7fa08aa8753a (unknown)
@ 0x7fa08c50ac43 (unknown)
Aborted (core dumped)

I met this issue each time at the end of training or testing. Any idea?

TwoImageData

Could you please tell me what "TwoImageData" layer is used for, I can't find the code.
Thanks!

Using Im2col and bottom_is_im2col needs more memory

I am trying to train VGG16 with my own data. I have cropped the images to 224x224. When I train with VGG16 as provided by Caffe model zoo (https://gist.github.com/ksimonyan/211839e770f7b538e2d8) I can train with batch size 32. After replacing all convolution layers with im2col followed by a convolution layer with bottom_is_im2col I can train with maximum batch size 8 without "Out of memory" error.
First of all, I wonder if this is normal behavior given that typical convolution layers use im2col internally.
Secondly, is there a way to reduce memory needs?
Thanks in advance,

Check failed: error == cudaSuccess (2 vs. 0) out of memory

I am currently running ./test_res.sh on a 11GB of GPU memory,gtx1080ti. But when I run the script, it immediately throws out an error:
I0911 12:46:39.811416 18216 caffe.cpp:252] Running for 1449 iterations.
F0911 12:46:40.284883 18216 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory
and since you are already at batch size = 1 ,I don't know how much real memory , or What should I do?

Caffe error

Hi,
I built the caffe from this repo and i am getting an error that
Error parsing text-format caffe.NetParameter: 62:15: Message type "caffe.LayerParameter" has no field named "deletebottom".

What might be the issue?
Regards, Vijay

Compatibility issue for cudnn

When build caffe, this error occurs

CXX .build_release/src/caffe/proto/caffe.pb.cc
CXX src/caffe/syncedmem.cpp
In file included from ./include/caffe/util/device_alternate.hpp:40:0,
                 from ./include/caffe/common.hpp:19,
                 from src/caffe/syncedmem.cpp:1:
./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::createPoolingDesc(cudnnPoolingStruct**, caffe::PoolingParameter_PoolMethod, cudnnPoolingMode_t*, int, int, int, int, int, int)’:
./include/caffe/util/cudnn.hpp:127:41: error: too few arguments to function ‘cudnnStatus_t cudnnSetPooling2dDescriptor(cudnnPoolingDescriptor_t, cudnnPoolingMode_t, cudnnNanPropagation_t, int, int, int, int, int, int)’
         pad_h, pad_w, stride_h, stride_w));
                                         ^
./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’
     cudnnStatus_t status = condition; \
                            ^
In file included from ./include/caffe/util/cudnn.hpp:5:0,
                 from ./include/caffe/util/device_alternate.hpp:40,
                 from ./include/caffe/common.hpp:19,
                 from src/caffe/syncedmem.cpp:1:
/usr/local/cuda/include/cudnn.h:799:27: note: declared here
 cudnnStatus_t CUDNNWINAPI cudnnSetPooling2dDescriptor(

It seems to relate with the compatibility of cudnn. Official caffe could be built on my computer, would you please update and solve this problem?

How to train embedding network

I have found that the code only contains the part for test and no code for training. Is the embedding network trained separately from deeplab network?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.