Giter Site home page Giter Site logo

resim's Introduction

ReSim

ReSim pipeline

This repository provides the PyTorch implementation of Region Similarity Representation Learning (ReSim) described in this paper:

@Article{xiao2021region,
  author  = {Tete Xiao and Colorado J Reed and Xiaolong Wang and Kurt Keutzer and Trevor Darrell},
  title   = {Region Similarity Representation Learning},
  journal = {arXiv preprint arXiv:2103.12902},
  year    = {2021},
}

tldr; ReSim maintains spatial relationships in the convolutional feature maps when performing instance contrastive pre-training, which is useful for region-related tasks such as object detection, segmentation, and dense pose estimation.

Installation

Assuming a conda environment:

conda create --name resim python=3.7
conda activate resim

# NOTE: if you are not using CUDA 10.2, you need to change the 10.2 in this command appropriately. 
# Code tested with torch 1.6 and 1.7
# (check CUDA version with e.g. `cat /usr/local/cuda/version.txt`)
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch

Pre-training

This codebase is based on the original MoCo codebase -- see this README for more details.

To pre-train for 200 epochs using the ReSim-FPN implementation as described in the paper:

python main_moco.py -a resnet50 --lr 0.03 --batch-size 256 \
       --dist-url tcp://localhost:10005 --multiprocessing-distributed --world-size 1 --rank 0 \
       --mlp --moco-t 0.2 --aug-plus --cos --epochs 200 \
       /location/of/imagenet/data/folder

ResNet-50 Pre-trained Models

Checkpoint Pre-train Epochs COCO AP @2x MoCo Checkpoint Detectron Backbone
ReSim-FPN 400 41.9 Download Download
ReSim-FPN 200 41.4 Download Download
ReSim-C4 200 41.1 Download Download

Detection

See these instructions for more details, but in brief:

# first install detectron2
# then place COCO-2017 dataset detection/datasets/coco

cd detection
python convert-pretrain-to-detectron2.py ../resim_fpn_checkpoint_latest.pth.tar detectron_resim_fpn_checkpoint_latest.pth.tar
python train_net.py --dist-url 'tcp://127.0.0.1:17654' --config-file configs/coco_R_50_FPN_2x_moco.yaml --num-gpus 8 MODEL.WEIGHTS detectron_resim_fpn_checkpoint_latest.pth.tar TEST.EVAL_PERIOD 180000 OUTPUT_DIR results/coco2x-resim-fpn SOLVER.CHECKPOINT_PERIOD 180000

License

This project is under the CC-BY-NC 4.0 license. See LICENSE.

resim's People

Contributors

cjrd avatar tete-xiao avatar

Stargazers

 avatar Roberto Del Prete avatar Alina Ciocarlan avatar Jiayu Tan avatar lclin avatar  avatar Wenfeng Pan avatar  avatar Sotirios Moschos avatar  avatar Ren Tianhe avatar Yuchong Yao avatar Edmond avatar Faris Alasmary avatar  avatar Yufei Ye avatar An-zhi WANG avatar Jiazhi Yang avatar Matt Shaffer avatar Jiaming Han avatar  avatar  avatar Chao YIN  尹超 avatar Daohui Ge avatar Emanuel Sanchez Aimar avatar DwanZhang avatar Xue Jiang avatar yousongzhu avatar  avatar jl avatar  avatar  avatar Francesco Saverio Zuppichini avatar HuiminWuHuiminWu avatar Hyo avatar Jingbo  avatar TED Vortex (Teodor-Eugen Duțulescu) avatar Wei Zhang avatar Roger GOU avatar PDC avatar Tin C. avatar Chao XU avatar Nikita avatar Hang Zhang avatar  avatar Howard H. Tang avatar  avatar Wen Wang avatar Peri Akiva avatar mZhenz avatar Chen Yimin avatar mllx01161110 avatar Derrick avatar  avatar  avatar Michael Tarasiou avatar Drinky avatar Zhaoqing (Derrick) Wang avatar Andrey Smorodov avatar bygreencn avatar xshen avatar guanfuchen avatar Jingyang Lin avatar Yang avatar Xiaolong Wang avatar Dejia Xu avatar Weijian Xu avatar  avatar CodingMan avatar  avatar Jiaqi Tang avatar Yuhang Zang avatar ali_robot avatar Wonjae Kim avatar Panagiotis Tigas avatar Bingchen Zhao avatar Yuki M. Asano avatar 爱可可-爱生活 avatar Lawrence avatar Jie Yang avatar Researcher.YuanYuhui avatar  avatar Xingjian Du avatar Xiangtai  Li avatar Ruotian(RT) Luo avatar Li Zhang avatar

Watchers

 avatar  avatar Pyjcsx avatar Zhan Tong avatar yousongzhu avatar Matt Shaffer avatar

resim's Issues

Config for dense pose estimation

Hi! Thank you for providing this excellent work.

Just a simple question. Do you plan to release your config for the dense pose estimation section in your paper? I'm quite interested in this experiements!

Thank you!

something goes wrong when training

Thanks for your greate job!!
I meet a problem when running your code. Please give me a hand.
When I input command:
python main_moco.py -a resnet34 --lr 0.03 --batch-size 256 --dist-url tcp://localhost:10005 --multiprocessing-distributed --world-size 1 --rank 0 --mlp --moco-t 0.2 --aug-plus --cos --epochs 10 ./my_data
Then it turns out:
...
File "/ReSim-main/moco/models.py", line 306, in forward
return self._forward_impl(x, return_mocodet_feats=return_mocodet_feats)
File "/ReSim-main/moco/models.py", line 286, in _forward_impl
prev_features = self.fpn_lateral5(inst_feats)
File "/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/anaconda3/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 423, in forward
return self._conv_forward(input, self.weight)
File "/anaconda3/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 419, in _conv_forward
return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Given groups=1, weight of size [256, 2048, 1, 1], expected input[64, 512, 7, 7] to have 2048 channels, but got 512 channels instead
I don't know how to solve this problem. Please help me, thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.