Giter Site home page Giter Site logo

bsnet's Introduction

BSNet

This is the official PyTorch implementation of BSNet (BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation (CVPR2024)).

BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation (CVPR2024) [Paper]

Jiahao Lu, Jiacheng Deng, Tianzhu Zhang

Get Started

Environment

Requirements

  • Python 3.x
  • Pytorch 1.10
  • CUDA 10.x or higher

The following installation suppose python=3.8 pytorch=1.10 and cuda=11.4.

  • Create a conda virtual environment

    conda create -n BSNet python=3.8
    conda activate BSNet
    
  • Install the dependencies

    Install Pytorch 1.10

    pip install spconv-cu114
    conda install pytorch-scatter -c pyg
    pip install -r requirements.txt
    

    Install segmentator from this repo (We wrap the segmentator in ScanNet).

  • Setup, Install spformer and pointgroup_ops.

    sudo apt-get install libsparsehash-dev
    python setup.py develop
    cd spformer/lib/
    python setup.py develop
    

Data Preparation

ScanNet v2 dataset

Download the ScanNet v2 dataset.

Put the downloaded scans and scans_test folder as follows.

MMImp
├── data
│   ├── scannetv2
│   │   ├── scans
│   │   ├── scans_test

Split and preprocess data

cd data/scannetv2
bash prepare_data.sh

The script data into train/val/test folder and preprocess the data. After running the script the scannet dataset structure should look like below.

BSNet
├── data
│   ├── scannetv2
│   │   ├── scans
│   │   ├── scans_test
│   │   ├── train
│   │   ├── val
│   │   ├── test
│   │   ├── val_gt

Generate Superpoints

Codes come from: https://github.com/ScanNet/ScanNet/tree/master/Segmentator.

python prepare_superpoint.py

Generate Pseudo Labels (Coming Soon!)

Here, we provide the pseudo labels generated by our method in Google Drive.

  • Generate No-overlapping Objects Google Drive
  • Train Simulated Samples
  • Train Mean Teacher
  • Generate Pseudo Labels

Training

The training steps are the same as the corresponding origin repositories. More details can be referred to SPFormer, ISBNet, Maft. Please note that the pretrained weights of SPFormer and Maft are identical to Gapro's.

Pre-trained Models

dataset Model AP AP_50% AP_25% Download
[ScanNetv2] SPFormer 56.3 73.9 - Model Weight
[ScanNetv2] ISBNet 54.5 73.1 82.5 Model Weight
[ScanNetv2] MAFT 58.4 75.9 - Model Weight
[ScanNetv2] SPFormer + Ours 53.3 72.7 83.4 Model Weight
[ScanNetv2] ISBNet + Ours 52.8 71.6 82.6 Model Weight
[ScanNetv2] MAFT + Ours 56.2 75.9 85.7 Model Weight

Acknowledgements

This repo is built upon ISBNet,SPFormer, Maft, Gapro.

Citation

If you find this project useful, please consider citing:

@inproceedings{lu2024bsnet,
  title={BSNet: Box-Supervised Simulation-assisted Mean Teacher for 3D Instance Segmentation},
  author={Lu, Jiahao and Deng, Jiacheng and Zhang, Tianzhu},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={20374--20384},
  year={2024}
}

bsnet's People

Contributors

peoplelu avatar

Stargazers

Frank Zhiyang Dou avatar Jiacheng Deng avatar  avatar PLOG avatar

Watchers

Danila Rukhovich avatar  avatar Luis avatar

bsnet's Issues

About Pretrained

Hi @peoplelu,

Your research is quite interesting and I have just taken a look at it. My question is:
Do you use 'pretrained' from SSTNet for training your model? If so, I'm not sure whether the use of 'pretrained' from a trained model with fully supervised behavior is fair for a weakly supervised task?

Thank you in advance.

Questions about evaluation

Hi @peoplelu ,

Just had a quick look throw the BSNet paper, found it interesting, but still have 2 questions. Can you may be guide me a little bit throw them? Sorry if I'm missing something in my questions.

  1. You use ground truth boxes and superpoints (at least for SSTNet, ISBNet, and SPFormer) and mark this evaluation as weakly supervised. But actually at least for ScanNet simply intersecting superpoints with ground truth boxes gives you exact (not approximate) ground truth instance masks. So with such inputs the evaluation becomes fully supervised and should match 100% of the baseline performance. So, the question is do I understand this unfairness correctly?

  2. Can you give me the intuition behind such impressive increase of detection metrics compared to the fully supervised baseline? E.g. 62 mAP50 for ISBNet compared to your 70 mAP50 for ISBNet+BSNet. As I understand, both of these methods don't predict boxes directly, just infer them from the predicted instance masks. And in instance segmentation performance the weakly supervised method is a little lower than fully supervised.

Thanks in advance,
Danila

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.