Official implementation for the ICCV 2023 paper
Efficient 3D Semantic Segmentation with Superpoint Transformer
๐โก๐ฅ
SPT is a superpoint-based transformer ๐ค architecture that efficiently โก performs semantic segmentation on large-scale 3D scenes. This method includes a fast algorithm that partitions ๐งฉ point clouds into a hierarchical superpoint structure, as well as a self-attention mechanism to exploit the relationships between superpoints at multiple scales.
โจ SPT in numbers โจ |
---|
๐ SOTA on S3DIS 6-Fold (76.0 mIoU) |
๐ SOTA on KITTI-360 Val (63.5 mIoU) |
๐ Near SOTA on DALES (79.6 mIoU) |
๐ฆ 212k parameters (PointNeXt รท 200, Stratified Transformer รท 40) |
โก S3DIS training in 3h on 1 GPU (PointNeXt รท 7, Stratified Transformer รท 70) |
โก Preprocessing x7 faster than SPG |
- 06.10.2023 Come see our poster for Efficient 3D Semantic Segmentation with Superpoint Transformer at ICCV 2023
- 14.07.2023 Our paper Efficient 3D Semantic Segmentation with Superpoint Transformer was accepted at ICCV 2023 ๐ฅณ
- 15.06.2023 Official release ๐ฑ
This project was tested with:
- Linux OS
- NVIDIA GTX 1080 Ti 11G, NVIDIA V100 32G, NVIDIA A40 48G
- CUDA 11.8 (
torch-geometric
does not support CUDA 12.0 yet) - conda 23.3.1
Simply run install.sh
to install all dependencies in a new conda environment
named spt
.
# Creates a conda env named 'spt' env and installs dependencies
./install.sh
Note: See the Datasets page for setting up your dataset path and file structure.
โโโ superpoint_transformer
โ
โโโ configs # Hydra configs
โ โโโ callbacks # Callbacks configs
โ โโโ data # Data configs
โ โโโ debug # Debugging configs
โ โโโ experiment # Experiment configs
โ โโโ extras # Extra utilities configs
โ โโโ hparams_search # Hyperparameter search configs
โ โโโ hydra # Hydra configs
โ โโโ local # Local configs
โ โโโ logger # Logger configs
โ โโโ model # Model configs
โ โโโ paths # Project paths configs
โ โโโ trainer # Trainer configs
โ โ
โ โโโ eval.yaml # Main config for evaluation
โ โโโ train.yaml # Main config for training
โ
โโโ data # Project data (see docs/datasets.md)
โ
โโโ docs # Documentation
โ
โโโ logs # Logs generated by hydra and lightning loggers
โ
โโโ media # Media illustrating the project
โ
โโโ notebooks # Jupyter notebooks
โ
โโโ scripts # Shell scripts
โ
โโโ src # Source code
โ โโโ data # Data structure for hierarchical partitions
โ โโโ datamodules # Lightning DataModules
โ โโโ datasets # Datasets
โ โโโ dependencies # Compiled dependencies
โ โโโ loader # DataLoader
โ โโโ loss # Loss
โ โโโ metrics # Metrics
โ โโโ models # Model architecture
โ โโโ nn # Model building blocks
โ โโโ optim # Optimization
โ โโโ transforms # Functions for transforms, pre-transforms, etc
โ โโโ utils # Utilities
โ โโโ visualization # Interactive visualization tool
โ โ
โ โโโ eval.py # Run evaluation
โ โโโ train.py # Run training
โ
โโโ tests # Tests of any kind
โ
โโโ .env.example # Example of file for storing private environment variables
โโโ .gitignore # List of files ignored by git
โโโ .pre-commit-config.yaml # Configuration of pre-commit hooks for code formatting
โโโ install.sh # Installation script
โโโ LICENSE # Project license
โโโ README.md
Note: See the Datasets page for further details on `data/``.
Note: See the Logs page for further details on `logs/``.
See the Datasets page to set up your datasets.
Use the following commands to evaluate SPT from a checkpoint file
checkpoint.ckpt
:
# Evaluate SPT on S3DIS Fold 5
python src/eval.py experiment=s3dis datamodule.fold=5 ckpt_path=/path/to/your/checkpoint.ckpt
# Evaluate SPT on KITTI-360 Val
python src/eval.py experiment=kitti360 ckpt_path=/path/to/your/checkpoint.ckpt
# Evaluate SPT on DALES
python src/eval.py experiment=dales ckpt_path=/path/to/your/checkpoint.ckpt
Note: The pretrained weights of the SPT and SPT-nano models for S3DIS 6-Fold, KITTI-360 Val, and DALES are available at:
Use the following commands to train SPT on a 32G-GPU:
# Train SPT on S3DIS Fold 5
python src/train.py experiment=s3dis datamodule.fold=5
# Train SPT on KITTI-360 Val
python src/train.py experiment=kitti360
# Train SPT on DALES
python src/train.py experiment=dales
Use the following to train SPT on a 11G-GPU ๐พ (training time and performance may vary):
# Train SPT on S3DIS Fold 5
python src/train.py experiment=s3dis_11g datamodule.fold=5
# Train SPT on KITTI-360 Val
python src/train.py experiment=kitti360_11g
# Train SPT on DALES
python src/train.py experiment=dales_11g
Note: Encountering CUDA Out-Of-Memory errors ๐๐พ ? See our dedicated troubleshooting section.
Note: Other ready-to-use configs are provided in
configs/experiment/
. You can easily design your own experiments by composing configs:# Train Nano-3 for 50 epochs on DALES python src/train.py datamodule=dales model=nano-3 trainer.max_epochs=50
See Lightning-Hydra for more information on how the config system works and all the awesome perks of the Lightning+Hydra combo.
Note: By default, your logs will automatically be uploaded to Weights and Biases, from where you can track and compare your experiments. Other loggers are available in
configs/logger/
. See Lightning-Hydra for more information on the logging options.
We provide notebooks to help you get started with manipulating our core data structures, configs loading, dataset and model instantiation, inference on each dataset, and visualization.
In particular, we created an interactive visualization tool โจ which can be used to produce shareable HTMLs. Demos of how to use this tool are provided in the notebooks. Additionally, examples of such HTML files are provided in media/visualizations.7z
- README - General introduction to the project
- Data - Introduction to
NAG
andData
, the core data structures of this project - Datasets - Introduction to
Datasets
and the project'sdata/
structure - Logging - Introduction to logging and the project's
logs/
structure
Note: We endeavoured to comment our code as much as possible to make this project usable. Still, if you find some parts are unclear or some more documentation would be needed, feel free to let us know by creating an issue !
Here are some common issues and tips for tackling them.
Our default configurations are designed for a 32G-GPU. Yet, SPT can run on an 11G-GPU ๐พ, with minor time and performance variations.
We provide configs in configs/experiment/
for
training SPT on an 11G-GPU ๐พ:
# Train SPT on S3DIS Fold 5
python src/train.py experiment=s3dis_11g datamodule.fold=5
# Train SPT on KITTI-360 Val
python src/train.py experiment=kitti360_11g
# Train SPT on DALES
python src/train.py experiment=dales_11g
Having some CUDA OOM errors ๐๐พ ? Here are some parameters you can play with to mitigate GPU memory use, based on when the error occurs.
Parameters affecting CUDA memory.
Legend: ๐ก Preprocessing | ๐ด Training | ๐ฃ Inference (including validation and testing during training)
Parameter | Description | When |
---|---|---|
datamodule.xy_tiling |
Splits dataset tiles into xy_tiling^2 smaller tiles, based on a regular XY grid. Ideal square-shaped tiles ร la DALES. Note this will affect the number of training steps. | ๐ก๐ฃ |
datamodule.pc_tiling |
Splits dataset tiles into 2^pc_tiling smaller tiles, based on a their principal component. Ideal for varying tile shapes ร la S3DIS and KITTI-360. Note this will affect the number of training steps. | ๐ก๐ฃ |
datamodule.max_num_nodes |
Limits the number of |
๐ด |
datamodule.max_num_edges |
Limits the number of |
๐ด |
datamodule.voxel |
Increasing voxel size will reduce preprocessing, training and inference times but will reduce performance. | ๐ก๐ด๐ฃ |
datamodule.pcp_regularization |
Regularization for partition levels. The larger, the fewer the superpoints. | ๐ก๐ด๐ฃ |
datamodule.pcp_spatial_weight |
Importance of the 3D position in the partition. The smaller, the fewer the superpoints. | ๐ก๐ด๐ฃ |
datamodule.pcp_cutoff |
Minimum superpoint size. The larger, the fewer the superpoints. | ๐ก๐ด๐ฃ |
datamodule.graph_k_max |
Maximum number of adjacent nodes in the superpoint graphs. The smaller, the fewer the superedges. | ๐ก๐ด๐ฃ |
datamodule.graph_gap |
Maximum distance between adjacent superpoints int the superpoint graphs. The smaller, the fewer the superedges. | ๐ก๐ด๐ฃ |
datamodule.graph_chunk |
Reduce to avoid OOM when RadiusHorizontalGraph preprocesses the superpoint graph. |
๐ก |
datamodule.dataloader.batch_size |
Controls the number of loaded tiles. Each train batch is composed of batch_size *datamodule.sample_graph_k spherical samplings. Inference is performed on entire validation and test tiles, without spherical sampling. |
๐ด๐ฃ |
datamodule.sample_segment_ratio |
Randomly drops a fraction of the superpoints at each partition level. | ๐ด |
datamodule.sample_graph_k |
Controls the number of spherical samples in the train batches. | ๐ด |
datamodule.sample_graph_r |
Controls the radius of spherical samples in the train batches. Set to sample_graph_r<=0 to use the entire tile without spherical sampling. |
๐ด |
datamodule.sample_point_min |
Controls the minimum number of |
๐ด |
datamodule.sample_point_max |
Controls the maximum number of |
๐ด |
callbacks.gradient_accumulator.scheduling |
Gradient accumulation. Can be used to train with smaller batches, with more training steps. | ๐ด |
- This project was built using Lightning-Hydra template.
- The main data structures of this work rely on PyToch Geometric
- Some point cloud operations were inspired from the Torch-Points3D framework, although not merged with the official project at this point.
- For the KITTI-360 dataset, some code from the official KITTI-360 was used.
- Some superpoint-graph-related operations were inspired from Superpoint Graph
- The hierarchical superpoint partition is computed using Parallel Cut-Pursuit
If your work uses all or part of the present code, please include the following a citation:
@inproceedings{robert2023spt,
title={Efficient 3D Semantic Segmentation with Superpoint Transformer},
author={Robert, Damien and Raguet, Hugo and Landrieu, Loic},
journal={Proceedings of the IEEE/CVF International Conference on Computer Vision},
year={2023}
}
You can find our paper on arxiv ๐.
Also, if you like this project, don't forget to give it a โญ, it means a lot to us !