Giter Site home page Giter Site logo

kamilzywanowski / minkloc3d-si Goto Github PK

View Code? Open in Web Editor NEW
29.0 1.0 5.0 8.79 MB

MinkLoc3D-SI: 3D LiDAR place recognition with sparse convolutions,spherical coordinates, and intensity

License: Other

Python 100.00%
deep-learning place-recognition 3d-computer-vision lidar-point-cloud dataset

minkloc3d-si's Introduction

MinkLoc3D-SI: 3D LiDAR place recognition with sparse convolutions,spherical coordinates, and intensity

Introduction

The 3D LiDAR place recognition aims to estimate a coarse localization in a previously seen environment based on a single scan from a rotating 3D LiDAR sensor. The existing solutions to this problem include hand-crafted point cloud descriptors (e.g., ScanContext, M2DP, LiDAR IRIS) and deep learning-based solutions (e.g., PointNetVLAD, PCAN, LPD-Net, DAGC, MinkLoc3D), which are often only evaluated on accumulated 2D scans from the Oxford RobotCat dataset. We introduce MinkLoc3D-SI, a sparse convolution-based solution that utilizes spherical coordinates of 3D points and processes the intensity of the 3D LiDAR measurements, improving the performance when a single 3D LiDAR scan is used. Our method integrates the improvements typical for hand-crafted descriptors (like ScanContext) with the most efficient 3D sparse convolutions (MinkLoc3D). Our experiments show improved results on single scans from 3D LiDARs (USyd Campus dataset) and great generalization ability (KITTI dataset). Using intensity information on accumulated 2D scans (RobotCar Intensity dataset) improves the performance, even though spherical representation doesn’t produce a noticeable improvement. As a result, MinkLoc3D-SI is suited for single scans obtained from a 3D LiDAR, making it applicable in autonomous vehicles.

Fig1

Citation

@ARTICLE{9661423,
  author={Żywanowski, Kamil and Banaszczyk, Adam and Nowicki, Michał R. and Komorowski, Jacek},
  journal={IEEE Robotics and Automation Letters}, 
  title={MinkLoc3D-SI: 3D LiDAR Place Recognition With Sparse Convolutions, Spherical Coordinates, and Intensity}, 
  year={2022},
  volume={7},
  number={2},
  pages={1079-1086},
  doi={10.1109/LRA.2021.3136863}}
  
@INPROCEEDINGS{9423215,
  author={Komorowski, Jacek},
  booktitle={2021 IEEE Winter Conference on Applications of Computer Vision (WACV)}, 
  title={MinkLoc3D: Point Cloud Based Large-Scale Place Recognition}, 
  year={2021},
  volume={},
  number={},
  pages={1789-1798},
  doi={10.1109/WACV48630.2021.00183}}

This work is an extension of Jacek Komorowski's MinkLoc3D.

Environment and Dependencies

Code was tested using Python 3.8 with PyTorch 1.7 and MinkowskiEngine 0.5.0 on Ubuntu 18.04 with CUDA 11.0.

The following Python packages are required:

  • PyTorch (version 1.7)
  • MinkowskiEngine (version 0.5.0)
  • pytorch_metric_learning (version 0.9.94 or above)
  • numba
  • tensorboard
  • pandas
  • psutil
  • bitarray

Modify the PYTHONPATH environment variable to include absolute path to the project root folder:

export PYTHONPATH=$PYTHONPATH:/.../.../MinkLoc3D-SI

Datasets

Preprocessed University of Sydney Campus dataset (USyd) and Oxford RobotCar dataset with intensity channel (IntensityOxford) available here. Extract the dataset folders on the same directory as the project code, so that you have three folders there: 1) IntensityOxford/ 2) MinkLoc3D-SI/ and 3) USyd/.

The pickle files used for positive/negative examples assignment are compatible with the ones introduced in PointNetVLAD and can be generated using the scripts in generating_queries/ folder. The benchmark datasets (Oxford and In-house) introduced in PointNetVLAD can also be used following the instructions in PointNetVLAD.

Before the network training or evaluation, run the below code to generate pickles with positive and negative point clouds for each anchor point cloud.

cd generating_queries/ 

# Generate training tuples for the USyd Dataset
python generate_training_tuples_usyd.py

# Generate evaluation tuples for the USyd Dataset
python generate_test_sets_usyd.py

# Generate training tuples for the IntensityOxford Dataset
python generate_training_tuples_intensityOxford.py

# Generate evaluation tuples for the IntensityOxford Dataset
python generate_test_sets_intensityOxford.py

Training

To train MinkLoc3D-SI network, prepare the data as described above. Edit the configuration file (config/config_usyd.txt or config/config_intensityOxford.txt):

  • num_points - number of points in the point cloud. Points are randomly subsampled or zero-padding is applied during loading, if there number of points is too big/small
  • max_distance - maximum used distance from the sensor, points further than max_distance are removed
  • dataset_name - USyd / IntensityOxford / Oxford
  • dataset_folder - path to the dataset folder
  • batch_size_limit parameter depending on available GPU memory. In our experiments with 10GB of GPU RAM in the case of USyd (23k points) the limit was set to 84, for IntensityOxford (4096 points) the limit was 256.

Edit the model configuration file (models/minkloc_config.txt):

  • version - MinkLoc3D / MinkLoc3D-I / MinkLoc3D-S / MinkLoc3D-SI
  • mink_quantization_size - desired quantization (IntensityOxford and Oxford coordinates are normalized [-1, 1], so the quantization parameters need to be adjusted accordingly!):
    • MinkLoc3D/3D-I: qx,qy,qz units: [m, m, m]
    • MinkLoc3D-S/3D-SI qr,qtheta,qphi units: [m, deg, deg]

To train the network, run:

cd training

# To train the desired model on the USyd Dataset
python train.py --config ../config/config_usyd.txt --model_config ../models/minkloc_config.txt

Evaluation

Pre-trained MinkLoc3D-SI trained on USyd is available in the weights folder. To evaluate run the following command:

cd eval

# To evaluate the model trained on the USyd Dataset
python evaluate.py --config ../config/config_usyd.txt --model_config ../models/minkloc_config.txt --weights ../weights/MinkLoc3D-SI-USyd.pth

License

Our code is released under the MIT License (see LICENSE file for details).

References

  1. J. Komorowski, "MinkLoc3D: Point Cloud Based Large-Scale Place Recognition", Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), (2021)
  2. M. A. Uy and G. H. Lee, "PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

minkloc3d-si's People

Contributors

kamilzywanowski avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

minkloc3d-si's Issues

Question about pos and neg mask

Hi, thanks for your great work.
I'm a new learner and have been confused at make_collate_fn part when reading your code:

def make_collate_fn(dataset: OxfordDataset, version, dataset_name, mink_quantization_size=None):
# set_transform: the transform to be applied to all batch elements
def collate_fn(data_list):
# Constructs a batch object
clouds = [e[0] for e in data_list]
labels = [e[1] for e in data_list]
batch = torch.stack(clouds, dim=0) # Produces (batch_size, n_points, point_dim) tensor
if dataset.set_transform is not None:
# Apply the same transformation on all dataset elements
batch = dataset.set_transform(batch)
if mink_quantization_size is None:
# Not a MinkowskiEngine based model
batch = {'cloud': batch}
else:
if version == 'MinkLoc3D':
coords = [ME.utils.sparse_quantize(coordinates=e, quantization_size=mink_quantization_size)
for e in batch]
coords = ME.utils.batched_coordinates(coords)
# Assign a dummy feature equal to 1 to each point
# Coords must be on CPU, features can be on GPU - see MinkowskiEngine documentation
feats = torch.ones((coords.shape[0], 1), dtype=torch.float32)
elif version == 'MinkLoc3D-I':
coords = []
feats = []
for e in batch:
c, f = ME.utils.sparse_quantize(coordinates=e[:, :3], features=e[:, 3].reshape([-1, 1]),
quantization_size=mink_quantization_size)
coords.append(c)
feats.append(f)
coords = ME.utils.batched_coordinates(coords)
feats = torch.cat(feats, dim=0)
elif version == 'MinkLoc3D-S':
coords = []
for e in batch:
# Convert coordinates to spherical
spherical_e = torch.tensor(to_spherical(e.numpy(), dataset_name), dtype=torch.float)
c = ME.utils.sparse_quantize(coordinates=spherical_e[:, :3], quantization_size=mink_quantization_size)
coords.append(c)
coords = ME.utils.batched_coordinates(coords)
feats = torch.ones((coords.shape[0], 1), dtype=torch.float32)
elif version == 'MinkLoc3D-SI':
coords = []
feats = []
for e in batch:
# Convert coordinates to spherical
spherical_e = torch.tensor(to_spherical(e.numpy(), dataset_name), dtype=torch.float)
c, f = ME.utils.sparse_quantize(coordinates=spherical_e[:, :3], features=spherical_e[:, 3].reshape([-1, 1]),
quantization_size=mink_quantization_size)
coords.append(c)
feats.append(f)
coords = ME.utils.batched_coordinates(coords)
feats = torch.cat(feats, dim=0)
batch = {'coords': coords, 'features': feats}
# Compute positives and negatives mask
# dataset.queries[label]['positives'] is bitarray
positives_mask = [[dataset.queries[label]['positives'][e] for e in labels] for label in labels]
negatives_mask = [[dataset.queries[label]['negatives'][e] for e in labels] for label in labels]
positives_mask = torch.tensor(positives_mask)
negatives_mask = torch.tensor(negatives_mask)
# Returns (batch_size, n_points, 3) tensor and positives_mask and
# negatives_mask which are batch_size x batch_size boolean tensors
return batch, positives_mask, negatives_mask
return collate_fn

Assuming batch size = 2 and the structure of training data is as follows:

{
"0": { "query": path/to/file/xxx.bin
        "positives": 1, 116, 117, 345, 346, ...
        "negatives": 18671, 15181, 8746, 2052, 7919...
       }
"1": { "query": path/to/file/xxx.bin
        "positives": 0, 2, 117, 118, 346, ...
        "negatives": 10283, 20550, 17938, 4424, 8452...
       }
...
}

For query 0, labels is [0, 1], its corresponding positives_masks is dataset.queries[0]['positives'][0] , dataset.queries[0]['positives'][1]. Because dataset.queries has been binarized to range(len(index)) and has excluded query_idx itself, in this case of small batch size, positives_masks should be [1,1]. Similarly, negatives_masks should be [0,0].

For query i and query i+1, labels is [i, i+1], and then get the i'th and (i+1)'th pos and neg bit-label. Actually we always get the next batch_size labels from i.

  • Is there anything wrong with my understanding?
  • What is the meaning of positives_mask and negatives_mask?
  • why not feed pos and neg into network and calculate the embedding distance between anchor and them?

Could you please give me some brief explanation?
Thanks in advance.

Reproducing MinkLoc3D-S results on Oxford dataset

Hi @KamilZywanowski,

first of all, thanks a lot for your great work!

I would like to reproduce the results from your paper (table III) for MinkLoc3D-S model on the submap data created by the PointNetVLAD authors. Would it be possible for you to provide a configuration file that you have used for model training? Many thanks in advance!

Cheers,
Mariia

An error in loss.py

Hi, just wanted to point out that when I try to run the code for MinkLoc3D, I need to do the following changes.

In loss.py, the mask variables need to be converted from Tensor to uint8 using

mask = (mask > 0).type(torch.uint8) 

Please see my edits below.

def get_max_per_row(mat, mask):
    # -- Do casting to get uint.8 -- #
    mask = (mask > 0).type(torch.uint8) 
    # print(type(mask))
    non_zero_rows = torch.any(mask.bool(), dim=1)
    mat_masked = mat.clone()
    mat_masked[~mask] = 0
    return torch.max(mat_masked, dim=1), non_zero_rows


def get_min_per_row(mat, mask):
     # -- Do casting to get uint.8 -- #
    mask = (mask > 0).type(torch.uint8) 
    non_inf_rows = torch.any(mask, dim=1)
    mat_masked = mat.clone()
    mat_masked[~mask] = float('inf')
    return torch.min(mat_masked, dim=1), non_inf_rows

Question about the KITTI dataset

Hi, just wanted to ask which version of the KITTI dataset are you using? Could I have the link to the raw data which you used in your paper? Thanks in advance!

Local performance not matching the result in the paper

Hi, I tried training the model on the USyd dataset locally and evaluated using the trained weights. I trained using the command in your README and evaluated using
python evaluate.py --config ../config/config_usyd.txt --model_config ../models/minkloc_config.txt --weights ../weights/model_MinkFPN_GeM_20220613_1608_final.pth
And this is the result I got
image
It differs quite a lot from the result obtained by using your pre-trained weights, which was 98.98.

Can I know whether or not the config files in config/ and the model/ folders are the optimal configuration? Thanks.

Intensity calibration

Hi, it's very nice to see your wonderful work. May I ask have you done the intensity calibration in your project?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.