Giter Site home page Giter Site logo

mlkorra / pointnet Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kuzand/pointnet

0.0 0.0 0.0 67 KB

PyTorch implementation of the PointNet with applications to 3D object classification and part segmentation.

License: MIT License

Python 70.75% Jupyter Notebook 29.25%

pointnet's Introduction

PointNet

PyTorch implementation of the PointNet [1], a deep neural network that can directly process point-clouds, without using intermediate representations such as voxels or multi-view images, and is suitable for a variety of point-based 3D recognition tasks such as object classification and part segmentation.

Introduction

PointNet is a neural network that can learn to arbitrarily aproximate any uniformly continuous, permutation-invariant (symmetric) function f on finite sets of points (point clouds) by decomposing it into:

f({x_1, x_n}) ≈ (g ∘ POOL)({h(x_1), ..., h(x_n)})

where x_i is the i-th point-vector of size C_in, X_in = {x_1, ..., x_n} is a point-cloud of cardinality n, h: R^{C_in} -> R^{C_out} and g: R^{C_out} -> R^L are some continuous functions, POOL is a symmetric pooling operation such as max-pooling or avg-pooling that aggregates information from the points and enforces permutation-invariance of the whole function f. Note that pooling operations are applied component-wise.

The continuous functions g and h can be approximated by Multi-layer perceptrons (MLPs) -- combination of fully-connected layers (FCs) followed by non-linearities. Since the MLP for the function h acts on all the points of a point-cloud identically and independently, we name it as PointMLP:

PointMLP({x_1, ..., x_n}) = {MLP(x_1), ..., MLP(x_n)}.

Note that PointMLP is permutation-equivariant by construction.

The vanilla PointNet thus can be represented as:

f(X_in) ≈ MLP(MAX(PointMLP(X_in))).

We note that in order for a PointNet to be able to arbitrarily approximate any continuous set function, it is required to have a PointMLP with sufficiently large number of output-layer neurons (typically C_out >= n).

In addition to the permutation-invariance, it is desirable to have invariance to certain geometric transformations (e.g. rigid transformation) of the point-clouds. TNets...

Architectures

PointNet for classification

PointNet for part segmentation

Note that this network is permutation-equivariant and not permutation-invariant as the PointNet for classification.

Dependencies

  • Pytorch (1.10.1)
  • Numpy (1.18.5)
  • Matplotlib (3.4.1)
  • Trimesh (3.9.32)
  • h5py (3.1.0)

In parentheses are the versions that were used for creating and testing the code.

Usage

For training the PointNet for 3D object classification on ModelNet dataset:

cd applications/classification
python train_clf.py --hdf5_path "dataset.hdf5" --device "cuda" --batch_size_train 32 --num_epochs 100 ...

Similarly for evaluation and inference:

python eval_clf.py ...
python infere_clf.py ...

See arg_parser.py for all the possible command-line arguments and run_clf.ipynb for an example.

References

[1] PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation, C. Qi et al., 2016

[2] Original TensorFlow implementation of the PointNet

pointnet's People

Contributors

kuzand avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.