Giter Site home page Giter Site logo

demo's Introduction

Monte-Carlo Scene Search for 3D Scene Understanding (demo)

This repo contains visualization of the scene understanding results from Monte-Carlo Scene Search (MCSS) method on the ScanNet dataset. MCSS esimates the scene layout and retrieves object models and poses from an RGB-D scan of the scene.

The MCSS method uses Monte-Carlo Tree Search (MCTS) to optimally select a set of proposals from a pool of layout components and objects proposals to best explain the RGB-D data. More specifically, we first generate multiple proposals for each wall and object in the scene from the point-cloud data of the scene. We then adapt MCTS to optimally select a set of wall and object proposals from this pool by relying on render and compare technique. Our method retrieves finer details of complex scene layouts and retrieves objects and their poses during cluttered scenarios. Our quantitative evaluation shows that MCSS outperforms previous methods for layout estimation and object retrieval tasks on the ScanNet dataset.

In this repo, we provide scripts to visualize the results from our method. We also provide evaluation scripts to reproduce the metrics reported in our paper.

Requirements

Clone the repo. Create and activate the virtual environment with python dependencies

conda env create --file=environment.yml
conda activate mcss_demo
mkdir outputs
  • Download the ShapenetV2 dataset by signing up on the website. Extract the models to $SHAPENET_DIR

  • Download the Scane2CAD dataset by signing up on there webpage. This is required only if you are running the eval scrips also. Extract the zip file and let the path to full_annotations.json be $SCAN2CAD

  • Download the MCSS results from here and extract them to the outputs folder. Finally, your repo directory should contain the following folder structure:

-MCSS_DEMO
    -assets
    -monte_carlo_model_search
    -outputs
        -scans
            -scene0011_00
            -scene0015_00
            .
            .
            

Run Visualization

To visualize the outputs of MCSS on ScanNet random validation scenes, run the following script:

python demo.py --shapenet_dir $SHAPENET_DIR

The above script first downloads the ScanNet scene from the official ScanNet server and then opens a open3d visualizer. Press 'q' to visualize a different scene. If you want to visualize the a particular scene provide the scene ID as:

python demo.py --shapenet_dir $SHAPENET_DIR --scene <sceneID>

Note that we provide MCSS results on only 126 and 64 validation scenes for evaluating objects and room layouts respectively (this is based on Scan2CAD and SceneCAD scenes whose scene IDs end with '_00'). Further, our method considers 4 main categories of objects, namely: chair, table, sofa and bed. Please read the paper for more details.

Run MCSS Evaluation for Objects

We compare the accuracy of our method with a challenging baseline and other methods that retrieve objects and estimate pose from an RGB-D scene. We consider standard metrics for evaluation and provide scripts to replicate the numbers reported on the paper. Please run the below script to obtain average precision/recall and average chamfer distance for MCSS retrieved objects on ScanNet 126 validation scenes. Again, these are the scenes whose scene IDs end with '_00'.

We evaluate the objects model retrieval and pose estimation accuracy by comparing with Scan2CAD dataset which contains manually annotated object models and poses for the ScanNet scenes.

python eval.py --download_scenes
python eval.py --scan2cad $SCAN2CAD --shapenet_dir $SHAPENET_DIR

The first script downloads all the scannet scenes (takes a few mins). The second script runs obtains the different metrics by comparing with Scan2CAD annotations. The outputs are dumped in outputs/evalAllScenesMCTS_testrun_0.500000IOU folder. The following files are important:

  • catAP.json - Contains average precision for all categories
  • catAR.json - Contains average recall for all categories
  • mctsChamferDistCat.json - Contains chamfer distance (in mts) of the MCSS retreived models for all categories
  • s2cChamferDistCat.json - Contains chamfer distance (in mts) of the Scan2CAD annotation models for all categories

Run MCSS Evaluation for Room Layouts

We evaluate precision and recall of detected room corners and the IOU of the detected room layout polygons. Here you can find our room layout annotations, refined from the original SceneCAD annotations. Please extract the results to $LAYOUT_LABELS. *.json files contain layout polygon instances. Every instance is a list of 3D polygon vertices. Then, you can run the evaluation using the following script:

python eval_layouts.py --annotations_path $LAYOUT_LABELS --solutions_path outputs/scans/

Citation

If you found this work useful for your publication, please consider citing us:

@INPROCEEDINGS{mcss2021,
title={Monte Carlo Scene Search for 3D Scene Understanding},
author={Shreyas Hampali and Sinisa Stekovic and Sayan Deb Sarkar and Chetan Srinivasa Kumar and Friedrich Fraundorfer and Vincent Lepetit},
booktitle = {CVPR},
year = {2021}
}

demo's People

Contributors

shreyashampali avatar vevenom avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.