Giter Site home page Giter Site logo

uehwan / 3-d-scene-graph Goto Github PK

View Code? Open in Web Editor NEW
73.0 4.0 11.0 72.06 MB

3D scene graph generator implemented in Pytorch.

Python 99.69% Shell 0.31%
computer-vision deep-learning deeplearning scenegraph scene-graph robotics robot intelligence 3d-scene-graph 3d-models

3-d-scene-graph's Introduction

3D-Scene-Graph: A Sparse and Semantic Representation of Physical Environments for Intelligent Agents

This work is based on our paper (IEEE Transactions on Cybernetics 2019, accepted). We proposed a new concept called 3D scene graph and its construction framework. Our work is based on FactorizableNet, implemented in Pytorch.

3D Scene Graph Construction Framework

The proposed 3D scene graph construction framework extracts relevant semantics within environments such as object categories and relations between objects as well as physical attributes such as 3D positions and major colors in the process of generating 3D scene graphs for the given environments. The framework receives a sequence of observations regarding the environments in the form of RGB-D image frames. For robust performance, the framework filters out unstable observations(i.e., blurry images) using the proposed adaptive blurry image detection algorithm. Then, the framework factors out keyframe groups to avoid redundant processing of the same information. Keyframe groups contain reasonably-overlapping frames. Next, the framework extracts semantics and physical attributes within the environments through recognition modules. During the recognition processes, spurious detections get rejected and missing entities are supplemented.
Finally, the gathered information gets fused into 3D scene graph and the graph gets updated upon new observations.

Requirements

Installation

Install Pytorch 0.3.1. The code has been tested only with Python 2.7, CUDA 9.0 on Ubuntu 16.04. You will need to modify a significant amount of code if you want to run in a different environment (Python 3+ or Pytorch 0.4+).

  1. Download 3D-Scene-Graph repository
git clone --recurse-submodules https://github.com/Uehwan/3D-Scene-Graph.git
  1. Install FactorizableNet
cd 3D-Scene-Graph/FactorizableNet

Please follow the installation instructions in FactorizableNet repository. Follow steps 1 through 6. You can skip step 7. Download VG-DR-Net in step 8. You do not need to download other models.

  1. Install 3D-Scene-Graph
cd 3D-Scene-Graph
touch FactorizableNet/__init__.py
ln -s ./FactorizableNet/options/ options
mkdir data
ln -s ./FactorizableNet/data/svg data/svg
ln -s ./FactorizableNet/data/visual_genome data/visual_genome
   
pip install torchtext==0.2.3
pip install setuptools pyyaml graphviz webcolors pandas matplotlib 
pip install git+https://github.com/chickenbestlover/ColorHistogram.git

An Alternative: use installation script

   ./build.sh
  1. Download ScanNet dataset

In order to use ScanNet dataset, you need to fill out an agreement to toe ScanNet Terms of Use and send it to the ScanNet team at [email protected]. If the process was successful, they will send you a script downloading ScanNet dataset.

To download a specific scan (e.g. scene0000_00) using the script (the script only runs on Python 2.7):

download-scannet.py -o [directory in which to download] --id scene0000_00
(then press Enter twice)

After the download is finished, the scan is located in a new folder scene0000_00. In the folder, *.sens file contains the RGBD Video with camera pose. To extract them, we use SensReader, an extraction tool provided by ScanNet git repo.

git clone https://github.com/ScanNet/ScanNet.git
cd ScanNet/SensReader/python/
python reader.py \
   --filename [your .sens filepath]  \
   --output_path [ROOT of 3D-Scene-Graph]/data/scene0000_00/ \
   --export_depth_images \
   --export_color_images \
   --export_poses \
   --export_intrinsics
    

Example of usage

python scene_graph_tuning.py \
  --scannet_path data/scene0000_00/\
  --obj_thres 0.23\
  --thres_key 0.2\
  --thres_anchor 0.68 \
  --visualize \
  --frame_start 800 \
  --plot_graph \
  --disable_spurious \
  --gain 10 \
  --detect_cnt_thres 2 \
  --triplet_thres 0.065

Core hyper-parameters

Data settings:

  • --dataset : choose dataset, default='scannet'.
  • --scannet_path : scannet scan filepath , default='./data/scene0507/'.
  • --frame_start : idx of frame to start , default=0.
  • --frame_end : idx of frame to end , default=5000.

FactorizableNet Output Filtering Settings:

  • --obj_thres : object recognition threshold score , default=0.25.
  • --triplet_thres : triplet recognition threshold score , default=0.08.
  • --nms : NMS threshold for post object NMS (negative means not NMS) , default=0.2.
  • --triplet_nms : Triplet NMS threshold for post object NMS (negative means not NMS) , default=0.4.

Key-frame Extraction Settings:

  • --thres_key : keyframe threshold score , default=0.1.
  • --thres_anchor : achorframe threshold score , default=0.65.
  • --alpha : weight for Exponentially Weighted Summation , default=0.4.
  • --gain : gain for adaptive thresholding in blurry image detection , default=25.
  • --offset : offset for adaptive thresholding in blurry image detection , default=1.

Visualization Settings:

  • --pause_time : a pause interval (sec) for every detection , default=1.
  • --plot_graph : plot 3D Scene Graph if true.
  • --visualize : enable visualization if ture.
  • --format : resulting image format, pdf or png, default='png'.
  • --draw_color : draw color node in 3D scene graph if true.
  • --save_image : save detection result image if true.

Result

scores1

Demo Video

Video Label

Citations

Please consider citing this project in your publications if it helps your research. The following is a BibTeX reference.

@article{kim2019graph3d,
  title={3D-Scene-Graph: A Sparse and Semantic Representation of Physical Environments for Intelligent Agents},
  author={Kim, Ue-Hwan and Park, Jin-Man and Song, Taek-Jin and Kim, Jong-Hwan},
  journal={IEEE Cybernetics},
  year={2019}
}

Acknowledgement

This work was supported by the ICT R&D program of MSIP/IITP. [2016-0-00563, Research on Adaptive Machine Learning Technology Development for Intelligent Autonomous Digital Companion]

3-d-scene-graph's People

Contributors

taekjinsong avatar uehwan avatar yikang-li avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

3-d-scene-graph's Issues

can not generate scene graph for visualization

When I run the scene_graph_tuning.py with args you show in the readme, I can't get the scene graph in the window of '3D scene graph'. Are there any code or args I need to modify?
What the arg(detect_cnt_thres, default equal 2) means? I find that the node_feature.ix[node_num]['detection_cnt'] is always equal 1. The code in vis_tuning.py is always run:
if node_feature.ix[node_num]['detection_cnt'] < cnt_thres: continue

.

Update?

Would you please:

  • port the whole thing to python3
    and
  • rewrite it using the current pytorch (1.0.1 at the moment of this post)
    ?

Bug reports and remedies

Thanks for your great job. And here I want to report some bugs and I have remedied them.

  1. In the
    obj, sub = rel['object']['name'], rel['subject']['name']

    and
    pred, obj, sub = rel['predicate'], rel['object']['name'], rel['subject']['name']
    ,

the key 'name' may not appear in the rel['subject'] or rel['object']. I think it may be the problem of the updated raw relationships.json. So you can replace them with the following codes:

if 'name' in rel['object']:
  obj = rel['object']['name']
else:
  obj = rel['object']['names'][0]
              
 if 'name' in rel['subject']:
  sub = rel['subject']['name']
else:
  sub = rel['subject']['names'][0]
  1. In the
    self.intrinsic_depth = np.array(intrinsic_depth[:3,:3])
    ,
    the intrinsic_depth is a list containing the strings rather than float numbers. You can add the following codes:
tmp_intrinsic_depth = []
for row in intrinsic_depth:
  row = [float(r) for r in row]
  tmp_intrinsic_depth.append(row)                                                                                                                                        
  intrinsic_depth = tmp_intrinsic_depth
  1. In the
    img_obj_detected = tools_for_visualizing.vis_object_detection(image_scene.copy(), test_set, obj_cls[:, 0], obj_boxes, obj_scores[:, 0])

You should firstly create the class instance first.

toolforvis = tools_for_visualizing()
img_obj_detected = toolforvis.vis_object_detection(image_scene.copy(), test_set, obj_cls[:, 0], obj_boxes, obj_scores[:, 0])

RuntimeError: CUDNN_STATUS_EXECUTION_FAILED

Hi there,
While I'm trying to run the model, when it reaches line 145 in scene_graph_tuning.py it returns this error. do you know What could be the reason and how I can solve it?

Thanks

Traceback (most recent call last):
File "scene_graph_tuning.py", line 153, in
object_result, predicate_result = model.forward_eval(im_data, im_info, )
File "/work/home/akhaghighat/3D-Scene-Graph/model/SGGenModel.py", line 92, in forward_eval
features, object_rois = self.rpn(im_data, im_info)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "./FactorizableNet/models/RPN/RPN.py", line 96, in forward
features = self.features(im_data)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/container.py", line 67, in forward
input = module(input)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/modules/conv.py", line 282, in forward
self.padding, self.dilation, self.groups)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/functional.py", line 90, in conv2d
return f(input, weight, bias)
RuntimeError: CUDNN_STATUS_EXECUTION_FAILED

vis_tuning.py

  1. Need to group relevant functions into classes
  2. Remove comments of "codes for tests"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.