thu-vclab / hggd Goto Github PK

View Code? Open in Web Editor NEW

43.0 2.0 5.0 6.28 MB

Official code of paper "Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes"

License: MIT License

Python 99.77% Shell 0.23%

hggd's Introduction

Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes
RA-L 2023

Official code of paper Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes

Framework

Requirements

Python >= 3.8
PyTorch >= 1.10
pytorch3d
numpy==1.23.5
pandas
cupoch
numba
grasp_nms
matplotlib
open3d
opencv-python
scikit-image
tensorboardX
torchsummary
tqdm
transforms3d
trimesh
autolab_core
cvxopt

Installation

This code has been tested on Ubuntu20.04 with Cuda 11.1/11.3/11.6, Python3.8/3.9 and Pytorch 1.11.0/1.12.0.

Get the code.

git clone https://github.com/THU-VCLab/HGGD.git

Create new Conda environment.

conda create -n hggd python=3.8
cd HGGD

Please install pytorch and pytorch3d manually.

# pytorch-1.11.0
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
# pytorch3d
pip install fvcore
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu113_pyt1110/download.html

Install other packages via Pip.

pip install -r requirements.txt

Usage

Checkpoint

Checkpoints (realsense/kinect) can be downloaded from Tsinghua Cloud

Preprocessed Dataset

Preprocessed datasets (realsense.7z/kinect.7z) can be downloaded from Tsinghua Cloud

Containing converted and refined grasp poses from each image in graspnet dataset

Train

Training code has been released, please refer to training script

Typical hyperparameters:

batch-size # batch size, default: 4
step-cnt # step number for gradient accumulation, actual_batch_size = batch_size * step_cnt, default: 2
lr # learning rate, default: 1e-2
anchor-num # spatial rotation anchor number, default: 7
anchor-k # in-plane roation anchor number, default: 6
anchor-w # grasp width anchor size, default: 50
anchor-z # grasp depth anchor size, default: 20
all-points-num # point cloud downsample number, default: 25600
group-num # local region fps number, default: 512
center-num # sampled local center/region number, default: 128
noise # point cloud noise scale, default: 0
ratio # grasp attributes prediction downsample ratio, default: 8
grid-size # grid size for our grid-based center sampling, default: 8
scene-l & scene-r # scene range, train: 0~100, seen: 100~130, similar: 130~160, novel: 160~190
input-w & input-h # downsampled input image size, should be 640x360
loc-a & reg-b & cls-c & offset-d # loss multipier, default: 1, 5, 1, 1
epochs # training epoch number, default: 15
num-workers # dataloader worker number, default: 4
save-freq # checkpoint saving frequency, default: 1
optim # optimizer, default: 'adamw'
dataset-path # our preprocessed dataset path (read grasp poses)
scene-path  # original graspnet dataset path (read images)
joint-trainning # whether to joint train our two part of network (trainning is a typo, should be training, please ignore it)

Test

Download and unzip our preprocessed datasets (for convenience), you can also try removing unnecessary parts in our test code and directly reading images from the original graspnet dataset api.

Run test code (read rgb and depth image from graspnet dataset and eval grasps).

bash test_graspnet.sh

Attention: if you want to change camera, please remember to change camera in config.py

Typical hyperparameters:

center-num # sampled local center/region number, higher number means more regions&grasps, but gets slower speed, default: 48
grid-size # grid size for our grid-based center sampling, higher number means sparser centers, default: 8
ratio # grasp attributes prediction downsample ratio, default: 8
anchor-k # classification anchor number for grasp in-plane rotation, default: 6
anchor-w # regress anchor size for grasp width, default: 50
anchor-z # regress anchor size for grasp depth, default: 20
all-points-num # downsampled point cloud point number, default: 25600
group-num # local region point cloud point number, default: 512
local-k # grasp detection number in each local region, default: 10
scene-l & scene-r # scene range, train: 0~100, seen: 100~130, similar: 130~160, novel: 160~190
input-h & input-w # downsampled input image size, should be 640x360
local-thres & heatmap-thres # heatmap and grasp score filter threshold, set to 0.01 in our settings
dataset-path # our preprocessed dataset path (read grasp poses)
scene-path # original graspnet dataset path (read images)
num-workers # eval worker number
dump-dir # detected grasp poses dumped path (used in later evaluation)

Demo

Run demo code (read rgb and depth image from file and get grasps).

bash demo.sh

Typical hyperparameters:

center-num # sampled local center/region number, higher number means more regions&grasps, but gets slower speed, default: 48
grid-size # grid size for our grid-based center sampling, higher number means sparser centers, default: 8
all-points-num # downsampled point cloud point number, default: 25600
group-num # local region point cloud point number, default: 512
local-k # grasp detection number in each local region, default: 10

Results

Attention: HGGD detects grasps only from heatmap guidance, without any workspace mask (adopted in Graspness) or object/foreground segmentation method (adopted in Scale-balanced Grasp). It may be useful to add some of this prior information to get better results.

Evaluation results on RealSense camera:

	Seen	Similar	Novel
In paper	59.36	51.20	22.17
In repo	64.45	53.59	24.59

Evaluation results on Kinect camera:

	Seen	Similar	Novel
In paper	60.26	48.59	18.43
In repo	61.17	47.02	19.37

Citation

Please cite our paper in your publications if it helps your research:

@article{chen2023efficient,
  title={Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes},
  author={Chen, Siang and Tang, Wei and Xie, Pengwei and Yang, Wenming and Wang, Guijin},
  journal={IEEE Robotics and Automation Letters},
  year={2023},
  publisher={IEEE}
}

hggd's People

Contributors

Stargazers

Watchers

Forkers

asitaku github-songsong qq625924821 zhangpengk kim-chan186

hggd's Issues

How to generate grasp labels

Dear author,

I want to train HGGD for another dataset. I want to know how to generate grasp labels from Graspnet-1billion dataset. Would you like to share your code for grasp generation?

Thank you very much.

Best regards,
Shiyu

Modifications to graspnetAPI in Your Repo

Hi,

I'm exploring the version of graspnetAPI in your repo for my project. Could you briefly outline the main changes you've made compared to the original version? Specifically interested in any new features or major alterations.

Thanks for your work and looking forward to your reply.

Best,
Rui

Some objects not detected

Dear authors,

Thanks for your great work. This code has significant improvement compared with Graspnet-1billion. However, I am confused why some object is not detected. For example, scissor in the demo is not detected for grasping. Is this due to scissor is not in the training dataset? How can I fix this issue?

Thank you very much.

Best regards,
Shiyu Li

question about the dataset

Hello,
Thanks for your work.
I downloaded the dataset from the link https://cloud.tsinghua.edu.cn/d/e3edfc2c8b114513b7eb/, and there are 189 scenes in this zip file. However, I notice in your paper that you mention that training requires 500 scenes.
My question is whether the final performance of the model trained using a data set of only 189 scenes can reach the value in the paper?

在运行test_graspnet.py时遇到了问题

在您的代码中test_graspnet.py分为了两个函数：def inference（）和def evaluate()
在log中，我看到了inference函数得到的info：
{test_graspnet.py:140} INFO - Using saved anchors
{test_graspnet.py:382} INFO - Time stats:
{test_graspnet.py:383} INFO - Total: 34.181 ms
{test_graspnet.py:386} INFO - 2d: 7.364 ms data: 16.780 ms 6d: 5.255 ms colli:2.982 ms nms: 1.802 ms
但是并没有得到evaluate中的info：
logging.info(f'Scene: {args.scene_l} ~ {args.scene_r}')
logging.info(f'colli == {colli}')
logging.info(f'ap == {ap}')
logging.info(f'ap0.8 == {aps[3]}')
logging.info(f'ap0.4 == {aps[1]}')
这些信息在log里都没有。
同时，test过程在scene 100-130之间，会卡在
score list == [ 0.2 0.4 0.4 0.4 0.2 -1. 0.6 0.2 -1. 0.2 0.2 -1. 0.4 0.2 0.4 -1. 0.6 0.2 0.2 -1. 0.4 -1. 0.4 0.4 -1. 0.4 0.2 0.4 0.4 0.4 0.6 0.2 0.4 -1. ]
colli == 0.23529411764705882
Mean Accuracy for scene:0128 ann:0255 = 66.559 这里，然后就不动了。
我尝试把100-130换到130-160
结果也会卡住，在scene:0158 ann:0255时卡住
我不知道问题出在哪里了，为什么会这样内😎

请问生成的多个平面抓取位姿有置信度分数吗，怎么判断哪个平面抓取位姿是最优的

some questions

您好！
首先十分感谢大佬能提供代码学习
我有些问题想问一下：

如果我要用HGGD对新的物体进行抓取，可以直接用你提供的checkpoint吗？还是要构建自己的数据集并训练？
请问您是如何构建自己的数据集的呢（比如制作工具）？

How to get the TS-ACRONYM dataset you used in the paper?

Hi,

Thank you for your work. I just wondering how to get the TS-ACRONYM you generated. Is there any release plan for it?

Best regards,
Rui

Inquiry on Real-Time Grasp Prediction Methodology

I recently came across your work, and I must commend you on the groundbreaking results and the innovative approach your team has adopted. The videos showcasing real-time grasp prediction were particularly inspiring.

Would you be willing to share more insights or any advice on replicating such real-time grasp prediction capabilities?

Hyper-params for getting to work with 480x640 images?

Hey, thanks for releasing the code for the project!

I have one question, what do I need to change if I want to work with 480x640 images?

For now, in order to make them compatible with your framework out-of-the-box, I upscale the images to 720x1280, send them to the point-cloud helper to get the view_points and then call the networks, as you do in your demo.

My issue is that the final grasp predictions are going to be expressed in the upscaled pointcloud coordinates. So if I visualize the pointcloud I get from the resized 720x1280 images I get nice and tight grasps

but if I visualize them together with the original resolution pointcloud it is not (obviously).

I tried to manually post-process the translation component of the proposed grasps, e.g. multiplying by 0.5 to fit the resolution change, by still doesnt work.

Is it possible to make your demo.py run for image resolution (480, 640)? Following your implementation at demo.py I did the following changes:

Changed input_h, input_w in the arguments to half my desired size (240,320) as from what I understand thats what you need there (currently has 360,640 for 720x1280 images).
Changed the PointCloudHelper class to replace the (1280, 720) constants with (640, 480). In particular I changed the line ymap, xmap = np.meshgrid(np.arange(480), np.arange(640)) and the line factor = 640 / self.output_shape[0].

If I run the demo with this setting, I get the following error:

Traceback (most recent call last):
  File "grasp_proposal_test.py", line 353, in <module>
    pred_gg = inference(view_points,
  File "grasp_proposal_test.py", line 149, in inference
    pred_2d, perpoint_features = anchornet(x)
  File "/home/p300488/miniconda3/envs/hggd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/p300488/6d_grasp_ws/src/hggd_grasp_server/src/inference/models/anchornet.py", line 213, in forward
    x = layer(x + xs[self.depth - i])
RuntimeError: The size of tensor a (29) must match the size of tensor b (30) at non-singleton dimension 3

I assume that maybe there are some other hyper-parameters that need to be modified? Or some other change to make it run for my desired resolution? Please let me know if there is an easy fix for this!! Thank you!

IsADirectoryError: [Errno 21] Is a directory: './realsense_checkpoint'

When I try to "bash test_graspnet.sh", reported this error "IsADirectoryError: [Errno 21] Is a directory: './realsense_checkpoint'"
It seems like it needs a specific file instead of a directory.
How can I solve this problem？
Sincerely thank you for your help

如何应用到UR3e上

您好，请问如何将预测的结果应用到ur3e上，能否提供一些参考。

When I test my checkpoint. "pred_grasps" stopped at "scene_0129" and with no error reported.Just stop at here.

I think i have gotten the realsense "Seen" result 64.45.
If I want to move on testing "Similar" and "Novel" , what shall I do?

Only grasping pose detection on the edge of the scene

Hello, HGGD researchers,

Firstly, I'd like to express my gratitude for providing excellent research and a user-friendly implementation.

When I tried detecting grasping pose in my environment using the provided demo, I encountered an issue where the poses were being detected on the outer edges as shown in Figure.
I would appreciate any feedback on this matter.

Thank you.

Meaning of grasp labels

Dear author,

I saw there are some grasp annotations in the label files. What is the meaning of translation_d, rotation_d and width_d?
{"numgrasp": 146, "translation_d": 6.7442166274596925e-09, "rotation_d": 5.012725293206674e-05, "width_d": 0.0028016255560493948}

Are they used in the training process?

Best regards,
Shiyu Li

Generate preprocessed grasp poses

Hi I've seen that you use a different set of grasp labels that you generate from the original graspnet1billion dataset, you provide the link but I didn't find how you originally generated them, could you tell me how to generate them from scratch?

model training

Hello, I have reviewed your paper and code, and I feel that you have done a great job. Could you please share your training code? I would like to see the details of your model training. Thank you.

Regarding your new article: Rethinking 6-Dof Grasp Detection: A Flexible Framework for High-Quality Grasping

Hello author,
I recently read your latest article: Rethinking 6-Dof Grasp Detection: A Flexible Framework for High-Quality Grasping，I have some questions about why the GS results reproduced in the two papers change, did you turn on collision detection processing in the tests when you ran the tests in the new paper?Will you open source the code for your latest paper?
I would be grateful for your reply !

thu-vclab / hggd Goto Github PK

hggd's Introduction

Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes RA-L 2023

Framework

Requirements

Installation

Usage

Checkpoint

Preprocessed Dataset

Train

Test

Demo

Results

Citation

hggd's People

Contributors

Stargazers

Watchers

Forkers

hggd's Issues

Recommend Projects

Recommend Topics

Recommend Org

Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered Scenes
RA-L 2023