wangyueft / dgcnn Goto Github PK

View Code? Open in Web Editor NEW

1.6K 1.6K 418.0 35.82 MB

License: MIT License

Python 99.02% Shell 0.98%

dgcnn's Introduction

Yue Wang's personal website

dgcnn's People

Contributors

Stargazers

Watchers

Forkers

coastline2018 gilsharo1 tangyoubao foreverdavid hao0313 ai3dvision samaonline mbencherif van51 clarencehoo tch maawad wonseoklee cl5220418 xinyugaotudelft collector-m mathieuorhan ninaqy jiameng1010 afcarl donrv vivonasg wusuoweima xuyongzhi soinlovelin xiaoliang008 junzhang0629 zchen06 shabayek zhangjian94cn draffelt binhnguyennus stevenczwu rkunnawa wpfhtl dkoguciuk laduchoang hemarm gaoxiaoninghit zengjinghui thibaultgroueix thomasly nejcd tthhee yglkings dapengchalmers omshinde preheatedkd jdc08161063 zys0070 wildspirit007 dylanwusee csrhddlam peterzhefu kienlgk mayshy liangtianxin simon-be-happy lxl0326 aaroncnu whiteeat peterzs tommaoer nnu-gisa githubgenli fenglian425 jryongithub alexandroskal weiguochow peterzhousz princewang-cal hongyuanzhu liuweiping2020 jinsz qinyanjun04 chisyliu liangzhenm alitabet hippogriff lucksong itwpub skylook digitalillusions roeenanos hdjsjyl ahandsomeperson mehakxheikh amnondrory supermousse jeremycjm mattbv pkw2009 iamtomcheng xiaolong-yun azuki-miho houseleo renmaqilong deruhat galenooo inf-yifanyang

dgcnn's Issues

What's the evaluation script?

The readme refers to running evaluate.py, but I couldn't find the script. I am guessing I am missing something obvious here? Where should I find the evaluate.py script?

Serious problem. About label smoothing in loss

While examining provided codes,

I found out label smoothing term is applied for cross entroy loss.

However, there is no description about that in the paper.

I doubt for the effects of this term.. Is there any official results without label smoothing term

How do you visualize the extracted features?

How do you visualize the extracted features?Can you send me your code about visualization of extracted features?My email is that [email protected] you very much.

Undefined "weight_decay" in tensorflow/sem_seg/model.py

Hi!
In tensorflow/sem_seg/model.py there seems to be undefined "weight_decay" variable used in tf_util.conv2d(...) layers while building model.

For example here:
out1 = tf_util.conv2d(edge_feature, 64, [1,1], padding='VALID', stride=[1,1], bn=True, is_training=is_training, weight_decay=weight_decay, scope='adj_conv1', bn_decay=bn_decay, is_dist=True)

part segmentation

Hi,

I am trying to reproduce your results showing in the paper with your code but I am not able to do it.
Would you mind releasing your trained model for shapenet part segmentation task?

Thanks!

Semantic segmentation: model differs from paper

Thanks for your code contribution! I read the paper, and noticed that the EdgeConv architecture (Figure 3) suggested for semantic segmentation is different from the one implemented in the code. In particular, the code uses two parallel pooling operations (max and mean instead of only max). Could you tell which one was used to obtain the results in the paper, and perhaps comment on reasons for the difference? Was using both max and mean aggregation better than only using either one alone?

Any idea of the get_edge_feature function for unknown batch size

Hi, thanks for your helpful code. But just one painful question: I would like to integrate your code into my Keras implementation. The function of get_edge_feature is using batch_size, which is 'None' for Keras (dynamic for any batch_size). Is it possible to generate the point_cloud_neighbors without using your solution (point_cloud is reshaped to (-1, num_dims)). I guess tf.scan could work. But I am still struggling with that. Many thanks!

Point cloud transform block different from paper

Hi!

Thanks for sharing the code. I am reading the paper and found that the implementation of the point cloud transform block (aka T-Net in the original point net paper) is different from what is mentioned in the code.

In the paper, as shown in Fig. 3, the coordinate difference the k nearest neighbor and the coordinates of the point is concatenated (therefor n x k x (3+3) = n x k x 6).

However in the code, there is one additional max pooling along the number of points axis, as compared to the original point net implementation. I do not quite understand why.

If someone can enlighten me or share their thought on this, I would be very grateful.

Edit:
I think I figured out why the input of the t-net is n x k x 6 as the input already went through feature transformation defined here. However I am still puzzled by the additional max pooling operation as compared to the original point net implementation.

Why the test.py point_num=3000 ，However the train_multi_gpu.py point_num=2048 of part_seg ？

Thank you very much for your open source! However, I have a questi . Why the test.py point_num=3000 ，However the train_multi_gpu.py point_num=2048 of part_seg ？

visualization problem

Hi,
Thanks for sharing the work! I have some problem with visualization. how to visualize the results such as figure 4 in your paper

Thanks a lot!

Train seg & cls simultaneously

Hello,
I noticed that the architecture of your network has two branches for seg and cls, they split from the first edgeconv operation. In your code, I find the segmentation and classification are trained independently.
I wonder if there is a way to train seg and cls together with the different loss, just like Figure 3 shown in your paper?

There exist no rotation augmentation in the preprocess of the pytorch implementation

Hi, thank you for opening your code.
I notice that there exist no rotation augmentation in the preprocess of the pytorch implementation , which is different with the tf implementation. However, the best accuracy achieve 93.1 .Does it means that the rotation augmentation is useless?
I want to construct a auto-encoder which use the dgcnn as the backbone of the encoder, therefore the absence of rotation augment in the modelnet classification really puzzle me. Does it mean that I needn't execute the rotation augment even though the dgcnn is used for reconstruction purpose which is different with classification?
Thanks a lot.

accuracy about classification

Hi,when I run the tensorflow code.I just got the accuracy of 91.2% .I read the paper published in 2018,the result is as sama sa the baseline .I want to the resaon.thanks!

Questions about transform nets

Hello Yue,
I have a question about transform networks, How to devise a transform net? I noticed your transform net in tensorflow implementation is almost the same as in charlesq/pointnet implementation, while you use EdgeConv instead of 1x1 conv in your model.

the difference between fixed knn graph and dynamic knn graph?

@WangYueFt I find that you compare the result with baseline in the paper. As you mentioned, the baseline is using fixed knn graph rather dynamic graph. So could you help me explain what is the difference between fixed knn graph and dynamic knn graph?

reproduce the classification result with pytorch

I run the pytorch code with the script
python main.py --exp_name=dgcnn_1024 --model=dgcnn --num_points=1024 --k=20 --use_sgd=True
And I always get results slightly worse than the reported results in the paper.
I used the best test results in the training process. Especially, for average acc (mean class acc), the gap with the reported ones is larger.
Are there any special settings or tricks in running the code?
Thanks in advance.

Why bias=None in conv2d?

Hi! I have some small questions about the code.

I noticed that bias is set to None in both PointNet and dgcnn. I'm curious why we do that? I can't find the explanation in these papers.
Do we need to apply batch normalization on the input (before conv1)?

Thanks a lot if someone could help me!!!

"The number of GPUs to use" in sem_seg with train.py

Hello, Thank you for sharing this code, it's amazing!
Sorry, I have some question about train.py in sem_seg folder,
When I run "sh +x train_job.sh" ,
cmd show this code:
"Traceback (most recent call last):
File "train.py", line 289, in
train()
File "train.py", line 238, in train
train_one_epoch(sess, ops, train_writer)
File "train.py", line 271, in train_one_epoch
ops['pointclouds_phs'][1]: current_data[start_idx_1:end_idx_1, :, :],
IndexError: list index out of range"

I check train.py parameters, and find a probably reason for GPU use number:
parser.add_argument('--num_gpu', type=int, default=1, help='the number of GPUs to use [default: 2]')
I just one NVIDIA 1050Ti, so I change default=2 to 1,is that mean I just buy more graphics card to fix this question?
THANKS a lot!

can not download indoor3d_sem_seg_hdf5_data.zip

I try to run: wget https://shapenet.cs.stanford.edu/media/indoor3d_sem_seg_hdf5_data.zip.
It failed and the website (https://shapenet.cs.stanford.edu/) seems to be closed.
Where else can I get these data? Or did I do something wrong?
Many thanks.

IndexError: list index out of range

Traceback (most recent call last):
File "train.py", line 285, in
train()
File "train.py", line 234, in train
train_one_epoch(sess, ops, train_writer)
File "train.py", line 267, in train_one_epoch
ops['pointclouds_phs'][1]: current_data[start_idx_1:end_idx_1, :, :],
IndexError: list index out of range

Where is the categorical vector?

Hi, thank you for sharing the code. In your paper, there is a categorical vector for semantic segmentation network.

However, I didn't find many descriptions of this vector in the paper or any implementation of this in the code.

dgcnn/tensorflow/sem_seg/model.py

Line 87 in 69b80a3

    
           out_max = tf_util.max_pool2d(out7, [num_point, 1], padding='VALID', scope='maxpool')

Here, after the max pooling, the global feature is directly repeated and concatenated with point features.

What is this categorical vector? Is it necessary?

The number of parameters of pytorch version

Thanks for your time.
I am confused about the number of parameters.
The number of parameters of pytorch version is 1.8 M, but, 21M listed in your paper.
How to get it?
Thnks very much.

Questions About the classification of ModelNet40

It seems that you did not mentioned that you concatenate feature in the first and second EdgeConv layer in your classification model. But in your code models/dgcnn.py, it's appeared that the features output from the lower layer were concatenate together. Then which way is accuracy 92.2% obtained?
net = tf_util.conv2d(tf.concat([net1, net2, net3, net4], axis=-1), 1024, [1, 1], padding='VALID', stride=[1,1], bn=True, is_training=is_training, scope='agg', bn_decay=bn_decay)
ModelNet40 data set has two form: aligened and unaligned. Which form is used in the paper DGCNN?

why do we need net = tf_util.conv2d(tf.concat([net1, net2, net3, net4], axis=-1), 1024, [1, 1], ?

@WangYueFt @syb7573330 Could you help me explain why do we need to perform tf.concat here to combine the previous net? It seems didn't mention in the paper.

Also does anyone understands this line? Please help

dgcnn/models/dgcnn.py

Line 79 in 29948ad

    
           net = tf_util.conv2d(tf.concat([net1, net2, net3, net4], axis=-1), 1024, [1, 1],

No such file or directory: 'pretrained/model.2048.t7'

I tried to run your script under pytorch, but there is no pretrained model for model.2048.t7.

question about test for part segmentation

I made some changes to the network,the trained model by myself reported errors during the test as follows:

Thanks for any help!

About kernel_size

https://github.com/WangYueFt/dgcnn/blob/master/pytorch/model.py#L100
using kernel size =(1,1)
why not use kernel size =(1,self.k)?

Hi , can I train this model on my own dataset ?

segmentation experiments

Hi,
I have read your paper and like your work very much. Your work is simple and effective, and I look forward to your pytorch implementation of segmentation experiments.

Thanks.

KeyError: "Unable to open object (object 'data' doesn't exist)"

Thanks for your awesome code share

I run the train.py code following readme step by step, but when I run python train.py, there is an error:KeyError: "Unable to open object (object 'data' doesn't exist)", here is details:

I solve all the problem of dependency but above error keep showing

I run the pointnet(https://github.com/charlesq34/pointnet) without error, however, I cannot run dgcnn...

please help me, so I can study about dgcnn more

sorry for my poor english skills

An issue about the pairwise_distance

Hi,
Thanks for releasing the code. I have an issuse about the estimation of "pairwise_distance". In some previous works, the k nearest neighbors are estimated by Euclidean distance such as |x - xi|. I'm confused about the implemention of "pairwise_distance", -i.e., "-xx - inner - xx.transpose(2, 1)". Could you please do me a favor?

Best regards

InternalError (see above for traceback): Blas xGEMM launch failed

Hello,thank you for your reply,when I try to run code about sem_seg,I meet this problem,and I have one gpu(8gmemory),can you tell me how to solve this problem?looking forward your reply.
InternalError (see above for traceback): Blas xGEMM launch failed : a.shape=[1,4096,3], b.shape=[1,3,4096], m=4096, n=4096, k=3
[[Node: tower_0/MatMul = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](tower_0/ExpandDims_1, tower_0/transpose)]]

About spatial transform in pytorch implementation

Hi,
I found that there is no spatial transform module in pytorch implementation. Is it useless for your model?
Thanks

Questions about the Input point cloud data dimension of the EdgeConv

Thank you very much for sharing the code！

I attempt to use to the dataset ModelNet40 (XYZ and normal ), which is 6 dimensions. when I changed the input_dim from 3 to 6, but it doesn't work.
Are the input point cloud data dimensions of Dgcnn/EdgeConv fixed into 3 in the code and paper?
How can I use the ModelNet40 (XYZ and normal ) with 6 input dimensions on Dgcnn/EdgeConv ?

Can you help me to solve this problem?
look forward to your reply！

Do we need multiple GPUs?

For ModelNet-40 classification task, how long do you guys need to run one epoch? On my PC, it took about 5 hours for 15 epochs with a Tesla k40c card. Do I need to use 2 or 3 GPUs to speed it up?

Question about seg in S3DIS

@WangYueFt Thanks for your Excellent work. I read the code about sem_seg, i meet a problem " ops['pointclouds_phs'][1]: current_data[start_idx_1:end_idx_1, :, :],
IndexError: list index out of range"
i am wondering about the ops, why is it necessary that the data is feeded twice, and how can i soleve the by changing the code?
feed_dict = {ops['pointclouds_phs'][0]: current_data[start_idx_0:end_idx_0, :, :],
ops['pointclouds_phs'][1]: current_data[start_idx_1:end_idx_1, :, :],
ops['labels_phs'][0]: current_label[start_idx_0:end_idx_0],
ops['labels_phs'][1]: current_label[start_idx_1:end_idx_1],
ops['is_training_phs'][0]: is_training,
ops['is_training_phs'][1]: is_training}

Looking forward to your reply!

The code is running super slow?

@WangYueFt @syb7573330 I could run the code successfully, but the code is running super slow. The speed is about 10 epochs/day. Do you have any idea about this problem or it is the normal speed for this code?

Training with my own data

Hi there,

Could you kindly provide the code for generating the pointcloud for training and testing?

I would like to train with my own data.

Thanks!

PyTorch Implementation different with origin paper.

Hi, Thanks for your sharing. I have a question about your get_graph_feature function of pytorch implementaion.

def get_graph_feature(x, k=20, idx=None):
    batch_size = x.size(0)
    num_points = x.size(2)
    x = x.view(batch_size, -1, num_points)
    if idx is None:
        idx = knn(x, k=k)   # (batch_size, num_points, k)
    device = torch.device('cuda')

    idx_base = torch.arange(0, batch_size, device=device).view(-1, 1, 1)*num_points

    idx = idx + idx_base

    idx = idx.view(-1)
 
    _, num_dims, _ = x.size()

    x = x.transpose(2, 1).contiguous()   # (batch_size, num_points, num_dims)  -> (batch_size*num_points, num_dims) #   batch_size * num_points * k + range(0, batch_size*num_points)
    feature = x.view(batch_size*num_points, -1)[idx, :]
    feature = feature.view(batch_size, num_points, k, num_dims) 
    x = x.view(batch_size, num_points, 1, num_dims).repeat(1, 1, k, 1)
    
    feature = torch.cat((feature, x), dim=3).permute(0, 3, 1, 2)
  
    return feature

I think the way you get the variable feature is different with the description in the paper. Here I think the feature is only h_j while in your paper you use h_j - h_i so you may do feature = feature - x before concatenation?

Dose knn neighbours xj include the center point xi?

In your paper:

dose knn neighbours xj 0r include the center point xi in the paper and code?

Cuda memory

Hello,

I am trying to run pytorch example on 6GB Cuda card and I get the following message:

RuntimeError: CUDA out of memory. Tried to allocate 640.00 MiB (GPU 0; 5.94 GiB total capacity; 4.54 GiB already allocated; 415.44 MiB free; 143.32 MiB cached)

How can we run the examples on 6GB cards?

Thanks

preparing .h5 format data

Hi @WangYueFt @syb7573330

Answers for former issues I wrote helped me a lot.

Now, I am trying to apply your model using my data.

I have (x,y,z) for point cloud and also I have mesh data.

I understand that to using pointnet or dgcnn, the file for input is .h5 file for points.

I really searched a lot for making .h5 file, but I failed.

What I am trying to do is,

prepare points data.( such as .obj, .off, .pts)
change data into .ply
.ply -> .h5
is it right?

Could you notice me some example codes for upper case ?

the code I tried:
https://github.com/charlesq34/pointnet/blob/master/utils/data_prep_util.py
https://github.com/IsaacGuan/PointNet-Plane-Detection/tree/master/data

sorry for my poor english skills
This issue may seem very stupid, but I would appreciate it if you could help it.

Categoryless Training/Testing for part segmentation

Hi,

Would like to ask if the DGCNN model is able to train/test without explicitly stating the category (e.g. plane, vase, mug) of the object/data. I've looked through the source code and i found that you have to explicitly indicate the category of the object before it is able to segment it.

Also if its able to achieve segmentation of N parts. The current implementation seems to have hard coded the number of parts for each category. I'm wondering if its able to achieve something similar to this https://www.youtube.com/watch?v=6JQdNzsw7jA

accuracy of ModelNet40

Hi,
I read your paper uploaded to arxiv long time ago in which the accuracy of ModelNet40 was reported as 92.2%. However, the accuracy becomes 92.9% in your ACM Transaction paper. So, I read your most-recent code again and compare it with the code I downloaded before, but I could not figure out what has been changed. Could you declare what contributes to the 0.7% improvement?

Thanks.

The order of convolution and max operation.

Hi, Thank you for your sharing. I have tried the code of PyTorch version DGCNN and got some questions on your implementation:

In model.py Why do you first do convolution on x (the return of get_graph_feature with dimension [batch_size, feature_dim * 2, num_points, k] and then do max operation on the last dimension instead of first calculate the max feature and then do 1d convolution ? Is there any insight on why this order is better ?
In my understanding, in get_graph_feature, we can just return return torch.max(feature, dim=-1, keepdim=False)[0] and then in the forward , there is no need to do max pool and we should use nn.Conv1d and nn.BatchNorm1d

        x = get_graph_feature(x, k=self.k)
        x = self.conv1(x)
        x1 = x

concatenate features in classification model

It seems that you did not mentioned that you concatenate feature in the first and second EdgeConv layer in your classification model. But in your code, it's appeared that the features output from the lower layer were
concatenate together.

 net = tf_util.conv2d(tf.concat([net1, net2, net3, net4], axis=-1), 1024, [1, 1], 
                       padding='VALID', stride=[1,1],
                       bn=True, is_training=is_training,
                       scope='agg', bn_decay=bn_decay)

I want to know that is it crucial for the classification results?