Giter Site home page Giter Site logo

dgcnn's Introduction

Yue Wang's personal website

dgcnn's People

Contributors

hqucms avatar liuziwei7 avatar lukereichold avatar syb7573330 avatar wangyueft avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dgcnn's Issues

reproduce the classification result with pytorch

I run the pytorch code with the script
python main.py --exp_name=dgcnn_1024 --model=dgcnn --num_points=1024 --k=20 --use_sgd=True
And I always get results slightly worse than the reported results in the paper.
I used the best test results in the training process. Especially, for average acc (mean class acc), the gap with the reported ones is larger.
Are there any special settings or tricks in running the code?
Thanks in advance.

IndexError: list index out of range

Traceback (most recent call last):
File "train.py", line 285, in
train()
File "train.py", line 234, in train
train_one_epoch(sess, ops, train_writer)
File "train.py", line 267, in train_one_epoch
ops['pointclouds_phs'][1]: current_data[start_idx_1:end_idx_1, :, :],
IndexError: list index out of range

concatenate features in classification model

It seems that you did not mentioned that you concatenate feature in the first and second EdgeConv layer in your classification model. But in your code, it's appeared that the features output from the lower layer were
concatenate together.

 net = tf_util.conv2d(tf.concat([net1, net2, net3, net4], axis=-1), 1024, [1, 1], 
                       padding='VALID', stride=[1,1],
                       bn=True, is_training=is_training,
                       scope='agg', bn_decay=bn_decay)

I want to know that is it crucial for the classification results?

Questions about the Input point cloud data dimension of the EdgeConv

Thank you very much for sharing the code!

I attempt to use to the dataset ModelNet40 (XYZ and normal ), which is 6 dimensions. when I changed the input_dim from 3 to 6, but it doesn't work.
Are the input point cloud data dimensions of Dgcnn/EdgeConv fixed into 3 in the code and paper?
How can I use the ModelNet40 (XYZ and normal ) with 6 input dimensions on Dgcnn/EdgeConv ?

Can you help me to solve this problem?
look forward to your reply!

Where is the categorical vector?

Hi, thank you for sharing the code. In your paper, there is a categorical vector for semantic segmentation network.
image

However, I didn't find many descriptions of this vector in the paper or any implementation of this in the code.

out_max = tf_util.max_pool2d(out7, [num_point, 1], padding='VALID', scope='maxpool')

Here, after the max pooling, the global feature is directly repeated and concatenated with point features.

What is this categorical vector? Is it necessary?

There exist no rotation augmentation in the preprocess of the pytorch implementation

Hi, thank you for opening your code.
I notice that there exist no rotation augmentation in the preprocess of the pytorch implementation , which is different with the tf implementation. However, the best accuracy achieve 93.1 .Does it means that the rotation augmentation is useless?
I want to construct a auto-encoder which use the dgcnn as the backbone of the encoder, therefore the absence of rotation augment in the modelnet classification really puzzle me. Does it mean that I needn't execute the rotation augment even though the dgcnn is used for reconstruction purpose which is different with classification?
Thanks a lot.

Train seg & cls simultaneously

Hello,
I noticed that the architecture of your network has two branches for seg and cls, they split from the first edgeconv operation. In your code, I find the segmentation and classification are trained independently.
I wonder if there is a way to train seg and cls together with the different loss, just like Figure 3 shown in your paper?

Undefined "weight_decay" in tensorflow/sem_seg/model.py

Hi!
In tensorflow/sem_seg/model.py there seems to be undefined "weight_decay" variable used in tf_util.conv2d(...) layers while building model.

For example here:
out1 = tf_util.conv2d(edge_feature, 64, [1,1], padding='VALID', stride=[1,1], bn=True, is_training=is_training, weight_decay=weight_decay, scope='adj_conv1', bn_decay=bn_decay, is_dist=True)

The number of parameters of pytorch version

Thanks for your time.
I am confused about the number of parameters.
The number of parameters of pytorch version is 1.8 M, but, 21M listed in your paper.
How to get it?
Thnks very much.

Do we need multiple GPUs?

For ModelNet-40 classification task, how long do you guys need to run one epoch? On my PC, it took about 5 hours for 15 epochs with a Tesla k40c card. Do I need to use 2 or 3 GPUs to speed it up?

Any idea of the get_edge_feature function for unknown batch size

Hi, thanks for your helpful code. But just one painful question: I would like to integrate your code into my Keras implementation. The function of get_edge_feature is using batch_size, which is 'None' for Keras (dynamic for any batch_size). Is it possible to generate the point_cloud_neighbors without using your solution (point_cloud is reshaped to (-1, num_dims)). I guess tf.scan could work. But I am still struggling with that. Many thanks!

PyTorch Implementation different with origin paper.

Hi, Thanks for your sharing. I have a question about your get_graph_feature function of pytorch implementaion.

def get_graph_feature(x, k=20, idx=None):
    batch_size = x.size(0)
    num_points = x.size(2)
    x = x.view(batch_size, -1, num_points)
    if idx is None:
        idx = knn(x, k=k)   # (batch_size, num_points, k)
    device = torch.device('cuda')

    idx_base = torch.arange(0, batch_size, device=device).view(-1, 1, 1)*num_points

    idx = idx + idx_base

    idx = idx.view(-1)
 
    _, num_dims, _ = x.size()

    x = x.transpose(2, 1).contiguous()   # (batch_size, num_points, num_dims)  -> (batch_size*num_points, num_dims) #   batch_size * num_points * k + range(0, batch_size*num_points)
    feature = x.view(batch_size*num_points, -1)[idx, :]
    feature = feature.view(batch_size, num_points, k, num_dims) 
    x = x.view(batch_size, num_points, 1, num_dims).repeat(1, 1, k, 1)
    
    feature = torch.cat((feature, x), dim=3).permute(0, 3, 1, 2)
  
    return feature

I think the way you get the variable feature is different with the description in the paper. Here I think the feature is only h_j while in your paper you use h_j - h_i so you may do feature = feature - x before concatenation?

part segmentation

Hi,

I am trying to reproduce your results showing in the paper with your code but I am not able to do it.
Would you mind releasing your trained model for shapenet part segmentation task?

Thanks!

accuracy issue

Hi,
Is there anyone who achieved about 92% result? Why I only got 89%?

Many thanks.

Best Regards
Frank

accuracy about classification

Hi,when I run the tensorflow code.I just got the accuracy of 91.2% .I read the paper published in 2018,the result is as sama sa the baseline .I want to the resaon.thanks!

The order of convolution and max operation.

Hi, Thank you for your sharing. I have tried the code of PyTorch version DGCNN and got some questions on your implementation:

  1. In model.py Why do you first do convolution on x (the return of get_graph_feature with dimension [batch_size, feature_dim * 2, num_points, k] and then do max operation on the last dimension instead of first calculate the max feature and then do 1d convolution ? Is there any insight on why this order is better ?
    In my understanding, in get_graph_feature, we can just return return torch.max(feature, dim=-1, keepdim=False)[0] and then in the forward , there is no need to do max pool and we should use nn.Conv1d and nn.BatchNorm1d
        x = get_graph_feature(x, k=self.k)
        x = self.conv1(x)
        x1 = x

An issue about the pairwise_distance

Hi,
Thanks for releasing the code. I have an issuse about the estimation of "pairwise_distance". In some previous works, the k nearest neighbors are estimated by Euclidean distance such as |x - xi|. I'm confused about the implemention of "pairwise_distance", -i.e., "-xx - inner - xx.transpose(2, 1)". Could you please do me a favor?

Best regards

Categoryless Training/Testing for part segmentation

Hi,

Would like to ask if the DGCNN model is able to train/test without explicitly stating the category (e.g. plane, vase, mug) of the object/data. I've looked through the source code and i found that you have to explicitly indicate the category of the object before it is able to segment it.

Also if its able to achieve segmentation of N parts. The current implementation seems to have hard coded the number of parts for each category. I'm wondering if its able to achieve something similar to this https://www.youtube.com/watch?v=6JQdNzsw7jA

preparing .h5 format data

Hi @WangYueFt @syb7573330

Answers for former issues I wrote helped me a lot.

Now, I am trying to apply your model using my data.

I have (x,y,z) for point cloud and also I have mesh data.

I understand that to using pointnet or dgcnn, the file for input is .h5 file for points.

I really searched a lot for making .h5 file, but I failed.

What I am trying to do is,

  1. prepare points data.( such as .obj, .off, .pts)
  2. change data into .ply
  3. .ply -> .h5
    is it right?

Could you notice me some example codes for upper case ?


the code I tried:
https://github.com/charlesq34/pointnet/blob/master/utils/data_prep_util.py
https://github.com/IsaacGuan/PointNet-Plane-Detection/tree/master/data


sorry for my poor english skills
This issue may seem very stupid, but I would appreciate it if you could help it.

Questions About the classification of ModelNet40

  1. It seems that you did not mentioned that you concatenate feature in the first and second EdgeConv layer in your classification model. But in your code models/dgcnn.py, it's appeared that the features output from the lower layer were concatenate together. Then which way is accuracy 92.2% obtained?
    net = tf_util.conv2d(tf.concat([net1, net2, net3, net4], axis=-1), 1024, [1, 1], padding='VALID', stride=[1,1], bn=True, is_training=is_training, scope='agg', bn_decay=bn_decay)

  2. ModelNet40 data set has two form: aligened and unaligned. Which form is used in the paper DGCNN?

Point cloud transform block different from paper

Hi!

Thanks for sharing the code. I am reading the paper and found that the implementation of the point cloud transform block (aka T-Net in the original point net paper) is different from what is mentioned in the code.

In the paper, as shown in Fig. 3, the coordinate difference the k nearest neighbor and the coordinates of the point is concatenated (therefor n x k x (3+3) = n x k x 6).

However in the code, there is one additional max pooling along the number of points axis, as compared to the original point net implementation. I do not quite understand why.

If someone can enlighten me or share their thought on this, I would be very grateful.

Edit:
I think I figured out why the input of the t-net is n x k x 6 as the input already went through feature transformation defined here. However I am still puzzled by the additional max pooling operation as compared to the original point net implementation.

"The number of GPUs to use" in sem_seg with train.py

Hello, Thank you for sharing this code, it's amazing!
Sorry, I have some question about train.py in sem_seg folder,
When I run "sh +x train_job.sh" ,
cmd show this code:
"Traceback (most recent call last):
File "train.py", line 289, in
train()
File "train.py", line 238, in train
train_one_epoch(sess, ops, train_writer)
File "train.py", line 271, in train_one_epoch
ops['pointclouds_phs'][1]: current_data[start_idx_1:end_idx_1, :, :],
IndexError: list index out of range"

I check train.py parameters, and find a probably reason for GPU use number:
parser.add_argument('--num_gpu', type=int, default=1, help='the number of GPUs to use [default: 2]')
I just one NVIDIA 1050Ti, so I change default=2 to 1,is that mean I just buy more graphics card to fix this question?
THANKS a lot!

KeyError: "Unable to open object (object 'data' doesn't exist)"

Thanks for your awesome code share

I run the train.py code following readme step by step, but when I run python train.py, there is an error:KeyError: "Unable to open object (object 'data' doesn't exist)", here is details:

image

I solve all the problem of dependency but above error keep showing

I run the pointnet(https://github.com/charlesq34/pointnet) without error, however, I cannot run dgcnn...

please help me, so I can study about dgcnn more

sorry for my poor english skills

Question about seg in S3DIS

@WangYueFt Thanks for your Excellent work. I read the code about sem_seg, i meet a problem " ops['pointclouds_phs'][1]: current_data[start_idx_1:end_idx_1, :, :],
IndexError: list index out of range"
i am wondering about the ops, why is it necessary that the data is feeded twice, and how can i soleve the by changing the code?
feed_dict = {ops['pointclouds_phs'][0]: current_data[start_idx_0:end_idx_0, :, :],
ops['pointclouds_phs'][1]: current_data[start_idx_1:end_idx_1, :, :],
ops['labels_phs'][0]: current_label[start_idx_0:end_idx_0],
ops['labels_phs'][1]: current_label[start_idx_1:end_idx_1],
ops['is_training_phs'][0]: is_training,
ops['is_training_phs'][1]: is_training}

Looking forward to your reply!

Semantic segmentation: model differs from paper

Thanks for your code contribution! I read the paper, and noticed that the EdgeConv architecture (Figure 3) suggested for semantic segmentation is different from the one implemented in the code. In particular, the code uses two parallel pooling operations (max and mean instead of only max). Could you tell which one was used to obtain the results in the paper, and perhaps comment on reasons for the difference? Was using both max and mean aggregation better than only using either one alone?

InternalError (see above for traceback): Blas xGEMM launch failed

Hello,thank you for your reply,when I try to run code about sem_seg,I meet this problem,and I have one gpu(8gmemory),can you tell me how to solve this problem?looking forward your reply.
InternalError (see above for traceback): Blas xGEMM launch failed : a.shape=[1,4096,3], b.shape=[1,3,4096], m=4096, n=4096, k=3
[[Node: tower_0/MatMul = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](tower_0/ExpandDims_1, tower_0/transpose)]]

Questions about transform nets

Hello Yue,
I have a question about transform networks, How to devise a transform net? I noticed your transform net in tensorflow implementation is almost the same as in charlesq/pointnet implementation, while you use EdgeConv instead of 1x1 conv in your model.

Why bias=None in conv2d?

Hi! I have some small questions about the code.

  1. I noticed that bias is set to None in both PointNet and dgcnn. I'm curious why we do that? I can't find the explanation in these papers.
  2. Do we need to apply batch normalization on the input (before conv1)?

Thanks a lot if someone could help me!!!

Serious problem. About label smoothing in loss

While examining provided codes,

I found out label smoothing term is applied for cross entroy loss.

However, there is no description about that in the paper.

I doubt for the effects of this term.. Is there any official results without label smoothing term

Cuda memory

Hello,

I am trying to run pytorch example on 6GB Cuda card and I get the following message:

RuntimeError: CUDA out of memory. Tried to allocate 640.00 MiB (GPU 0; 5.94 GiB total capacity; 4.54 GiB already allocated; 415.44 MiB free; 143.32 MiB cached)

How can we run the examples on 6GB cards?

Thanks

visualization problem

Hi,
Thanks for sharing the work! I have some problem with visualization. how to visualize the results such as figure 4 in your paper
image
Thanks a lot!

accuracy of ModelNet40

Hi,
I read your paper uploaded to arxiv long time ago in which the accuracy of ModelNet40 was reported as 92.2%. However, the accuracy becomes 92.9% in your ACM Transaction paper. So, I read your most-recent code again and compare it with the code I downloaded before, but I could not figure out what has been changed. Could you declare what contributes to the 0.7% improvement?

Thanks.

What's the evaluation script?

The readme refers to running evaluate.py, but I couldn't find the script. I am guessing I am missing something obvious here? Where should I find the evaluate.py script?

Training with my own data

Hi there,

Could you kindly provide the code for generating the pointcloud for training and testing?

I would like to train with my own data.

Thanks!

segmentation experiments

Hi,
I have read your paper and like your work very much. Your work is simple and effective, and I look forward to your pytorch implementation of segmentation experiments.

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.