Yue Wang's personal website
wangyueft / dgcnn Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Yue Wang's personal website
https://github.com/WangYueFt/dgcnn/blob/master/pytorch/model.py#L100
using kernel size =(1,1)
why not use kernel size =(1,self.k)?
I run the pytorch code with the script
python main.py --exp_name=dgcnn_1024 --model=dgcnn --num_points=1024 --k=20 --use_sgd=True
And I always get results slightly worse than the reported results in the paper.
I used the best test results in the training process. Especially, for average acc (mean class acc), the gap with the reported ones is larger.
Are there any special settings or tricks in running the code?
Thanks in advance.
Traceback (most recent call last):
File "train.py", line 285, in
train()
File "train.py", line 234, in train
train_one_epoch(sess, ops, train_writer)
File "train.py", line 267, in train_one_epoch
ops['pointclouds_phs'][1]: current_data[start_idx_1:end_idx_1, :, :],
IndexError: list index out of range
It seems that you did not mentioned that you concatenate feature in the first and second EdgeConv layer in your classification model. But in your code, it's appeared that the features output from the lower layer were
concatenate together.
net = tf_util.conv2d(tf.concat([net1, net2, net3, net4], axis=-1), 1024, [1, 1],
padding='VALID', stride=[1,1],
bn=True, is_training=is_training,
scope='agg', bn_decay=bn_decay)
I want to know that is it crucial for the classification results?
Thank you very much for sharing the code!
I attempt to use to the dataset ModelNet40 (XYZ and normal )
, which is 6 dimensions. when I changed the input_dim from 3
to 6
, but it doesn't work.
Are the input point cloud data dimensions of Dgcnn/EdgeConv fixed into 3
in the code and paper?
How can I use the ModelNet40 (XYZ and normal )
with 6
input dimensions on Dgcnn/EdgeConv ?
Can you help me to solve this problem?
look forward to your reply!
I tried to run your script under pytorch, but there is no pretrained model for model.2048.t7.
Hi, thank you for sharing the code. In your paper, there is a categorical vector for semantic segmentation network.
However, I didn't find many descriptions of this vector in the paper or any implementation of this in the code.
dgcnn/tensorflow/sem_seg/model.py
Line 87 in 69b80a3
What is this categorical vector? Is it necessary?
Hi, thank you for opening your code.
I notice that there exist no rotation augmentation in the preprocess of the pytorch implementation , which is different with the tf implementation. However, the best accuracy achieve 93.1 .Does it means that the rotation augmentation is useless?
I want to construct a auto-encoder which use the dgcnn as the backbone of the encoder, therefore the absence of rotation augment in the modelnet classification really puzzle me. Does it mean that I needn't execute the rotation augment even though the dgcnn is used for reconstruction purpose which is different with classification?
Thanks a lot.
@WangYueFt I find that you compare the result with baseline in the paper. As you mentioned, the baseline is using fixed knn graph rather dynamic graph. So could you help me explain what is the difference between fixed knn graph and dynamic knn graph?
Hi,
Is there anyone who knows how to visualize the results of part segmentation? Many thanks for your help.
Best Regards
Frank
Hello,
I noticed that the architecture of your network has two branches for seg and cls, they split from the first edgeconv operation. In your code, I find the segmentation and classification are trained independently.
I wonder if there is a way to train seg and cls together with the different loss, just like Figure 3 shown in your paper?
Hi!
In tensorflow/sem_seg/model.py there seems to be undefined "weight_decay" variable used in tf_util.conv2d(...) layers while building model.
For example here:
out1 = tf_util.conv2d(edge_feature, 64, [1,1], padding='VALID', stride=[1,1], bn=True, is_training=is_training, weight_decay=weight_decay, scope='adj_conv1', bn_decay=bn_decay, is_dist=True)
Thanks for your time.
I am confused about the number of parameters.
The number of parameters of pytorch version is 1.8 M, but, 21M listed in your paper.
How to get it?
Thnks very much.
For ModelNet-40 classification task, how long do you guys need to run one epoch? On my PC, it took about 5 hours for 15 epochs with a Tesla k40c card. Do I need to use 2 or 3 GPUs to speed it up?
Hi, thanks for your helpful code. But just one painful question: I would like to integrate your code into my Keras implementation. The function of get_edge_feature is using batch_size, which is 'None' for Keras (dynamic for any batch_size). Is it possible to generate the point_cloud_neighbors without using your solution (point_cloud is reshaped to (-1, num_dims)). I guess tf.scan could work. But I am still struggling with that. Many thanks!
Hi, Thanks for your sharing. I have a question about your get_graph_feature
function of pytorch implementaion.
def get_graph_feature(x, k=20, idx=None):
batch_size = x.size(0)
num_points = x.size(2)
x = x.view(batch_size, -1, num_points)
if idx is None:
idx = knn(x, k=k) # (batch_size, num_points, k)
device = torch.device('cuda')
idx_base = torch.arange(0, batch_size, device=device).view(-1, 1, 1)*num_points
idx = idx + idx_base
idx = idx.view(-1)
_, num_dims, _ = x.size()
x = x.transpose(2, 1).contiguous() # (batch_size, num_points, num_dims) -> (batch_size*num_points, num_dims) # batch_size * num_points * k + range(0, batch_size*num_points)
feature = x.view(batch_size*num_points, -1)[idx, :]
feature = feature.view(batch_size, num_points, k, num_dims)
x = x.view(batch_size, num_points, 1, num_dims).repeat(1, 1, k, 1)
feature = torch.cat((feature, x), dim=3).permute(0, 3, 1, 2)
return feature
I think the way you get the variable feature
is different with the description in the paper. Here I think the feature is only h_j
while in your paper you use h_j - h_i
so you may do feature = feature - x
before concatenation?
Hi,
I am trying to reproduce your results showing in the paper with your code but I am not able to do it.
Would you mind releasing your trained model for shapenet part segmentation task?
Thanks!
Hi,
Is there anyone who achieved about 92% result? Why I only got 89%?
Many thanks.
Best Regards
Frank
Hi,when I run the tensorflow code.I just got the accuracy of 91.2% .I read the paper published in 2018,the result is as sama sa the baseline .I want to the resaon.thanks!
Hi, Thank you for your sharing. I have tried the code of PyTorch version DGCNN and got some questions on your implementation:
model.py
Why do you first do convolution on x (the return of get_graph_feature
with dimension [batch_size, feature_dim * 2, num_points, k]
and then do max operation on the last dimension instead of first calculate the max feature and then do 1d convolution ? Is there any insight on why this order is better ?get_graph_feature
, we can just return return torch.max(feature, dim=-1, keepdim=False)[0]
and then in the forward , there is no need to do max pool and we should use nn.Conv1d
and nn.BatchNorm1d
x = get_graph_feature(x, k=self.k)
x = self.conv1(x)
x1 = x
How calculate forward time?Can you give some use case?
I want to train the model with .stl data instead of .h5, how can I implement this.
Hi,
Thanks for releasing the code. I have an issuse about the estimation of "pairwise_distance". In some previous works, the k nearest neighbors are estimated by Euclidean distance such as |x - xi|. I'm confused about the implemention of "pairwise_distance", -i.e., "-xx - inner - xx.transpose(2, 1)". Could you please do me a favor?
Best regards
Thank you very much for your open source! However, I have a questi . Why the test.py point_num=3000 ,However the train_multi_gpu.py point_num=2048 of part_seg ?
Hi,
Would like to ask if the DGCNN model is able to train/test without explicitly stating the category (e.g. plane, vase, mug) of the object/data. I've looked through the source code and i found that you have to explicitly indicate the category of the object before it is able to segment it.
Also if its able to achieve segmentation of N parts. The current implementation seems to have hard coded the number of parts for each category. I'm wondering if its able to achieve something similar to this https://www.youtube.com/watch?v=6JQdNzsw7jA
Answers for former issues I wrote helped me a lot.
Now, I am trying to apply your model using my data.
I have (x,y,z) for point cloud and also I have mesh data.
I understand that to using pointnet or dgcnn, the file for input is .h5 file for points.
I really searched a lot for making .h5 file, but I failed.
What I am trying to do is,
Could you notice me some example codes for upper case ?
the code I tried:
https://github.com/charlesq34/pointnet/blob/master/utils/data_prep_util.py
https://github.com/IsaacGuan/PointNet-Plane-Detection/tree/master/data
sorry for my poor english skills
This issue may seem very stupid, but I would appreciate it if you could help it.
It seems that you did not mentioned that you concatenate feature in the first and second EdgeConv layer in your classification model. But in your code models/dgcnn.py, it's appeared that the features output from the lower layer were concatenate together. Then which way is accuracy 92.2% obtained?
net = tf_util.conv2d(tf.concat([net1, net2, net3, net4], axis=-1), 1024, [1, 1], padding='VALID', stride=[1,1], bn=True, is_training=is_training, scope='agg', bn_decay=bn_decay)
ModelNet40 data set has two form: aligened and unaligned. Which form is used in the paper DGCNN?
Hi!
Thanks for sharing the code. I am reading the paper and found that the implementation of the point cloud transform block (aka T-Net in the original point net paper) is different from what is mentioned in the code.
In the paper, as shown in Fig. 3, the coordinate difference the k nearest neighbor and the coordinates of the point is concatenated (therefor n x k x (3+3) = n x k x 6).
However in the code, there is one additional max pooling along the number of points axis, as compared to the original point net implementation. I do not quite understand why.
If someone can enlighten me or share their thought on this, I would be very grateful.
Edit:
I think I figured out why the input of the t-net is n x k x 6 as the input already went through feature transformation defined here. However I am still puzzled by the additional max pooling operation as compared to the original point net implementation.
Hello, Thank you for sharing this code, it's amazing!
Sorry, I have some question about train.py in sem_seg folder,
When I run "sh +x train_job.sh" ,
cmd show this code:
"Traceback (most recent call last):
File "train.py", line 289, in
train()
File "train.py", line 238, in train
train_one_epoch(sess, ops, train_writer)
File "train.py", line 271, in train_one_epoch
ops['pointclouds_phs'][1]: current_data[start_idx_1:end_idx_1, :, :],
IndexError: list index out of range"
I check train.py parameters, and find a probably reason for GPU use number:
parser.add_argument('--num_gpu', type=int, default=1, help='the number of GPUs to use [default: 2]')
I just one NVIDIA 1050Ti, so I change default=2 to 1,is that mean I just buy more graphics card to fix this question?
THANKS a lot!
Thanks for your awesome code share
I run the train.py code following readme step by step, but when I run python train.py, there is an error:KeyError: "Unable to open object (object 'data' doesn't exist)", here is details:
I solve all the problem of dependency but above error keep showing
I run the pointnet(https://github.com/charlesq34/pointnet) without error, however, I cannot run dgcnn...
please help me, so I can study about dgcnn more
sorry for my poor english skills
@WangYueFt Thanks for your Excellent work. I read the code about sem_seg, i meet a problem " ops['pointclouds_phs'][1]: current_data[start_idx_1:end_idx_1, :, :],
IndexError: list index out of range"
i am wondering about the ops, why is it necessary that the data is feeded twice, and how can i soleve the by changing the code?
feed_dict = {ops['pointclouds_phs'][0]: current_data[start_idx_0:end_idx_0, :, :],
ops['pointclouds_phs'][1]: current_data[start_idx_1:end_idx_1, :, :],
ops['labels_phs'][0]: current_label[start_idx_0:end_idx_0],
ops['labels_phs'][1]: current_label[start_idx_1:end_idx_1],
ops['is_training_phs'][0]: is_training,
ops['is_training_phs'][1]: is_training}
Looking forward to your reply!
Thanks for your code contribution! I read the paper, and noticed that the EdgeConv architecture (Figure 3) suggested for semantic segmentation is different from the one implemented in the code. In particular, the code uses two parallel pooling operations (max and mean instead of only max). Could you tell which one was used to obtain the results in the paper, and perhaps comment on reasons for the difference? Was using both max and mean aggregation better than only using either one alone?
@WangYueFt @syb7573330 I could run the code successfully, but the code is running super slow. The speed is about 10 epochs/day. Do you have any idea about this problem or it is the normal speed for this code?
Hello,thank you for your reply,when I try to run code about sem_seg,I meet this problem,and I have one gpu(8gmemory),can you tell me how to solve this problem?looking forward your reply.
InternalError (see above for traceback): Blas xGEMM launch failed : a.shape=[1,4096,3], b.shape=[1,3,4096], m=4096, n=4096, k=3
[[Node: tower_0/MatMul = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](tower_0/ExpandDims_1, tower_0/transpose)]]
I try to run: wget https://shapenet.cs.stanford.edu/media/indoor3d_sem_seg_hdf5_data.zip.
It failed and the website (https://shapenet.cs.stanford.edu/) seems to be closed.
Where else can I get these data? Or did I do something wrong?
Many thanks.
Hello Yue,
I have a question about transform networks, How to devise a transform net? I noticed your transform net in tensorflow implementation is almost the same as in charlesq/pointnet
implementation, while you use EdgeConv instead of 1x1 conv in your model.
Hi! I have some small questions about the code.
Thanks a lot if someone could help me!!!
Hi,
I found that there is no spatial transform module in pytorch implementation. Is it useless for your model?
Thanks
While examining provided codes,
I found out label smoothing term is applied for cross entroy loss.
However, there is no description about that in the paper.
I doubt for the effects of this term.. Is there any official results without label smoothing term
@WangYueFt @syb7573330 Could you help me explain why do we need to perform tf.concat here to combine the previous net? It seems didn't mention in the paper.
Also does anyone understands this line? Please help
Line 79 in 29948ad
Hello,
I am trying to run pytorch example on 6GB Cuda card and I get the following message:
RuntimeError: CUDA out of memory. Tried to allocate 640.00 MiB (GPU 0; 5.94 GiB total capacity; 4.54 GiB already allocated; 415.44 MiB free; 143.32 MiB cached)
How can we run the examples on 6GB cards?
Thanks
How do you visualize the extracted features?Can you send me your code about visualization of extracted features?My email is that [email protected] you very much.
Hi,
I read your paper uploaded to arxiv long time ago in which the accuracy of ModelNet40 was reported as 92.2%. However, the accuracy becomes 92.9% in your ACM Transaction paper. So, I read your most-recent code again and compare it with the code I downloaded before, but I could not figure out what has been changed. Could you declare what contributes to the 0.7% improvement?
Thanks.
The readme refers to running evaluate.py, but I couldn't find the script. I am guessing I am missing something obvious here? Where should I find the evaluate.py script?
Hi there,
Could you kindly provide the code for generating the pointcloud for training and testing?
I would like to train with my own data.
Thanks!
Hi,
I have read your paper and like your work very much. Your work is simple and effective, and I look forward to your pytorch implementation of segmentation experiments.
Thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.