✧ Homepage: https://yanx27.github.io/
✧ Google Scholar: https://scholar.google.com.hk/citations?hl=zh-CN&user=TK4Ty0gAAAAJ
PointNet and PointNet++ implemented by pytorch (pure python) and on ModelNet, ShapeNet and S3DIS.
License: MIT License
✧ Homepage: https://yanx27.github.io/
✧ Google Scholar: https://scholar.google.com.hk/citations?hl=zh-CN&user=TK4Ty0gAAAAJ
I have the following error when trying to train classification. I tried to reduce batch size and number of points and I always have the issue.
Traceback (most recent call last):
File "train_cls.py", line 209, in
main(args)
File "train_cls.py", line 171, in main
loss.backward()
File "...site-packages\torch\tensor.py", line 195, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "...site-packages\torch\autograd_init_.py", line 99, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.
My conda environment is:
blas 1.0 mkl
ca-certificates 2019.11.27 0 anaconda
certifi 2019.11.28 py37_0 anaconda
cffi 1.13.2 py37h7a1dbc1_0
cudatoolkit 10.1.243 h74a9793_0
cudnn 7.6.5 cuda10.1_0 anaconda
freetype 2.9.1 ha9979f8_1
icc_rt 2019.0.0 h0cc432a_1
intel-openmp 2019.4 245
jpeg 9b vc14h4d7706e_1 [vc14] anaconda
libpng 1.6.37 h2a8f88b_0
libtiff 4.1.0 h56a325e_0
mkl 2019.4 245
mkl-service 2.3.0 py37hb782905_0
mkl_fft 1.0.15 py37h14836fe_0
mkl_random 1.1.0 py37h675688f_0
ninja 1.9.0 py37h74a9793_0
numpy 1.18.1 py37h93ca92e_0
numpy-base 1.18.1 py37hc3f5095_1
olefile 0.46 py37_0
openssl 1.1.1 he774522_0 anaconda
pillow 5.2.0 py37h08bbbbd_0
pip 20.0.2 py37_0
pycparser 2.19 py37_0
python 3.7.6 h60c2a47_2
pytorch 1.4.0 py3.7_cuda101_cudnn7_0 pytorch
setuptools 45.1.0 py37_0
six 1.14.0 py37_0
sqlite 3.30.1 he774522_0
tk 8.6.7 vc14hb68737d_1 [vc14] anaconda
torchvision 0.5.0 py37_cu101 pytorch
tqdm 4.42.0 py_0
vc 14.1 h0510ff6_4
vs2015_runtime 14.16.27012 hf0eaf9b_1
wheel 0.33.6 py37_0
wincertstore 0.2 py37_0
xz 5.2.4 h2fa13f4_4
zlib 1.2.11 vc14h1cdd9ab_1 [vc14] anaconda
zstd 1.3.7 h508b16e_0
Any idea?
Hi @yanx27 first of all thank you for this helpful work. I was first following the official tensorflow repository by the original authors. But it didn't work well for me. I later moved to your repository (PyTorch implementation) and I was able to do the training process without issues. I just have one question about the visualization. I was of the assumption that the show3d_balls.py file is supposed to help us visualize the output of segmentation like the way you have shown in the README.md file. But I find that the visualization code is not complete yet. Please correct me if I am wrong. What did you do to visualize the results of training, the way you have shown in the README.md file ? Did you use any external tools ?
Thank you for your good work!
But when training,i get some errors:
Traceback (most recent call last):
File "train_clf.py", line 14, in
from data_utils.ModelNetDataLoader import ModelNetDataLoader, load_data
ImportError: No module named data_utils.ModelNetDataLoader
I don't know what's the problem and can you kindly tell me how to solve it?
Hi,
I found no validation data is used in your code. I wonder whether it is fair to compare the accuracy with the original paper without using validation data?
Thanks!
With model Pointnet_cls_msg.py, it seems that the cuda memory is not enough with the dataset.
I wonder if you use only just one gpu?
I runned the code on modelnet40 with ubuntu16.04 pytorch1.1 and a 1070. But I can't get the score that you report, I get only 88.5. Is that the dataset problem, because I used the dataset that automatically downloaded by the official Pointnet code which is 416M instead of the dataset you provided which is 1.9G. Or is the effect of input transform or feature transform. By the way my batchsize is 24. Really dont know where is the problem.
Thank you for your good work!
I am interested in the configuration to get 0.52 mIoU in S3DIS dataset, like number of GPU and multi-scale or single-scale except for batchsize and learning rate. And more, is it convenient for you to provide checkpoint.pth of 0.52 mIoU?
In the classification task, the function 'farthest_point_sample()' is defined to uniformly resample the point clouds, but why the flag 'uniform' is set False?
Hi, @yanx27 ,
Which version of s3dis dataset should be used to test your package (Pointnet_Pointnet2_pytorch)?
Stanford3dDataset_v1.2.zip or Stanford3dDataset_v1.2_Aligned_Version.zip?
According to indoor3d_util.py, Stanford3dDataset_v1.2_Aligned_Version
is mentioned. However, the guide mentions Stanford3dDataset_v1.2
.
Pls clarify on which version of s3dis dataset should be used.
THX!
Hi,
I noticed that the input to the semantic segmentation network has 9 dimensional features. Apart from xyz coordinate and normal, what is the feature meaning of the rest 3 dimensions. Is it RGB ?
Thank you
Cheers
when i run the train_clf.py
KeyError: "Unable to open object (object 'data' doesn't exist)"
how to solve this problem?
thank you
Have you tested the speed gap between your implementation(I notice that all the implement is pure python) and the cuda implementation of other source code?
I just curious about it. How you can give me a reply. Thanks!
Hi, @yanx27 ,
It is found that XYZ coordinates are subtracted by minimum coordinates for coordinate normalization in indoor3d_util.py. Why don't use mean coordinates for coordinate normalization? It seems mean coordinates is much reasonable.
THX!
when I tried point2_sem_seg model to other dataset, the error occurs:
device-side assert from cuda
this problem probably caused by index out of bound when I searched google. And after I tried run model on CPU mode, it clearly pointed out which line the error occurs:
==========index_points=========
points.shape: torch.Size([8, 512, 3])
idx.shape: torch.Size([8, 1024])
view_shape: [8, 1]
repeat_shape: [1, 1024]
==========index_points=========
points.shape: torch.Size([8, 512, 3])
idx.shape: torch.Size([8, 1024, 32])
view_shape: [8, 1, 1]
repeat_shape: [1, 1024, 32]
Traceback (most recent call last):
File "train_semseg.py", line 277, in <module>
main(args)
File "train_semseg.py", line 180, in main
seg_pred, trans_feat = classifier(points)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/idriver/work/wt/nuRadarScenes/models/pointnet2_sem_seg.py", line 26, in forward
l1_xyz, l1_points = self.sa1(l0_xyz, l0_points)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/idriver/work/wt/nuRadarScenes/models/pointnet_util.py", line 202, in forward
new_xyz, new_points = sample_and_group(self.npoint, self.radius, self.nsample, xyz, points)
File "/home/idriver/work/wt/nuRadarScenes/models/pointnet_util.py", line 135, in sample_and_group
grouped_xyz = index_points(xyz, idx) # [B, npoint, nsample, C]
File "/home/idriver/work/wt/nuRadarScenes/models/pointnet_util.py", line 64, in index_points
new_points = points[batch_indices, idx, :]
RuntimeError: index 512 is out of bounds for dim with size 512
so the idx
is out of bound in function index_points
where idx
produced in function blow:
def query_ball_point(radius, nsample, xyz, new_xyz):
"""
Input:
radius: local region radius
nsample: max sample number in local region
xyz: all points, [B, N, 3]
new_xyz: query points, [B, S, 3]
Return:
group_idx: grouped points index, [B, S, nsample]
"""
device = xyz.device
B, N, C = xyz.shape
_, S, _ = new_xyz.shape
group_idx = torch.arange(N, dtype=torch.long).to(device).view(1, 1, N).repeat([B, S, 1])
sqrdists = square_distance(new_xyz, xyz)
group_idx[sqrdists > radius ** 2] = N
group_idx = group_idx.sort(dim=-1)[0][:, :, :nsample]
group_first = group_idx[:, :, 0].view(B, S, 1).repeat([1, 1, nsample])
mask = group_idx == N
group_idx[mask] = group_first[mask]
return group_idx
when I change group_idx[sqrdists > radius ** 2] = N-1
and mask = group_idx == N-1
, it works!
so is that right ?
Dear author, thank you for sharing the code.
How to visualize the sementic segmentation results of S3DIS dataset?
Could you provide some suggestion for me ?
i visualized some shape(point clound file ) and found that most of them are alignd
I have trained pointnet++ using below command.
python train_partseg.py --model_name pointnet2
I see it also gives the test accuracies after each epoch and the final best test accuracy.
Now if I have to only test the accuracy again, how can I do it?
Hi, @yanx27 ,
I had followed the steps on training s3dis (semantic segmentation) dataset and performed testing. However, the test result only gives 0.329263
mIOU (use pointnet2_sem_seg model), which is quite lower than the reported 0.532
mIOU. And I found that the trained model performs good segmentation on class ceiling, floor, wall, while it performs almost zero accuracy on class bean, column, window. Is there something wrong in the dataset preparation?
Here are the logs on my training and testing: pointnet2_sem_seg.txt, eval.txt
THX!
@yanx27 Thanks for this very useful repo.
Im trying to use test code for semantic segmentation in test_semseg.py.
For the following block:
I am getting below error:
Traceback (most recent call last): File "test_semseg.py", line 204, in <module> main(args) File "test_semseg.py", line 133, in main batch_data[0:real_batch_size, ...] = scene_data[start_idx:end_idx, ...] TypeError: can't assign a numpy.ndarray to a Variable[CUDAType]
I could see that because within the for loop, once the variable batch_data has been converted to cuda tensor in the very first iteration of for loop, I believe the same cuda tesnor is being assing with new set of block data via scene_data[start_idx:end_idx, ...] which is still a numpy array. Could you please confirm if my understanding about the error is correct. I can rewrite the loop accordingly.
TIA
Hello,
Thanks a lot for the accessible code.
Can you please share with us the pretrained model for the classification test on ModelNet?
Thanks a lot
Hello.
I have questions as mentioned in title.
how the options of 'num_vote' are used for reporting your results?
and is the implementation of 'num_vote' are appropriate? does the randomness are inside the 'classifier' instance?
cf. original implementation of pointnet++ of authers include randomness when loading test data.
If answered, it will be much more plausible to others.
thank you.
Two questions about the visualiztion part.
pc_utils.py
can not run properly. Base on the code, the reason is load_data
function is not defined in ShapeNetDataLoader.py
.
what is the functionality for the show3d_balls.py
file? According to the code, is it correct that showpoints
function is to show point cloud with different color according the points' gt and pred labels? Could u give a illustrated example w. gt and pred labels?
Thanks!
hello, I appreciate this repository very much. I found that you use 'Variable' in your code. What is the version of the pytorch? Thank you very much!
I want to use your code to test Pointnet2 network, at this point I input data as point cloud file, its format as [1,3,2048].It appear error:"\model\pointnet_util.py", line 79, in farthest_point_sample centroid = xyz[batch_indices, farthest, :].view( B, 1, 3) RuntimeError: shape '[1, 1, 3]' is invalid for input of size 2047.
Thanks for your work!
The data cache for the ModelNet40 dataloader is not working, see https://discuss.pytorch.org/t/dataloader-re-initialize-dataset-after-each-iteration/32658/4
I'd recommend adding the supported or suggested environment configuration for the beginners who want to give the code a try.
Pointnet_Pointnet2_pytorch/model/pointnet.py
Line 242 in 31deedb
nll_loss
is used but the input labels_pred
from here is not a log_softmax
, causing a negative label loss
value.
Pointnet_Pointnet2_pytorch/model/pointnet.py
Line 209 in 31deedb
Either using a log_softmax
before nll_loss
or change the nll_loss
to cross_entropy
can solve it.
Hi @yanx27 ! Thank you for your work!
There are two classes relating to segmentation, can they both be used for segmentation task?
I find that you only use class PointNet2PartSeg
in train_seg.py
.
Hi,thanks for your code, it's really nice.
I have one question about the the evaluation method of part seg. I noticed in the 111
line of test_partseg.py
, you calc the acc by following code:
cur_pred_val[i, :] = np.argmax(logits[:, seg_classes[cat]], 1) + seg_classes[cat][0]
This assumes that, during test we already know which parts each category contains. I think it may reduce the difficulty during test. Is this implementation the same as official?
Hi, when I run the python train_semseg.py --gpu 1 --log_dir pointnet_sem_seg. I find the following error:
RuntimeError: Given groups=1, weight of size 64 6 1, expected input[16, 9, 4096] to have 6 channels, but got 9 channels instead
Could you mind telling the solution? Thanks a lot!
Hi,
I noticed, that the data path for semantic segmentation is expected to be
data/s3dis/Stanford3dDataset_v1.2_Aligned_Version/
(DATA_PATH = os.path.join(ROOT_DIR, 'data','s3dis', 'Stanford3dDataset_v1.2_Aligned_Version'))
unlike the one mentioned in the Readme, which is
data/Stanford3dDataset_v1.2_Aligned_Version/
Best regards,
Philipp
Thank you for your good work!
i want view shapenet dataset,but i run to get next.......
from data_utils.ShapeNetDataLoader import load_data ImportError: cannot import name 'load_data'
Hi,
running train_semseg.py as instructed in the readme gives me following errors
python train_semseg.py --model pointnet2_sem_seg --test_area 5 --log_dir pointnet2_sem_seg
PARAMETER ...
Namespace(batch_size=16, decay_rate=0.0001, epoch=128, gpu='0', learning_rate=0.001, log_dir='pointnet2_sem_seg', lr_decay=0.7, model='pointnet2_sem_seg', npoint=4096, optimizer='Adam', step_size=10, test_area=5)
start loading training data ...
[1.124833 1.1816078 1. 2.2412012 2.340336 2.343587 1.7070498
2.0335796 1.8852289 3.8252103 1.7948895 2.7857335 1.3452303]
Totally 47623 samples in train set.
start loading test data ...
[1.1381457 1.2059734 1. 9.996554 2.5299199 2.0086675 2.1162353
1.9657742 2.4815738 4.727607 1.4018297 2.8840992 1.4809785]
Totally 18923 samples in test set.
The number of training data is: 47623
The number of test data is: 18923
No existing model, starting training from scratch...
**** Epoch 1 (1/128) ****
Learning rate:0.001000
BN momentum updated to: 0.100000
Traceback (most recent call last):
File "train_semseg.py", line 274, in
main(args)
File "train_semseg.py", line 168, in main
for i, data in tqdm(enumerate(trainDataLoader), total=len(trainDataLoader), smoothing=0.9):
File "C:\Users\API\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\utils\data\dataloader.py", line 278, in iter
return _MultiProcessingDataLoaderIter(self)
File "C:\Users\API\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\utils\data\dataloader.py", line 682, in init
w.start()
File "C:\Users\API\AppData\Local\Programs\Python\Python36\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\API\AppData\Local\Programs\Python\Python36\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\API\AppData\Local\Programs\Python\Python36\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\API\AppData\Local\Programs\Python\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Users\API\AppData\Local\Programs\Python\Python36\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'main..'
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\API\AppData\Local\Programs\Python\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\API\AppData\Local\Programs\Python\Python36\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
any ideas ? thank you
Getting the following error in the feature transform regularizer for train_sem_seg.py.
RuntimeError: Could not run 'aten::conj.out' with arguments from the 'CUDATensorId' backend. 'aten::conj.out' is only available for these backends: [CPUTensorId, VariableTensorId].
Any suggestions for how to fix?
when I trained train_partset.py with the command:
python train_partseg.py --multi_gpu="1, 2" --model_name='pointnet2' --batchsize=16 --epoch=130 --step_size=30 --optimizer='Adam'
the program will stop at the first iteration of progress bar, and I even can't kill the process.
While when I trained on a single GPU with:
python train_partseg.py --gpu="2" --model_name='pointnet2' --batchsize=16 --epoch=130 --step_size=30 --optimizer='Adam'
it can be run successfully.
I don't know what's the problem and can you kindly tell me how to solve it?
Hello,
Thanks for your great work!
I have a small question that why you switch the 2nd and 3rd channel of the input points during training? (e.g. line 161 in train_cls.py)
@yanx27 thanks for your work!
are there any differences between tf_utils (such as FPS, ballquery) of pytorch version and cuda version
The mean class accuracy is not seen in the paper, may I ask why this is?
It is difficult to obtain the accuracy of 90.7 in the repetition experiment, so I want to consult the mean class accuracy.
Thanks!
Dear author, thank you for sharing the code.
When i run the train_semseg.py--model pointnet2_sem_seg --test_area 5 --log_dir pointnet2_sem_seg,
There is an error: Given groups=1, weight of size 32 9 1 1, expected input[16, 12, 32, 1024] to have 9 channels, but got 12 channels instead.
I used the Stanford3dDataset_v1.2_Aligned_Version dataset.
I don't know how to solve the problem and can you kindly tell me how to solve it?
thank you
The problem I have solved. thank you.
Hello, is there any one got pointnet++ part-seg results, as noted in README.md?
I only get nearby ~=79 / ~=84 for avg-class miou / avg-ins miou.
avg-ins miou is acceptable, i don't know why avg-class miou is deterior.
`def forward(self, xyz, cls_label):
# Set Abstraction layers
B,C,N = xyz.shape
if self.normal_channel:
l0_points = xyz
l0_xyz = xyz[:,:3,:]
else:
l0_points = xyz
l0_xyz = xyz
`
In here, why dose l0_point include the poiition vector as well? Should'nt it be 3 for normal and None otherwise?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.