Giter Site home page Giter Site logo

pytorchemd's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

pytorchemd's Issues

EMD is large and remains unchanged between epoches

Hi! Thank you for your great work!
I have some questions and need your help.

  1. I got a large EMD (from 1001000) when training the model, and it didn't change throughout training (with changes smaller than 1). I divided the EMD loss by the number of points (12001600, with centerilze but without normalize). I wonder if I need to normalize the point coordinates between 0~1 before send it into EMD?
  2. In tese_emd_loss.py, loss is computed by 'loss = d[0] / 2 + d[1] * 2 + d[2] / 3'. I wonder why the coefficients are different among different dims of d ?

Could you please give some advice?
Thanks!

Any updates for CUDA?

Is there any way to install this with CUDA 11.8 and PyTorch 2.2? I've followed the tips in #6 and was able to install but when running I get the following error:

    match = emd_cuda.approxmatch_forward(xyz1, xyz2)

     RuntimeError: Unknown layout

EMD runtime error: Please let me know what is the error as the same code works perfectly in a different project on the same PC

match = emd_cuda.approxmatch_forward(xyz1, xyz2)
IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

This is the error I am getting after input :

EMD = earth_mover_distance().cuda()
np1 = np.asarray(TSS.points)
np2 = np.asarray(RSS.points)
T1 = torch.from_numpy(np1).type(torch.float32).cuda()
T2 = torch.from_numpy(np2).type(torch.float32).cuda()
l1 = EMD(T2, T2)

RuntimeError: CUDA error: an illegal memory access was encountered

hi,
Thanks for your excellent work!
However, when the number of points increased, I encountered a problem:

'''
p1 = torch.rand(1,50000, 3).cuda()
p2 = torch.rand(1,50000, 3).cuda()

d = earth_mover_distance(p1, p2, transpose=False)
print(d)

'''
RuntimeError: CUDA error: an illegal memory access was encountered

Can you provide any suggestions?

Any alternatives for Mac

Hey, I'm using a mac, as this repository is dependent on cuda, I'm unable to setup the project. It would really help if someone knows of any alternative for mac. Thanks.

What is the range value of EMD distance?

heelo, @daerduoCarey

I read and test the code of EMD distance, it seems that the range value of this distance is summed by n*m, so if the n and m is too large(such as >=10000), then EMD distance is also very large.

should we normalize it divided by n*m if used EMD as a loss? thanks!

Updated CUDA and PyTorch versions

Hello, I am interested in this code but I have CUDA 10.1 version and PyTorch 1.5. Is there a way to modify the setup accordingly ?

Thank you in advance

cuda runtime error (98) : invalid device function

THCudaCheck FAIL file=/home/user/hdd/github/PyTorchEMD/cuda/emd_kernel.cu line=190 error=98 : invalid device function Traceback (most recent call last): File "test_emd_loss.py", line 35, in <module> d = earth_mover_distance(p1, p2, transpose=False) File "/home/user/hdd/github/PyTorchEMD/emd.py", line 44, in earth_mover_distance cost = EarthMoverDistanceFunction.apply(xyz1, xyz2) File "/home/user/hdd/github/PyTorchEMD/emd.py", line 11, in forward match = emd_cuda.approxmatch_forward(xyz1, xyz2) RuntimeError: cuda runtime error (98) : invalid device function at /home/user/hdd/github/PyTorchEMD/cuda/emd_kernel.cu:190 Segmentation fault (core dumped)

I met this error when execute test_emd_loss.py after successfully compiling. The environment is Pytorch 1.7, CUDA 10.2, Ubuntu 18.04
Google for a while, I realize that the cuda versions are inconsistent between my conda env(cudatoolkit=10.2) and local CUDA (/usr/local/cuda-10.0).
After reinstalling CUDA 10.2 for the local machine, it works.

Just record it for anything helpful.

installation problem

Thank you for your nice code! However, I met an installation problem, so that I cannot use your code. Could you help me to check how to fix this problem?

I use Ubuntu16.04, Pytorch 1.1, GCC 5.5, CUDA 9.0. Should I change GCC version?
Here's the installation log:

running install
running bdist_egg
running egg_info
writing emd_ext.egg-info/PKG-INFO
writing dependency_links to emd_ext.egg-info/dependency_links.txt
writing top-level names to emd_ext.egg-info/top_level.txt
reading manifest file 'emd_ext.egg-info/SOURCES.txt'
writing manifest file 'emd_ext.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
building 'emd_cuda' extension
creating build
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/cuda
gcc -pthread -B /home/pm/anaconda3/envs/PyTorchEMD/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/pm/anaconda3/envs/PyTorchEMD/lib/python3.6/site-packages/torch/include -I/home/pm/anaconda3/envs/PyTorchEMD/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/pm/anaconda3/envs/PyTorchEMD/lib/python3.6/site-packages/torch/include/TH -I/home/pm/anaconda3/envs/PyTorchEMD/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-9.0/include -I/home/pm/anaconda3/envs/PyTorchEMD/include/python3.6m -c cuda/emd.cpp -o build/temp.linux-x86_64-3.6/cuda/emd.o -g -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=emd_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/local/cuda-9.0/bin/nvcc -I/home/pm/anaconda3/envs/PyTorchEMD/lib/python3.6/site-packages/torch/include -I/home/pm/anaconda3/envs/PyTorchEMD/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/pm/anaconda3/envs/PyTorchEMD/lib/python3.6/site-packages/torch/include/TH -I/home/pm/anaconda3/envs/PyTorchEMD/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda-9.0/include -I/home/pm/anaconda3/envs/PyTorchEMD/include/python3.6m -c cuda/emd_kernel.cu -o build/temp.linux-x86_64-3.6/cuda/emd_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -O2 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=emd_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9220): error: argument of type "const void *" is incompatible with parameter of type "const float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9231): error: argument of type "const void *" is incompatible with parameter of type "const float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9244): error: argument of type "const void *" is incompatible with parameter of type "const double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9255): error: argument of type "const void *" is incompatible with parameter of type "const double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9268): error: argument of type "const void *" is incompatible with parameter of type "const float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9279): error: argument of type "const void *" is incompatible with parameter of type "const float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9292): error: argument of type "const void *" is incompatible with parameter of type "const double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9303): error: argument of type "const void *" is incompatible with parameter of type "const double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9316): error: argument of type "const void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9327): error: argument of type "const void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9340): error: argument of type "const void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9352): error: argument of type "const void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9365): error: argument of type "const void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9376): error: argument of type "const void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9389): error: argument of type "const void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9401): error: argument of type "const void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9410): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9419): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9428): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9437): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9445): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9454): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9463): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9472): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9481): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9490): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9499): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9508): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9517): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9526): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9535): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9544): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512pfintrin.h(55): error: argument of type "const void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512pfintrin.h(63): error: argument of type "const void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512pfintrin.h(73): error: argument of type "const void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512pfintrin.h(81): error: argument of type "const void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512pfintrin.h(91): error: argument of type "void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512pfintrin.h(100): error: argument of type "void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512pfintrin.h(109): error: argument of type "void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512pfintrin.h(117): error: argument of type "void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512pfintrin.h(127): error: argument of type "void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512pfintrin.h(136): error: argument of type "void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512pfintrin.h(145): error: argument of type "void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512pfintrin.h(153): error: argument of type "void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10799): error: argument of type "const void *" is incompatible with parameter of type "const float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10811): error: argument of type "const void *" is incompatible with parameter of type "const float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10823): error: argument of type "const void *" is incompatible with parameter of type "const double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10835): error: argument of type "const void *" is incompatible with parameter of type "const double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10847): error: argument of type "const void *" is incompatible with parameter of type "const float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10859): error: argument of type "const void *" is incompatible with parameter of type "const float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10871): error: argument of type "const void *" is incompatible with parameter of type "const double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10883): error: argument of type "const void *" is incompatible with parameter of type "const double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10895): error: argument of type "const void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10907): error: argument of type "const void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10919): error: argument of type "const void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10931): error: argument of type "const void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10943): error: argument of type "const void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10955): error: argument of type "const void *" is incompatible with parameter of type "const int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10967): error: argument of type "const void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10979): error: argument of type "const void *" is incompatible with parameter of type "const long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(10989): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11000): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11009): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11020): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11029): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11040): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11049): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11060): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11069): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11080): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11089): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11100): error: argument of type "void *" is incompatible with parameter of type "float *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11109): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11120): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11129): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11140): error: argument of type "void *" is incompatible with parameter of type "double *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11149): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11160): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11169): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11180): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11189): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11200): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11209): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11220): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11229): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11240): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11249): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11260): error: argument of type "void *" is incompatible with parameter of type "int *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11269): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11280): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11289): error: argument of type "void *" is incompatible with parameter of type "long long *"

/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512vlintrin.h(11300): error: argument of type "void *" is incompatible with parameter of type "long long *"

/home/pm/anaconda3/envs/PyTorchEMD/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(83): warning: calling a constexpr host function("from_bits") from a host device function("lowest") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/pm/anaconda3/envs/PyTorchEMD/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(84): warning: calling a constexpr host function("from_bits") from a host device function("max") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/pm/anaconda3/envs/PyTorchEMD/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(85): warning: calling a constexpr host function("from_bits") from a host device function("lower_bound") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/pm/anaconda3/envs/PyTorchEMD/lib/python3.6/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(86): warning: calling a constexpr host function("from_bits") from a host device function("upper_bound") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

92 errors detected in the compilation of "/tmp/tmpxft_0000464e_00000000-6_emd_kernel.cpp1.ii".
error: command '/usr/local/cuda-9.0/bin/nvcc' failed with exit status 1

EMD still very large and increases throughout training

Hi,
Similar to some of the other issues posted, I'm getting a very large EMD. I divided by the number of points but ended up with an EMD of around 26 for a Chamfer Distance of around .1. I'm working with n = 22000 points.
In addition, if I use EMD as my loss and backpropogate, the loss ends up increasing, whereas it went down with Chamfer Distance. Any advice?
Thanks!

Fail when installing

Sorry to disturb, I failed when install.
These are error messages, could you please help me find where is wrong?
Ubuntu 16.04 Pytorch1.6.1 CUDA9.0

Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
1.10.1
g++ -pthread -shared -B /home/gy/anaconda3/envs/pvc/compiler_compat -L/home/gy/anaconda3/envs/pvc/lib -Wl,-rpath=/home/gy/anaconda3/envs/pvc/lib -Wl,--no-as-needed -Wl,--sysroot=/ /home/gy/pvc0922/PyTorchEMD/build/temp.linux-x86_64-3.8/cuda/emd.o /home/gy/pvc0922/PyTorchEMD/build/temp.linux-x86_64-3.8/cuda/emd_kernel.o -L/home/gy/anaconda3/envs/pvc/lib/python3.8/site-packages/torch/lib -L//usr/local/cuda-9.0/lib64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-3.8/emd_cuda.cpython-38-x86_64-linux-gnu.so
g++: error: /home/gy/pvc0922/PyTorchEMD/build/temp.linux-x86_64-3.8/cuda/emd.o: 没有那个文件或目录
g++: error: /home/gy/pvc0922/PyTorchEMD/build/temp.linux-x86_64-3.8/cuda/emd_kernel.o: 没有那个文件或目录
error: command 'g++' failed with exit status 1

EMD value extremely large

Hi,

I wanted to run the EMD, however, the value is extremely large. FOr example, chamfer distance is about 0.07, however, EMD generated from this code is 60-ish. I wonder is there any normalization that we should take care of?

Thanks!

Grad check failed

Your implementation failed grad check.

import torch as T
from torch.autograd import gradcheck
x = T.rand(2, 4, 3).cuda().double().requires_grad_(True)
y = T.rand(2, 5, 3).cuda().double()
print(gradcheck(earth_mover_distance, (x, y)))

One bug is perhaps here and here. Probably you want to cast a value to scalar_t before the division.
I am not familiar with CUDA so couldn't get any further. You have any idea how to solve this?

install failed

PyTorchEMD/cuda/emd_kernel.cu(181): error: identifier "AT_CHECK" is undefined

in running setup
(base) server1@server1:/data/PythonCodes/XiaoYuan/PyTorchEMD$ python setup.py install
running install
running bdist_egg
running egg_info
creating emd_ext.egg-info
writing emd_ext.egg-info/PKG-INFO
writing dependency_links to emd_ext.egg-info/dependency_links.txt
writing top-level names to emd_ext.egg-info/top_level.txt
writing manifest file 'emd_ext.egg-info/SOURCES.txt'
reading manifest file 'emd_ext.egg-info/SOURCES.txt'
writing manifest file 'emd_ext.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_ext
/home/server1/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py:782: UserWarning: The detected CUDA version (11.1) has a minor version mismatch with the version that was used to compile PyTorch (11.3). Most likely this shouldn't be a problem.
warnings.warn(CUDA_MISMATCH_WARN.format(cuda_str_version, torch.version.cuda))
building 'emd_cuda' extension
creating /data/PythonCodes/XiaoYuan/PyTorchEMD/build
creating /data/PythonCodes/XiaoYuan/PyTorchEMD/build/temp.linux-x86_64-3.7
creating /data/PythonCodes/XiaoYuan/PyTorchEMD/build/temp.linux-x86_64-3.7/cuda
Emitting ninja build file /data/PythonCodes/XiaoYuan/PyTorchEMD/build/temp.linux-x86_64-3.7/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] /home/server1/home/cuda11.1/bin/nvcc -I/home/server1/anaconda3/lib/python3.7/site-packages/torch/include -I/home/server1/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/server1/anaconda3/lib/python3.7/site-packages/torch/include/TH -I/home/server1/anaconda3/lib/python3.7/site-packages/torch/include/THC -I/home/server1/home/cuda11.1/include -I/home/server1/anaconda3/include/python3.7m -c -c /data/PythonCodes/XiaoYuan/PyTorchEMD/cuda/emd_kernel.cu -o /data/PythonCodes/XiaoYuan/PyTorchEMD/build/temp.linux-x86_64-3.7/cuda/emd_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="gcc"' '-DPYBIND11_STDLIB="libstdcpp"' '-DPYBIND11_BUILD_ABI="cxxabi1011"' -DTORCH_EXTENSION_NAME=emd_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++14
FAILED: /data/PythonCodes/XiaoYuan/PyTorchEMD/build/temp.linux-x86_64-3.7/cuda/emd_kernel.o
/home/server1/home/cuda11.1/bin/nvcc -I/home/server1/anaconda3/lib/python3.7/site-packages/torch/include -I/home/server1/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/server1/anaconda3/lib/python3.7/site-packages/torch/include/TH -I/home/server1/anaconda3/lib/python3.7/site-packages/torch/include/THC -I/home/server1/home/cuda11.1/include -I/home/server1/anaconda3/include/python3.7m -c -c /data/PythonCodes/XiaoYuan/PyTorchEMD/cuda/emd_kernel.cu -o /data/PythonCodes/XiaoYuan/PyTorchEMD/build/temp.linux-x86_64-3.7/cuda/emd_kernel.o -D__CUDA_NO_HALF_OPERATORS
-D__CUDA_NO_HALF_CONVERSIONS
_ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=emd_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++14
/data/PythonCodes/XiaoYuan/PyTorchEMD/cuda/emd_kernel.cu(181): error: identifier "AT_CHECK" is undefined

/data/PythonCodes/XiaoYuan/PyTorchEMD/cuda/emd_kernel.cu(268): error: identifier "AT_CHECK" is undefined

/data/PythonCodes/XiaoYuan/PyTorchEMD/cuda/emd_kernel.cu(385): error: identifier "AT_CHECK" is undefined

3 errors detected in the compilation of "/data/PythonCodes/XiaoYuan/PyTorchEMD/cuda/emd_kernel.cu".
[2/2] c++ -MMD -MF /data/PythonCodes/XiaoYuan/PyTorchEMD/build/temp.linux-x86_64-3.7/cuda/emd.o.d -pthread -B /home/server1/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/server1/anaconda3/lib/python3.7/site-packages/torch/include -I/home/server1/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/server1/anaconda3/lib/python3.7/site-packages/torch/include/TH -I/home/server1/anaconda3/lib/python3.7/site-packages/torch/include/THC -I/home/server1/home/cuda11.1/include -I/home/server1/anaconda3/include/python3.7m -c -c /data/PythonCodes/XiaoYuan/PyTorchEMD/cuda/emd.cpp -o /data/PythonCodes/XiaoYuan/PyTorchEMD/build/temp.linux-x86_64-3.7/cuda/emd.o -g -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=emd_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/home/server1/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1723, in _run_ninja_build
env=env)
File "/home/server1/anaconda3/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "setup.py", line 26, in
'build_ext': BuildExtension
File "/home/server1/anaconda3/lib/python3.7/site-packages/setuptools/init.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/home/server1/anaconda3/lib/python3.7/distutils/core.py", line 148, in setup
dist.run_commands()
File "/home/server1/anaconda3/lib/python3.7/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/home/server1/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/server1/anaconda3/lib/python3.7/site-packages/setuptools/command/install.py", line 67, in run
self.do_egg_install()
File "/home/server1/anaconda3/lib/python3.7/site-packages/setuptools/command/install.py", line 109, in do_egg_install
self.run_command('bdist_egg')
File "/home/server1/anaconda3/lib/python3.7/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/server1/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/server1/anaconda3/lib/python3.7/site-packages/setuptools/command/bdist_egg.py", line 164, in run
cmd = self.call_command('install_lib', warn_dir=0)
File "/home/server1/anaconda3/lib/python3.7/site-packages/setuptools/command/bdist_egg.py", line 150, in call_command
self.run_command(cmdname)
File "/home/server1/anaconda3/lib/python3.7/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/server1/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/server1/anaconda3/lib/python3.7/site-packages/setuptools/command/install_lib.py", line 11, in run
self.build()
File "/home/server1/anaconda3/lib/python3.7/distutils/command/install_lib.py", line 107, in build
self.run_command('build_ext')
File "/home/server1/anaconda3/lib/python3.7/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/home/server1/anaconda3/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/server1/anaconda3/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run
_build_ext.run(self)
File "/home/server1/anaconda3/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 186, in run
_build_ext.build_ext.run(self)
File "/home/server1/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 340, in run
self.build_extensions()
File "/home/server1/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 735, in build_extensions
build_ext.build_extensions(self)
File "/home/server1/anaconda3/lib/python3.7/site-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
_build_ext.build_ext.build_extensions(self)
File "/home/server1/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 449, in build_extensions
self._build_extensions_serial()
File "/home/server1/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 474, in _build_extensions_serial
self.build_extension(ext)
File "/home/server1/anaconda3/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
_build_ext.build_extension(self, ext)
File "/home/server1/anaconda3/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension
depends=ext.depends)
File "/home/server1/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 565, in unix_wrap_ninja_compile
with_cuda=with_cuda)
File "/home/server1/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1404, in _write_ninja_file_and_compile_objects
error_prefix='Error compiling objects for extension')
File "/home/server1/anaconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1733, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.