Several samples (e.g. gPointCloud and gFluidSurface) failed to compile on Linux using

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Samples failed to build on Linux and GCC,about nvidia/gvdb-voxels

Comments (29)

icoderaven commented on July 26, 2024 2

@nathanchrs @ramakarl
In the process of fixing issues on my new GPU, and updating cudpp to support it (replacing the deprecated ballot and shfl instructions to their *sync__ versions as per upstream) I have mostly converted the project into a modern CMake project that builds targets correctly and exports them. shared_cudpp is now a submodule that points to upstream cub and moderngpu with those fixes and pulls them in automatically and builds them as part of the gvdb library build. sample_utils is a build option, and individual examples can be chosen to be built (but I haven't bothered to fix all their builds yet, have only fixed it for gInteractiveGL).

So basically, on a linux machine, all you need to do
cmake -DCUDA_SDK_ROOT_DIR=/usr/local/cuda-9.2/samples -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../install -DGVDB_BUILD_SAMPLES=ON -DBUILD_gInteractiveGL=ON ../source/
and then a
make -j8 install
and you should be good to go! No more annoying multiple builds/configs/etc.

I've tested this on Ubuntu 18.04 with a Turing GPU with cuda 10.1 and 9.2, and a Pascal GPU with 9.2. Haven't tested on OSX/Windows, but I haven't really changed any of the older functionality (except for re-enabling CUDA compute autodetect)

This Branch

Feel free to test this if you need to and provide feedback if it works for you. I can create a pull request (although I don't think I can test for OSX/Windows compatibility)

from gvdb-voxels.

alvingitlord commented on July 26, 2024 1

Hi, @icoderaven , I’m running it on windows cuda 9.1, VS2015, as recommended default setting. There are various issues relating to OpenGL, Optix and other stuff. I have been looking into the source code for a few months and I have to say gvdb right now needs a community effort to improve on readability and compatibility. Right now I’m trying to scrap the bare minimum functionality that I need from GVDB source code and use it as a header library.

from gvdb-voxels.

oursnoir commented on July 26, 2024 1

@icoderaven I can confirm... I tried running gPointCloud just now (CUDA 9.2, GT 940M, Linux). My installed ptx files does not have the d_* prefixes, but the program did quit with a memory-related error:
...
Validating OptiX.
Update GVDB Volume.
Rebuild Optix.. Done.
GVDB CUDA ERROR:
  Launch status: invalid argument
  Kernel status: no error
  Caller: Allocator::CreateMemLinear
  Call:   cuMemAlloc
  Args: 

Hello,
I think it's not related to any debug mode or equivalent.
It seems more likely related to:
string ( SUBSTRING ${_ptxbase} 27 -1 _ptxname )
found at line 150 in gvdb-voxels/source/gvdb_library/cmake/Helpers.cmake

From how are generate the ptx files there are more characters than expected.
string ( SUBSTRING ${_ptxbase} 29 -1 _ptxname ) works better.

Best,

from gvdb-voxels.

icoderaven commented on July 26, 2024 1

@ramakarl So, finally managed to get time to look at all these issues, and after pulling the latest merge, I finally have at least one sample (gDepthMap) building and running correctly using CUDA 9.2 on a GTX 1080. After diving through the cuDPP implementation, I isolated the problem at cudaMalloc() calls, and further investigations drove me to the realisation that the issue was being caused by incorrect/mismatched arch and compute codes during compilation for all the different components (cudpp, then gvdb-voxels, and finally the sample codebase). This seems to have fixed the memory leak leading to a sigkill on generating cudpp plans now for multiple samples that have built.

I'm going to add all the fixes to the various errors and inconsistencies in the CMakeLists everywhere, and try and add autodetection of CUDA architecture in the build process (for linux peeps) and it would be great if people could check out the branch and test if it works for them, after which I'll send in a pull request. Expect a link pointing to the branch on this thread by tomorrow.

from gvdb-voxels.

icoderaven commented on July 26, 2024 1

Alright, autodetection of CUDA for cudpp and gvdb_voxels along with updated instructions on building in the readme should be live on this branch:

https://github.com/icoderaven/gvdb-voxels/tree/feature/linux_build_fix

Other people having trouble building on linux, could you please check this out and see if it works for you? If all is good then I'll sanitize a little more and make a pull request to this repo.
@nathanchrs @hubbardp @Planet99 @Taironn @oscarbg @oursnoir @AndreFrelicot @jeremie-lg

There are a bunch of other things that should be fixed (like the ad-hoc install process for cudpp, and the redundant cmake files in multiple locations, and how nice a combined global CMake file to build everything would be), but this should set things up in a consistent manner for now so that we can focus on getting the rest of the samples to build, etc.

from gvdb-voxels.

icoderaven commented on July 26, 2024 1

Thanks for investigating! That looks like it'll work. Sorry for not replying earlier; There's possibly a cleaner strategy than using regexes by creating a custom target and then renaming it (using add_custom_command() and add_custom_target() ) that I wanted to try out, but it has been pretty low on my priority list. There's some other stuff that we'd ideally also do, e.g. making the sample_utils an optional component (I did a little bit of that on that branch) and remove the redundancy in the CMake files inside there.

I have a bunch of downstream packages (in ROS) I've written that do that for my generated PTX files and will eventually push some of that nice clean CMake stuff later.

from gvdb-voxels.

nathanchrs commented on July 26, 2024

I have just tried to build gvdb-voxels on Linux, and I must say that the process got quite involved with lots of inconsistencies to work around. I finally got GVDB and some examples (gSprayDeposit and gFluidSurface) to build with the workarounds above, though.

Anyway, thanks for this great library!

from gvdb-voxels.

icoderaven commented on July 26, 2024

Using CUDA 9.2, and on a GTX 1080, although I can get the samples to build (I had to also add a dummy WindowResize method), running the samples gives errors in loading the ptx files.

Judging by the fact that the installed ptx files had d_* prefixes, whereas the .cu code in the library was reading the non-prefixed ptx files, after renaming the said files, the ptx files are indeed loaded, but then the executables end up eating up all the memory until killed by the kernel. Possibly related to #30

Thinking that the d_* prefix was due to the generated ptx files somehow being a debug variant, after lots of tweaking the individual CMake packages, I managed to get all the libraries to build in debug mode, but I see the same behaviour.

What gives?

from gvdb-voxels.

nathanchrs commented on July 26, 2024

@icoderaven I can confirm... I tried running gPointCloud just now (CUDA 9.2, GT 940M, Linux). My installed ptx files does not have the d_* prefixes, but the program did quit with a memory-related error:

...
Validating OptiX.
Update GVDB Volume.
Rebuild Optix.. Done.
GVDB CUDA ERROR:
  Launch status: invalid argument
  Kernel status: no error
  Caller: Allocator::CreateMemLinear
  Call:   cuMemAlloc
  Args:

from gvdb-voxels.

icoderaven commented on July 26, 2024

@nathanchrs Do you recall if you were using a different CUDA version for the samples when you built them earlier?

from gvdb-voxels.

nathanchrs commented on July 26, 2024

@icoderaven I'm certain I used CUDA 9.2 (I did have CUDA 9.1 previously, but I have removed it some time before building the samples due to an unrelated problem).

from gvdb-voxels.

icoderaven commented on July 26, 2024

Do the 1.1 release samples even run for anyone on linux? There seems to be no updates on the dedicated nvidia forum either.
@Alvininorge seems to be the only person who seems to be able to run this? Could you give any insights on how you got those running?

from gvdb-voxels.

ramakarl commented on July 26, 2024

@icoderaven
Thanks, great work!
Have you confirmed that all samples are working? When I uploaded GVDB 1.1, i confirmed that they all worked on linux (Ubuntu 14.4), so its unclear why they wouldnt. Your fixes do support CUDA 9 however, so that may be the difference.
If you or others can confirm samples working then I will merge to main.

from gvdb-voxels.

icoderaven commented on July 26, 2024

I have been waiting for other people to pipe in. In the meanwhile I've made some more extensions to the library - one of them was to allow setting the view matrix and the projection matrix of the camera class without having to depend on the Orbit Camera. I've also added the functionality (on a different branch - https://github.com/icoderaven/gvdb-voxels/commits/feature/device_mem_extension) to optionally use device memory instead of texture memory when adding a channel (since for my application I need to make atomic updates to voxel data in my custom ray traversal algorithm). I'd be happy to get feedback if you have ideas on how to do that in other/better ways.

from gvdb-voxels.

icoderaven commented on July 26, 2024

Regarding samples working, I think all the samples do work on my machine now after changing the desired compute in their respective CMakeLists. (Currently, I have gDepthMap gFluidSurface gInteractiveGL gInteractiveOptix gPointCloud gPointFusion gRenderKernel gResample gSprayDeposit built and running). I have also added the ability of the visualization utils to be a separate CMake package that can be imported into a third party CMake project (I use it within ROS).

from gvdb-voxels.

ramakarl commented on July 26, 2024

Allowing the option to use device memory is a pretty big change, and one that we have gotten several requests for.
You are probably thinking of using macros to achieve this. This could work to easily switch tex3d to in[x] or out[x] buffers, but keep in mind you would also need to modify function headers, and the VDBInfo structure since that holds references to textures that would need to change to CUDA buffers. You may find macros get complicated and difficult to read.
The alternative is a another run-time pathway that uses a copied set of kernels that operate on cuda buffers, while keeping the existing kernel functions for textures in place. This would give the ability to have both buffer and texture storage at run-time, and may be valuable because you loose hardware tri-linear interpolation with buffers. This method also has the benefit that the code pathways could diverge, as they might need to in order to support linear filtering in the buffered case.

Either strategy would be complicated. But I would advocate for the run-time solution of having two sets of kernels because of the value of having some GVDB objects with buffers, e.g. used for simulation, and some other GVDB objects used for rendering, e.g. with hardware tri-linear filtering for normals and fast lookup for raytracing. Both could exist at run-time whereas with macros they cannot. The two code paths can be more easily changed w/o having to re-validate the samples under both, and providing a run-time switch gives more flexibility.

from gvdb-voxels.

icoderaven commented on July 26, 2024

Well, I ended up doing something close to the former. I have modified a minimal set of library functions (specifically the ones updating the Atlas) and appended a data structure to VDBInfo to carry a device memory pointer. When adding a channel, an optional flag triggers the alternate resource allocation pathway. Within the struct I've added a bool vector to signify whether a particular channel uses texture memory or not. And for allocating data to the (linear) memory, I use the atlas resolution to assign data. The only kernel I had to modify to use this switch was the apronUpdate, but that was a trivial change.

This works for me, since I don't really need free interpolation that texture memory provides for the channels where I need to make atomic updates. And I get to keep the nice texture memory interop for the channels I like to have it on (and do my ray tracing on). Anyway, I didn't add it to my main branch that I am planning to use for the pull request because it's my personal modification, but if it is useful for others then I'd be glad.

from gvdb-voxels.

ramakarl commented on July 26, 2024

It sounds like you have run-time control over which channel uses buffers, which is great. With your own branch that supports buffers, have you been able to verify that the original texture pathways also work if you turn off the option for those channels? If you are able to verify the original pathways with all samples, then I think its still worthwhile to integrate into the main repo because its a valuable approach. The default addition of new channels can be set to the texture pathway, and we could provide an overloaded function that allows for the buffer channel option. The main criteria for merging is that we can validate that the original samples are working as intended, which includes texture-based raytracing and atlas operations, etc.

Since you have a compile-time version of UpdateAtlas for buffers, and some channels are buffers while others are textures, how are you updating the atlas on channels used for raytracing?

from gvdb-voxels.

icoderaven commented on July 26, 2024

I see. I think there's a slight misunderstanding. I don't want to replace all the texture memory usage with device memory usage. I just want the ability to have additional channels that use device memory.

The default is still using texture buffers. So, everything works exactly as normal (I just tested a couple of samples, gSprayDeposit and gResample, and yes they work too. I can do a more rigorous test as well). Only when adding a new channel, there is now an option to use device memory only for that channel. This channel can be read and written to (atomically) from within a kernel just like normal device memory. So, e.g. in a custom kernel, instead of using
surf3Dwrite(v, gvdb->volOut[chan], vox.x * sizeof(float), vox.y, vox.z);
for this particular channel one can do

unsigned long int atlas_id = vox_fixed.z*atlas_res.x*atlas_res.y + vox_fixed.y*atlas_res.x + vox_fixed.x;
float *atlas_mem = (float *) (gvdb->atlas_dev_mem[chan]) + atlas_id;
*atlas_mem = v;

(or just an atomic update instead)

I don't intend to use the device memory channels for ray tracing, really. For instance, for my use case, per voxel I want to store say the cumulative length of all ray segments passing through the voxel. This requires me to atomically update some corresponding data field. This is what I use the channel for. Using this information then a per voxel kernel takes in data from this auxilliary data channel and uses it to compute a scalar value I update in texture memory (on another channel) that I use for ray tracing.

from gvdb-voxels.

icoderaven commented on July 26, 2024

icoderaven@4f40da7

from gvdb-voxels.

nathanchrs commented on July 26, 2024

Hi @icoderaven, sorry I was quite busy with other things for the past couple weeks. I have seen your feature/linux_build_fix branch, and it fixes most of my issues on Linux. Thank you!

However, the CMake-generated install script now fails to copy PTX files correctly: it skips the first 2 characters of each PTX file name (e.g. tix_trace_primary.ptx instead of optix_trace_primary.ptx).

I found that this is caused by the changes in source/gvdb_library/cmake/Helpers.cmake (diff) and source/sample_utils/Helpers.cmake (diff) files.
There, the start index of SUBSTRING in string ( SUBSTRING ${_ptxbase} 29 -1 _ptxname ) was changed from 27 to 29.

On my system, the _ptxbase variable contains values in the format cuda_compile_ptx_generated_<PTX_NAME>, which means that 27 is the correct start index for me; however, it might be different for you. Perhaps you can check what value _ptxbase contains at that point on your system?

from gvdb-voxels.

nathanchrs commented on July 26, 2024

Looking at previous issues such as #2, I realized that for some users, the generated filename for the PTX files are in the format cuda_compile_ptx_1_generated_cuda_gvdb_copydata (with an extra _1 in the middle), which explains why string ( SUBSTRING ${_ptxbase} 29 -1 _ptxname ) works for some.

I checked the FindCUDA module in the CMake repo (https://github.com/Kitware/CMake/blob/v3.9.6/Modules/FindCUDA.cmake) as it is that script which is assigning the PTX file names. It appears that the script adds a counter to the generated filename (line 1828). This part of the code was added in CMake 3.7.0, while I am using CMake 3.5.1.

From checking the repository, I found out that the format for the generated PTX filenames are:

cuda_compile_ptx_generated_<PTX_NAME> for CMake < 3.7.0
cuda_compile_ptx_<COUNTER>_generated_<PTX_NAME> for CMake >= 3.7.0

I think we need to either change the string ( SUBSTRING ${_ptxbase} 29 -1 _ptxname ) part to automatically detect which format it is supposed to use or update the minimum CMake version to 3.7.0 in the documentation.

from gvdb-voxels.

nathanchrs commented on July 26, 2024

Possible fix:
In source/gvdb_library/cmake/Helpers.cmake and source/sample_utils/Helpers.cmake, replace

string ( SUBSTRING ${_ptxbase} 29 -1 _ptxname )

with

string ( REGEX MATCH "^cuda_compile_ptx(_\\d+)?_generated_" _ptxname_prefix "${_ptxbase}" )
string ( LENGTH "${_ptxname_prefix}" _ptxname_prefix_length)
string ( SUBSTRING ${_ptxbase} "${_ptxname_prefix_length}" -1 _ptxname )

from gvdb-voxels.

nathanchrs commented on July 26, 2024

BTW @icoderaven, I think it would be nice if you could PR your linux_build_fix branch as it is for the time being, to save time for other Linux users. Thanks!

from gvdb-voxels.

icoderaven commented on July 26, 2024

@nathanchrs I fixed the two Helper.cmake files in the correct way using custom_targets. Can you check, and I'll send in a pull request for the branch as-is.

from gvdb-voxels.

nathanchrs commented on July 26, 2024

@icoderaven it works for me 👍

from gvdb-voxels.

icoderaven commented on July 26, 2024

Done. Pull request created.

from gvdb-voxels.

ramakarl commented on July 26, 2024

Great work!
The build fixes for Linux and CUDA 10 for Turing are especially valuable. I will take a closer look at both the Win and Linux build when I have a better opportunity.

from gvdb-voxels.

NBickford-NV commented on July 26, 2024

Closing since it looks like icoderaven fixed this issue (and also the new build system removes CUDPP and changes how PTX files are generated). Please feel free to reopen this issue if there are still problems - thanks again!

from gvdb-voxels.

Samples failed to build on Linux and GCC about gvdb-voxels HOT 29 CLOSED

Comments (29)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent