Comments (7)
Next thing to try is enabling fault handling: https://docs.python.org/3/library/faulthandler.html Add PYTHONFAULTHANDLER=1
to the command line and see if we get a stack trace when the fault happens. If this doesn't work it's a matter of tracing the code somehow to see which line is the last to execute before a segfault happens. That's painful but we can possibly find a tool to do that cleanly.
from tutorials.
I traced the error back to the Convolution in the Gaussian filter, which is used by the Rand3DElasticd
transform. I suspect this issue is related to a previous bug I encountered. I guess PyTorch 24.03 container may not include the commit mentioned in the ticket, since the code runs successfully under PyTorch's nightly build. I will confirm it today.
from tutorials.
It looks to have died on the second last cell:
deformed_data_dict = rand_elastic(data_dict)
print(f"image shape: {deformed_data_dict['image'].shape}")
image, label = deformed_data_dict["image"][0], deformed_data_dict["label"][0]
plt.figure("visualise", (8, 4))
plt.subplot(1, 2, 1)
plt.title("image")
plt.imshow(image[:, :, 5], cmap="gray")
plt.subplot(1, 2, 2)
plt.title("label")
plt.imshow(label[:, :, 5])
plt.show()
I can't tell from that if it's the elastic deformation or the matplotlib calls that cause the crash, it could be either if the environment isn't right and some compiled code caused a crash.
from tutorials.
Would be worth running with CUDA_LAUNCH_BLOCKING="1" if the test is running pre-processing on the GPU
from tutorials.
Would be worth running with CUDA_LAUNCH_BLOCKING="1" if the test is running pre-processing on the GPU
I don't think the transforms are run on the GPU. You can see in this tutorial.
https://github.com/Project-MONAI/tutorials/blob/main/modules/3d_image_transforms.ipynb
from tutorials.
Would be worth running with CUDA_LAUNCH_BLOCKING="1" if the test is running pre-processing on the GPU
I don't think the transforms are run on the GPU. You can see in this tutorial. https://github.com/Project-MONAI/tutorials/blob/main/modules/3d_image_transforms.ipynb
There may be some interaction anyway with the CUDA components in Pytorch so it's worth trying. Are there any other environment variables to set to enhance debug output? We don't have much else to go on since we can't replicate the issue locally.
from tutorials.
tried with CUDA_LAUNCH_BLOCKING=1
, the same error occurred.
Step to reproduce:
docker pull nvcr.io/nvidia/pytorch:24.03-py3
docker run ...
# install monai
git clone https://github.com/Project-MONAI/MONAI.git
python -m pip install --upgrade pip wheel
python -m pip install -r requirements-dev.txt
BUILD_MONAI=0 python setup.py develop
# install tutorial
git clone https://github.com/Project-MONAI/tutorials.git
python -m pip install -r requirements.txt; python -m pip list
# run notebook
CUDA_VISIBLE_DEVICES=0 CUDA_LAUNCH_BLOCKING=1 ./runner.sh -t modules/3d_image_transforms.ipynb
from tutorials.
Related Issues (20)
- unrecognized arguments `local-rank` in "brats_training_ddp.py"
- FileNotFoundError in "acceleration/TensorRT_inference_acceleration.ipynb"
- KeyError in "reconstruction/MRI_reconstruction/unet_demo/inference.ipynb"
- swin_unetr_btcv_segmentation_3d: pre trained model download link broken HOT 4
- Link for Installation of MONAI Generative Models gives 404 error
- AutoRunner demo needs to set auto_scale_allowed to False HOT 1
- Incorporating ONNX Support into Brain Tumor Segmentation Example HOT 1
- Getting cls_logits NaN of Inf during training HOT 1
- Decollate_batch() should be used with LoadImaged or LoadImage(image_only= "False") with dictionary_based input HOT 1
- module 'cv2.dnn' has no attribute 'DictValue'
- ImportError: libGL.so.1: cannot open shared object file: No such file or directory
- Kernel hangs in "TCIA_PROSTATEx_Prostate_MRI_Anatomy_Model.ipynb" HOT 11
- Issue with multi-GPU support in Auto3DSeg on Windows HOT 1
- the argument needed to change the default directory in pathology/tumor_detection/README.MD
- please upload more famous diffusion model about image to image,thanks HOT 1
- Certificate verify failed when downloading OASIS data
- monailabel tutorials contain broken and outdated links to Orthanc HOT 2
- IndexError in modules/resample_benchmark.ipynb HOT 1
- Auto3DSeg to test data for segmentation of the Lung HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tutorials.