Comments (15)
When rendering with high spp, you can split the rendering process in multiple passes by using the samples_per_pass
property of the integrator:
<integrator type="path">
<integer name="samples_per_pass" value="64"/>
</integrator>
<sensor type="perspective">
...
<sampler type="independent">
<integer name="sample_count" value="256"/>
</sampler>
...
</sensor>
Note that sample_count
should be a multiple of sample_per_pass
Here the renderer should perform 4 passes with 64 SPP each and accumulate the results.
I have to say that this feature hasn't been thoroughly tested so please let me know if it works for you!
from mitsuba2.
Here the autodiff computation graph is maintained between the passes, so it keeps growing until you eventually run out of memory on your GPU.
One solution would be to "perform autodiff on the image" within the loop for the passes. You can then average the gradients you get in the different passes.
from mitsuba2.
@Speierers You have a typo of 😄samples_per_pass
in the XML snippet, which I copied and got an XML parsing error.
Now fixed.
from mitsuba2.
Randomly, when trying some of the above, it would also crash my console when trying to develop the HDRFilm. This seemed to be random to me, as when I repeated the same rendering (after crashing the terminal), it worked.
If this is reproducible, could you open another issue for this with a full log? Thanks!
from mitsuba2.
What is the spp for the cbox scene?
Maybe try using a smaller one.
On my PC with NVIDIA 1080Ti (same memory size with 2080 Ti), the spp can only be up to 128 for the cbox scene.
The same error occurs if the spp is more than 128 here.
from mitsuba2.
That solves my problem. I have to keep it below 256x256@128spp. I understand that gpu_autodiff backend can use a lot of VRAM, but it looks like gpu_rgb uses no less. Are we not supposed to render with large spps using the OptiX backends?
from mitsuba2.
The gpu_rgb
backend should use less GPU memory than gpu_autodiff
(no autodiff graph to store), but a large wavefront is still created and stored in memory. It's possible to split rendering into several passes so that each wavefront fits in memory.
from mitsuba2.
@merlinND Just a side remark: the gpu_autodiff
and gpu_rgb
backends should use the same amount of memory if nothing is being differentiated. @philcn Can this issue be closed?
from mitsuba2.
Right, I meant in case something is being differentiated.
from mitsuba2.
Thanks for raising this issue! I am facing the same problem as well. My scene is 512x384 and composed of simple objects. I am not able to increase my spp beyond 50.
When compared to scalar_
mode, gpu_
variant is about 3x faster for my scene rendered with 50 spp. However, 50 spp is not yielding a satisfactory output and I need to increase the SPP to reduce the noise. I am able to run mitusba2 using scalar
mode with very large spp (>1024, which is what I need).
I was hoping to reap the benefits of GPU acceleration for such time-intensive rendering tasks (with high SPP). Is there any hope to extend the capabilities of gpu_
variants to do something like this? If not, what is the best way to render scenes with low rendering noise (any other hack I can use other than simply increasing SPP) in Mitsuba2?
PS: Will using .ply
or .serialized
files for loading objects reduce this memory footprint as compared to .obj
files?
from mitsuba2.
I also had this issue when rendering the example cbox.xml scene using gpu_autodiff_spectral at 256x256 @ 256 samples, which gives me the error mentioned by @philcn:
2020-04-02 14:38:36 INFO main [mitsuba.cpp:194] Mitsuba version 2.0.0 (master[008cb4df], Windows, 64bit, 12 threads, 8-wide SIMD)
2020-04-02 14:38:36 INFO main [mitsuba.cpp:195] Copyright 2019, Realistic Graphics Lab, EPFL
2020-04-02 14:38:36 INFO main [mitsuba.cpp:196] Enabled processor features: cuda avx2 avx fma f16c sse4.2 x86_64
2020-04-02 14:38:36 INFO main [xml.cpp:1129] Loading XML file "resources\data\scenes\cbox\cbox.xml" ..
2020-04-02 14:38:36 INFO main [xml.cpp:1130] Using variant "gpu_autodiff_spectral"
2020-04-02 14:38:36 INFO main [PluginManager] Loading plugin "plugins\regular.dll" ..
2020-04-02 14:38:36 INFO main [PluginManager] Loading plugin "plugins\path.dll" ..
2020-04-02 14:38:36 INFO main [PluginManager] Loading plugin "plugins\independent.dll" ..
2020-04-02 14:38:36 INFO main [PluginManager] Loading plugin "plugins\gaussian.dll" ..
2020-04-02 14:38:36 INFO main [PluginManager] Loading plugin "plugins\hdrfilm.dll" ..
2020-04-02 14:38:36 INFO main [PluginManager] Loading plugin "plugins\perspective.dll" ..
2020-04-02 14:38:36 INFO main [PluginManager] Loading plugin "plugins\diffuse.dll" ..
2020-04-02 14:38:36 INFO main [PluginManager] Loading plugin "plugins\area.dll" ..
2020-04-02 14:38:36 INFO main [PluginManager] Loading plugin "plugins\d65.dll" ..
2020-04-02 14:38:36 INFO main [PluginManager] Loading plugin "plugins\obj.dll" ..
2020-04-02 14:38:36 INFO main [Scene] Validating and building scene in OptiX.
Caught a critical exception: cuda_malloc(): out of memory!
(here, I don't need to restart my console)
Rendering the same as the above, but now with 128 samples, gives me a different error:
2020-04-02 14:07:12 INFO main [mitsuba.cpp:194] Mitsuba version 2.0.0 (master[008cb4df], Windows, 64bit, 12 threads, 8-wide SIMD)
2020-04-02 14:07:12 INFO main [mitsuba.cpp:195] Copyright 2019, Realistic Graphics Lab, EPFL
2020-04-02 14:07:12 INFO main [mitsuba.cpp:196] Enabled processor features: cuda avx2 avx fma f16c sse4.2 x86_64
2020-04-02 14:07:12 INFO main [xml.cpp:1129] Loading XML file "resources\data\scenes\cbox\cbox.xml" ..
2020-04-02 14:07:12 INFO main [xml.cpp:1130] Using variant "gpu_autodiff_spectral"
2020-04-02 14:07:12 INFO main [PluginManager] Loading plugin "plugins\regular.dll" ..
2020-04-02 14:07:12 INFO main [PluginManager] Loading plugin "plugins\path.dll" ..
2020-04-02 14:07:12 INFO main [PluginManager] Loading plugin "plugins\independent.dll" ..
2020-04-02 14:07:12 INFO main [PluginManager] Loading plugin "plugins\gaussian.dll" ..
2020-04-02 14:07:12 INFO main [PluginManager] Loading plugin "plugins\hdrfilm.dll" ..
2020-04-02 14:07:12 INFO main [PluginManager] Loading plugin "plugins\perspective.dll" ..
2020-04-02 14:07:12 INFO main [PluginManager] Loading plugin "plugins\diffuse.dll" ..
2020-04-02 14:07:12 INFO main [PluginManager] Loading plugin "plugins\area.dll" ..
2020-04-02 14:07:12 INFO main [PluginManager] Loading plugin "plugins\d65.dll" ..
2020-04-02 14:07:12 INFO main [PluginManager] Loading plugin "plugins\obj.dll" ..
2020-04-02 14:07:12 INFO main [Scene] Validating and building scene in OptiX.
cuda_check(): runtime API error = 0002 "cudaErrorMemoryAllocation" in C:/Users/u0120819/mitsuba2/ext/enoki/src/cuda/horiz.cu:59.
(I need to restart my console after getting the above error)
However, the following fixed it (when using 128 samples):
When rendering with high spp, you can split the rendering process in multiple passes by using the
samples_per_pass
property of the integrator:<integrator type="path"> <integer name="samples_per_pass" value="64"/> </integrator>
Thank you!
I ran some quick tests by adjusting the samples per pass with the same settings for 256x256 @ 128 samples:
max depth = 6:
64 samples per pass: took 2.606s to render
32 samples per pass: took 2.666s to render
16 samples per pass: took 2.811s to render
max depth = -1:
64 samples per pass: took 5.469s to render
32 samples per pass: took 6.385s to render
16 samples per pass: took 7.293s to render
I then took this a step further by adjusting the samples per pass, but now with 256x256 @ 256 samples:
max depth = -1:
128 samples per pass: gave the error from above of: cuda_check(): runtime API error = 0002 "cudaErrorMemoryAllocation" in C:/Users/u0120819/mitsuba2/ext/enoki/src/cuda/horiz.cu:59.
64 samples per pass: took 10.613s
32 samples per pass: took 12.502s
Randomly, when trying some of the above, it would also crash my console when trying to develop the HDRFilm. This seemed to be random to me, as when I repeated the same rendering (after crashing the terminal), it worked.
from mitsuba2.
Thanks @merlinND and @wjakob for explaining the memory usage of gpu backends, and @Speierers for the multipass example! My problem is solved.
from mitsuba2.
Is samples_per_pass
relevant for the pathreparam
integrator too?
from mitsuba2.
Yes it should.
from mitsuba2.
Is samples_per_pass
helpful for gpu_autodiff_*
variants also when doing autodiff?
I am using following script to render and for autodiff, according to the suggestions in this thread (if my understanding is correct). I found passes
is limited to a maximum value, beyond which it would cause the "out of memory" error, and this is confirmed by checking the GPU memory usage during program running.
# create scene, optimizer, ...
# render image using multi-pass as suggested in this thread
passes = 10 # increasing which beyond this number would cause out-of-memory.
spp_per_pass = 12
img_i = render(scene, optimizer=opt, spp=spp_per_pass)
for i in range(passes-1):
img_i = render(scene, optimizer=opt, spp=spp_per_pass)
img_i[img_i != img_i] = 0
image += img_i
del img_i
image = image / passes
# perform autodiff on image ...
Any suggestions?
from mitsuba2.
Related Issues (20)
- [🔨 compilation issue] 'cmake -GNinja' step HOT 3
- [❔ other question] How to convert/extract enoki.autodiff_cuda.Float32 type array from Bitmap class? HOT 4
- [❔ other question]runtime API error = 0719 HOT 1
- [🐛 bug report] importing Pytorch after mitsuba => Segmentation fault (core dumped) HOT 1
- [❔ other question] Performance on Windows much lower than Linux? HOT 5
- [🐛 bug report] hdrfilm.cpp: duplicate channel detection HOT 3
- [❔ other question] how to uninstall mitsuba2 in linux? HOT 4
- The examples of Differentiable rendering applications HOT 2
- [🐛 bug report] OptiX "arithmetic involving uninitialized variable!" with Certain Scenes HOT 1
- [✨ feature request] Tutorial on differential rendering material acquisition HOT 1
- Something about code HOT 1
- Caught a critical exception: main [properties.cpp:401] └1∩Æ∞☺
- [✨ feature request] Does the mitsuba2 support bsdf for cloth model? HOT 1
- 用java或C++用物件導向的概念設計程式解下面的問題:讀取一組時間,計算出時針與分針的夾角,時間的格式是0:00到12:00,小時數可能為1或2位數,分鐘數總是2位數,00到59之間。輸出腳讀為0到180之間的值,角度的精確度到千分之一。例如9:00是90.000度,不是-90,也不是270,2:00則是60 HOT 1
- Rendered polarized images' intensity doesn't change according to the change of polarization angle- [❔ other question] HOT 1
- [❔ other question] setting the variant("llvm_spectral_polarized"), the code didn't work HOT 2
- [❔ other question]How can we output UV map in python? HOT 1
- - [❔ other question] I want to optimize the values in the transform component of my sensor plugin HOT 1
- Unable to compile: Windows 10: Visual Studio Community 2022 HOT 1
- White noise in spectral rendering - [❔ other question] HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mitsuba2.