Giter Site home page Giter Site logo

Comments (6)

tobias-kirschstein avatar tobias-kirschstein commented on July 25, 2024

Hi Lee,

I did some experiments to see what model configurations you can fit on an RTX 3090.
The most promising I could find was using only 8 instead of 32 hash encodings and slightly restricting the number of simultaneous samples that are being processed:
--n_hash_encodings 8 --latent_dim_time 8 --max_n_samples_per_batch 19
This should still give you a reasonable performance but will be noticeably worse than the full model when the observed movements are very complex.
In the paper, we already experimented with using 16 hash encodings, which only marginally impaired the results. Going further down to 8 will have a similar effect. The extreme case would be to only use a single hash encoding, which is equivalent to the NGP + Def. ablation in Table 3 of the paper. The performance will suffer, but it was still on par with DyNeRF in our experiments.
So, playing around with the number of hash encodings is a good way to address GPU memory concerns and will still give reasonable results.

So far, I haven't tried running the full model in a distributed manner. But the first thing I would try here is to distribute the hash encodings to different GPUs. The starting point would be the hash ensemble implementation:

for h, hash_encoding in enumerate(self.hash_encodings):

where we loop over the hash encodings and collect the spatial features. I guess, it shouldn't be too hard to have the hash grids reside on separate GPUs and communicate the 3d positions as well as the queried spatial features with a dedicated main GPU or something.

Hope this helps

from nersemble.

LeeHanmin avatar LeeHanmin commented on July 25, 2024

Thanks a lot!

from nersemble.

LeeHanmin avatar LeeHanmin commented on July 25, 2024

Hi, I have successfully trained Nersemble and it is awesome. Can I get the video taken by each camera with one of the IDs?

from nersemble.

tobias-kirschstein avatar tobias-kirschstein commented on July 25, 2024

Glad you like it!
Not exactly sure what you mean by "get the video taken by each camera with one of the IDs".
I assume you are talking about rendering the trained model from each camera?
You can get the predictions from the evaluation cameras by running the evaluation script (see section 3.2. in the README).
Use the flags
--skip_timesteps 3 --max_eval_timesteps -1
to tell the evaluation script that you want to render every 3rd timestep (=24.3fps).
The rendered images will be put into some subfolder in ${NERSEMBLE_MODELS_PATH}/NERS-XXX-${name}/evaluation.
From there, it should be straightforward to pack the rendered images into a video.

from nersemble.

LeeHanmin avatar LeeHanmin commented on July 25, 2024

Sorry I wasn't clear enough. What I mean is that I want the video of 16 monocular cameras with a certain id such as 124 from the first frame to the last frame. Could you please provide it to me?

from nersemble.

tobias-kirschstein avatar tobias-kirschstein commented on July 25, 2024

Sorry, I still don't quite understand your request.
What exactly do you need?
Do you need the 16 videos of a person from the dataset to train NeRSemble? In that case section 2 of the README describes that.
But since you wrote "I have successfully trained Nersemble" above, I assumed you just want to render a trained model from the 12 training and 4 evaluation viewpoints. But my last comment describes how to get those renderings.
Not sure what other "video of 16 monocular cameras" you are referring to? Do you maybe mean the circular renderings as in the teaser image in the README?

from nersemble.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.