GearboxAssy.glb has <div class="snippet-clipboard-content notranslate position-rel

Can you elaborate that? I.e. In how far does it affect perform

GearboxAssy.glb has node and semantic about gltf-sample-models HOT 7 CLOSED

khronosgroup commented on July 30, 2024

GearboxAssy.glb has node and semantic

from gltf-sample-models.

Comments (7)

javagl commented on July 30, 2024

This should be valid. The goal is to have the option to pass the transform matrix of an arbitrary node (multiplied with the view matrix) to the shader. In fact, this is explicitly used as the example in the specification of the semantics: https://github.com/KhronosGroup/glTF/tree/master/specification/1.0#semantics

(The Duck model and several others are also using this)

from gltf-sample-models.

mre4ce commented on July 30, 2024

I was afraid that was the case. That's such a horrible thing from a performance perspective.

from gltf-sample-models.

javagl commented on July 30, 2024

Can you elaborate that? I.e.

In how far does it affect performance?
What would be the alternative for passing the global transform of an arbitrary node to the renderer?

Or is the point exactly that: The global transform has to be re-computed each time, unless you implement some sophisticated "caching"/notification to detect changes in the transform of parent nodes. AFAIK, Cesium does this, but probably, most other viewers don't. When you have a deep chain of nodes...

node0 
    - node1 
        - node2
            ...
            - node1000
                - Sunlight__Front-Upper-Left-node   <- There it is....

then this may indeed be a considerable overhead. But I think that it is unlikely to have a depth of more than ~5 for such nodes.

from gltf-sample-models.

mre4ce commented on July 30, 2024

This is a general problem with glTF. There are too many ways to pass compound matrices to a shader and they each take up four uniform vectors. I'm not too concerned about the calculation of a compound matrix. The shear number that you can possibly pass down, and the frequency at which they would be updated (per mesh/primitive) is a concern.

Have a look at the following extensions to see where I am coming from:

https://github.com/mre4ce/glTF/blob/KHR_glsl_view_projection_buffer/extensions/Khronos/KHR_glsl_view_projection_buffer/README.md

https://github.com/mre4ce/glTF/blob/KHR_glsl_multi_view/extensions/Khronos/KHR_glsl_multi_view/README.md

from gltf-sample-models.

javagl commented on July 30, 2024

I already had seen these extensions, but admittedly, did not yet study them in detail. I only skimmed them back when you opened them, and now only roughly tried to understand what they are aiming at.

This is intended as a disclaimer: I might have misunderstood something here. Additionally, I'm not so deeply familiar with the goals and underlying technologies. Therefore, here are only a few unsorted thoughts. If they sound like garbage to you, they likely are, and you can ignore them.

Performance

I cannot say anything profound about the performance implications that you referred to. But at least I can say that I was confused:

Note that on modern scalar GPUs, using transpose( mat3( u_modelInverse ) ) is no more computationally expensive in a shader than using an explicit MODELINVERSETRANSPOSE uniform. Using an explicit uniform only introduces overhead to pass the uniform value to the shader.

I know that all sorts of state changes are expensive, but thought that setting uniforms was one of the cheapest. And I know that GPUs are ridiculously fast, but thought that still one of the main goals of having all the different semantics was to move as many computations out of the shader as possible: Doing computations for millions of vertices (compared to doing them once on host side) doesn't come for free.

But of course, I assume that you proposed all this for a reason, and that there are benchmarks that support your implicit claims. I wonder how this translates to Vulkan, or to GL 4.x with glProgramUniform, but maybe the performance characteristics are similar there.

(A side note: I'm pretty sure that many glTF viewer implementations (including my own) are simple and straightforward in this regard: They just set the values of all uniforms, for each drawn primitive, regardless of whether the values have changed or not. I'd really be curious to see the https://github.com/cx20/gltf-test extended with some basic performance test (maybe just a frame counter), and applied to "complex" models. Optimizations (even the most "basic" ones, like material sorting) are probably still beyond what most renderers implement, except for sophisticated ones like Cesium)

Shader version information

When I first saw the extensions, I noticed that much of the notes about shaders and versions seemed to be the same. One reason of why I did not read both of them in detail back then was that I would have to flesh out the parts that are the actual extension. The extension proposals could probably be written much more compactly if the shader versioning part was explicitly factored out, maybe in a dedicated document that both can link to (although I'm not sure whether this is considered as a good practice for such proposals).

The changes in GLSL that are mentioned there (and the additonal ones of newer GLSL versions, that are not (yet) mentioned there) are important, and might become pressingly important when glTF moves forward. Although the explicit linkage to GLSL is about to be removed in glTF 2.0, knowing which features are supported (and how they may be addressed) may be valuable - e.g. that WebGL2 will suport UBOs.

Redundancy is redundant

The most important point, and the most bold one - I'm really going out on a limb with this one: Aren't the proposed extensions nearly the same? I noticed that, at some point, I had them open in two browser tabs, and switched back-and-forth between these tabs to spot the differences. From my naive point of view, the KHR_glsl_view_projection_buffer is basically included in the KHR_glsl_multi_view - namely, as the case of NUM_VIEWS==1. Couldn't the multi-view approach be some sort of "default", as the more generic of both (that can emulate the other) ? Again, I only read them quickly and won't arrogate the have understood them in all depth, but they looked very similar for me.

(EDIT: I also noticed the claims about the precision when multiplying matrices in the shader, but haven't read the entire paper yet)

from gltf-sample-models.

mre4ce commented on July 30, 2024

Setting uniforms is not cheap by any means. Moving as much as possible to the CPU is something we did 10 years ago. Also keep in mind that there is usually a pretty large difference between vertex and fragment counts, so vertex cost is usually completely swamped by fragment cost. Note that the extensions I propose are all aimed at vertex uniforms. For performance testing compare to the Vulkan-Samples glTF implementation and measure the CPU overhead using OpenGL.

https://github.com/KhronosGroup/Vulkan-Samples/blob/master/samples/apps/atw/scenes/scene_gltf.h

In particular, test using Qualcomm, ARM or AMD drivers. The NVIDIA drivers "hide" the cost by efficiently moving things to a separate thread. However, that still meas the cost is there as far as power draw and heat.

The KHR_glsl_view_projection_buffer and the KHR_glsl_multi_view extensions are deliberately not the same because not all platform support GL_OVR_multiview2. Implementing KHR_glsl_view_projection_buffer as KHR_glsl_multi_view with NUM_VIEWS=1 would actually come at a performance cost.

from gltf-sample-models.

emackey commented on July 30, 2024

No further work is being done on glTF 1.0 samples.

from gltf-sample-models.

GearboxAssy.glb has node and semantic about gltf-sample-models HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent