Giter Site home page Giter Site logo

Comments (4)

james7132 avatar james7132 commented on July 28, 2024 1

How did you generate the codegen in the diffs above? With the unchecked change, I think those copies should have been vectorized.

It's honestly sort of jank. I compile results with cargo and use rust flags to make rustc output the *.asm files, then use rustfilt to pretty it up a bit, then using git to track the diff in a nice interface. I'd use something like Godbolt, but setting up my own server with the Rust crates I want tends to be a lot more effort.

Regarding const fns not being inlined, that's odd to me, I thought all const fns would be evaluated at compile time and their result inlined. If you could open a PR that adds the #[inline] attribute on all const fns, that would be great!

If the entire expression is constant (inputs, outputs, static dependencies), the result will be computed at compile time, but if used in a non-const context, it will be treated as a normal function, including not inlining if the function is considered big enough.

from encase.

james7132 avatar james7132 commented on July 28, 2024

This was discussed on the Bevy Discord (link). A summarization of the analysis of the codegen:

  • encase already does a capacity check only once per write:
    pub fn new<T: ShaderType>(data: &T, buffer: B, offset: usize) -> Result<Self> {
  • ArrayMetadata's accessors not being inlined, even on the most aggressive compilation settings.
    • Experimental change: Commit
    • Change in codegen: diff
    • Observed result: Metadata not being accessed causes unsupported use cases (i.e. certain structs in uniform buffers) to collapsing into the panic, resulting in a lot more codegen than necessary. Though this is probably not actively impacting hotpath performance
    • Additional note: this also probably applies to the other metadata types as well, even if they're all const.
  • SliceExt::array and SliceExt::array_mut both use checked conversions into &mut [T] on the target subslice. This results in a lot of extra branching. Attempted to replace this with an unchecked conversion instead.
    • Experimental change: commit
    • Change in codegen: diff
    • Observed result: All of the branches disappeared. The copy is not vectorized, though it should be, but there aren't any unnecessary branches anymore.
    • Additional note: This implementation is unsound as there is no capacity check when converting to the array. Either the unsafe needs to be lifted out and added as an invariant (basically treating the slice as a raw pointer), or we need to find another way to avoid the branch.

TODO: Actually benchmark the changes here to see if the gains are significant enough to warrant these kinds of changes.

from encase.

james7132 avatar james7132 commented on July 28, 2024

One potential other middle ground is to change the vector and matrix implementations to directly copy their bytes instead of relying on the underlying components' implementations. This would eliminate the vast majority of the branches being produced, potentially allows for vectorized copies, and avoids the need for infectious use of unsafe.

from encase.

teoxoy avatar teoxoy commented on July 28, 2024

This is great stuff, thanks for looking into it!

I think the most promising optimization here would be using unchecked versions of SliceExt::array and SliceExt::array_mut, making all read and write methods unsafe and making sure we always check the bounds in the buffer wrappers which is the API most users will interact with anyway. A PR doing this would be welcome but I'm curious how much perf we will get from this, we should benchmark it.

How did you generate the codegen in the diffs above? With the unchecked change, I think those copies should have been vectorized.

Regarding const fns not being inlined, that's odd to me, I thought all const fns would be evaluated at compile time and their result inlined. If you could open a PR that adds the #[inline] attribute on all const fns, that would be great!

from encase.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.