teoxoy / encase Goto Github PK

View Code? Open in Web Editor NEW

145.0 2.0 25.0 106 KB

Provides a mechanism to lay out data into GPU buffers according to WGSL's memory layout rules

Home Page: https://crates.io/crates/encase

License: MIT No Attribution

Rust 99.22% WGSL 0.78%

wgsl wgpu rendering

encase's Issues

Support variable-sized arrays in uniform buffers via `arrayvec::ArrayVec`/`tinyvec::ArrayVec`

Since the arrayvec::ArrayVec/tinyvec::ArrayVec types have a hard cap on the items they contain, the capacity can be used on the shader side as the fixed-size array length.

Besides being able to use these Vec-like data structures, this will also allow this trick to work (we should however note somewhere that reading beyond the actual length of items that were written will effectively return garbage data).

Allow a `Vec` newtype to be used within uniform buffers

This newtype would work in the same way a runtime-sized array works within storage buffers.

This enables use-cases where shaders are generated at runtime and the array size is injected in the shader.

We should require a newtype for this since you have to be more careful to inject the correct array size in the shader (same size as the Vec length).

OpenGL/WebGl std140 support?

Hey hi!

I am investigating about lib that helps me to set my buffers using std140 easily, I am using right now crevice but I am having some troubles with shaders that require sized arrays or dynamic size types, like:

struct PointLight {
   a: Vec3,
   b: Vec3,
   c: Vec3,
   ...
}

struct Material {
   attr: f32,
   attr2: Vec3,
   lights: Vec<PointLight>, // or [PointLight; 4]
}

So my question is, opengl/std140 is supported by encase? I see that the readme talks about WGPU only.
If so, is this kind of structs is supported? This is exactly the code that I am trying to make it work with crevice: https://github.com/Nazariglez/LearnOpenGL-Notan/blob/main/src/_2_lighting/_6_1_multiple_lights.rs#L264

Where #[uniform] it just adds AsStd140 and Uniform to control the input type when setting the buffer's data.

If encase support this it would be great.

Implement `ShaderType` for packed types

I'm hitting the issue where I want to implement ShaderType over a struct that has a [u16; 2] field, which isn't supported by Encase.

AFAIU seeing that WGSL doesn't, it makes sense. The way I make this currently work is by representing [u16; 2] as a u32 in WGSL and then unpack it into a vec2<u32>.

Is this in scope for Encase?

Array stride in uniform address space

Today I was trying to write this struct to a uniform buffer:

#[derive(ShaderType)]
pub struct MortonUniforms {
    lut: [u32; 4608],
    size_lut: [u32; 1 << (extended_morton_coder::SIZE_LUT_NUMBER_OF_BITS + 1)],
    morton_index_scale: f32,
    offset: Vec3,
    size_multiplier: f32,
    multiplier: Vec3,
}

This doesn't work because of lut and size_lut, with this error:

thread 'main' panicked at 'array stride must be a multiple of 16 (current stride: 4)', C:\Users\andre\.cargo\registry\src\index.crates.io-6f17d22bba15001f\encase-0.6.1\src\types\array.rs:47:17

I saw issue #40 , and that the solution is to "create your own newtype around an array and [...] derive them" using the impl_vector macro, but I was confused on how to use that macro or if it made sense in my situation considering I have a lot more than 4 elements.

Vector of Structs support

I used to think it doe, because it worked really great. I have some rendering logic, based on bevy, and upon increasing support for bevy 0.14.0 suddenly the program crashes frequently ; And I encounter this error :

Caused by:
    In Device::create_compute_pipeline
    Error matching shader requirements against the pipeline
    Shader global ResourceBinding { group: 1, binding: 0 } is not available in the pipeline layout
    Buffer structure size 48, added to one element of an unbound array, if it's the last field, ended up greater than the given `min_binding_size`

With the following struct:

#[cfg_attr(feature = "bevy_wgpu", derive(ShaderType))]
#[derive(Default, Clone, Copy, Debug, PartialEq)]
pub struct Albedo {
    pub r: f32,
    pub g: f32,
    pub b: f32,
    pub a: f32,
}

#[derive(Clone, ShaderType)]
pub(crate) struct Voxelement {
    pub(crate) albedo: Albedo,
    pub(crate) content: u32
}

#[derive(Resource, Clone, AsBindGroup, TypePath, ExtractResource)]
#[type_path = "shocovox::gpu::ShocoVoxRenderData"]
pub struct ShocoVoxRenderData {

   ...

    #[storage(3, visibility(compute))]
    pub(crate) voxels: Vec<Voxelement>,
}

In shader:

struct Voxelement {
    albedo : vec4f,
    content: u32,
}

...

@group(1) @binding(3)
var<storage, read_write> voxels: array<Voxelement>;

I am not sure anymore, because if I add a _padding variable to Voxelement as: _padding: Vec3,, evaluations pass, but the program crashes...

are vectors of structs supported, or should I try to manually arrange its structure, with e.g. #[repr(C)]?

Thank you for the help in advance!

Consider updating `encase_derive_impl` to use `syn@2`

Hi! I noticed that encase_derive_impl (and possibly other parts of this repository) depend on syn@1. Perhaps you could consider updating to use syn@2 wherever possible?

Some things worth noting for anyone who intends to take on this suggestion:

syn@2's MSRV is Rust 1.56, up from syn@1's Rust 1.31. I am unsure if this may be a problem - the repository's main Cargo.toml specifies a MSRV of Rust 1.63, but encase_derive and encase_derive_impl do not specify a MSRV.
Updating to syn@2 may not be as simple as swapping the version number - there are a number of breaking changes in syn@2, listed at https://github.com/dtolnay/syn/releases/tag/2.0.0.

If you do consider to transition to syn@2, you have my thanks! :)

Switch to `coverage_nightly` ?

I was trying to run coverage tests something that depends on this crate and noticed that it fails with:

error[E0554]: `#![feature]` may not be used on the stable release channel
 --> C:\Users\Eugene\.cargo\registry\src\github.com-1ecc6299db9ec823\encase-0.4.0\src\lib.rs:2:23
  |
2 | #![cfg_attr(coverage, feature(no_coverage))]
  |

According to https://github.com/taiki-e/cargo-llvm-cov#exclude-function-from-coverage this could be solved by a command line flag or a change to checking for coverage_nightly instead of coverage. Would a PR be acceptable to switch to coverage_nightly or is it preferable to use the command line flag for llvm-cov?

Derived ShaderType rust-analyzer error

I'm getting this error from rust-analyzer when using #[derive(ShaderType)] in VSCode:

no method `to_panicvals` on type `StdWrapper<&&str>`

Does anyone know how to fix this?

How can you pad array of structures properly?

Hello,

Is there a way to add proper padding for dynamically sized arrays containing structures? Something like Vec<MyStruct>, so not structures containing dynamically sized arrays as can be seen in the example code. Here's my wgpu 13.1 compute shader program which compiles and runs, but the resulting calculations are wrong. Only first few instances are calculated properly and the rest are initialized to zero, although they should not be zero since that's not how the buffer is being initialized to begin with.

Ideally I would like to directly pad array of cmath Vector3 values to my shader program, with a type signature like this Vec<Vector3<f32>> but even having a wrapper struct that derives ShaderType would be good enough for now.

WGSL file:

struct Vec3 {
    x: f32,
    y: f32,
    z: f32
};

@group(0)
@binding(0)
var<storage, read> test_arr: array<Vec3>;

@group(0)
@binding(1)
var<storage, read_write> output_buf: array<Vec3>;

@compute
@workgroup_size(1)
fn main(@builtin(global_invocation_id) global_id: vec3<u32>) {
    var gg = test_arr[global_id.x];
    gg.x = 7.2;
    gg.y = 2.2;
    gg.z = 1.2;
    output_buf[global_id.x] = gg;
}

Rust program:

         #[derive(ShaderType)]
         pub struct Vec3Wrap {
             pub x: f32,
             pub y: f32,
             pub z: f32
         }


          let shader = include_wgsl!("./compute_calc_vis.wgsl");
          let shader = engine.device.create_shader_module(shader);

          let data = (0..28).map(|_| Vec3Wrap {
              x: 0.0,
              y: 5.0,
              z: 0.0,
          }).collect::<Vec<_>>();

          let mut buf = encase::StorageBuffer::new(Vec::new());
          buf.write(&data).unwrap();
          let byte_buffer = buf.into_inner();

          let input_buffer = engine.device.create_buffer_init(&wgpu::util::BufferInitDescriptor {
              label: Some("Input Buffer"),
              contents: bytemuck::cast_slice(byte_buffer.as_slice()),
              usage: wgpu::BufferUsages::STORAGE | wgpu::BufferUsages::COPY_DST
          });

          let output_gpu_buffer = engine.device.create_buffer(&wgpu::BufferDescriptor {
              label: Some("Output Buffer"),
              size: byte_buffer.len() as _,
              usage: wgpu::BufferUsages::STORAGE | wgpu::BufferUsages::COPY_SRC,
              mapped_at_creation: false,
          });

          let mapping_buffer = engine.device.create_buffer(&wgpu::BufferDescriptor {
              label: Some("Mapping Buffer"),
              size: byte_buffer.len() as _,
              usage: wgpu::BufferUsages::COPY_DST | wgpu::BufferUsages::MAP_READ,
              mapped_at_creation: false,
          });

          let compute_pipeline = engine.device.create_compute_pipeline(&wgpu::ComputePipelineDescriptor {
              label: None,
              // layout: Some(&pipeline_layout),
              layout: None,
              module: &shader,
              entry_point: "main",
          });

          let bind_group_layout = compute_pipeline.get_bind_group_layout(0);
          let pipeline_layout = engine.device.create_pipeline_layout(&wgpu::PipelineLayoutDescriptor {
              label: None,
              bind_group_layouts: &[&bind_group_layout],
              push_constant_ranges: &[],
          });

          let bind_group = engine.device.create_bind_group(&wgpu::BindGroupDescriptor {
              label: None,
              layout: &bind_group_layout,
              entries: &[
                  wgpu::BindGroupEntry {
                      binding: 0,
                      resource: input_buffer.as_entire_binding(),
                  },
                  wgpu::BindGroupEntry {
                      binding: 1,
                      resource: output_gpu_buffer.as_entire_binding(),
                  },
              ],
          });

          let mut encoder = engine.device.create_command_encoder(&wgpu::CommandEncoderDescriptor::default());

          {
              let mut cpass = encoder.begin_compute_pass(&wgpu::ComputePassDescriptor::default());
              cpass.set_pipeline(&compute_pipeline);
              cpass.set_bind_group(0, &bind_group, &[]);
              cpass.dispatch_workgroups(data.len() as u32, 1, 1);
          }

          encoder.copy_buffer_to_buffer(&output_gpu_buffer, 0, &mapping_buffer, 0, data.len() as _);

          engine.queue.submit(core::iter::once(encoder.finish()));

          let output_slice = mapping_buffer.slice(..);
          output_slice.map_async(wgpu::MapMode::Read, |_| {});

          engine.device.poll(wgpu::Maintain::Wait);

          let output = output_slice.get_mapped_range().to_vec();
          mapping_buffer.unmap();

          let ob = StorageBuffer::new(output);
          let out_val: Vec<Vec3Wrap> = ob.create().unwrap();

          info!("compute values:");
          for x in out_val.iter() {
              info!("x: {}, y: {}, z: {}", x.x,x.y, x.z);
          }


INFO [..] x: 7.2, y: 2.2, z: 1.2
INFO [..] x: 7.2, y: 2.2, z: 1.2
INFO [..] x: 7.2, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0
INFO [..] x: 0, y: 0, z: 0

Provide a way to fetch whether a ShaderType is runtime sized or not

It's currently not possible to check if a type T: ShaderType is runtime sized or not. This makes it difficult, in generic code, to tell if one can assume that the T::min_size can be used to accurately estimate how large a preallocated buffer needs to be to fit N elements of T. This mandates the use of dynamic allocation

The motivation here is to support use cases that directly use wgpu::Queue::write_buffer_with to avoid extra copies, but it requires a preallocated non-expandable buffer. To preallocate, a maximum capacity for the buffer must be known, and this requires summing the size of all elements (padding for alignment), or computing the capacity assuming the size is fixed. This is doable in non-generic contexts where we know a type is not runtime sized, but not possible in generic contexts without T::is_runtime_sized or something similar.

New release

Can you do a new crates.io release? I'm interested in having support for nalgebra 0.32

Implement `ShaderType` for `u64` and `i64`

u64 and i64 are supported since wgpu 0.20 with the addition of Features::SHADER_INT64 and should implement ShaderType

Wrong min_size() for struct

#[derive(ShaderType)]
pub struct GpuSolariMaterial {
    pub base_color: [f32; 4],
    pub emissive: [f32; 4],
    pub base_color_texture_id: u32,
    pub normal_map_texture_id: u32,
    pub emissive_texture_id: u32,
}

GpuSolariMaterial::min_size() should be 48, but is 44.

Generate WGSL to reduce duplication of definitions

nalgebra support: ShaderSize is not implemented for Matrix<f32, Const<3>...

Considering the following WGSL code:

struct Material {
    color: vec4<f32>
}

@group(0)
@binding(0)
var<uniform> material: Material;

This is what wgsl_to_wgpu generates:

#[repr(C)]
#[derive(Debug, Copy, Clone, PartialEq, encase::ShaderType)]
pub struct Material {
    pub color: nalgebra::SVector<f32, 4>,
}

Which causes the following build error:

error[E0277]: the trait bound `Matrix<f32, Const<4>, Const<1>, ArrayStorage<f32, 4, 1>>: ShaderSize` is not satisfied
 --> src\shaders\shader.rs:4:41
  |
4 | #[derive(Debug, Copy, Clone, PartialEq, encase::ShaderType)]
  |                                         ^^^^^^^^^^^^^^^^^^ the trait `ShaderSize` is not implemented for `Matrix<f32, Const<4>, Const<1>, ArrayStorage<f32, 4, 1>>`
  |
  = help: the following other types implement trait `ShaderSize`:
            &T
            &mut T
            Arc<T>
            ArrayLength
            AtomicI32
            AtomicU32
            Box<T>
            Cow<'_, T>
          and 47 others
  = help: see issue #48214
  = note: this error originates in the derive macro `encase::ShaderType` (in Nightly builds, run with -Z macro-backtrace for more info)

Which is unexpected.

Please advise.

Stride of [f32; 4] fields in struct

Pretty new to this so bear with me

wgsl

struct Light {
    position: vec4<f32>,
    color: vec4<f32>,
}

rust

#[repr(C)]
#[derive(Debug, Copy, Clone, PartialEq, encase::ShaderType)]
pub struct Light {
    pub position: [f32; 4],
    pub color: [f32; 4],
}

When I try to use Light in a uniform buffer, I get this failed assertion from Light::assert_uniform_compat():

panicked at 'array stride must be a multiple of 16 (current stride: 4)

[f32; 4] is 16 bytes, yeah? It looks to me that instead of treating each struct field as 16 byte aligned data, it looks into the arrays and treats it like a stride per f32 instead of striding each [f32; 4] whole.

Would it be possible to make Light (as written, with [f32; 4]s) pass assert_uniform_compat?

I looked in the docs and found there is an impl<T> ShaderType for [T]. Maybe this precludes special treatment of [f32; 4]? Is the only solution to pick one of the matrix/vector libraries (like mint or nalgebra) and use that instead of raw [f32; 4]? If so, maybe we can add another example to ShaderType::assert_uniform_compat() to shed light on this scenario?

Provide a way to lay out structs with fields that are unknowable at compile time

Requested by @cwfitzgerald in #15.

MIRI reports UB on `<Vec<u8> as ByteVecExt>::try_extend_zeroed`

Running the test suite in this repository through MIRI reports that <Vec<u8> as ByteVecExt>::try_extend_zeroed results in Undefined Behaviour.

I believe that this is because the end of as_mut_ptr_end is open and therefore is not pointing to anything borrowed. I'll open a pull request in a minute with a fix which both adds MIRI to CI and which passes all tests with MIRI.

Update `glam` to 0.21

Currently, glam is at 0.21.
I tried updating encase, and there are no compile errors when upgrading directly, except that "std" has to be added to the "features".

Vertex buffers

I am currently trying to use encase with wgpu, but got a bit confused when I tried to use it for vertex buffers.

As far as I could tell, the only difference between StorageBuffer and UniformBuffer is that UniformBuffer calls assert_uniform_compat.

So if I need to make a vertex buffer, am I supposed to just use StorageBuffer?
(worked fine so far by the way)

Consider reexporting crates for which `ShaderType` are implemented

Could you reexport crates like nalgebra to make sure consumers of encase are using the exact types for which ShaderType is implemented?

Right now, I think the only way to handle this is to manually check the version of nalgebra being used by encase and add that as a direct dependency of the dependent project.

Also, thank you for the amazing work =)

Specialize write_into for trivially memcpy-able types

Right now, encase recursively calls write_into when serializing out types into GPU memory. This typically follows the flow of copy field/value -> advance the writer by it's padding -> repeat until done.

This unfortunately breaks vectorization when copying the data. Larger structs with heavily recursive types like 4x4 matrices end up with tens or hundreds of these steps when they could be just directly memcpy-ed into the target buffer.

For all types that have a runtime fixed size and do not have any additional padding, they're trivially memcpy-able into and out of GPU buffers. Similarly, arrays, slices and Vecs of these types are can also be batch memcpy-ed where applicable.

This information is statically available at compile time in a type's METADATA. If statements on constant expressions will optimize out the unused branch. This should be doable even without compiler support for specialization.

How to Derive `Shadertype` for generic structs without unsafe code?

I have a generic vector implementation:

#[repr(C)]
struct MyVec<T> {
    pub x: T,
    pub y: T,
    pub z: T,
}

pub type MyVecf = MyVec<f32>;

but it has usize variants so I can not use the ShaderSize derive macro ( as far as I know ).

I've tried implementing the ShaderSize trait manually but I just can't understand how it operates. I've also tried digging around the repo and supported types for clues, and found the following to seem to work:

use encase::{impl_vector, vector::AsMutVectorParts, vector::AsRefVectorParts};

impl_vector!(3, MyVecf, f32; using From);

impl AsRefVectorParts<f32, 3> for MyVecf {
    fn as_ref_parts(&self) -> &[f32; 3] {
        unsafe { &*(self as *const MyVecf as *const [f32; 3]) }
    }
}

impl AsMutVectorParts<f32, 3> for MyVecf {
    fn as_mut_parts(&mut self) -> &mut [f32; 3] {
        unsafe { &mut *(self as *mut MyVecf as *mut [f32; 3]) }
    }
}

but I would really like to avoid using unsafe code, is there any possibility of it?

Setup CI

Not possible to retrieve the aligned written size?

When writing out from an encase Buffer via into_inner/as_ref getting e.g. a Vec<u8>, that inner structure has unaligned length (there might be padding missing at the end of it).
This makes it impossible to append anything to a GPU-side buffer after having written to it with encase-data once, because how will one know the next properly aligned offset?
Would be great to just have a get_(next_)offset function along the already existing set_offset on encase’s types.

Use Encase's Layouting Independent of Uploading

In rend3's vertex pulling rewrite I need to deal with sparse updates of buffers a lot. Because of that I'm not actually writing out the entire buffer all at once. I end up uploading individual items that changed into the byte offsets into the buffer they are supposed to be.

I would love to have a definition like this:

#[derive(ShaderType)]
pub struct DirectionalLightShader {
    header: DirectionalLightHeader,
    #[runtime]
    lights: Vec<DirectionalLight>,
}

But because I can't upload the whole thing at once, I can't actually just use WriteInto here. What I would love is an API that would give me an offset into the structure for a given member. Maybe something like

/// Raw Value because it's just a plain member
let offset: u64 = DirectionalLightShader::Offsets::HEADER
/// Vectors require you to give the index to get the full index
let offset: u64 = DirectionalLIghtShader::Offsets::LIGHTS::index(2);

This would let me use encase's layout even though I need my own custom weird layout system. Not at all attached to the syntax, just what it would let me accomplish.

Consider inlining hot functions for primitives, vectors, and matricies

This flamegraph seems to show that many of the potentially hot function calls are not being inlined (i.e. <f32 as WriteInto>::write_into). This is potentially very costly in terms of performance.

This also seems to carry over to the internal types like Cursor and many BufferRef/BufferMut implementations as well.

Consider adding `bitflags` helpers

Most likely after bitflags v2 since it will release with a new BitFlags trait

teoxoy / encase Goto Github PK

encase's Issues

Recommend Projects

Recommend Topics

Recommend Org