Rework Command Buffer Submission

Remove CrCommandQueue and create command buffers from the render device. Also submit
Rework command buffer submission
Allow for signals to be inserted in "command buffers", this internally will flush the hardware command buffer and send to the queue. For example, commandBuffer->SignalFence(Graphics) would chop the command buffer in two

Rework command queue so that committing a command buffer doesn't schedule it right there, but keeps them in a list, such that we can call flush on it, which actually dispatches those command buffers (and potentially signals) so they may be optimally scheduled on the GPU.

D3D12 makes this observation that might be useful:

Calling ExecuteCommandLists twice in succession (from the same thread, or different threads) guarantees that the first workload (A) finishes before the second workload (B). Calling ExecuteCommandLists with two command lists allows the driver to merge the two command lists such that the second command list (D) may begin executing work before all work from the first (C) has finished. Specifically, your application is allowed to insert a fence signal or wait between A and B, and the driver has no visibility into this, so the driver must ensure that everything in A is complete before the fence operation. There is no such opportunity in a single call to the API, so the driver is able to optimize that scenario.

I think the best improvement we can do here is remove the queue from the abstraction, and instead put the queue inside the render device, and index them via an enum that points to the appropriate queue, such as

CrQueueType
{
    Graphics,
    AsyncCompute,
    Copy
}

and call renderDevice->SubmitCommandBuffer(commandBuffer) or something like that. Eventually we call FlushCommandBuffers, etc.

We also create a command buffer that belongs to the queue through the render device directly, instead of through the main command queue.

renderDevice->CreateCommandBuffer(CrQueueType::Graphics)

All the details about queues, etc aren't really necessary, just the ability to determine where a command buffer comes from and where it's going.

Compiling Built-in Shaders

Compiling built-in shaders is tricky, but here are the requirements we want to fulfill.

They need to be in sync with the build. We don't want a separate manual process. They will be considered same as code
The build process is able to determine which are out of date and compile those
The build process creates a code-accesible database of shaders. Initializing pipeline states and shaders becomes easy
Process verified at compile time, i.e. it shouldn't be possible to instantiate a built-in shader that WAS removed or doesn't exist
The compilation process calls CrShaderCompiler
We need to be careful if we add debug information as the compile times can greatly increase
It should be easy to recompile the shaders live, including on consoles
It should be easy to monitor the files for automatic recompilation, even on consoles
We need to be careful about compile times, especially on PC where both D3D12 and Vulkan will be available
Compilation must happen as parallel as it is for normal source files

Tasks

Custom tool gets all .shaders files and distributes compilation

Calls CrShaderCompiler for every shader
Cross-platform
Must do shader distribution by itself, using threads or equivalent
Dependencies must be tracked manually but can use cache
We can do a single metadata header and cpp with all the shaders. This is a big advantage

Unify index buffers

Every render mesh has an index buffer, but it is unique. We can do better by having index buffers for every imported render model (and store an offset in the mesh). That way we can render meshes in one go if their material is the same, or split them if it isn't. We'll need to change the sort key logic for that too.

This would benefit a theoretical prepass or visibility buffer pass

Unit Tests

Some of the core classes could do with good unit tests:

CrPath - make sure to get all the examples from https://en.cppreference.com/w/cpp/filesystem/path/extension
CrModelInstance - create a many model instances, destroy them, make sure we retain the same information as before (they gets shuffled around for optimal traversal)

Implement CrTimer class

This will allow us to make simple measurements across the codebase. First use intended to measure shader compile times, add FPS counter, etc

Rework GPU Buffers

GPU Buffers

Current Model

There is a hardware GPU buffer. This is the same as a buffer created through the graphics APIs. It has a fixed size and properties
There are GPU buffers, which are a mixture between a buffer (when they are owning buffers) and a view (when they don't own the buffer)
We have separate vertex and index buffer classes so they can store additional properties such as the vertex descriptor or the index type. They derive from GPUBuffer

New Model

There is a hardware GPU buffer, same as in the old model
There are vertex and index buffers, but they're just a container for a hardware buffer, plus some extra data. We cannot inherit from the hardware GPU buffer because we don't have access to the platform-specific implementation of the hardware buffer unless we want to put those inside the platform-specific parts, which we can do
GPU Buffers now own hardware buffers, and cannot be treated as views
There are GPUBufferViews now, which we can create from existing buffers only, to ensure they are compatible

D3D12 Backend

There are several fundamental tasks missing

Requirements for descriptor heaps

Non-shader-visible descriptor heap that I can return and reuse descriptors. A pool
Shader-visible stream of descriptors. Command buffers reserve chunk of descriptors as they need them (this is the only sync point) and use those throughout the frame. Those chunks are reserved for the duration of the frame but reused later

Review GPU Selection code

At the moment we're overriding any OS decision so we can test specific GPUs, but it might make sense to add a command line parameter so specify what we favor, and otherwise let the OS do it for us.

Fix Validation Layer Errors

D3D12:

Fix resource transition validation errors. This is fixed by transitioning resources to the default state at the end of the frame

Vulkan:

Fix resource destruction validation layers. We are destroying resources that are still in use in some cases

Improve camera

The main camera needs a bit of work, some stuff is bugfixing, some is extra functionality

Fix the right vector rotation around which we rotate the camera. There is currently some drift
Fix the relative movement. We want to record an absolute value that we have moved so that we always "come back" to the origin of the movement
Add an orbiting camera
Add zoom

Implement Render Graph

This task is going to be a little tricky. The idea behind implementing resource transitions is to take it at a really high level. Vulkan and D3D12 are quite different in this regard so we need to implement this in a "usage" sort of way, i.e. tell it what we want to do with it instead of giving it a source and destination state.

Another thing is that the API is provided as if things were going to be manual (i.e. there is no implicit state tracking), but the idea is that a RenderGraph system would sit on top and only that system would perform actual transition calls, unless some very specific usage called for manual intervention.

Resource transitions are intrinsically linked to render passes, so we may want to extend render passes to actually specify more than for example VkRenderpass needs, such as UAVs, or even create a compute pass, such that these dependencies are solved inside the Create*Pass functions. This means there is no need to transition textures manually other than e.g. inside texture loading where we have to perform those transitions to move them out of copy states, etc.

Fix resource states. We need to decide whether resource states are common between textures and buffers, and if not, ensure that they reflect the functionality that we need accurately. States aren't necessarily states internally, rather they are usage clues so that we can react appropriately in the platform-specific code
Create a compute pass. BeginComputePass(computePassDescriptor), EndComputePass(), etc. This can also inform how to best do the high level render passes later
How are resources bound? Does a user bind manually inside the execution of the render pass or are they bound internally during the BeginPass()? Especially textures, where some have to be bound globally and transitioned appropriately. It definitely could be a combination.
If resources are bound during the BeginPass, it means we need to include the binding point at the point of the render pass.
Implement some form of commandBuffer->TextureBarrier and BufferBarrier functions
Take into account that render passes in Vulkan perform implicit transitions, so they don't need to explicitly do them, whereas D3D12 does need to do them manually
Create a good set of usages for textures, and another for buffers. Keep them completely separate, even though these APIs like to put them together. They are often incompatible and even contradictory combinations
Add initial state for a texture, that we may be able to have a guarantee of initial states

Update:

The final state for this task ended up as:

Implemented Render Graph
Kept BeginRenderPass/EndRenderPass and added a pass type (Graphics, Compute)
Removed TextureBarrier and BufferBarrier APIs, they are handled in the passes
Separated texture and buffer usages, it makes no sense to share them (e.g. a buffer as a render target or a texture as an indirect buffer)

Documentation

https://github.com/KhronosGroup/Vulkan-Docs/wiki/Synchronization-Examples : Very good examples and use cases
https://github.com/Tobski/simple_vulkan_synchronization/blob/master/thsvs_simpler_vulkan_synchronization.h : short library that has enums that map to a combination of image layout, access, and pipeline stage, that we can get inspiration from

Add RenderDoc support

Currently, Vulkan can invoke RenderDoc through an extension. However there's no way to programmatically trigger a capture or take multiple consecutive captures of a region. Further, D3D12 does not have such a mechanism and it would be very useful to have one.

Hook up renderdoc right before the creation of the device
Create a platform-independent API for triggering captures, a count, etc
Make sure this API works with things that aren't RenderDoc necessarily, such as PIX, Razor, etc. In the future we could add a command line called -pix, -razor, etc and they all need to work

Add GLTF importer

Integrate DXC in shader compiler

Integrate dxc to compile HLSL to SPIR-V for the Vulkan backend

Add folder facilities

There needs to be a central place where we can access general paths like:

Temp directory
Shader temp directory
Built directories

etc

Render Graph Improvements

Rework render graph to accept other types of passes, of which passes with lambdas are just one of them. Make a parent class with pure virtual functions Setup() and Execute(), and if we pass in lambdas just create an instance of a class whoe execution of those lambdas goes through the virtual. We need to be able to isolate things outside their execution point if need be
Validation for resources being used that weren't declared in the pass
Validation for passes that produce resources that aren't consumed by any other pass
Validation for passes that read and write to the same resource (bound as texture and render target for example)
Make sure when we bind textures we provide the bind handle (if bindless provide handle + offset)

Add FBX importer

Using the standard FBX importer library

Create Texture Upload and Download

Model

Texture uploading and downloading needs to happen explicitly for DX12-class APIs.

Having an auxiliary command buffer that lives on the render device and does uploads and downloads allows us to defer the upload phase until the frame is about to start, or we can send the command buffer early and get another one, etc It can also deal with using the copy queue. Perhaps a texture uploader class would be useful, in that the code would be platform-independent and would deal with scheduling and other complexities. This upload should handle buffers as well.

There would be two APIs, one scheduled on the render device, another on a command buffer directly. They have different purposes.

The render device one is meant to be used outside the flow of the frame, and can be more disruptive. It's meant mainly for tooling and non performance-critical bits of code. It can be immediate or deferred (waiting on a lambda, etc)
The command buffer one is more straightforward, as it follows the GPU timeline directly. It cannot be immediate or it will disrupt the natural flow of the frame. We can e.g. copy a texture to a buffer, add to a queue and pick up the result later in a lambda (such as save the texture to disk)

Resource States

One of the most complicated issues of explicit APIs is having to track the state of resources. The render graph can track those resources, but outside of that we need to make certain guarantees.

During the execution and flow of the render graph, every resource state is known
If the resource is not tracked by the graph, or we are outside the execution of the graph, the resource is assumed to be in a default state
All operations that change the state of a resource must restore it to the default state after they finish. This needs to be a strong guarantee or we'll get validation errors
For example, the render graph will always put a tracked resource in its default state at the end

Requirements

Need to guarantee texture isn't in use by GPU
Need an immediate mode (e.g. downloading texture to immediately save it)

Consoles

Direct access to memory. No need to copy, it's just there

PC

Need to download/upload the texture to the CPU via PCIe. Possible latency

Implementation Problems

State of texture is not known (for barriers)
Need to return memory, then dispose of it or keep it around - fixed by returning a buffer
Need to have option of waiting or not

Scenarios

1. Download pixel memory immediately

Console: Read immediately
PC: Queue download, wait for idle, read immediately

// Download texture and read immediately
renderDevice->DownloadTextureImmediate(ICrTexture* texture, [](void* memory, uint64_t size)
{
    // Copy or do behavior here
});

// Creates a CPU-visible buffer
// Puts command in render device command buffer to copy texture into buffer
// Waits for device to finish
// Returns buffer
CrGPUBufferHandle buffer = renderDevice->DownloadTextureImmediate(texture);

2. Download texture and read whenever it's ready (e.g. continuous read)

Both: Insert a callback that executes when memory is ready

renderDevice->DownloadTextureQueue(ICrTexture* texture, [](void* memory, uint64_t size)
{
    // Copy or do behavior here
});

// Creates a buffer
// Puts command in render device command buffer to copy texture into buffer
// Puts a request in the transfer queue
// Loops through the transfer queue every frame, and if finished, executes callback
renderDevice->DownloadTextureQueue(texture, [](ICrHardwareBuffer* buffer)
{
    // Copy or do behavior here
});

// TODO An alternative is to have a BeginDownload() and a wait for download.

3. Copy pixel memory onto a texture

// Transitions texture to dst
// Copies buffer into texture
// Transitions texture into original state
// Optionally submits and waits for device to finish
renderDevice->UploadTextureImmediate(buffer, texture);

Uploading textures to GPU from CPU. Has to have option to stream it directly
On consoles it would just copy it directly
On PC it would queue it at the end of the function so it gets uploaded/swizzled
This API is unable to properly issue barriers
renderDevice->UploadTexture(ICrTexture* texture, [](uint8* stagingBufferMemory)
{
// Serialize texture here into stagingBufferMemory
});

// Begin texture upload
// This creates a temporary buffer on NUMA, maps the texture directly on UMA
// Returns a pointer, so that we can populate the buffer as fast as we see fit
uint8_t* textureData = renderDevice->BeginTextureUpload(texture);

// Populate textureData
for(int w = 0; w < width; ++w) ...

renderDevice->EndTextureUpload(texture);

Fix Vulkan binding slot issues

In Vulkan there is no per-stage concept of slot for resources like in previous APIs. For example, a shader that uses a constant buffer in slot 0 in the vertex buffer is expecting the pixel shader to not have anything bound to the same slot. However since we don't compile shaders together until runtime, it is not possible to offset them appropriately unless we hardcode them, but that's not really the most efficient way. Solutions to this could be:

Offset them by a fixed amount per stage. This imposes a fixed table for all shaders but makes inefficient use of the API and memory
Assign a descriptor set per stage. The limitation here is that implementation only guarantee 4 descriptor sets, which is typically a limitation on mobile
On the fly change the bindings once we have all parts of the shader. In theory it should be possible, given Vulkan is simple to modify. Monotonically increase the bindings between stages. Since reflection will give us that information later we don't really mind where things go

Resources:
https://zeux.io/2020/02/27/writing-an-efficient-vulkan-renderer/
http://kylehalladay.com/blog/tutorial/2017/11/27/Vulkan-Material-System.html
http://kylehalladay.com/blog/tutorial/vulkan/2017/08/13/Vulkan-Uniform-Buffers.html

Fix Singleton pattern

The current singleton usage relies on the order of initialization being defined, to some extent. For example, the ImGuiRenderer creates CrTextures that need deleting before the render device is destroyed. However, singletons are defined as either static unique pointers or simple objects.

We probably want a singleton system where on creation they register themselves manually to a list, and on destruction they deinitialize in the reverse order they were initialized in. Therefore any dependencies they may have are implicitly taken care of, instead of manual deallocation of each system.

Create Material Compiler

The material compiler is an ubershader management system. It allows us to compile ubershaders, set defines, features, etc. It is intimately linked to the material, and ultimately will allow us to create materials by adding or removing features. The intention is to also extend it further down the line to a material creation system.

The tasks for this Material Compiler are:

Hashing mechanism: store shaders in a disk cache (could be networked in the future) to prevent duplicated compilations
Preprocessor: preprocess ubershader to create a global hash. Also remove whitespace, comments, etc
Caching mechanism: linked to the hashing, we need a place to store the optimized binaries
Shader sources: a place to collect the shader code that we can use to compile
Multithreaded shader requests: any given material requests multiple shaders (variants, platforms), service them quickly
Recompilation mechanism: ability to recompile material shaders and recompile them easily

Rework CrPath

std::filesystem::path is very convenient but we cannot really modify it and converting to string incurs in constant construction of strings. It could be beneficial to have a limited functionality path with a fixed length and normalization routines, etc.

Object Selection

We want to start implementing a very basic object selection pipeline to be able to start moving entities around. The requirements for this system are:

Select an object
Highlight selected object
Make a translation gizmo
Be able to select translation gizmo
Move entity

For that there are certain things we need to make sure we are able to do

Trigger selection code on input
Create a new render pass with objects that pass the selection candidate test (small region in screen)
These objects should have the debug shader enabled
The debug shader should have a special 'unique id' mode assigned to them
The unique id is rendered into a tiny id buffer
This id buffer is analyzed and entity information is extracted from it
For this we need to implement texture/buffer downloading
Once we have the entity, we trigger the highlight shader which creates an outline

Create a Cycle Timer

In the same spirit as CrTimer, create a CrCycleTImer, to measure cycles and compare performance. HLSL++ already uses _rdstc for it, it's just a question of getting a nice interface for it

Improve Present and Swapchain

To conform with D3D12 and potentially other APIs

Move AcquireNextImage to Present. Present calls potentially block
Fence waiting is currently very explicit. Can we hide the implementation detail by making the swapchain do that behavior?
Creating a swapchain can take a previous swapchain. We can force destroy the incoming handle once we extract the HWND or other necessary information so that we don't have to keep it around. Vulkan can actually take advantage of the previous swapchain when recreating it

Add debugging information to shaders

Basically, we need to output PDBs so it's easier to inspect and modify shaders. We need a good, centralized system to do it. It can either be in the shader compiler or passed in externally somehow, but they all need to go into a single directory (per platform) and it needs to be deleted easily (so inside temp or something similar where people might look for them)

They are not final objects, we just want them for debugging. Both D3D12 and Vulkan have different mechanisms to do it. In terms of embedding the information for the PDB, we probably need to add something in the shader reflection header.

Render World

RenderWorld contains all instances of objects that participate in the rendering in one way or another. For example, model instances, lights, cameras, etc. There are more creative uses such as proxies, etc. All render world objects have a transform.

The render world should be a flat list of objects, so we need to make sure every model instance has a model instance id, this has a problem because we want a local id (iterate over model instances) and a global id (iterate over visibility). We could have a list of visibility ids that one can retrieve

Creating and removing an instance incurs in a search to find an empty space, but access to them is linear and can also be split by transform, etc.

Create a Render World
Create a Model Instance
CreateModelInstance() creates a model instance that uniquely belongs to that render world, so it must have an id assigned to it
DestroyModelInstance() destroys the model instance, and frees up the ids
Getting the different properties of a model instance goes through the render world
The render world must somehow ensure that all its entities are destroyed before it. It should probably hold a reference to them and be able to destroy whichever are left
The render world interacts with a visibility system. The visibility system is able to iterate through the properties of the render world and creates its own list of visible objects based on the current camera
The render world is then able to iterate through the visible entities to render them

Make CrShaderCompiler use dxc

Glslang is still good to create the metadata so that part of the pipeline should not disappear. However, glslang + optimizer is a very slow and buggy pipeline and we're better off using dxc which is fast and production oriented, and is also faster

Improve CrProcess API

The API for launching processes is actually not very good at the moment. Several things need to change:

Make CrProcess an object that has functions
Add Wait() (waits for completion)
Add Read() to read the std out
Destructor will close all handles, etc so we can do nothing more with it. That means in this model there

It this API is good enough we can consider modularizing and making it a standalone github project

Improve Vertex Inputs

Vertex inputs should be well defined, be in a single place, and not spread out in multiple places

Create a collection of common vertex inputs. SimplVertex should probably be removed
Ability to bind multiple vertex streams to command buffer
Shader creates an input layout from metadata
Vertex buffers own a local view of their data via a CrVertexDescriptor
Meshes have a global view via another CrVertexDescriptor. This is what we use to match with the pipeline state
Meshes also have multiple vertex buffers

Create Material model

The material model governs the way geometry passes work. Getting this right is going to prove tricky so here's a summary.

CONCEPTS

· Vertex Buffer: A source of geometry information
· Shader: The full pipeline (i.e. vertex + pixel shader, etc)
· Render Mesh: A single draw unit. Typically composed of multiple vertex buffers
· Material: A collection of shaders (prepass, gbuffer, etc)
· RenderModel: A mesh and a material together. The render model produces pipeline objects that are directly usable for rendering. They can be compiled up front, in parallel, etc.

CREATION

· A vertex buffer is created by loading a mesh. In the future we'll have an import process, but for now an fbx/gltf loads a vertex buffer
· A mesh is created by concatenating one or more vertex buffers
· A shader is creating by calling CrShaderCompiler and loading the returned binary object. A shader hasn't been compiled until it's a pipeline
· A material is loaded by loading an fbx/gltf at the moment (in the future the import process would create a default material for it). We'll call the Material Compiler with a descriptor which creates all the necessary shaders for it
· A render model is the final product after all those processes above. It has almost all the information needed to render, except for instance-specific data such as transform, skinning, and engine-provided data (e.g. refraction buffer)

LOADING

Loading is probably the hardest part to define as it is context-defined.

Meshes: These are easier because there are two sources of geometry: source and serialized. In the first case we load from an FBX and create the mesh once the geometry is processed, in the second the geometry has been stored in a native format and just needs to be deserialized.

Materials: Material shaders are more complicated because there is a compilation process involved, there are multiple shaders per material, and there are multiple ways to obtain these shaders. A material typically contains all the shaders necessary to render an object in any given context (e.g. prepass, gbuffer, transparency). It has one shader for each of those contexts and knows all the render state that is necessary (render target formats, depth state, etc). Materials are built through a material descriptor that kicks off a process to obtain the shaders for it. There are two ways to obtain shaders.

a) Source: If the shader binary isn't available anywhere, load the source file that contains the entry point, set up defines as necessary and call CrShaderCompiler. This in turn calls the platform-specific compiler and produces an output where the application is expecting it. After that the application loads the binary and passes it to the API. The shader manager can have a cache in temp and a runtime cache to accelerate the lookups. The key is the entire command line and it needs to compare the date.

b) Compiled: Load the binary objects. These binaries are ready to use by the API (bar some runtime patching e.g. vulkan bind points) These binaries live in the data and be part of some binary blob, etc. This would be a second step when importing and creating materials is more defined.

Pipelines: Each render model points to its meshes, its materials (that meshes use) and it creates pipelines for the combinations. These pipelines are created when the render model is instantiated or in a deferred manner, in any case they need to be ready before any rendering happens to avoid stalls. The pipeline manager already has a cache to avoid creating the same pipelines repeatedly, and has the offline cache as well.

Fine-tune Texture Resource States

Texture resource states don't fully describe all usages and some of them don't translate well to some APIs. In particular,

PreInitialied is a Vulkan concept, it shouldn't be handled in the platform-independent layer and also isn't really relevant as it only matters for textures that aren't going to change state anyway
Undefined is also a conflicting one. While both Vulkan and D3D12 have a concept of being able to up-promote from some states to anything, their behavior is different and we don't want to rely on that
Introduce the idea that resources are in a well-defined state when they live outside of the render graph. That way we can reliably download, copy or perform other operations outside the render graph without having to track the states (and restoring them properly)
Depth stencil buffers are a complex feature. Depth is able to be in multiple states, to be able to be updated but also to be read from a pixel shader while at the same time being used for depth-stencil testing.
- Depth-stencil write
- Depth-stencil read-only
- Depth-stencil read-only with shader read (as SRV)
One other thing that would be useful is to use the Agility SDK. The newer D3D12 has different resource states that better match the Vulkan model and make things more explicit

User Interface

Just putting some notes here on what is desirable from a UI point of view, frameworks, etc. It's not urgent so we can take as much time as we need. The first thing we want to discuss is requirements.

Windows must react in a "native" way, in terms of minimizing, maximizing, dragging to the top, etc
It must be possible to create tabs
It must be possible to dock windows
It must be possible to customize everything with images and icons
It must be possible to skin the tools
It must be fast

The main question is whether it's possible to rework Imgui to make it behave the way we want. As an experiment, on the Imgui demo we must manage to do:

Windows to react to Window Key + Arrows. For this we need to make windows have the native title bar, which is probably what we'd like anyway. This means being able to customize the title bar is a must. This is doable when removing the window decorations
Make sure we can dock properly by dragging the native title bar. I haven't been able to do this
Make sure we can resize dynamically through the native window resize (which isn't possible at the moment as it causes a black screen). This is related to swapchain recreation

Add FPS/ms statistics to the UI

Graphics API abstraction

There are still things pending from a complete graphics API abstraction. Outstanding things are

Advances Features

Ubershaders

Ubershaders are almost ready in terms of basic functionality, but several things need addressing:

Make sure cached shaders that are out of date are discarded. Whether it's nuking the cache or storing the engine hash in the shader, we need to be able to determine whether something is out of date
Have a mechanism for creating defines out of a CrMaterialShaderDescriptor. The material shader descriptor is where we get our hashes from, so we need to be careful to ensure that a property change in the descriptor causes a change in the defines. We could perhaps associate a property to a define, that way it is guaranteed. Anything that doesn't produce a define or a change in the shader shouldn't go in the descriptor

Model Instance Manipulator

We want to create a manipulator so we can move model instances around. This will exercise several aspects of the engine, such as builtin shaders that are compatible with our geometry pipeline, and the instance picking system (as we'll rely on it for selecting the different axes)

Create a manipulator when we click on any model instance
It should stay on as long as the model instance is selected
It should disappear when no entity is selected
It should have a constant size on screen, regardless of how far the camera is from the entity. See this link for things to take into account. https://computergraphics.stackexchange.com/questions/9907/how-to-keep-an-object-constant-in-screen-space#:~:text=Scale%20the%20object%20proportional%20to,regardless%20of%20the%20camera%20zoom.
Clicking on the manipulator does not count as selecting something else
Clicking and holding on the manipulator makes it highlight, otherwise it goes back to its normal color
Dragging the manipulator should move it around, and with it the entity. The movement should be restricted to the axis/es the manipulator part refers to
The manipulator should render on top of everything, but should still be depth tested itself

It needs to be accompanied with multiple other quality of navigation fixes, into a coherent solution

When we select an entity, we should be able to focus on it via the F key. It needs to look at the pivot point, at a distance of some multiple of the bounding sphere radius
We need to be able to orbit around it via the Alt + Left Click + Move Mouse
We want the navigation to be smooth. No snapping, teleporting, etc
We need to be able to move in the direction of the camera via Alt + Right Click + Move Mouse

Memory leak in renderdoc

Booting the game up with renderdoc seems to leak memory. I think it's related to how we are creating the vkRenderPasses and vkFramebuffer in local memory and reusing, whereas renderdoc just creates the objects.

Shader error system

We need a way to properly log errors when things go wrong and the shader compiler cannot crash or exit unexpectedly if things don't work. A good error message and nice debug information will help

Ideally we pipe error messages to the calling executable so things can be inspected after the fact

Fix SHADER_PATH and IN_SRC_PATH

Either put them in a common header with paths, read them from a file, something that is not a macro and is configurable. Also perhaps configurable via the command line?

Integrate fmtlib in codebase

There are currently issues with the way logging works where passing variadic parameters to functions doesn't really work and there's some weirdness involved in fixing it. Instead of spending time we should really just integrate fmtlib and rework the logging system to use it.

https://github.com/fmtlib/fmt

Rendering Resource Lifetimes

Currently rendering resources are deleted right when the destructor is called. We need to have a way to add them to a deletion queue, which then can check that the semaphore the frame signals to say it's finished has been signaled and therefore the resources are safe to delete.

Previously there were issues with the way the swapchain was working that meant we were waiting too much, and exposed issues in the destruction of GPU Resources.

Create a GPU Deletion Queue
Each list is associated with a primitive that is able to determine whether the GPU has completed the work up to that point. For now, a CrGPUFence
Every frame, the Deletion Queue will push a fence to the end of the command buffers, and will assume that anything deleted up to that point needs to wait for that fence
If that fence is signaled, the deletion queue can proceed

Input improvements

Input currently works but cannot handle multiple windows, focus, etc

Tasks

Make each window have an input handler, that routes input to the main InputManager
Relative mouse movement outside the window (window loses focus) can be solved via SDL_SetRelativeMouseMode
Make sure things like Imgui are supported (Imgui should be able to have exclusive input)
Concepts such as game window, editor window, property windows, etc should be expressable
Probably use ICrOSWindow in all this

wm_move
raw input
restrict mouse coordinates to window
on mouse exit (the screen)
keyboard input difficult
stylus

InputManager
{
// external interface
void SetMousePosition
void SetMouseButtonPressed

Add Platform Utilities

For Renderdoc, we've had to query registry keys, and we'll probably want to do a bit more of it. However I don't know the use cases yet to embark on such a task. There are however things that one can only do in certain platforms or operating systems (such as the registry) and it would be good to keep them separate and not compilable.

For example, a class that is CrWindowsPlatformUtilities that has registry-specific functions that ease querying of regitry keys (which is definitely a hassle)

We musn't go too far with this however, as many things are OS-specific but that doesn't mean they cannot be abstracted away behind a good interface that can benefit more platforms.

Create Vertex Descriptors From Shader Vertex Inputs

We want a way to create vertex inputs from declarations in shaders, that way a shader knows which declaration to use, which format belongs to it, we can validate mismatches, etc.

Embedded Shaders

Embedded shaders are those shaders that are independent of materials and come together with the game build. There are several requirements that we would like to address.

Embedded shaders are compiled during the normal build. That means binary files (.spv, .bin, etc) are ready by the time the game launches and runs. They do not get embedded in the exe, instead they live in "code data"
They are easily accessible inside code. That means we get access to some sort of metadata that allows us to load shaders with very little programming overhead
One big issue with embedded shaders is that we need to create pipeline objects with them. Creating a convenient pipeline object for e.g. a copy shader can be quite annoying, because it needs a combination of texture formats which makes the interface pretty inconvenient. Perhaps a way of creating variants of pipelines is necessary, and we make cumbersome code for frequently-used variations.
Load binary first instead of hlsl. This means that shaders load a lot faster, especially in debug
Live recompilation should be a first-class citizen. The system should be designed so that live recompilation on consoles or other embedded devices is possible with minimal issues
File watchers. As soon as you save the hlsl file or shaders file, shaders get recompiled
Alternatively a key to recompile shaders
A way to filter which shaders get recompiled

Add resolution to command line

Should look like -resolution 1280x720

redorav / corsair-engine Goto Github PK

corsair-engine's People

Contributors

Stargazers

Watchers

corsair-engine's Issues