Giter Site home page Giter Site logo

streamline's Introduction

Streamline (SL) - Version 2.4.11

Streamline is an open-sourced cross-IHV solution that simplifies integration of the latest NVIDIA and other independent hardware vendors’ super resolution technologies into applications and games. This framework allows developers to easily implement one single integration and enable multiple super-resolution technologies and other graphics effects supported by the hardware vendor.

This repo contains the SDK for integrating Streamline into your application.

For a high level overview, see the NVIDIA Developer Streamline page

IMPORTANT: For important changes and bug fixes included in the current release, please see the Release Notes

As of SL 2.0.0, it is now possible to recompile all of SL from source, with the exception of the DLSS-G plugin. The DLSS-G plugin is provided as prebuilt DLLs only. We also provide prebuilt DLLs (signed) for all other plugins that have source. Most application developers will never need to rebuild SL themselves, especially for shipping; all of the pieces needed to develop, debug and ship an application that integrates SL and its features are provided in the pre-existing directories bin/, include/, and lib/. Compiling from source is purely optional and likely of interest to a subset of SL application developers. For developers wishing to build SL from source, see the following sections.


Prerequisites

Hardware

  • GPU supporting DirectX 11 and Vulkan 1.2 or higher

Windows

  • Win10 20H1 (version 2004 - 10.0.19041) or newer
  • Install latest graphics driver (if using NVIDIA GPU it MUST be 512.15 or newer)
  • Install VS Code or VS2017/VS2019 with SDK 10.0.19041+
  • Install "git".
  • Clone your fork to a local hard drive, make sure to use a NTFS drive on Windows (SL uses symbolic links)

BUILDING SL FROM SOURCE


As mentioned in the lead section of this document, SL now ships with most of its source code available, which allows developers who wish to build most of SL from source to do so locally. The sole exception is the DLSS-G plugin, which is only available procompiled.

IMPORTANT: Only use production builds when releasing your software. Also, use either the original NVIDIA-signed SL DLLs or implement your own signing system (and check for that signature in SL), otherwise SL plugins could be replaced with potentially malicious modules.

Configuring and Building a Tree

All of SL's projects and build information (and those of all SL based apps) are controlled through a single platform independent build script called premake5.lua. This is located in the root of the SL tree and uses the premake project creation toolchain. Any new projects or changes to existing projects must be listed in this file. For most projects, all source and header files found within a project's directory will be automatically added to that project.

To configure a new SL tree to build, there is a script called setup.bat. Note that on Windows this must be run from either a Windows command prompt window or a PowerShell window.

Running the setup.bat script will cause two things to be done:

  1. Use the NVIDIA tool packman to pull all build dependencies to the local machine and cache them in a shared directory. Links are created from external in the SL tree to this shared cache for external build dependencies.
  2. Run premake to generate the project build files in _project\vs2017 (for Windows)

To build the project, simply open _project\vs2017\streamline.sln in Visual Studio, select the desired build configuration and build, or else use the provided build script:

./build.bat with -{debug|develop|production} (debug is default) or use VS IDE and load solution from the _project directory

The default setting is to target x86_64 CPU architecture.

NOTE: To build the project minimal configuration is needed. Any version of Windows 10 will do. Then run the setup and build scripts as described here above. That's it. The specific version of Windows, NVIDIA driver, or Vulkan are all runtime dependencies, not compile/link time dependencies. This allows SL to build on stock virtual machines that require zero configuration. This is a beautiful thing, help us keep it that way.

Changing an Existing Project

Do not edit the MSVC project files (or Makefiles on other platforms) directly! Always modify the premake5.lua described above.

When changing an existing project's settings or contents (ie: adding a new source file, changing a compiler setting, linking to a new library, etc), it is necessary to run setup.bat again for those changes to take effect and MSVC project files and or solution will need to be reloaded in the IDE.

NVIDIA does not recommend making changes to the headers in include, as these can affect the API itself and can make developer-built components incompatible with NVIDIA-supplied components.

Using the results of local builds

Once the project is built for a configuration, the built, unsigned DLLs may be found in _artifacts\sl.*\<Config>\. These DLLs can be copied as desired into the bin\x64 directory, or packaged for use in the application itself.

Obviously, sl.dlss_g.dll cannot be built from source and thus the prebuilt copy must be used.

(Optional) Compiling Shaders

If you would like to recompile the shaders for the NIS plugin, you will need to have Python 3 installed and in the path.

SDK Packaging

  • Execute ./package.bat with -{debug|develop|production} (production is default)

The packaged SDK can be found in the generated _sdk folder.

Debugging

Streamline offers several ways to debug and troubleshoot issues. Please see the following pages for more information.

  • Using SL ImGui: [Debugging - SL ImGUI (Realtime Data Inspection).md](docs/Debugging - SL ImGUI %28Realtime Data Inspection%29.md)
  • Using JSON configuration files: [Debugging - JSON Configs (Plugin Configs).md](docs/Debugging - JSON Configs %28Plugin Configs%29.md)
  • Using NRD's validation layer: [Debugging - NRD.md](docs/Debugging - NRD.md)

General Programming Guide

Please read ProgrammingGuide.md to learn about the integration in games.

Advanced Programming Guide - Manual Hooking With Lowest Overhead

Please read ProgrammingGuideManualHooking.md to learn about advanced SL integration in games.

Programming Guides Per Feature:

Sample Plugin Source Code

A sample Streamline plugin source code is located here

Sample App and Source

A sample application using Streamline may be found in this git repo

streamline's People

Contributors

anpurohit avatar larsbishop avatar liam-middlebrook avatar nvthayes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

streamline's Issues

[Vulkan] VK_ERROR_OUT_OF_POOL_MEMORY in Vulkan::processDescriptors

Hi :)
I tried to integrate Streamline in our Vulkan backend, and when I try to EvaluateFeature(kFeatureNIS), I get a VK_ERROR_OUT_OF_POOL_MEMORY error inside the Vulkan::processDescriptors function.

From what I understand, the issue is that poolSizes only specifies the number of descriptors for a single descriptor set, but the code tries to create 64 descriptor sets, thus the second call of m_ddt.AllocateDescriptorSets fails.

[Vulkan] Validation Error: Not all objects have been destroyed at shutdown

Hi :) I'm trying to integrate Streamline into our Vulkan renderer (currently only using NIS), and when shutting down the rendering engine, I get a lot of validation errors, saying that various objects have not been destroyed at the time the device itself was destroyed:

VUID-vkDestroyDevice-device-00378(ERROR / SPEC): msgNum: 1901072314 - Validation Error: [ VUID-vkDestroyDevice-device-00378 ] Object 0: handle = 0x20386c803b0, type = VK_OBJECT_TYPE_DEVICE; Object 1: handle = 0x8a50150000001582, name = const buffer, type = VK_OBJECT_TYPE_BUFFER; | MessageID = 0x71500fba | OBJ ERROR : For VkDevice 0x20386c803b0[], VkBuffer 0x8a50150000001582[const buffer] has not been destroyed. The Vulkan spec states: All child objects created on device must have been destroyed prior to destroying device (https://vulkan.lunarg.com/doc/view/1.3.211.0/windows/1.3-extensions/vkspec.html#VUID-vkDestroyDevice-device-00378)
Objects: 2
[0] 0x20386c803b0, type: 3, name: NULL
[1] 0x8a50150000001582, type: 9, name: const buffer

I can track back, that these constant buffer(s) was allocated by Streamline (thanks to the name). There are also many messages about other objects (without any name), which I assume also come from Streamline, but I can't validate that for sure. This is not really a blocking issue for me, but it would be nice to fix it.

Recommended DXGI configuration for Reflex?

For example, Unreal out of the box sets 3 backbuffers and a maximum frame latency of 3. If I want minimum latency, would it help to configure 2 and 1 respectively? How about using the waitable swap chain feature?

[Vulkan] Error: descriptor not initialised

Hi :) I'm trying to integrate Streamline into our Vulkan rendering engine, and when calling EvaluateFeature(kFeatureNIS), I get a vulkan validation error (or just a device lost, if the validation layer is not enabled), saying that the descriptor 0/0 (the NIS constant buffer, from what I can see?) has not been initialised with vkUpdateDescriptorSets.
This doesn't seem to happen every time I call EvaluateFeature, but it happens often.

I managed to locally fix the issue by setting needsUpdate to true in Vulkan::processDescriptors, which effectively causes vkUpdateDescripotrSets to be called with every dispatch. Obviously, that's a bit wasteful, but I'm not familiar enough with the Streamline code base, to figure out, what is the proper fix/at which condition the regular needUpdate check fails.

Exception on slShutdown, when using sl::kFeatureImGUI

Hello, I was updating streamline version from 2.2.1 to 2.4.11 and found out the exception on slShutdown. As I see it's caused by sl.imgui. There was no issues on 2.2.1. I also managed to reproduce it in StreamlineSample. It only reproduces when DLSS-G swapchain is created.

Exception occurs here or on previous line. GImGui is null.

io.Fonts->TexID = NULL; // We copied g_pFontTextureView to io.Fonts->TexID so let's clear that as well.

sl.consts: Misleading Description for `jitterOffset`

From sl_consts.h:

//! Specifies clip space jitter offset
float2 jitterOffset;

From ProgrammingGuideDLSS.md troubleshooting section:

Make sure that jitter offset values are in pixel space

Indeed, we had trouble with DLSS artifacts until we realized that jitter offset is supposed to be in pixel space, not clip space.

Enabling DLSSG in D3D11 throw error when calling NvAPI_D3D12_SetAsyncFrameMarker

It's even happening with the Streamline Sample.

ERROR: [10.04.2023 15-32-03][streamline][error]d3d12.cpp:2554[setAsyncFrameMarker] NvAPI_D3D12_SetAsyncFrameMarker((ID3D12CommandQueue*)queue, &params) failed error -1

I found that it's with this function in sl.compute project's d3d12.cpp

ComputeStatus D3D12::setAsyncFrameMarker(CommandQueue queue, ReflexMarker marker, uint64_t frameId)
{
    NV_LATENCY_MARKER_PARAMS_V1 params = { 0 };
    params.version = NV_LATENCY_MARKER_PARAMS_VER1;
    params.frameID = frameId;
    params.markerType = (NV_LATENCY_MARKER_TYPE)marker;

    NVAPI_CHECK(NvAPI_D3D12_SetAsyncFrameMarker((ID3D12CommandQueue*)queue, &params));
    return ComputeStatus::eOk;
}

And reflex isn't working when DLSSG is enabled or even loaded.
I think it has something to do with this error, or maybe not, but it certainly only stopped working when I loaded DLSSG.

You can reproduce this by simply running the Streamline Sample in d3d11 mode.

EDIT: By reflex not working I meant there was no latency report available and the reflex FPS cap wasn't working. I couldn't really tell if reflex itself is working though.

Streamline destroys NGX context, while it's in use on gpu. DLSS

Hi,
Shouldn't the streamline handle the NGX context life time, as he owns it.
When switching dlss quality mode or output resolution the streamline destroys previous ngx context and creates new one without validating that resources are not in use on gpu.
I don't see any mentions in programming guide that user have to test whether the quality mode or output resolution changes and flush the gpu before calling slEvaluateFeature(sl::kFeatureDLSS, ..);

Am I missing something? Should we really manage the ngx handle lifetime, without even having proper access to it?
I would appreciate if someone clarify it to me.

Thanks, Daniel.

SL2.0 - NRD - Incorrect root parameter setup

Hi,
I´m trying SL 2.0 to use NRD.
First, the doc is not up to date and that does not help setup the lib.
After some struggles I got to the point where I can call slEvaluateFeature(sl::kFeatureNRD,...) but I have those error messages from SL :

[18.04.2023 19-52-00][streamline][error]d3d12.h:193[validate] Incorrect root parameter setup!
[18.04.2023 19-52-00][streamline][info]d3d12.cpp:1502[dispatch] Created root signature 0x1cd7e09d190 with hash 14346176348325673696
[18.04.2023 19-52-00][streamline][error]d3d12.cpp:1523[dispatch] Failed to create CS pipeline state
[18.04.2023 19-52-00][streamline][error]nrdentry.cpp:735[nrdEndEvent] ctx.compute->dispatch(grid[0], grid[1], grid[2]) failed
...
[18.04.2023 19-52-00][streamline][info]d3d12.cpp:1502[dispatch] Created root signature 0x1cd7e09c9b0 with hash 10906806363737679482
[18.04.2023 19-52-00][streamline][info]d3d12.cpp:1526[dispatch] Created pipeline state 0x1cd7f2420e0 with hash 5412566692928581985
...
[18.04.2023 19-52-00][streamline][error]nrdentry.cpp:960[nrdEndEvent] ctx.compute->dispatch(dispatch.gridWidth, dispatch.gridHeight, 1) failed
[18.04.2023 19-52-00][streamline][error]d3d12.cpp:1538[dispatch] Failed to create root signature or pso for kernel nrd_prep.cs:main
[18.04.2023 19-52-00][streamline][error]d3d12.cpp:1538[dispatch] Failed to create root signature or pso for kernel nrd_pack.cs:main
[18.04.2023 19-52-00][streamline][error]d3d12.cpp:1538[dispatch] Failed to create root signature or pso for kernel REBLUR_Diffuse_SplitScreen.cs:main
...

And d12 debug layer give me that :
D3D12 ERROR: ID3D12Device::CreateComputePipelineState: Root Signature doesn't match Compute Shader: Shader CBV descriptor range (BaseShaderRegister=0, NumDescriptors=1, RegisterSpace=0) is not fully bound in root signature
[ STATE_CREATION ERROR #882: CREATECOMPUTEPIPELINESTATE_CS_ROOT_SIGNATURE_MISMATCH]

SL is integrated via sl.interposer.dll, NRD seems to be properly initialized, I can´t find more data on the error and how I could fixe it.
It is a dx12 renderer on windows11 and a rtx4080.

What is going wrong ?

[NIS] Setting options through slEvaluateFeature doesn't work.

The docs state

NIS options must be set so that the NIS plugin can track any changes made by the user. This can be done explicitly using the slNISSetOptions or implicitly by adding options as part of the slEvaluateFeature call

However, adding the options as part of the slEvaluateFeature call doesn't seem to work at all - no change in options apply and if slNISSetOptions is never called, the slEvaluateFeature fails with eErrorMissingConstants.

Repro:
Simply pass the desired NIS options through the inputs of the slEvaluateFeature call instead of calling slNISSetOptions, and remove any call to slNISSetOptions, then try to use NIS in the program.

Linux support planned?

Hi,

Streamline 2.0 has been released and seems it brings support for Vulkan for the first time..
At least 1.1 documentation mentioned dx only support..
But Stremline 2.0 even with VK support, still seems Windows only, which is sad as DLSS supports Linux also..
should be possible to attemp a Linux port for people outside Nv, as almost all library has source code available.. except the new DLSS-G plugin, which is provided in binary form only, if understood correctly..
So hope NV interested in providing Linux support (with DLSS-G support too..)
Hope once/if people add Streamline support for FSR2.x (& 3.0), they add also support for Linux ( as FSR 2.x supports Vulkan so Linux friendly)..
To end, say no need for shipping a Linux product, but would be nice to be able to integrate DLSS-G on Linux via Streamline on some of my toy projects..

Edit: useful, for potential Linux support, for new NV Path Tracing SDK, which plans to integrate Steamline 2.0 for DLSS-G frame generation support..

Thanks..

Exception on slSetTag(), when cloning resource when 'D3D12_RESOURCE_DESC::Alignment == 4096' and 'D3D12_RESOURCE_DESC::Format == DXGI_FORMAT_R16G16_FLOAT'

Hello, I got exception on slSetTag(). After looking into source code, I see that it happens in ResourcePool::allocate(), when you call m_compute->getResourceState(res->state, initialState);
https://github.com/NVIDIAGameWorks/Streamline/blob/7ac42e47c7dd55b5b6dd6176c0228048636541b2/source/platforms/sl.chi/generic.cpp#L172C21-L172C21
The possible problem is that you don't test the return value of m_compute->cloneResource(source, res, debugName, initialState); which might not initialize sl::chi::Resource res variable, which is pointer to sl::Resource, therefore in res->state you access with nullptr.
Also I enabled d3d12 debug layer validation to investigate why m_compute->cloneResource(source, res, debugName, initialState); don't initialize res and got this d3d12 error

D3D12 ERROR: ID3D12Device::CreateCommittedResource: D3D12_RESOURCE_DESC::Alignment is invalid. The value is 4096. Resources with D3D12_RESOURCE_DESC::Flags with either D3D12_RESOURCE_FLAG_ALLOW_RENDER_TARGET or D3D12_RESOURCE_FLAG_ALLOW_DEPTH_STENCIL must set Alignment equal to 65536 (aka. D3D12_DEFAULT_RESOURCE_PLACEMENT_ALIGNMENT), or 0. [ STATE_CREATION ERROR #721: CREATERESOURCE_INVALIDALIGNMENT]
D3D12: BREAK enabled for the previous message, which was: [ ERROR STATE_CREATION #721: CREATERESOURCE_INVALIDALIGNMENT ]

Looks like you add D3D12_RESOURCE_FLAG_ALLOW_RENDER_TARGET flag, without changing D3D12_RESOURCE_DESC::Alignment. I guess you did not expected the alignment to be 4096.

https://github.com/NVIDIAGameWorks/Streamline/blob/7ac42e47c7dd55b5b6dd6176c0228048636541b2/source/platforms/sl.chi/d3d12.cpp#L2159C39-L2159C39

Thanks for your work and hope these issues will be fixed soon.

ResourcePool::allocate() may starve other threads, causing hitches or stutters with dlssg enabled

When ResourcePool::allocate() reaches a limit (i.e. VRAM budget or maximum queue depth) it'll spin in a busy loop hoping another thread comes around and frees existing allocations via ResourcePool::recycle(). If said busy loop exceeds its time limit, it'll fall back to a brand new allocation instead. I'm assuming this loop doesn't execute under normal circumstances because Streamline doesn't spawn threads and games rarely parallelize slEvaluateFeature() calls.

However, once DLSS-G is enabled there's suddenly 3 threads competing with each other: a game (present) thread, a sl.pacer thread, and a sl.dlssg thread. The game and sl.pacer threads are often contended on ResourcePool's mutex, leading to a problem where ::recycle() is unable to progress after ::allocate() enters its busy loop. Streamline tries to mitigate this deadlock with the following code:

float resourcePoolWaitUs = bytesAvailable > footprint.totalBytes && allocated.second.size() < m_maxQueueSize ? 500.0f : 100000.0f;
// Use more precise timer
extra::AverageValueMeter meter;
meter.begin();
// Prevent deadlocks, time out after a reasonable wait period.
// See comments above about the wait time and VRAM consumption.
while (items.second.empty() && meter.getElapsedTimeUs() < resourcePoolWaitUs)
{
lock.unlock();
// Better than sleep for modern CPUs with hyper-threading
YieldProcessor();
lock.lock();
meter.end();
}

There's an oversight on line 144 as std::mutex does not guarantee fairness. YieldProcessor() is one instruction and makes no real difference. Unlocking and relocking might wake other threads but ::allocate() can reacquire the lock before anybody else gets a chance. This is often the case on my machine.

Based on my not-so-scientific testing I usually hit that 100000us pause in games every 1-2s which results in a stuttery mess. This only occurs with vertical sync enabled through Nvidia Control Panel. Games are smooth with vertical sync off.

I annotated an Nsight trace while trying to understand what's happening. Possibly useful for someone:

Memory leak from slSetD3DDevice() with D3D11 device.

Hello!

My name is Daniel Isheden, and I'm a rendering engineer at CCP Games. We are currently working on implementing a wide variety of upscaling techniques into EVE Online. This includes DLSS 3 through Streamline, both for DX11 and DX12. While DLSS is working fine on DX12, (including with frame generation), we are seeing a significant memory leak on DX11 when we recreate the D3D11 device, which we believe is caused by Streamline.

The memory leak issue

Due to the vastly different requirements of different upscaling techniques (for example FSR3 and DLSS3 requiring control over the devices and swapchains), we completely destroy and reinitialize a new device whenever we need to change upscaling methods or upscaling settings. The memory leak seems to be caused by Streamline incorrectly holding on to the D3D11 device which we pass in, preventing it from being destroyed. This causes us to leak the entire device, along with a large amount of VRAM, around 200-500 MBs depending on screen resolution and settings. We can see that when we call Release() on the device, the ref count is significantly higher after we have passed the device to Streamline using slSetD3DDevice().

Here are the relevant commands that we do to initialize Streamline.

  1. Initialize Streamline using slInit().
  2. Create the D3D11 device using D3D11CreateDeviceAndSwapChain().
  3. Attach the device to Streamline using slSetD3DDevice().
  4. When we're done, we call slShutdown().

As soon as we have called slSetD3DDevice() with the D3D11 device, the device will not be correctly released by Streamline, even if we never actually do anything with Streamline beyond calling slShutdown() after that. There doesn't seem to be a way to get Streamline to correctly release the device. Again, this problem only occurs for DX11. This is not a problem in DX12.

Apart from the memory leak, DLSS itself works correctly on DX11 for us. Also, this memory leak goes away completely if we just skip calling slSetD3DDevice(), but this unfortunately breaks DLSS (see below).

Do we need to call slSetD3DDevice()?

The Streamline Programming Guide says:

"IMPORTANT When using d3d11 it is important to note that calling D3D11CreateDeviceAndSwapChain will result in that device being automatically assigned to SL."

We are indeed using D3D11CreateDeviceAndSwapChain() to create our device and swapchain, so according to this spec, we shouldn't need to call slSetD3DDevice() at all. Skipping the call to slSetD3DDevice() does indeed fix the memory leak, which seems to imply that the problem is that we're essentially setting the device "twice", once implicitly with D3D11CreateDeviceAndSwapChain() and once explicitly with slSetD3DDevice(). Then when we call slShutdown(), the device is only released once, causing the reference count to never reach 0.

Unfortunately, there seems to be a separate bug here which prevents DLSS from functioning without calling slSetD3DDevice() manually. While we are able to turn on DLSS using slSetFeatureLoaded(), when we actually try to call slDLSSGetOptimalSettings(), we get a eErrorMissingOrInvalidAPI error, along with the following error in the console:

"[17-24-37][streamline][error][tid:52156][14s:954ms:054us]dlssentry.cpp:815[slGetData] NGX context is missing, please make sure DLSSContext feature is enabled and supported on the platform"

This error goes away and DLSS works correctly if we call slSetD3DDevice() with our device, but again, according to the documentation we shouldn't need to. It makes no sense that setting the device "twice" both fixes this error AND causes a memory leak.

We have also tried to do slSetD3DDevice(nullptr) in an attempt to clear the implicitly set device, then set it explicitly again, but slSetD3DDevice(nullptr) always causes an access violation inside of Streamline.

I am working to put together a repro program for this issue, which I will provide through a different channel early next week! In the meanwhile, I wanted to provide as much information as I could through this ticket, as I believe this issue should be relatively easy to reproduce.

DLSS-G - Missing production dll?

Both nvngx_dlssg.dll in bin/x64 and bin/x64/development are non-production builds, where do we obtain the library for release?

image

For reference, the same version but production build found in the files of Cyberpunk 2077:

image

Source code required for Interposer 1.3.3.0

Please release the code

image

Special K requires recompiling the interposer DLL so that it does not load and hook dxgi.dll from System32.

There are numerous shipping games now that require 1.2 and now even 1.3 versions of the interposer in order for stuff like DLSS Frame Generation to work.

Our users are understandably upset that they have to either give up our software or lose features on their very expensive 40 series GPUs.

We're kind of upset that release builds of the Streamline Interposer do not read a config file, there has got to be a long-term solution to make Streamline stop loading and hooking the system DLL and then exploding spectacularly :) Re-compiling the interposer DLL for every release is our only solution right now and we cannot do this until the source code is released.

Using DLAA mode for implementing anti-aliasing in custom engine, the rendered image flickers significantly.

sl-dlaa.mp4

Version and Platform:

  • streamline v2.4.10
  • GPU: RTX4090
  • Driver: 555.99

I referred to the documentation and the implementation of the Streamline_Sample. Below, I will describe the specific implementation process.

1. Initialize

I use manual hooking to enable sl feature and call slSetVulkanInfo to hook vulkan API. Then I know that kFeatureDLSS is supported by calling slIsFeatureSupported.

2. Setup DLSSOptions

sl::DLSSOptions dlss_options{};
dlss_options.mode = sl::DLSSMode::eDLAA;
dlss_options.outputWidth = ;
dlss_options.outputHeight =;
dlss_options.colorBuffersHDR = sl::Boolean::eTrue;
dlss_options.useAutoExposure = sl::Boolean::eFalse;

3. Setup Constants

3.1 Camera

I used the default column-major matrices and the left-handed coordinate system of OpenGL. When configuring the camera-related properties, I converted the matrices to row-major order.

3.2 Motion Vector

Motion vector is computed in uv space as following codes, thus it is already in [-1, 1].

vec2 velocity = vec2(vertex.clip_position.xy / vertex.clip_position.w - last_clip_pos.xy) * 0.5;
  • constants.jitterOffset
    I make jitterOffset be within [-0.5, 0.5]
  • constants.mvecScale: [-1, 1]
    Because motion vector in my custom engine points from previous frame to current frame, which is opposite to the description of dlss documentation. Therefore, I set mvecScale to [-1, -1].
    In addition, Y-axis of screen coordinate in mu custome engine is opposite to that in dlss documention. Therefore, I need to set mvecScale to [-1, 1].
  • constants.motionVectorsJittered = sl::Boolean::eTrue
    Motion vector contains jitter offset.

I also compare the motion vectors with and without correctness operation described as below.

if (correct_y) {
    constants.jitterOffset = { jitter.x, -jitter.y }; // Negate Y-axis
    constants.mvecScale = { -1.f, 1.f };  // Negate the direction of motion vector and negate Y-axis
} else {
    constants.jitterOffset = { jitter.x, jitter.y };
    constants.mvecScale = { 1.f, 1.f };
}
  • no correct y
no-correct-y.mp4
  • correct y
correct-y.mp4

I didn't find any difference.

Proxy swap chain back buffer counts are inconsistent with dlssg loaded

FWIW, I'm not entirely sure of the extent to which these are bugs or intentional behavior. I didn't see anything mentioned in the documentation. Each code block assumes factory, commandQueue, and swapChain interfaces were created through sl.interposer proxy methods.

  1. IDXGISwapChain1::GetDesc() and IDXGISwapChain1::GetDesc1() return incorrect BufferCount values. ::GetBuffer() doesn't match.
dlssgOptions.mode = sl::DLSSGMode::eOff;
Assert(slDLSSGSetOptions(0, dlssgOptions) == sl::Result::eOk);                        // dlssg explicitly off, SL feature still enabled

swapChainDesc.BufferCount = 3;
factory->CreateSwapChainForHwnd(commandQueue.Get(), hwnd, &swapChainDesc, nullptr, nullptr, &swapChain);

for (uint32_t i = 0; i < swapChainDesc.BufferCount; i++)                              // swapChainDesc.BufferCount is 3. GetBuffer() succeeds 3 times.
{
    CComPtr<ID3D12Resource> buffer;
    Assert(swapChain->GetBuffer(i, IID_PPV_ARGS(&buffer)) == S_OK);
}

DXGI_SWAP_CHAIN_DESC t;
Assert(swapChain->GetDesc(&t) == S_OK && t.BufferCount == swapChainDesc.BufferCount); // Assertion fails. t.BufferCount is 2. 2 != 3.
  1. Calling IDXGISwapChain1::ResizeBuffers() suddenly forces ::GetBuffer() to adhere to BufferCount limits from above.
dlssgOptions.mode = sl::DLSSGMode::eOff;
Assert(slDLSSGSetOptions(0, dlssgOptions) == sl::Result::eOk);                        // dlssg explicitly off, SL feature still enabled

swapChainDesc.BufferCount = 3;
factory->CreateSwapChainForHwnd(commandQueue.Get(), hwnd, &swapChainDesc, nullptr, nullptr, &swapChain);

Assert(swapChain->ResizeBuffers(0, 0, 0, DXGI_FORMAT_UNKNOWN, 0) == S_OK);

for (uint32_t i = 0; i < swapChainDesc.BufferCount; i++)                              // swapChainDesc.BufferCount is 3.
{
    CComPtr<ID3D12Resource> buffer;
    Assert(swapChain->GetBuffer(i, IID_PPV_ARGS(&buffer)) == S_OK);                   // Assertion failure when i == 2.
}
  1. Calling IDXGISwapChain1::ResizeBuffers() causes presentation to fail entirely...? Passing 0s should preserve original values.
dlssgOptions.mode = sl::DLSSGMode::eOff;
Assert(slDLSSGSetOptions(0, dlssgOptions) == sl::Result::eOk);                        // dlssg explicitly off, SL feature still enabled

factory->CreateSwapChainForHwnd(commandQueue.Get(), hwnd, &swapChainDesc, nullptr, nullptr, &swapChain);

Assert(swapChain->ResizeBuffers(0, 0, 0, DXGI_FORMAT_UNKNOWN, 0) == S_OK);

while (true)
    Assert(swapChain->Present(0, 0) == S_OK);                                         // Succeeds, yet the screen never updates. Resource creation errors are printed to SL's log.

I ended up treating values from GetDesc() as the "ground truth" and that's enough for 1 & 2. 3 is slightly more annoying because ResizeBuffers() parameters don't seem to be validated. Manually specifying count, width, and height is a workaround. Personally I'd consider these bugs since they manifest even while dlssg is off.

sl.interposer: IDXGIFactory2_CreateSwapChainForHwnd behavior differs from D3D11

dxgi.cpp:195

if (*ppSwapChain)
{
    // Handled by one of the plugins
    hr = S_OK;
}

results in a major difference from the native implementation of this function. You can't assume that, just because the output ptr is non-null, you already handled creation of the swap chain. Use your own internal pointer and initialize it to zero, because previously, it was entirely valid to call CreateSwapChainForHwnd without bothering to initialize our output ptr to null. Now, doing so causes a silent failure to create a swap chain if the sl interposer is loaded.

In other words,

IDXGISwapChain1* swapChain;
auto result = factory->CreateSwapChainForHwnd(..., &swapChain);

works fine with standard D3D11, causes silent, non-deterministic failure (which results in later crashes) with SL interposer. For SL we have to ensure that swapChain is initialized to nullptr to get it to work.

as

aharkı a guzel

Typo in PrecisionInfo ctor

include\sl.h(369,40): error : base class 'sl::BaseStructure' is uninitialized when used here to access 'sl::BaseStructure::structType' [-Werror,-Wuninitialized]

You might wanted to pass PrecisionInfo::s_structType instead of BaseStructure::structType in BaseStructure ctor

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.