Giter Site home page Giter Site logo

Crash with AMD GPU about wluma HOT 24 CLOSED

maximbaz avatar maximbaz commented on August 18, 2024
Crash with AMD GPU

from wluma.

Comments (24)

xenrox avatar xenrox commented on August 18, 2024 1

To be sure, can you rollback your config to the one attach in the beginning of the issue and give us the few first lines of RUST_LOG=debug wluma ?

[2022-01-16T20:50:42Z DEBUG wluma] Using config: Config { frame: Frame { capturer: Wlroots, processor: Vulkan }, als: Time { thresholds_string_key: {}, thresholds: {9: "dim", 13: "bright", 7: "dark", 0: "night", 20: "night", 11: "normal", 16: "normal", 18: "dark"} }, output_by_type: OutputByType { backlight: [], ddcutil: [] }, keyboard: None, output: [DdcUtil(DdcUtilOutput { name: "LG HDR 4K" })] }
Continue adjusting brightness and wluma will learn your preference over time.
[2022-01-16T20:50:42Z DEBUG wluma::frame::capturer::wlroots] Using output 'Goldstar Company Ltd LG HDR 4K 0x00004E1E (DP-3)' for config 'LG HDR 4K'
thread 'predictor-LG HDR 4K' panicked at 'Unable to compute luma percent: ERROR_MEMORY_MAP_FAILED', src/frame/capturer/wlroots.rs:124:26
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

from wluma.

xenrox avatar xenrox commented on August 18, 2024 1

@maximbaz
Sure I would love to help with debugging the vulkan crash.

I have applied #28 and here is the new output if that helps:

config
env RUST_LOG=debug ./target/debug/wluma
[2022-01-16T20:55:29Z DEBUG wluma] Using config: Config { frame: Frame { capturer: None, processor: Vulkan }, als: Time { thresholds_string_key: {}, thresholds: {9: "dim", 0: "night", 16: "normal", 7: "dark", 11: "normal", 18: "dark", 13: "bright", 20: "night"} }, output_by_type: OutputByType { backlight: [], ddcutil: [] }, keyboard: None, output: [DdcUtil(DdcUtilOutput { name: "LG HDR 4K" })] }
Continue adjusting brightness and wluma will learn your preference over time.
[2022-01-16T20:55:30Z DEBUG wluma::brightness::ddcutil] display found: DisplayInfo { backend: I2cDevice, id: "22794", manufacturer_id: Some("GSM"), model_id: Some(30471), version: Some((1, 4)), serial: Some(544286), manufacture_year: Some(31), manufacture_week: Some(2), model_name: Some("LG HDR 4K"), serial_number: None, edid_data: Some([0, 255, 255, 255, 255, 255, 255, 0, 30, 109, 7, 119, 30, 78, 8, 0, 2, 31, 1, 4, 181, 60, 34, 120, 158, 62, 49, 174, 80, 71, 172, 39, 12, 80, 84, 33, 8, 0, 113, 64, 129, 128, 129, 192, 169, 192, 209, 192, 129, 0, 1, 1, 1, 1, 77, 208, 0, 160, 240, 112, 62, 128, 48, 32, 101, 12, 88, 84, 33, 0, 0, 26, 40, 104, 0, 160, 240, 112, 62, 128, 8, 144, 101, 12, 88, 84, 33, 0, 0, 26, 0, 0, 0, 253, 0, 56, 61, 30, 135, 56, 0, 10, 32, 32, 32, 32, 32, 32, 0, 0, 0, 252, 0, 76, 71, 32, 72, 68, 82, 32, 52, 75, 10, 32, 32, 32, 1, 247, 2, 3, 28, 113, 68, 144, 4, 3, 1, 35, 9, 7, 7, 131, 1, 0, 0, 227, 5, 192, 0, 230, 6, 5, 1, 82, 72, 93, 2, 58, 128, 24, 113, 56, 45, 64, 88, 44, 69, 0, 88, 84, 33, 0, 0, 30, 86, 94, 0, 160, 160, 160, 41, 80, 48, 32, 53, 0, 88, 84, 33, 0, 0, 26, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 44]), mccs_version: None, mccs_database: Database { entries: {} } }
[2022-01-16T20:55:30Z DEBUG wluma::brightness::ddcutil] display found: DisplayInfo { backend: I2cDevice, id: "22791", manufacturer_id: Some("BNQ"), model_id: Some(32561), version: Some((1, 3)), serial: Some(21573), manufacture_year: Some(26), manufacture_week: Some(8), model_name: Some("BenQ XL2411Z"), serial_number: Some("D2G05085SL0"), edid_data: Some([0, 255, 255, 255, 255, 255, 255, 0, 9, 209, 49, 127, 69, 84, 0, 0, 8, 26, 1, 3, 128, 53, 30, 120, 46, 157, 225, 166, 84, 84, 159, 38, 13, 80, 84, 165, 107, 128, 209, 192, 69, 124, 97, 124, 129, 128, 129, 188, 149, 60, 49, 124, 1, 1, 2, 58, 128, 24, 113, 56, 45, 64, 88, 44, 69, 0, 19, 42, 33, 0, 0, 30, 0, 0, 0, 255, 0, 68, 50, 71, 48, 53, 48, 56, 53, 83, 76, 48, 10, 32, 0, 0, 0, 253, 0, 56, 144, 30, 160, 33, 0, 10, 32, 32, 32, 32, 32, 32, 0, 0, 0, 252, 0, 66, 101, 110, 81, 32, 88, 76, 50, 52, 49, 49, 90, 10, 1, 222, 2, 1, 4, 0, 254, 91, 128, 160, 112, 56, 53, 64, 48, 32, 53, 0, 19, 42, 33, 0, 0, 26, 134, 111, 128, 160, 112, 56, 64, 64, 48, 32, 53, 0, 19, 42, 33, 0, 0, 26, 252, 126, 128, 136, 112, 56, 18, 64, 24, 32, 53, 0, 19, 42, 33, 0, 0, 30, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 199]), mccs_version: None, mccs_database: Database { entries: {} } }
[2022-01-16T20:55:30Z WARN  wluma] Skipping output 'LG HDR 4K' as it might be disconnected: Unable to find display
thread 'predictor-LG HDR 4K' panicked at 'Did not receive initial brightness value in time', src/predictor/controller.rs:69:25
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Edit: This is with capturer none.

from wluma.

xenrox avatar xenrox commented on August 18, 2024 1

Made some progress:
One problem is, that my display has no serial. Because of that serial is not Some but None and the iteration does not even reach the display I actually want to select here:

.and_then(|merged| merged.contains(name).then(|| ()))

After fixing that (substituted serial_number with model_id) it can actually match my display (no more warning) but still fails at getting the intial brightness

env RUST_LOG=debug ./target/debug/wluma
[2022-01-16T23:07:41Z DEBUG wluma] Using config: Config { frame: Frame { capturer: None, processor: Vulkan }, als: Time { thresholds_string_key: {}, thresholds: {16: "normal", 13: "bright", 7: "dark", 0: "night", 18: "dark", 20: "night", 9: "dim", 11: "normal"} }, output_by_type: OutputByType { backlight: [], ddcutil: [] }, keyboard: None, output: [DdcUtil(DdcUtilOutput { name: "LG HDR 4K" })] }
Continue adjusting brightness and wluma will learn your preference over time.
thread 'predictor-LG HDR 4K' panicked at 'Did not receive initial brightness value in time', src/predictor/controller.rs:69:25
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrac

from wluma.

maximbaz avatar maximbaz commented on August 18, 2024 1

With 4044541 the missing serial number should be fixed, together with having more debug logging to identify why displays are not being matched. Untested by hopefully will work 😛

DDC/I2C errors like the above is something @cyrinux experienced as well (I dont use external monitors so we've been testing on his setup), we concluded that it's just transient errors that we can't do anything about - although for him the errors only happened if he was messing with screen using ddcutil cli in parallel. So for you the errors occur even just by running wluma and nothing else?

To clarify, as long as the errors are transient, the app should just work, we log and ignore those errors.


Let me summarize to see if I remember it all correctly, the state with the latest main code:

  • On desktop with external GPU, only capturer=none now works
  • On laptop, capturer=wlroots now works with both only internal screen and with external display plugged in.

Everything correct?

from wluma.

xenrox avatar xenrox commented on August 18, 2024 1

Yes 5 is enough, it only failed one or two times with 4.

from wluma.

xenrox avatar xenrox commented on August 18, 2024 1

If you want to make a head start, here's what you can do:

* install Vulkan validation layers (are you on Arch? `extra/vulkan-validation-layers`)

* `export VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation`

* `cargo run`

This is an extremely helpful tool, it will log all wrongdoings related to Vulkan. On my machine it produces no issues, but I'm willing to bet that on your desktop we will get some nice errors to fix.

Yes, I am on Arch.

VUID-vkMapMemory-memory-00682(ERROR / SPEC): msgNum: -330527817 - Validation Error: [ VUID-vkMapMemory-memory-00682 ] Object 0: handle = 0xfa21a40000000003, type = VK_OBJECT_TYPE_DEVICE_MEMORY; | MessageID = 0xec4c8bb7 | Mapping Memory without VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT set: VkDeviceMemory 0xfa21a40000000003[]. The Vulkan spec states: memory must have been created with a memory type that reports VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT (https://www.khronos.org/registry/vulkan/specs/1.2-extensions/html/vkspec.html#VUID-vkMapMemory-memory-00682)
    Objects: 1
        [0] 0xfa21a40000000003, type: 8, name: NULL
thread 'predictor-LG HDR 4K' panicked at 'Unable to compute luma percent: ERROR_MEMORY_MAP_FAILED', src/frame/capturer/wlroots.rs:124:26

from wluma.

maximbaz avatar maximbaz commented on August 18, 2024 1

Perfect, at least we will not be walking in the dark, but we have precise description of what to fix, with helpful links 😁

Will share some hints in case you want to try to play around, if not - no big deal, we will get to this eventually 😉

It crashes on mapping the memory here:

let buffer_pointer = self.device.map_memory(

It tried to map the memory to the following allocated buffer:

let buffer_memory = unsafe { device.allocate_memory(&allocate_info, None)? };

Which was allocated in some memory that fits these requirements:

let allocate_info = vk::MemoryAllocateInfo {

Specifically, I'm nearly positive that the bug is here:

memory_type_index: 0,

Instead of picking the index of the right memory, we just pass 0, which on our laptop is the only available memory, so the choice is always right 😁

The fix could be as simple as trying to hardcode there 1 and seeing if it works, but I dont know if indices are actually incremental - but for sure there is an API call where we can get the right index for the memory that reports VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT (as per the error message), and then pass the right index there.

If you do look into it, please keep us posted how it goes!

from wluma.

xenrox avatar xenrox commented on August 18, 2024 1

Yes it seems like I am all out of bugs.
Thanks a lot for your help @maximbaz , @cyrinux!

from wluma.

cyrinux avatar cyrinux commented on August 18, 2024

Hi,
Thanks for your bug report.
Can you attach the output of your ddcutil detect and try to use the serial number in the "name" field please?

from wluma.

xenrox avatar xenrox commented on August 18, 2024

Sorry, seems like my display has no serial number 😆

Display 1
I2C bus: /dev/i2c-10
EDID synopsis:
Mfg id: GSM
Model: LG HDR 4K
Product code: 30471
Serial number:
Binary serial number: 544286 (0x00084e1e)
Manufacture year: 2021, Week: 2
VCP version: 2.1

And using some other identifier like GSM results in Unable to find display.

from wluma.

cyrinux avatar cyrinux commented on August 18, 2024

Ok, LG HDR 4K is a good name value. What happen if you change capturer to none please?
Do you confirm also you run a wayland session (and not X11) ?

from wluma.

xenrox avatar xenrox commented on August 18, 2024

Yes, I use sway.
If I change capturer to none, it can no longer identify the correct display:

[2022-01-16T19:05:21Z DEBUG wluma] Using config: Config { frame: Frame { capturer: None, processor: Vulkan }, als: Time { thresholds_string_key: {}, thresholds: {11: "normal", 18: "dark", 13: "bright", 9: "dim", 7: "dark", 0: "night", 16: "normal", 20: "night"} }, output_by_type: OutputByType { backlight: [], ddcutil: [] }, keyboard: None, output: [DdcUtil(DdcUtilOutput { name: "LG HDR 4K" })] }
Continue adjusting brightness and wluma will learn your preference over time.
[2022-01-16T19:05:21Z WARN  wluma] Skipping output 'LG HDR 4K' as it might be disconnected: Unable to find display
thread 'predictor-LG HDR 4K' panicked at 'Did not receive initial brightness value in time', src/predictor/controller.rs:69:25
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

from wluma.

cyrinux avatar cyrinux commented on August 18, 2024

Ok thank you, we will back too you asap 👍🏻

from wluma.

cyrinux avatar cyrinux commented on August 18, 2024

To be sure, can you rollback your config to the one attach in the beginning of the issue and give us the few first lines of RUST_LOG=debug wluma ?

from wluma.

maximbaz avatar maximbaz commented on August 18, 2024

Thanks for reporting @xenrox, and thanks to both of you for helping investigate the problem!

There seems to be two issues at hand, one about matching to a proper ddcutil when capturer is none (capturer should not really affect this, so the cause might be actually somewhere else), another about vulkan crash.

We'll try to get to the bottom of both issues, starting from us providing a dev build with extra debugging info, which should help us narrow down the problem. We'll ping you soon 🙂


Vulkan issue can be tough to fix without having a way to locally reproduce, so just wanted to ask if you would in theory feel adventurous and willing to try helping us fix it? 🙂

I'm almost sure that the issue is due to the fact that I'm using Vulkan API incorrectly, I must have hardcoded some assumptions, and it happens to work on all of our laptops simply because integrated GPU shares memory with CPU and/or RAM, so even "improper" params still allow code to work, while on your desktop you have a real independent GPU, which complains on allocating things that were not supposed to be allocated. Will do my best to help guiding you if you want to help with this, although I'm clearly no Vulkan expert myself, I just hack it until it works 😁

The only temporary alternative would be using capturer "none" on your desktop until this is properly solved, where wluma would simply use ALS (hour in your case), but not the actual screen contents.

from wluma.

xenrox avatar xenrox commented on August 18, 2024

Which in turn gets fixed by increasing INITIAL_TIMEOUT_SECS (I set it to 20), so I guess my display is just slow?

[2022-01-16T23:09:43Z DEBUG wluma] Using config: Config { frame: Frame { capturer: None, processor: Vulkan }, als: Time { thresholds_string_key: {}, thresholds: {9: "dim", 7: "dark", 13: "bright", 18: "dark", 20: "night", 0: "night", 11: "normal", 16: "normal"} }, output_by_type: OutputByType { backlight: [], ddcutil: [] }, keyboard: None, output: [DdcUtil(DdcUtilOutput { name: "LG HDR 4K" })] }
Continue adjusting brightness and wluma will learn your preference over time.
[2022-01-16T23:10:03Z DEBUG wluma::predictor::controller] Learning Entry { lux: "night", luma: 0, brightness: 85 }

from wluma.

maximbaz avatar maximbaz commented on August 18, 2024

Very interesting, thanks for sharing! Will make the necessary adjustment tomorrow for the serial number, you are right we indeed didn't anticipate that it might be None in a valid scenario.

And as for timeout, interesting..... Could also be (at least partially) a debug build to blame, but still, never happened to us so far... Out of curiosity you can try to experiment by adding some logging in ddcutil.rs in get() method to see how long it executes - just to get an idea if your guess was correct.

from wluma.

xenrox avatar xenrox commented on August 18, 2024

Yes on a non debug build it works with default settings, on a debug build it takes slightly less than 4 seconds.
Now I have run into a new error, but thats for tomorrow to investigate.

[2022-01-16T23:39:49Z ERROR wluma::brightness::controller] Unable to get brightness value: DDC/CI I2C error: Remote I/O error (os error 121)
[2022-01-16T23:39:49Z ERROR wluma::brightness::controller] Unable to get brightness value: DDC/CI error: Expected DDC/CI length bit

from wluma.

xenrox avatar xenrox commented on August 18, 2024

With 4044541 the missing serial number should be fixed, together with having more debug logging to identify why displays are not being matched. Untested by hopefully will work 😛

Yeah matching works fine now, thanks a lot!

DDC/I2C errors like the above is something @cyrinux experienced as well (I dont use external monitors so we've been testing on his setup), we concluded that it's just transient errors that we can't do anything about - although for him the errors only happened if he was messing with screen using ddcutil cli in parallel. So for you the errors occur even just by running wluma and nothing else?

I was using ddcutil in parallel to train wluma. Thanks for the explanation.

To clarify, as long as the errors are transient, the app should just work, we log and ignore those errors.

Let me summarize to see if I remember it all correctly, the state with the latest main code:

* On desktop with external GPU, only `capturer=none` now works

* On laptop, `capturer=wlroots` now works with both only internal screen and with external display plugged in.

Everything correct?

Correct! (Just had to slightly increase INITIAL_TIMEOUT_SECS again)

from wluma.

maximbaz avatar maximbaz commented on August 18, 2024

Can you tell me by how much are you increasing INITIAL_TIMEOUT_SECS to make it work? Would 5 be enough? I wouldn't want it to be too long, like waiting for 10 seconds only to learn that there is some bug or misconfiguration would seem excessive.... But of course I also dont want to have such a small value that it causes random failures like you experience!

from wluma.

maximbaz avatar maximbaz commented on August 18, 2024

I was using ddcutil in parallel to train wluma.

As long a you can use it to train wluma, yes just disregard the errors - maybe they should have been warnings really, but I wanted to see if someone experiences those issues in other scenarios.

We want to add CLI to wluma, so that it will be possible to train it natively, without any potential conflicts between different tools who all want exclusive access to the same resource. But until that happens, if ddcutil works for the purpose, just use it 🙂

I increased the timeout, of course just let me know if you still manage to see issues with 5 and we will increase more, after all this constant is not really based on anything else but the feedback like yours.


It seems there is just one outstanding issue left, proper support for independent GPU for capturer=wlroots. There are a few small things @cyrinux and I want to deal with first (that are already in progress), and then we'll get to this.

If you want to make a head start, here's what you can do:

  • install Vulkan validation layers (are you on Arch? extra/vulkan-validation-layers)
  • export VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation
  • cargo run

This is an extremely helpful tool, it will log all wrongdoings related to Vulkan. On my machine it produces no issues, but I'm willing to bet that on your desktop we will get some nice errors to fix.

from wluma.

cyrinux avatar cyrinux commented on August 18, 2024

Guys, I reproduce the same bug and can confirm that putting 1 doesnt crash.

from wluma.

maximbaz avatar maximbaz commented on August 18, 2024

@cyrinux has fixed the last issue, we'll release v4 soon, but if you can confirm @xenrox it would be appreciated 😉

from wluma.

maximbaz avatar maximbaz commented on August 18, 2024

Thanks to you too 🙂
Will do some sanity checks and release v4 tomorrow, FYI the format of config.toml is slightly changed now 😉

from wluma.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.