Giter Site home page Giter Site logo

Comments (5)

vmx avatar vmx commented on September 9, 2024

@chuck-r I was able to reproduce the issue. I won't have much time next week. Though I wanted to let you know that I'm looking into it and making progress.

from neptune.

vmx avatar vmx commented on September 9, 2024

In case anyone wants to join the debugging fun, it can even reproduced directly with the Clang. Run with the attached kernel:

> clang++ -cl-std=CL2.0 -x cl --target=amdgcn-amd-amdhsa -nogpulib -Xclang -finclude-default-header kernel.txt

kernel.txt:678:25: error: unsupported initializer for address space
DEVICE state_2_standard apply_matrix_2_standard (CONSTANT Fr matrix[3][3], state_2_standard s) {
                        ^
kernel.txt:678:25: error: unsupported initializer for address space
fatal error: error in backend: Cannot select: 0x2b4cbf8: i32 = GlobalAddress<[4 x i64] addrspace(5)* @constinit.10> 0
In function: apply_matrix_2_standard
clang: error: clang frontend command failed with exit code 70 (use -v to see invocation)
Debian clang version 11.1.0-4
Target: amdgcn-amd-amdhsa
Thread model: posix
InstalledDir: /usr/bin

from neptune.

vmx avatar vmx commented on September 9, 2024

@chuck-r there was a new release of ec-gpu-gen. Can you please try it again? You might need to run a cargo update, to get the newest version.

I re-open the issue and would then close it once, it is confirmed that it is really fixed.

from neptune.

chuck-r avatar chuck-r commented on September 9, 2024

That particular issue is fixed, but still getting an "illegal instruction" segfault later on.

$ benchy
    Finished release [optimized] target(s) in 0.09s
     Running `target/release/benchy window-post --size 32GiB --cache /media/FileCoin/window-post-32GiB-dir --skip-precommit-phase1 --skip-commit-phase1 --skip-commit-phase2`
2021-11-25T20:45:58.562 INFO benchy::window_post > Benchy Window PoSt: sector-size=34359738368, api_version=1.0.0, preserve_cache=false, skip_precommit_phase1=true, skip_precommit_phase2=false, skip_commit_phase1=true, skip_commit_phase2=true, test_resume=false
2021-11-25T20:45:58.562 INFO benchy::window_post > Using cache directory "/media/FileCoin/window-post-32GiB-dir"
2021-11-25T20:45:58.562 INFO benchy::window_post > *** Restoring precommit phase1 output file
2021-11-25T20:45:58.562 INFO filecoin_proofs::api > validate_cache_for_precommit_phase2:start
2021-11-25T20:45:58.562 INFO filecoin_proofs::api > validate_cache_for_precommit_phase2:finish
2021-11-25T20:45:58.562 INFO filecoin_proofs::api::seal > seal_pre_commit_phase2:start
2021-11-25T20:45:58.563 TRACE filecoin_proofs::api::seal > seal phase 2: base tree size 2147483647, base tree leafs 1073741824, rows to discard 7
2021-11-25T20:45:58.563 INFO storage_proofs_porep::stacked::vanilla::proof > replicate_phase2
2021-11-25T20:45:58.563 TRACE storage_proofs_porep::stacked::vanilla::proof > transform_and_replicate_layers
2021-11-25T20:45:58.563 TRACE storage_proofs_porep::stacked::vanilla::proof > nodes count 1073741824, data len 34359738368
2021-11-25T20:45:58.563 TRACE storage_proofs_porep::stacked::vanilla::proof > is_merkle_tree_size_valid(134217728, BINARY_ARITY) = true
2021-11-25T20:45:58.563 TRACE storage_proofs_porep::stacked::vanilla::proof > is_merkle_tree_size_valid(134217728, 8) = true
2021-11-25T20:45:58.563 TRACE storage_proofs_porep::stacked::vanilla::proof > tree_r_last using rows_to_discard=2
2021-11-25T20:45:58.563 INFO storage_proofs_porep::stacked::vanilla::proof > Building trees [524288 descriptors max available]
2021-11-25T20:45:58.563 INFO storage_proofs_porep::stacked::vanilla::proof > generating tree c using the GPU
2021-11-25T20:45:58.563 INFO storage_proofs_porep::stacked::vanilla::proof > Building column hashes
2021-11-25T20:45:58.564 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 1/8 with column nodes 400000
2021-11-25T20:45:58.618 DEBUG rust_gpu_tools::opencl::utils > loaded devices: [Device { vendor: Amd, name: "gfx1030", memory: 17163091968, pci_id: PciId(3072), uuid: None, device: Device { id: 140033517789120 } }]
2021-11-25T20:45:58.618 DEBUG rust_gpu_tools::device > loaded devices: [Device { vendor: Amd, name: "gfx1030", memory: 17163091968, pci_id: PciId(3072), uuid: None, opencl: Some(Device { vendor: Amd, name: "gfx1030", memory: 17163091968, pci_id: PciId(3072), uuid: None, device: Device { id: 140033517789120 } }) }]
2021-11-25T20:45:58.632 INFO neptune::proteus::program > Using kernel on OpenCL.
2021-11-25T20:45:58.664 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 400000/400000/134217728
2021-11-25T20:45:58.872 INFO neptune::proteus::program > Using kernel on OpenCL.
2021-11-25T20:46:00.353 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 1/8 with column nodes 400000
2021-11-25T20:46:00.423 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 800000/400000/134217728
2021-11-25T20:46:00.592 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 1/8 with column nodes 400000
2021-11-25T20:46:00.672 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 1200000/400000/134217728
2021-11-25T20:46:00.833 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 1/8 with column nodes 400000
2021-11-25T20:46:00.904 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 1600000/400000/134217728
...
2021-11-25T20:47:19.543 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 1/8 with column nodes 400000
2021-11-25T20:47:19.598 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 133200000/400000/134217728
2021-11-25T20:47:19.783 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 1/8 with column nodes 400000
2021-11-25T20:47:19.838 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 133600000/400000/134217728
2021-11-25T20:47:20.022 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 1/8 with column nodes 400000
2021-11-25T20:47:20.078 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 134000000/400000/134217728
2021-11-25T20:47:20.261 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 1/8 with column nodes 217728
2021-11-25T20:47:20.289 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 134217728/217728/134217728
2021-11-25T20:47:20.507 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:20.570 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 400000/400000/134217728
2021-11-25T20:47:31.068 TRACE storage_proofs_porep::stacked::vanilla::proof > base data len 134217728, tree data len 19173961
2021-11-25T20:47:31.068 INFO storage_proofs_porep::stacked::vanilla::proof > persisting base tree_c 1/8 of length 153391689
2021-11-25T20:47:31.068 TRACE storage_proofs_porep::stacked::vanilla::proof > tree_c store path "/media/FileCoin/window-post-32GiB-dir/sc-02-data-tree-c-0.dat" -- exists? true
2021-11-25T20:47:31.073 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:31.140 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 800000/400000/134217728
2021-11-25T20:47:31.301 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:31.358 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 1200000/400000/134217728
2021-11-25T20:47:31.379 TRACE storage_proofs_porep::stacked::vanilla::proof > flattening tree_c base data of 134217728 nodes using batch size 262144
2021-11-25T20:47:31.539 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:33.318 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 1600000/400000/134217728
2021-11-25T20:47:33.318 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:33.377 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 2000000/400000/134217728
2021-11-25T20:47:33.390 TRACE storage_proofs_porep::stacked::vanilla::proof > done flattening tree_c base data
2021-11-25T20:47:33.390 TRACE storage_proofs_porep::stacked::vanilla::proof > flattening tree_c tree data of 19173961 nodes using batch size 262144 and base offset 134217728
2021-11-25T20:47:33.548 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:33.606 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 2400000/400000/134217728
2021-11-25T20:47:33.634 TRACE storage_proofs_porep::stacked::vanilla::proof > done flattening tree_c tree data
2021-11-25T20:47:33.634 TRACE storage_proofs_porep::stacked::vanilla::proof > writing tree_c store data
2021-11-25T20:47:33.783 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:33.836 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 2800000/400000/134217728
2021-11-25T20:47:34.015 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:34.072 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 3200000/400000/134217728
2021-11-25T20:47:34.252 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:34.306 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 3600000/400000/134217728
2021-11-25T20:47:34.489 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:34.543 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 4000000/400000/134217728
2021-11-25T20:47:34.725 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:34.786 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 4400000/400000/134217728
2021-11-25T20:47:34.958 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:35.018 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 4800000/400000/134217728
2021-11-25T20:47:35.192 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:35.251 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 5200000/400000/134217728
2021-11-25T20:47:35.434 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:35.486 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 5600000/400000/134217728
2021-11-25T20:47:35.673 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:35.726 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 6000000/400000/134217728
2021-11-25T20:47:35.910 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:35.968 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 6400000/400000/134217728
2021-11-25T20:47:36.151 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:36.205 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 6800000/400000/134217728
2021-11-25T20:47:36.390 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:36.445 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 7200000/400000/134217728
2021-11-25T20:47:36.627 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:36.685 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 7600000/400000/134217728
2021-11-25T20:47:36.864 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:36.922 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 8000000/400000/134217728
2021-11-25T20:47:37.103 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:37.160 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 8400000/400000/134217728
2021-11-25T20:47:37.343 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
2021-11-25T20:47:37.401 TRACE storage_proofs_porep::stacked::vanilla::proof > node index 8800000/400000/134217728
2021-11-25T20:47:37.581 TRACE storage_proofs_porep::stacked::vanilla::proof > processing config 2/8 with column nodes 400000
thread '<unnamed>' panicked at 'Could not create Fr from bytes.: Bytes could not be converted to Fr', /home/chuck/git/rust-fil-proofs/storage-proofs-porep/src/stacked/vanilla/proof.rs:573:46
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'worker-thread-4' panicked at 'failed to recv columns: RecvError', /home/chuck/git/rust-fil-proofs/storage-proofs-porep/src/stacked/vanilla/proof.rs:622:51
2021-11-25T20:47:48.921 TRACE storage_proofs_porep::stacked::vanilla::proof > done writing tree_c store data
thread 'main' panicked at 'failed to receive base_data, tree_data for tree_c: RecvError', /home/chuck/git/rust-fil-proofs/storage-proofs-porep/src/stacked/vanilla/proof.rs:661:26
thread 'main' panicked at 'Worker Pool was poisoned', /home/chuck/.cargo/registry/src/github.com-1ecc6299db9ec823/yastl-0.1.2/src/wait.rs:50:13
stack backtrace:
   0:     0x55cb4acb7120 - std::backtrace_rs::backtrace::libunwind::trace::h34055254b57d8e79
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/../../backtrace/src/backtrace/libunwind.rs:90:5
   1:     0x55cb4acb7120 - std::backtrace_rs::backtrace::trace_unsynchronized::h8f1e3fbd9afff6ec
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x55cb4acb7120 - std::sys_common::backtrace::_print_fmt::h3a99a796b770c360
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/sys_common/backtrace.rs:67:5
   3:     0x55cb4acb7120 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h32d1f94a80615d18
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/sys_common/backtrace.rs:46:22
   4:     0x55cb4acdb6cc - core::fmt::write::h306731c068f7162c
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/fmt/mod.rs:1110:17
   5:     0x55cb4acb3ac5 - std::io::Write::write_fmt::hd2fa90334eee2a21
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/io/mod.rs:1588:15
   6:     0x55cb4acb97eb - std::sys_common::backtrace::_print::h5abaa2601a852287
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/sys_common/backtrace.rs:49:5
   7:     0x55cb4acb97eb - std::sys_common::backtrace::print::h8d81445442bb638f
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/sys_common/backtrace.rs:36:9
   8:     0x55cb4acb97eb - std::panicking::default_hook::{{closure}}::hcfe804496a9fa747
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:208:50
   9:     0x55cb4acb92c1 - std::panicking::default_hook::hbea8e3ccf2ba8901
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:225:9
  10:     0x55cb4acb9eb4 - std::panicking::rust_panic_with_hook::h7ee9e1a2d0f8975a
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:622:17
  11:     0x55cb4ac096c4 - std::panicking::begin_panic::{{closure}}::h1aeb5a805a015d3b
  12:     0x55cb4ac0968c - std::sys_common::backtrace::__rust_end_short_backtrace::h0c19a3f2edf276b3
  13:     0x55cb4a31a85c - std::panicking::begin_panic::ha45b90973055b434
  14:     0x55cb4ac0b5cd - yastl::wait::WaitGroup::join::h384027ae7daeb207
  15:     0x55cb4a79644d - yastl::scope::Scope::zoom::h7009b675f4993532
  16:     0x55cb4a5dcb58 - yastl::Pool::scoped::h2b96289e89712371
  17:     0x55cb4a3af0b6 - storage_proofs_core::measurements::measure_op::h996e5e296c8c0f58
  18:     0x55cb4a6db64d - storage_proofs_porep::stacked::vanilla::proof::StackedDrg<Tree,G>::replicate_phase2::h487b00da4079463b
  19:     0x55cb4a54ab9d - filecoin_proofs::api::seal::seal_pre_commit_phase2::h0109ae6da6162dd5
  20:     0x55cb4a696185 - fil_proofs_tooling::measure::measure::ha37703436f6c4807
  21:     0x55cb4a7e6cce - benchy::window_post::run_pre_commit_phases::h48b5dae3428ad3ad
  22:     0x55cb4a7ed53b - benchy::window_post::run_window_post_bench::h233611cd176e49cc
  23:     0x55cb4a7f6431 - benchy::window_post::run::h8ec3e493dad0f387
  24:     0x55cb4a3966c1 - benchy::main::hd678c96767c9cc6c
  25:     0x55cb4a7d80f3 - std::sys_common::backtrace::__rust_begin_short_backtrace::he89a9c19e7ea7357
  26:     0x55cb4a77f6dd - std::rt::lang_start::{{closure}}::hcc91664a2cefe980
  27:     0x55cb4acba4b9 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h2aabc384aab89b7b
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/ops/function.rs:259:13
  28:     0x55cb4acba4b9 - std::panicking::try::do_call::hc5fcacb7a85fc7b1
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:401:40
  29:     0x55cb4acba4b9 - std::panicking::try::hb5d9603af3abbe3a
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:365:19
  30:     0x55cb4acba4b9 - std::panic::catch_unwind::h98fe6ac3925e64b4
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panic.rs:434:14
  31:     0x55cb4acba4b9 - std::rt::lang_start_internal::h22ac7383c516f93e
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/rt.rs:34:21
  32:     0x55cb4a397a62 - main
  33:     0x7f64308fab25 - __libc_start_main
  34:     0x55cb4a3212be - _start
  35:                0x0 - <unknown>
thread panicked while panicking. aborting.
Illegal instruction (core dumped)

Doesn't seem to be related, but can't discern where it's coming from. If it's unrelated, go ahead and close it. @vmx I'll send you a coredump in Slack as well if you want to look.

from neptune.

porcuquine avatar porcuquine commented on September 9, 2024

Closing on the assumption this is unrelated. Please reopen if needed.

from neptune.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.