lurk-lab / neptune Goto Github PK
View Code? Open in Web Editor NEWRust Poseidon implementation.
License: Other
Rust Poseidon implementation.
License: Other
Wasn't sure how to title this bug report, so I hope it's good enough.
My configuration:
Manjaro Linux
Mesa Drivers with AMDGPU-PRO OpenCL library (this configuration works in other OpenCL applications/benchmarks such as Luxmark)
CPU: Ryzen 5950X
GPU: Radeon RX 6800 XT
Ran into this error when running benchy
from rust-fil-proofs. @vmx advised me to file a bug report against neptune.
Finished test [unoptimized + debuginfo] target(s) in 20.54s
Running target/debug/deps/neptune-b1005e4d0551b26d
running 33 tests
test circuit::tests::test_poseidon_hash ... ok
test circuit::tests::test_scalar_product ... ok
test circuit::tests::test_scalar_product_with_add ... ok
test circuit::tests::test_square_sum ... ok
test column_tree_builder::tests::test_column_tree_builder ... LLVM ERROR: Cannot select: 0x55bca5ac5090: i32 = GlobalAddress<[4 x i64] addrspace(5)* @constinit.10> 0
In function: hash_2_standard
error: test failed, to rerun pass '--lib'
Caused by:
process didn't exit successfully: `/home/chuck/git/neptune/target/debug/deps/neptune-b1005e4d0551b26d --test-threads=1` (signal: 6, SIGABRT: process abort signal)
Interestingly, when I installed ROCm from AUR, I get a different, but probably related error (technically, ROCm isn't support on RDNA cards, but I tried it after seeing a Phoronix article):
$ cargo test --features opencl,arity2,arity4,arity8,arity11,arity16,arity24,arity36 -- --test-threads=1
Finished test [unoptimized + debuginfo] target(s) in 0.16s
Running target/debug/deps/neptune-b1005e4d0551b26d
running 33 tests
test circuit::tests::test_poseidon_hash ... ok
test circuit::tests::test_scalar_product ... ok
test circuit::tests::test_scalar_product_with_add ... ok
test circuit::tests::test_square_sum ... ok
test column_tree_builder::tests::test_column_tree_builder ... LLVM ERROR: Cannot select: 0x5650f3109988: i32 = GlobalAddress<[4 x i64] addrspace(5)* @constinit.10> 0
In function: apply_round_matrix_2_standard
error: test failed, to rerun pass '--lib'
Caused by:
process didn't exit successfully: `/home/chuck/git/neptune/target/debug/deps/neptune-b1005e4d0551b26d --test-threads=1` (signal: 6, SIGABRT: process abort signal)
We introduced basic CI as part of #177 to compensate for the change in CI infrastructure. We should introduce a Docker runner able to test Neptune CI with OpenCL and CUDA capabilities, and then run it in the form of self-hosted runners.
To achieve this, we need to integrate the following resources:
Existing CI: We currently have a CI pipeline set up with GitHub Actions.
Self-Hosted Runner Documentation: The official GitHub documentation for self-hosted runners can be found here: https://docs.github.com/en/actions/hosting-your-own-runners/about-self-hosted-runners and https://docs.github.com/en/actions/hosting-your-own-runners/using-self-hosted-runners-in-a-workflow
NVIDIA Container Toolkit: In order to use the GPU within our Docker container, we can integrate the NVIDIA Container Toolkit (https://github.com/NVIDIA/nvidia-docker) into our setup.
Dockerfile Example: A sample Dockerfile that demonstrates basic Rust installation: https://gist.github.com/huitseeker/0c58ee69f63c5e81d6ea64f0dc5153f7. This example can serve as a starting point for our own Dockerfile configuration.
In the new CID/CAR format, multihash is supported, and Poseidon has officially become a hash option.
poseidon-bls12_381-a2-fc1 | multihash | 0xb401 | permanent | Poseidon using BLS12-381 and arity of 2 with Filecoin parameters |
However, it has not been implemented by many multihash implementations. Indeed, the go-multihash does not support Poseidon, neither does rust-multihash
https://github.com/multiformats/go-multihash/tree/master/register
https://github.com/multiformats/rust-multihash#supported-hash-types
Is neptune finalized? Is there a plan to implement Poseidon in multiple programming languages (FFI does not seem to be a good idea)?
cd gbench
cargo run
Compiling gbench v0.5.4 (/home/peware/neptune/gbench)
Finished dev [unoptimized + debuginfo] target(s) in 2m 43s
Running target/debug/gbench
[2020-07-13T00:50:57Z INFO gbench] KiB: 4194304
[2020-07-13T00:50:57Z INFO gbench] leaves: 134217728
[2020-07-13T00:50:57Z INFO gbench] max column batch size: 400000
[2020-07-13T00:50:57Z INFO gbench] max tree batch size: 700000
--> Run 0
[2020-07-13T00:50:57Z INFO gbench] Creating ColumnTreeBuilder
(some Futhark code): Could not find acceptable OpenCL device.
sudo lshw -C display
*-display
description: VGA compatible controller
product: Ellesmere [Radeon RX 470/480/570/570X/580/580X]
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 0
bus info: pci@0000:05:00.0
version: ef
width: 64 bits
clock: 33MHz
capabilities: pm pciexpress msi vga_controller bus_master cap_list rom
configuration: driver=amdgpu latency=0
resources: irq:61 memory:d0000000-dfffffff memory:cfe00000-cfffffff ioport:5000(size=256) memory:fdec0000-fdefffff memory:fde00000-fde1ffff
*-display UNCLAIMED
description: VGA compatible controller
product: ES1000
vendor: Advanced Micro Devices, Inc. [AMD/ATI]
physical id: 3
bus info: pci@0000:01:03.0
version: 02
width: 32 bits
clock: 33MHz
capabilities: pm vga_controller bus_master cap_list
configuration: latency=64 mingnt=8
resources: memory:c0000000-c7ffffff ioport:2000(size=256) memory:ed9f0000-ed9fffff memory:c0000-dffff
Hi, fn generate_constants() https://github.com/filecoin-project/neptune/blob/2b11f0ce69f52aa9594f250baa658bfe2d349ac3/src/round_constants.rs#L26
references https://extgit.iaik.tugraz.at/krypto/hadeshash/blob/master/code/scripts/create_rcs_grain.sage
That file does not exist. An updated script exists in that repo with a notice of some fixed bugs.
Are there no security implications in not following the updated reference impl?
I was trying to reproduce the Poseidon constants which circomlib uses (they use the more recent script generate_parameters_grain.sage) and was unable to.
cd neptune-master/gbench
export NEPTUNE_GBENCH_GPUS=99
RUST_LOG=info cargo run -- --max-tree-batch-size 700000 --max-column-batch-size 400000
Finished dev [unoptimized + debuginfo] target(s) in 0.36s
Running `/public/home/cf/neptune-master/target/debug/gbench --max-tree-batch-size 700000 --max-column-batch-size 400000
[2021-03-10T09:41:39Z INFO gbench] KiB: 4194304
[2021-03-10T09:41:39Z INFO gbench] leaves: 134217728
[2021-03-10T09:41:39Z INFO gbench] max column batch size: 400000
[2021-03-10T09:41:39Z INFO gbench] max tree batch size: 700000
[2021-03-10T09:41:39Z INFO gbench] GPU[Selector: BatcherType::CustomGPU(BusId(99))] --> Run 0
[2021-03-10T09:41:39Z INFO gbench] GPU[Selector: BatcherType::CustomGPU(BusId(99))]: Creating ColumnTreeBuilder
[2021-03-10T09:41:39Z INFO neptune::triton::cl] getting context for ~BusId(99)
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: ClError(BusIdNotAvailable)', gbench/src/main.rs:31:6
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Any', gbench/src/main.rs:163:23
BUSID info
$ rocm-smi --showhw
========================ROCm System Management Interface========================
GPU DID GFX RAS SDMA RAS UMC RAS VBIOS BUS
1 66a1 DISABLED ENABLED ENABLED 113-D1631900-064 0000:04:00.0
2 66a1 DISABLED ENABLED ENABLED 113-D1631900-064 0000:26:00.0
3 66a1 DISABLED ENABLED ENABLED 113-D1631900-064 0000:43:00.0
4 66a1 DISABLED ENABLED ENABLED 113-D1631900-064 0000:63:00.0
==============================End of ROCm SMI Log ==============================
clinfo
[cf@c07r1n01 gbench]$ clinfo
Number of platforms: 1
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 2.1 AMD-APP (2982.0)
Platform Name: AMD Accelerated Parallel Processing
Platform Vendor: Advanced Micro Devices, Inc.
Platform Extensions: cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
Platform Name: AMD Accelerated Parallel Processing
Number of devices: 4
Device Type: CL_DEVICE_TYPE_GPU
Vendor ID: 1002h
Board name: Device 66a1
Device Topology: PCI[ B#4, D#0, F#0 ]
Max compute units: 60
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 1024
Max work group size: 256
Preferred vector width char: 4
Preferred vector width short: 2
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Native vector width char: 4
Native vector width short: 2
Native vector width int: 1
Native vector width long: 1
Native vector width float: 1
Native vector width double: 1
Max clock frequency: 1600Mhz
Address bits: 64
Max memory allocation: 14588628172
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 16384
Max image 2D height: 16384
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 26273
Max size of kernel argument: 1024
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 17163091968
Constant buffer size: 14588628172
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 65536
Max pipe arguments: 16
Max pipe active reservations: 16
Max pipe packet size: 1703726284
Max global variable size: 14588628172
Max global variable preferred total size: 17163091968
Max read/write image args: 64
Max on device events: 1024
Queue on device max size: 8388608
Max on device queues: 1
Queue on device preferred size: 262144
SVM capabilities:
Coarse grain buffer: Yes
Fine grain buffer: Yes
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 0
Preferred global atomic alignment: 0
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 64
Error correction support: 0
Unified memory for Host and Device: 0
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue on Host properties:
Out-of-Order: No
Profiling : Yes
Queue on Device properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 0x2b835a7f4d30
Name: gfx906+sram-ecc
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 2.0
Driver version: 2982.0 (HSA1.1,LC)
Profile: FULL_PROFILE
Version: OpenCL 2.0
Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
Device Type: CL_DEVICE_TYPE_GPU
Vendor ID: 1002h
Board name: Device 66a1
Device Topology: PCI[ B#38, D#0, F#0 ]
Max compute units: 60
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 1024
Max work group size: 256
Preferred vector width char: 4
Preferred vector width short: 2
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Native vector width char: 4
Native vector width short: 2
Native vector width int: 1
Native vector width long: 1
Native vector width float: 1
Native vector width double: 1
Max clock frequency: 1600Mhz
Address bits: 64
Max memory allocation: 14588628172
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 16384
Max image 2D height: 16384
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 26273
Max size of kernel argument: 1024
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 17163091968
Constant buffer size: 14588628172
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 65536
Max pipe arguments: 16
Max pipe active reservations: 16
Max pipe packet size: 1703726284
Max global variable size: 14588628172
Max global variable preferred total size: 17163091968
Max read/write image args: 64
Max on device events: 1024
Queue on device max size: 8388608
Max on device queues: 1
Queue on device preferred size: 262144
SVM capabilities:
Coarse grain buffer: Yes
Fine grain buffer: Yes
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 0
Preferred global atomic alignment: 0
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 64
Error correction support: 0
Unified memory for Host and Device: 0
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue on Host properties:
Out-of-Order: No
Profiling : Yes
Queue on Device properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 0x2b835a7f4d30
Name: gfx906+sram-ecc
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 2.0
Driver version: 2982.0 (HSA1.1,LC)
Profile: FULL_PROFILE
Version: OpenCL 2.0
Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
Device Type: CL_DEVICE_TYPE_GPU
Vendor ID: 1002h
Board name: Device 66a1
Device Topology: PCI[ B#67, D#0, F#0 ]
Max compute units: 60
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 1024
Max work group size: 256
Preferred vector width char: 4
Preferred vector width short: 2
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Native vector width char: 4
Native vector width short: 2
Native vector width int: 1
Native vector width long: 1
Native vector width float: 1
Native vector width double: 1
Max clock frequency: 1600Mhz
Address bits: 64
Max memory allocation: 14588628172
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 16384
Max image 2D height: 16384
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 26273
Max size of kernel argument: 1024
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 17163091968
Constant buffer size: 14588628172
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 65536
Max pipe arguments: 16
Max pipe active reservations: 16
Max pipe packet size: 1703726284
Max global variable size: 14588628172
Max global variable preferred total size: 17163091968
Max read/write image args: 64
Max on device events: 1024
Queue on device max size: 8388608
Max on device queues: 1
Queue on device preferred size: 262144
SVM capabilities:
Coarse grain buffer: Yes
Fine grain buffer: Yes
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 0
Preferred global atomic alignment: 0
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 64
Error correction support: 0
Unified memory for Host and Device: 0
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue on Host properties:
Out-of-Order: No
Profiling : Yes
Queue on Device properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 0x2b835a7f4d30
Name: gfx906+sram-ecc
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 2.0
Driver version: 2982.0 (HSA1.1,LC)
Profile: FULL_PROFILE
Version: OpenCL 2.0
Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
Device Type: CL_DEVICE_TYPE_GPU
Vendor ID: 1002h
Board name: Device 66a1
Device Topology: PCI[ B#99, D#0, F#0 ]
Max compute units: 60
Max work items dimensions: 3
Max work items[0]: 1024
Max work items[1]: 1024
Max work items[2]: 1024
Max work group size: 256
Preferred vector width char: 4
Preferred vector width short: 2
Preferred vector width int: 1
Preferred vector width long: 1
Preferred vector width float: 1
Preferred vector width double: 1
Native vector width char: 4
Native vector width short: 2
Native vector width int: 1
Native vector width long: 1
Native vector width float: 1
Native vector width double: 1
Max clock frequency: 1600Mhz
Address bits: 64
Max memory allocation: 14588628172
Image support: Yes
Max number of images read arguments: 128
Max number of images write arguments: 8
Max image 2D width: 16384
Max image 2D height: 16384
Max image 3D width: 2048
Max image 3D height: 2048
Max image 3D depth: 2048
Max samplers within kernel: 26273
Max size of kernel argument: 1024
Alignment (bits) of base address: 1024
Minimum alignment (bytes) for any datatype: 128
Single precision floating point capability
Denorms: Yes
Quiet NaNs: Yes
Round to nearest even: Yes
Round to zero: Yes
Round to +ve and infinity: Yes
IEEE754-2008 fused multiply-add: Yes
Cache type: Read/Write
Cache line size: 64
Cache size: 16384
Global memory size: 17163091968
Constant buffer size: 14588628172
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 65536
Max pipe arguments: 16
Max pipe active reservations: 16
Max pipe packet size: 1703726284
Max global variable size: 14588628172
Max global variable preferred total size: 17163091968
Max read/write image args: 64
Max on device events: 1024
Queue on device max size: 8388608
Max on device queues: 1
Queue on device preferred size: 262144
SVM capabilities:
Coarse grain buffer: Yes
Fine grain buffer: Yes
Fine grain system: No
Atomics: No
Preferred platform atomic alignment: 0
Preferred global atomic alignment: 0
Preferred local atomic alignment: 0
Kernel Preferred work group size multiple: 64
Error correction support: 0
Unified memory for Host and Device: 0
Profiling timer resolution: 1
Device endianess: Little
Available: Yes
Compiler available: Yes
Execution capabilities:
Execute OpenCL kernels: Yes
Execute native function: No
Queue on Host properties:
Out-of-Order: No
Profiling : Yes
Queue on Device properties:
Out-of-Order: Yes
Profiling : Yes
Platform ID: 0x2b835a7f4d30
Name: gfx906+sram-ecc
Vendor: Advanced Micro Devices, Inc.
Device OpenCL C version: OpenCL C 2.0
Driver version: 2982.0 (HSA1.1,LC)
Profile: FULL_PROFILE
Version: OpenCL 2.0
Extensions: cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_depth_images cl_amd_copy_buffer_p2p cl_amd_assembly_program
The repo currently says: "Neptune is specialized to the BLS12-381 curve. Although the API allows for type specialization to other fields, the round numbers, constants, and s-box selection may not be correct. Do not do this."
Can we verify if the same set of constants would work for both curves in the Pasta cycle?
I recently use the 3090 to power my P2. But Got the Error
Graphics SM warp Exception on GPc 1, TPc 0,
Graphics Exception Out of Range Addr
Then some of my P2 task will fail with invalid vinilla proof or CID not match.
the tree_c and tree_r_last are the same!
I use ubuntu 18.04 + GPU driver 460.92
This is a tracking issue in regards to making Neptune work with pasta_curves
.
This is issue is meant as a reminder and not as an immediate action item. There are new releases of ff
and group
. Upgrading to those is a breaking change as they contain traits and you cannot really have the traits of two different versions in your dependency tree.
I propose postponing the upgrade until a new breaking release is needed for other reasons. This upgrade could then combined with such a release.
The upgrade is not straight forward, as also all other dependencies using those traits would need to be updates e.g. bellperson
.
The upgrade would enable an upgrade of pasta_curves
as well. The most recent release v0.5.0 should contain everything our current fork contains. This means once upgraded, we won't need to rely on a fork anymore.
There's a GitHub action that checks if a rebase is needed. It posts a comment every hour.
I'm currently watching this repo and it would be great if there wouldn't be so many events happening if nothing really changes. I propose using something like https://github.com/peter-evans/find-comment to check if the comment already exists and in case it does, skipping to post another one.
Thanks for the wonderful work!
It seems there are a few (potential) mismatches between round_numbers.rs and the Poseidon Hash paper. Is there any reason for this mismatch? More specifically,
let rf_interp = 0.43 * m + t.log2() - rp;
. In Poseidon Hash paper, Eq (3) requiresFor BLS12-381 with , M=128, and , we should add something like log_5(t)
instead of t.log2()
.
The documentation says that w is the first column of m_hat or second column of m without the first row..
Actually it is the first column of m without the first row.
The implementation is ok: https://github.com/filecoin-project/neptune/blob/bafd77a5014e3b6a6b40359097835c3eb1dd533f/src/mds.rs#L197
Hi, could you pls suggest the correct way to build this lib for wasm. I tried
cargo build --target wasm32-unknown-unknown
and also with
cargo build --target wasm32-unknown-unknown --features "wasm"
but was getting compile errors:
MmapInner::map(self.get_len(file)?, file, self.offset).map(|inner| Mmap { inner: inner })
| ^^^^^^^^^ use of undeclared type MmapInner
Thanks.
Hi - Nova's public parameters reference Neptune's Poseidon Hash Constants. We would like to serialize/deserialize Nova's public parameters using serde. What do you'll think about adding serde derive as an optional feature so that we can serialize/deserialize? It gets a bit tricky when it comes to serialize/deserialize of UInt from typenum so I was just wandering if anyone had any thoughts on that. Thanks.
This allows Neptune to be cited properly in publications, see:
https://citation-file-format.github.io/
The rust version specified in rust-toolchain.toml
(1.75.0) is out of date with the latest stable (1.76.0).
Check the rust version check workflow for details.
This issue was raised by the workflow at https://github.com/lurk-lab/neptune/actions/runs/7879730700/workflow.
Sorry.
I want to know how to use it.
I have 2 GPUs.
I don't understand RUST and cargo, so I want to know the execution command.
Some dependencies specified in Cargo.toml
are not needed.
Check the unused dependencies sanity check workflow for details.
This issue was raised by the workflow at https://github.com/lurk-lab/ci-workflows/tree/main/.github/workflows/unused-deps.yml
.
Note
If this is a false positive, please refer to thecargo-udeps
docs on how to ignore the dependencies.
I would like to propose the implementation of the Poseidon2 hash function in the Neptune repository. This recent advancement enhances the efficiency of the Poseidon hash function, specifically tailored for zero-knowledge applications.
Referencing the research paper and the explanatory note provided by the authors, Poseidon2 enhances performance by focusing on its linear layers and round constant addition. This new design requires only a short chain of additions for computation, significantly reducing the number of multiplications and reductions.
Specifically,
Given these improvements, Poseidon2 can offer a performance boost of up to a factor of 4 compared to the original Poseidon, without any increase in the number of rounds or other disadvantages. The reference implementation provided by HorizenLabs may be useful for this implementation.
Considering the focus of the Neptune repository on the Poseidon hash function, I believe that including Poseidon2 would greatly enhance its performance and efficiency.
Neptune currently has an implementation called Triton which is implemented in Futhark. There is now an OpenCL/CUDA implementation called Proteus, with better performance. I propose removing Triton to lower the maintenance cost of this library.
The rust version specified in rust-toolchain.toml
(1.76.0) is out of date with the latest stable (1.81.0).
Check the rust version check workflow for details.
This issue was raised by the workflow at https://github.com/argumentcomputer/neptune/actions/runs/10746951111/workflow.
Neptune should also support to run on CUDA.
There are broken dependencies in neptune that are causing build issues in https://github.com/filecoin-project/rust-filecoin-proofs-api
From neptune
:
$ cargo update
Updating crates.io index
error: failed to select a version for the requirement `rustc_version = "^0.1"`
candidate versions found which didn't match: 0.3.3, 0.3.2, 0.3.1, ...
location searched: crates.io index
required by package `fil-ocl-core v0.11.3`
... which is depended on by `fil-ocl v0.19.4`
... which is depended on by `rust-gpu-tools v0.3.0`
... which is depended on by `gbench v0.5.4 (...../neptune/gbench)`
From rust-filecoin-proofs-api
:
$ cargo update
Updating crates.io index
error: failed to select a version for the requirement `rustc_version = "^0.1"`
candidate versions found which didn't match: 0.3.3, 0.3.2, 0.3.1, ...
location searched: crates.io index
required by package `fil-ocl-core v0.11.3`
... which is depended on by `fil-ocl v0.19.4`
... which is depended on by `rust-gpu-tools v0.2.0`
... which is depended on by `bellperson v0.12.3`
... which is depended on by `filecoin-proofs-api v6.0.0 (...../rust-filecoin-proofs-api)`
Our current CI triggers on push to dev as well as merge groups:
https://github.com/lurk-lab/neptune/blob/9a6c931d158ebbfeb2a301f45d637642d65f0779/.github/workflows/check-downstream-compiles.yml
https://github.com/lurk-lab/neptune/blob/d8b4eeadd8acc9d9e8d9d510605c954f1410aa60/.github/workflows/rust.yml#L4-L9
check-downstream-compiles
job is meant only as a warning, and so is useless on merge_group or push.Currently, neptune
expects that the number of full rounds R_F
is an even number, as evidenced by the number of full rounds in the first and second halves being the same R_f = floor(R_F / 2)
.
All three Poseidon implementations (static, correct, and dynamic) use R_f
as the number of first and second half full rounds, which is correct only when R_F
is even (currently R_F = 8
for all Filecoin applications). However, when R_F
is odd, the number of second half full rounds should be R_F - R_f
(so you don't lose the last full round of the second half).
self.constants.full_rounds - self.constants.half_full_rounds
.unimplemented!
panic.The formula is a bit wrong I think,
it should be identifier * 2^40 + strength * 2^32
.
Originally posted by @Kubuxu in #116 (comment)
Issue for tracking benchmarks over time.
We currently have two tree builders, TreeBuilder
and ColumnTreeBuilder
, both with their own traits. AFAICT, there are only those implementations of those traits. Hence I wonder what the purpose of the ColumnTreeBuilderTrait
and TreeBuilderTrait
traits is?
I propose removing those traits and moving the implementation directly into the structs. Benefits:
Downsides:
I have a machine with an NVIDIA and an AMD GPU in it and I need a flag to tell Neptune which of them to use.
The recent update of the Poseidon article drops in additional requirements on the MDS matrix security, see p. 7. Any idea if a randomly sampled Cauchy matrix over a large field is still safe?
Some dependencies specified in Cargo.toml
are not needed.
Check the unused dependencies sanity check workflow for details.
This issue was raised by the workflow at https://github.com/argumentcomputer/ci-workflows/tree/main/.github/workflows/unused-deps.yml
.
Note
If this is a false positive, please refer to thecargo-udeps
docs on how to ignore the dependencies.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.