dimforge / simba Goto Github PK

View Code? Open in Web Editor NEW

282.0 282.0 29.0 194 KB

Set of mathematical traits to facilitate the use of SIMD-based AoSoA (Array of Struct of Array) storage pattern.

License: Apache License 2.0

Rust 100.00%

simba's People

Contributors

Stargazers

Watchers

simba's Issues

Subtract with overflow for integers

There is currently no support for core::num::Wrapping(). If numbers overflow with #[cfg = debug_assertions], the program panics. It would be nice, if this library supports all of it's arithmetics for AutoSimd<[Wrapping<T>]> for all T=i8/i16/i32/i64/usize/isize/u8/u16/u32/u64 like for T itself. This seems to be the only overflow-mode in the proposal for std::simd::Simd. It would be very nice to have it in simba!

let x = AutoSimd([1, 2]);
println!("{}", x * x); // works
let x = AutoSimd([Wrapping(1), Wrapping(2)]);
println!("{}", x * x); // doesn't compile yet

Update Crates.io version

Making this issue to let the maintainers know that the version on github is not matching that of the version on crates.io.

Hopefully this is just a quick update and can be fixed quite simply.

I found and referenced this #274 as it is causing problems in a different area.

Add serde support for fixed-point numbers

Implement serde's Serialize and Deserialize for fixed-point numbers.

Inconsistency between f32 and f64 in convertability to RealField

The following function compiles:

fn my_convert<T: simba::scalar::RealField>(value: f64) -> T {
    T::from_subset(&value)
}

However, the following function does not compile (replacing f64 with f32):

fn my_convert<T: simba::scalar::RealField>(value: f32) -> T {
    T::from_subset(&value)
}

The non-compiling function results in error[E0308]: mismatched types with expected `&f64`, found `&f32` as the error message.

When looking for a root cause, I noticed that complex.rs's trait definition for ComplexField has SupersetOf<f64> as one of its trait bounds but not SupersetOf<f32>. If I modify this code myself, I am able to get the f32 version of my_convert to work.

I was able to reproduce this issue in the master branch as of 2024-02-24 (commit 9d95d6d)

Add more methods from `num_traits::Float`

There are a couple of occasions where I regularly have to add an additional num_traits::Float bound.

Methods I need from num_traits::Float include the following:

min_positive_value
is_nan
nan
infinity and neg_infinity
max and min with the NaN semantics
copysign

Is the RealField suitable for these methods or is it intended to be more abstract, i.e., it not necessarily represents these floating point semantics?

Build fail with any sanitizer

When building any project which imports nalgebra with nightly and a sanitizer you get a HUGE amount of build errors.

Command I used to build

export RUSTFLAGS=-Zsanitizer=address RUSTDOCFLAGS=-Zsanitizer=address
cargo +nightly build

The same thing happens with Zsanitizer=thread

Toolchain version

rustc 1.66.0-nightly (4a1467723 2022-09-23)

OS/Hardware

Kubuntu 22.04
16 x Intel Core i7-10870 CPU @ 2.20Ghz
32 Gb RAM
Dual GPU Intel UHD Graphics + NVIDIA RTX 3060 Laptop GPU

Log

Due to how large the log is I could not add all of it so I only added the first 168 lines, all other lines are very similar.

Compiling simba v0.7.0
error: /home/yuri/Projects/threaded-bilateral-filter/target/debug/deps/libpaste-4d6a3c29c1b84e5f.so: undefined symbol: __asan_option_detect_stack_use_after_return
  --> /home/yuri/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.0/src/scalar/complex.rs:23:9
   |
23 |         paste::item! {
   |         ^^^^^

error: cannot determine resolution for the macro `paste::item`
  --> /home/yuri/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.0/src/scalar/complex.rs:23:9
   |
23 |         paste::item! {
   |         ^^^^^^^^^^^
   |
  ::: /home/yuri/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.0/src/simd/simd_complex.rs:33:5
   |
33 |     complex_trait_methods!(SimdRealField, simd_);
   |     -------------------------------------------- in this macro invocation
   |
   = note: import resolution is stuck, try simplifying macro imports
   = note: this error originates in the macro `complex_trait_methods` (in Nightly builds, run with -Z macro-backtrace for more info)

error: cannot determine resolution for the macro `paste::item`
   --> /home/yuri/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.0/src/scalar/complex.rs:23:9
    |
23  |         paste::item! {
    |         ^^^^^^^^^^^
...
186 |     complex_trait_methods!(RealField);
    |     --------------------------------- in this macro invocation
    |
    = note: import resolution is stuck, try simplifying macro imports
    = note: this error originates in the macro `complex_trait_methods` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0407]: method `from_real` is not a member of trait `ComplexField`
   --> /home/yuri/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.0/src/scalar/complex.rs:199:13
    |
199 | /             fn from_real(re: Self::RealField) -> Self {
200 | |                 re
201 | |             }
    | |_____________^ not a member of trait `ComplexField`
...
491 | / impl_complex!(
492 | |     f32,f32,f32;
493 | |     f64,f64,f64
494 | | );
    | |_- in this macro invocation
    |
    = note: this error originates in the macro `impl_complex` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0407]: method `real` is not a member of trait `ComplexField`
   --> /home/yuri/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.0/src/scalar/complex.rs:204:13
    |
204 | /             fn real(self) -> Self::RealField {
205 | |                 self
206 | |             }
    | |_____________^ not a member of trait `ComplexField`
...
491 | / impl_complex!(
492 | |     f32,f32,f32;
493 | |     f64,f64,f64
494 | | );
    | |_- in this macro invocation
    |
    = note: this error originates in the macro `impl_complex` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0407]: method `imaginary` is not a member of trait `ComplexField`
   --> /home/yuri/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.0/src/scalar/complex.rs:209:13
    |
209 | /             fn imaginary(self) -> Self::RealField {
210 | |                 Self::zero()
211 | |             }
    | |_____________^ not a member of trait `ComplexField`
...
491 | / impl_complex!(
492 | |     f32,f32,f32;
493 | |     f64,f64,f64
494 | | );
    | |_- in this macro invocation
    |
    = note: this error originates in the macro `impl_complex` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0407]: method `norm1` is not a member of trait `ComplexField`
   --> /home/yuri/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.0/src/scalar/complex.rs:214:13
    |
214 | /             fn norm1(self) -> Self::RealField {
215 | |                 $libm::abs(self)
216 | |             }
    | |_____________^ not a member of trait `ComplexField`
...
491 | / impl_complex!(
492 | |     f32,f32,f32;
493 | |     f64,f64,f64
494 | | );
    | |_- in this macro invocation
    |
    = note: this error originates in the macro `impl_complex` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0407]: method `modulus` is not a member of trait `ComplexField`
   --> /home/yuri/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.0/src/scalar/complex.rs:219:13
    |
219 | /             fn modulus(self) -> Self::RealField {
220 | |                 $libm::abs(self)
221 | |             }
    | |_____________^ not a member of trait `ComplexField`
...
491 | / impl_complex!(
492 | |     f32,f32,f32;
493 | |     f64,f64,f64
494 | | );
    | |_- in this macro invocation
    |
    = note: this error originates in the macro `impl_complex` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0407]: method `modulus_squared` is not a member of trait `ComplexField`
   --> /home/yuri/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.0/src/scalar/complex.rs:224:13
    |
224 | /             fn modulus_squared(self) -> Self::RealField {
225 | |                 self * self
226 | |             }
    | |_____________^ not a member of trait `ComplexField`
...
491 | / impl_complex!(
492 | |     f32,f32,f32;
493 | |     f64,f64,f64
494 | | );
    | |_- in this macro invocation
    |
    = note: this error originates in the macro `impl_complex` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0407]: method `argument` is not a member of trait `ComplexField`
   --> /home/yuri/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.0/src/scalar/complex.rs:229:13
    |
229 | /             fn argument(self) -> Self::RealField {
230 | |                 if self >= Self::zero() {
231 | |                     Self::zero()
232 | |                 } else {
233 | |                     Self::pi()
234 | |                 }
235 | |             }
    | |_____________^ not a member of trait `ComplexField`
...
491 | / impl_complex!(
492 | |     f32,f32,f32;
493 | |     f64,f64,f64
494 | | );
    | |_- in this macro invocation
    |
    = note: this error originates in the macro `impl_complex` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0407]: method `to_exp` is not a member of trait `ComplexField`
   --> /home/yuri/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.0/src/scalar/complex.rs:238:13
    |
238 | /             fn to_exp(self) -> (Self, Self) {
239 | |                 if self >= Self::zero() {
240 | |                     (self, Self::one())
241 | |                 } else {
242 | |                     (-self, -Self::one())
243 | |                 }
244 | |             }
    | |_____________^ not a member of trait `ComplexField`
...
491 | / impl_complex!(
492 | |     f32,f32,f32;
493 | |     f64,f64,f64
494 | | );
    | |_- in this macro invocation
    |
    = note: this error originates in the macro `impl_complex` (in Nightly builds, run with -Z macro-backtrace for more info)

Inconsistent behaviour of `SimdPartialOrd::simd_min` and `simd_max` when values are not comparable

When values are not comparable, these methods always return the second value.

An example of this is calling simd_min on a valid f32 and f32::NAN:

let number = 1.0f32;
let nan = f32::NAN;
println!("{}", number.simd_min(nan)); // Prints "NaN"
println!("{}", nan.simd_min(number)); // Prints "1"

This happens because the current implementation of simd_min is the following:

fn simd_min(self, other: Self) -> Self {
    if self <= other {
        self
    } else {
        other
    }
}

and a <= (or any syntax-sugared comparison in general) between two instances of a PartialOrd type where a.partial_ord(&b) returns None is evaluated to false.

Implement `Distribution` from the `rand` crate for more types

Currently there is a feature for the optional dependency rand in the simba crate. However, only the scalar types implement the traits from rand.

I don't see a reason why the Distribution trait can't be implemented with the SIMD types for Standard, such as AutoSimd. The implementation would simply generate a random value for each of the lanes of the SIMD vector. This is pretty easy to implement since impls already exist for every type used as a SIMD element.

Implementing this would allow better integration with the rand crate which is already an optional dependency.

Cannot build crate (version 0.7.1, feature libm)

I'm currently unable to build nalgebra on an embedded target (thumbv6m-none-eabi) due to simba failing to build.

Dependencies in Cargo.toml:

[dependencies]
simba = { version = "0.7.1", features = ["libm"], default-features = false }

Error:

   Updating crates.io index
  Downloaded approx v0.5.1
  Downloaded nb v1.0.0
  Downloaded void v1.0.2
  Downloaded simba v0.7.1
  Downloaded paste v1.0.7
  Downloaded autocfg v1.1.0
  Downloaded num-complex v0.4.1
  Downloaded num-traits v0.2.15
  Downloaded libm v0.2.2
  Downloaded nb v0.1.3
  Downloaded embedded-hal v0.2.7
  Downloaded 11 crates (342.2 KB) in 0.35s
   Compiling libm v0.2.2
   Compiling autocfg v1.1.0
    Checking nb v1.0.0
    Checking void v1.0.2
   Compiling paste v1.0.7
    Checking nb v0.1.3
    Checking embedded-hal v0.2.7
   Compiling num-traits v0.2.15
    Checking approx v0.5.1
    Checking num-complex v0.4.1
    Checking simba v0.7.1
error[E0034]: multiple applicable items in scope
   --> /home/axel/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.1/src/scalar/real.rs:82:30
    |
82  |                 $cpysgn_mod::copysign(self, sign)
    |                              ^^^^^^^^ multiple `copysign` found
...
225 | impl_real!(f32, f32, f32, Float; f64, f64, f64, Float);
    | ------------------------------------------------------ in this macro invocation
    |
    = note: candidate #1 is defined in an impl of the trait `Float` for the type `f32`
note: candidate #2 is defined in an impl of the trait `RealField` for the type `f32`
   --> /home/axel/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.1/src/scalar/real.rs:81:13
    |
81  |             fn copysign(self, sign: Self) -> Self {
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
225 | impl_real!(f32, f32, f32, Float; f64, f64, f64, Float);
    | ------------------------------------------------------ in this macro invocation
    = note: this error originates in the macro `impl_real` (in Nightly builds, run with -Z macro-backtrace for more info)
help: disambiguate the associated function for candidate #1
    |
82  |                 <f32 as Float>::copysign(self, sign)
    |                 ~~~~~~~~~~~~~~~~
help: disambiguate the associated function for candidate #2
    |
82  |                 <f32 as RealField>::copysign(self, sign)
    |                 ~~~~~~~~~~~~~~~~~~~~

error[E0034]: multiple applicable items in scope
   --> /home/axel/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.1/src/scalar/real.rs:82:30
    |
82  |                 $cpysgn_mod::copysign(self, sign)
    |                              ^^^^^^^^ multiple `copysign` found
...
225 | impl_real!(f32, f32, f32, Float; f64, f64, f64, Float);
    | ------------------------------------------------------ in this macro invocation
    |
    = note: candidate #1 is defined in an impl of the trait `Float` for the type `f64`
note: candidate #2 is defined in an impl of the trait `RealField` for the type `f64`
   --> /home/axel/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.7.1/src/scalar/real.rs:81:13
    |
81  |             fn copysign(self, sign: Self) -> Self {
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
225 | impl_real!(f32, f32, f32, Float; f64, f64, f64, Float);
    | ------------------------------------------------------ in this macro invocation
    = note: this error originates in the macro `impl_real` (in Nightly builds, run with -Z macro-backtrace for more info)
help: disambiguate the associated function for candidate #1
    |
82  |                 <f64 as Float>::copysign(self, sign)
    |                 ~~~~~~~~~~~~~~~~
help: disambiguate the associated function for candidate #2
    |
82  |                 <f64 as RealField>::copysign(self, sign)
    |                 ~~~~~~~~~~~~~~~~~~~~

For more information about this error, try `rustc --explain E0034`.
error: could not compile `simba` due to 2 previous errors

I can get the crate to build by modifying the impl_real! macro call to use Float as $cpysgn_mod

Your user guide and documentation links are wrong

Your documentation and user guide point to simba.org which I'm quite sure is not intended if you follow the links.

Missed optimization opportunity in Simd::new

I'm trying to initialize a Simd<f32x4> value from memory, but the safe way of doing it does a function call.

On simba 0.2.0, using the following rust code:

#[inline(never)]
pub fn exact_chonker_1(xs: &[f32]) -> simd::f32x4 {
    let mut acc = simd::f32x4::splat(1.0);
    let mut chunk_iter = xs.chunks_exact(4);

    for chunk in &mut chunk_iter {
        acc *= simd::f32x4::new(chunk[0], chunk[1], chunk[2], chunk[3]);
    }

    let mut r = simd::f32x4::splat(9999.0);
    let mut mask = simd::m32x4::splat(false);

    for (ix, item) in chunk_iter.remainder().iter().enumerate() {
        r.replace(ix, *item);
        mask.replace(ix, true);
    }

    acc * r.select(mask, simd::f32x4::splat(0.0))
}

#[inline(never)]
pub fn exact_chonker(xs: &[f32]) -> simd::f32x4 {
    let mut acc = simd::f32x4::splat(1.0);
    let mut chunk_iter = xs.chunks_exact(4);

    for chunk in &mut chunk_iter {
        acc *= simd::f32x4::from(*unsafe { std::mem::transmute::<*const f32, &[f32; 4]>(chunk.as_ptr()) });
    }

    let mut r = simd::f32x4::splat(9999.0);
    let mut mask = simd::m32x4::splat(false);

    for (ix, item) in chunk_iter.remainder().iter().enumerate() {
        r.replace(ix, *item);
        mask.replace(ix, true);
    }

    acc * r.select(mask, simd::f32x4::splat(0.0))
}

Analyzing the assembly, I'm seeing the initialization for f32x4::new() work like this:

 mov     r14, qword, ptr, [rip, +, _ZN5simba4simd16packed_simd_impl61Simd$LT$packed_simd..Simd$LT$$u5b$f32$u3b$$u20$4$u5d$$GT$$GT$3new17h20fdaa67639b0938E@GOTPCREL]
 xor     ebp, ebp
 lea     r12, [rsp, +, 64]
.LBB10_4:
 vmovaps xmmword, ptr, [rsp, +, 48], xmm3
 vmovd   xmm0, dword, ptr, [r13, +, 4*rbp]
 vmovd   xmm1, dword, ptr, [r13, +, 4*rbp, +, 4]
 vmovss  xmm2, dword, ptr, [r13, +, 4*rbp, +, 8]
 vmovss  xmm3, dword, ptr, [r13, +, 4*rbp, +, 12]
 mov     rdi, r12
 call    r14
 vmovaps xmm3, xmmword, ptr, [rsp, +, 48]

So the compiler seems to really want to call this Simd::new method, despite not using the results (xmm0-2 are not used as source operands later).

With the 2nd variant of the code, as far as I can tell the first loop just translates to:

.LBB11_4:
 vmulps  xmm0, xmm0, xmmword, ptr, [rsi, +, 4*rcx]
 add     rcx, 4
 cmp     rdx, rcx
 jne     .LBB11_4

Which is so much cleaner.

Is there anything you could think of that is inhibiting this for ::new()? Also, is there a better way to initialize from packed memory with safe rust?

regression: "no method named `floor` found for type `f32` in the current scope"

Hi, when upgrading from 0.2.0 to 0.2.1 (via "cargo update", for example), the new auto_simd_impl module fails to compile. simba is a transitive dependency of adskalman-rs. The simba requirement is brought in through nalgebra 0.22 with default features disabled and the libm feature enabled.

Here is the relevant part of Cargo.toml:

[dependencies]
nalgebra = {version="0.22", default-features=false, features=["libm"]}

You can look at an example build failure here https://github.com/strawlab/adskalman-rs/runs/1273243793 . Here is the relevant excerpt:

2020-10-19T04:58:42.2753561Z ##[group]Run cargo build
2020-10-19T04:58:42.2754019Z �[36;1mcargo build�[0m
2020-10-19T04:58:42.2794948Z shell: /bin/bash -e {0}
2020-10-19T04:58:42.2795288Z ##[endgroup]
2020-10-19T04:58:49.3160390Z     Updating crates.io index
2020-10-19T04:59:04.5300477Z     Updating git repository `https://github.com/strawlab/nalgebra-rand-mvn`
2020-10-19T04:59:04.7407493Z  Downloading crates ...
2020-10-19T04:59:04.8873833Z   Downloaded libm v0.2.1
2020-10-19T04:59:04.8991012Z   Downloaded paste-impl v0.1.18
2020-10-19T04:59:04.9004006Z   Downloaded num-traits v0.2.12
2020-10-19T04:59:04.9031778Z   Downloaded typenum v1.12.0
2020-10-19T04:59:04.9058033Z   Downloaded num-integer v0.1.43
2020-10-19T04:59:04.9076730Z   Downloaded simba v0.2.1
2020-10-19T04:59:04.9110998Z   Downloaded rand_chacha v0.2.2
2020-10-19T04:59:04.9122747Z   Downloaded rand_core v0.5.1
2020-10-19T04:59:04.9140039Z   Downloaded rand v0.7.3
2020-10-19T04:59:04.9195341Z   Downloaded autocfg v1.0.1
2020-10-19T04:59:04.9209759Z   Downloaded approx v0.3.2
2020-10-19T04:59:04.9225894Z   Downloaded proc-macro-hack v0.5.18
2020-10-19T04:59:04.9243806Z   Downloaded ppv-lite86 v0.2.9
2020-10-19T04:59:04.9256085Z   Downloaded paste v0.1.18
2020-10-19T04:59:04.9281813Z   Downloaded itertools v0.9.0
2020-10-19T04:59:04.9341314Z   Downloaded num-complex v0.2.4
2020-10-19T04:59:04.9356049Z   Downloaded log v0.4.11
2020-10-19T04:59:04.9382372Z   Downloaded either v1.6.1
2020-10-19T04:59:04.9394714Z   Downloaded cfg-if v0.1.10
2020-10-19T04:59:04.9405834Z   Downloaded num-rational v0.2.4
2020-10-19T04:59:04.9421917Z   Downloaded nalgebra v0.22.1
2020-10-19T04:59:04.9605885Z   Downloaded generic-array v0.13.2
2020-10-19T04:59:04.9660030Z    Compiling autocfg v1.0.1
2020-10-19T04:59:04.9662083Z    Compiling libm v0.2.1
2020-10-19T04:59:13.0282185Z    Compiling proc-macro-hack v0.5.18
2020-10-19T04:59:14.3810280Z    Compiling typenum v1.12.0
2020-10-19T04:59:14.9738980Z    Compiling rand_core v0.5.1
2020-10-19T04:59:15.2548227Z    Compiling ppv-lite86 v0.2.9
2020-10-19T04:59:15.4731143Z    Compiling log v0.4.11
2020-10-19T04:59:15.7339117Z    Compiling either v1.6.1
2020-10-19T04:59:15.8889314Z    Compiling cfg-if v0.1.10
2020-10-19T04:59:15.9216382Z    Compiling num-traits v0.2.12
2020-10-19T04:59:15.9986437Z    Compiling num-complex v0.2.4
2020-10-19T04:59:16.1825564Z    Compiling num-integer v0.1.43
2020-10-19T04:59:16.2599663Z    Compiling num-rational v0.2.4
2020-10-19T04:59:16.4607012Z    Compiling paste-impl v0.1.18
2020-10-19T04:59:16.6719174Z    Compiling itertools v0.9.0
2020-10-19T04:59:17.4802590Z    Compiling rand_chacha v0.2.2
2020-10-19T04:59:18.7762096Z    Compiling paste v0.1.18
2020-10-19T04:59:18.8147552Z    Compiling rand v0.7.3
2020-10-19T04:59:19.6569617Z    Compiling generic-array v0.13.2
2020-10-19T04:59:20.0426133Z    Compiling approx v0.3.2
2020-10-19T04:59:20.6779437Z    Compiling simba v0.2.1
2020-10-19T04:59:24.0738517Z error[E0599]: no method named `floor` found for type `f32` in the current scope
2020-10-19T04:59:24.0741232Z     --> /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.2.1/src/simd/auto_simd_impl.rs:822:32
2020-10-19T04:59:24.0742499Z      |
2020-10-19T04:59:24.0743108Z 822  |                   self.map(|e| e.floor())
2020-10-19T04:59:24.0743834Z      |                                  ^^^^^ method not found in `f32`
2020-10-19T04:59:24.0744443Z ...
2020-10-19T04:59:24.0745150Z 1435 | / impl_float_simd!(
2020-10-19T04:59:24.0745882Z 1436 | |     [f32; 2], f32, 2, [i32; 2], AutoBoolx2, _0, _1;
2020-10-19T04:59:24.0746550Z 1437 | |     [f32; 4], f32, 4, [i32; 4], AutoBoolx4, _0, _1, _2, _3;
2020-10-19T04:59:24.0747254Z 1438 | |     [f32; 8], f32, 8, [i32; 8], AutoBoolx8, _0, _1, _2, _3, _4, _5, _6, _7;
2020-10-19T04:59:24.0747816Z ...    |
2020-10-19T04:59:24.0748368Z 1442 | |     [f64; 8], f64, 8, [i64; 8], AutoBoolx8, _0, _1, _2, _3, _4, _5, _6, _7;
2020-10-19T04:59:24.0748927Z 1443 | | );
2020-10-19T04:59:24.0749795Z      | |__- in this macro invocation
2020-10-19T04:59:24.0750326Z      |
2020-10-19T04:59:24.0751320Z      = help: items from traits can only be used if the trait is in scope
2020-10-19T04:59:24.0752509Z      = note: the following traits are implemented but not in scope; perhaps add a `use` for one of them:
2020-10-19T04:59:24.0754922Z              candidate #1: `use crate::scalar::complex::ComplexField;`
2020-10-19T04:59:24.0756063Z              candidate #2: `use crate::num::float::FloatCore;`
2020-10-19T04:59:24.0756902Z              candidate #3: `use crate::num::Float;`
2020-10-19T04:59:24.0757680Z              candidate #4: `use crate::num::real::Real;`
2020-10-19T04:59:24.0759342Z      = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)
2020-10-19T04:59:24.0760231Z 
2020-10-19T04:59:24.0783529Z error[E0599]: no method named `floor` found for type `f64` in the current scope
2020-10-19T04:59:24.0784858Z     --> /home/runner/.cargo/registry/src/github.com-1ecc6299db9ec823/simba-0.2.1/src/simd/auto_simd_impl.rs:822:32
2020-10-19T04:59:24.0785458Z      |
2020-10-19T04:59:24.0785700Z 822  |                   self.map(|e| e.floor())
2020-10-19T04:59:24.0785994Z      |                                  ^^^^^ method not found in `f64`
2020-10-19T04:59:24.0786234Z ...
2020-10-19T04:59:24.0786449Z 1435 | / impl_float_simd!(
2020-10-19T04:59:24.0786725Z 1436 | |     [f32; 2], f32, 2, [i32; 2], AutoBoolx2, _0, _1;
2020-10-19T04:59:24.0787048Z 1437 | |     [f32; 4], f32, 4, [i32; 4], AutoBoolx4, _0, _1, _2, _3;
2020-10-19T04:59:24.0787374Z 1438 | |     [f32; 8], f32, 8, [i32; 8], AutoBoolx8, _0, _1, _2, _3, _4, _5, _6, _7;
2020-10-19T04:59:24.0787632Z ...    |
2020-10-19T04:59:24.0787887Z 1442 | |     [f64; 8], f64, 8, [i64; 8], AutoBoolx8, _0, _1, _2, _3, _4, _5, _6, _7;
2020-10-19T04:59:24.0788142Z 1443 | | );
2020-10-19T04:59:24.0788524Z      | |__- in this macro invocation
2020-10-19T04:59:24.0788767Z      |
2020-10-19T04:59:24.0789069Z      = help: items from traits can only be used if the trait is in scope
2020-10-19T04:59:24.0789609Z      = note: the following traits are implemented but not in scope; perhaps add a `use` for one of them:
2020-10-19T04:59:24.0790313Z              candidate #1: `use crate::scalar::complex::ComplexField;`
2020-10-19T04:59:24.0790775Z              candidate #2: `use crate::num::float::FloatCore;`
2020-10-19T04:59:24.0791159Z              candidate #3: `use crate::num::Float;`
2020-10-19T04:59:24.0791514Z              candidate #4: `use crate::num::real::Real;`
2020-10-19T04:59:24.0792240Z      = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)
2020-10-19T04:59:24.0792634Z 
2020-10-19T04:59:24.3187290Z error: aborting due to 2 previous errors
2020-10-19T04:59:24.3187861Z 
2020-10-19T04:59:24.3189441Z For more information about this error, try `rustc --explain E0599`.
2020-10-19T04:59:24.3316594Z error: could not compile `simba`.
2020-10-19T04:59:24.3316921Z 
2020-10-19T04:59:24.3317696Z To learn more, run the command again with --verbose.

0.7.1 docs.rs fails to build

The rustdoc for 0.7.1 failed to build: https://docs.rs/crate/simba/0.7.1

Remove Bounded requirement from RealField

While most types would have minimum and maximum values, arbitrary precision types can be useful to drop into an existing function when they're needed. Obviously, arbitrary precision types lack defined minimum and maximum values preventing them from implementing RealField, and since ComplexField also requires declaring the wrapped RealField, ComplexField isn't implementable either even though it doesn't directly depend on Bounded.

The rug crate has had to close multiple issues requesting the implementation of Float from num-traits for exactly this reason.

Tips for working generically with complex numbers?

I am trying to write some numeric code generically over complex numbers, with explicit SIMD. The idea is to use a SoA representation, with Complex<AutoSimd<[R; WIDTH]>>, where R can be f32 or f64, and WIDTH will be 4 or 8 or similar. I want to keep it generic over R, so the executable can pick whether to use single or double precision.

I found this to be somewhat unergonomic. First of all, there are no built-in methods for directly constructing these complex SIMD-arrays. I had to create helper functions like:

use std::array;
fn complex_arr_to_simd<const WIDTH: usize, R: Copy>(
    arr: &[Complex<R>; WIDTH],
) -> Complex<AutoSimd<[R; WIDTH]>> {
    let re = array::from_fn(|s| arr[s].re);
    let im = array::from_fn(|s| arr[s].im);
    Complex::new(AutoSimd(re), AutoSimd(im))
}

fn complex_simd_to_arr<const WIDTH: usize, R: Copy>(
    simd: Complex<AutoSimd<[R; WIDTH]>>,
) -> [Complex<R>; WIDTH] {
    array::from_fn(|s| Complex::new(simd.re.0[s], simd.im.0[s]))
}

fn complex_splat<const WIDTH: usize, R>(c: Complex<R>) -> Complex<AutoSimd<[R; WIDTH]>>
where
    AutoSimd<[R; WIDTH]>: SimdValue<Element = R>,
{
    Complex::new(AutoSimd::splat(c.re), AutoSimd::splat(c.im))
}

Note the bound on the last function: The traits don't generically guarantee that SimdValue::Element is the obvious scalar type, so I had to add it as a bound.

Now, when using this in a function, I'll need bounds like

where
    AutoSimd<[R; WIDTH]>: SimdValue<Element = R>,
    AutoSimd<[R; WIDTH]>: RealField,

which might not be that bad. But when trying to call a function with such a bound, the trait solver gets stuck somehow:

error[E0284]: type annotations needed: cannot satisfy `<AutoSimd<[R; WIDTH]> as SimdValue>::Element == R`
   --> src\lib.rs:262:17
    |
262 |                 complex_splat::<WIDTH, R>(base.powu(WIDTH as u32));
    |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot satisfy `<AutoSimd<[R; WIDTH]> as SimdValue>::Element == R`

Is there a better way of working with generic complex numbers with SIMD?

Migrate from packed_simd to stdlib simd (aka portable simd)

Packed simd stopped compiling a few days ago: rust-lang/packed_simd#343. It should be replaced with portable simd: https://doc.rust-lang.org/std/simd/index.html

lanes associated const for SimdValue

is possible we can have associated const LANES for wide types? Or for example in SimdValue:

trait SimdValue {
  const LANES: usize;
}

This will be very useful in generic context and working with arrays.

For example:

if we have:

type WIDE = simba::simd::WideF32x4;
type WIDET = f32;

for slice in data.chunks(WIDE::LANES) {
  let arr: [WIDET; WIDE::LANES] = slice.try_into().unwrap();
  let wide = WIDE::from(arr);
}

we can now easily switch the wide type, put in generics, or change wide amount to experiment for performance. today simba::simd::WideF32x4::lanes() is trait fn, and cannot be const.

if yes, i will attempt to make PR, thank you

Blanket impl of SimdPartialOrd prevents any Wrapper<T> of T: SimdValue defining vectorised field impls

SimdPartialOrd is blanket implemented for any T: SimdValue<Element = T, SimdBool = bool> + PartialOrd.

I am working on the num_dual crate, trying to make its dual number algebra work like a scalar does for nalgebra purposes. Here's a simplified illustration of the problem (playground).

The illustration has a DualNumber<T> type, which I would like to eventually support DualNumber<f32x4> etc. The object is to put one of those inside a nalgebra vector, i.e. Vector3<DualNumber<f32x4>>, and have that type produce vectorized code. So DualNumber<Simd<[f32; N]>> has to implement everything that f32 does. Including SimdRealField/SimdComplexField.

All of the SimdRealField/SimdComplexField traits require SimdPartialOrd. There is a blanket implementation of SimdPartialOrd, and the existence of that blanket impl prevents any type like DualNumber<T> implementing SimdPartialOrd manually and generic over T's SimdValue implementation / choice of SimdBool.

A type like this needs to have type SimdBool = T::SimdBool; in its SimdValue, and then use the vectorised SimdBool in its implementations of SimdRealField/SimdComplexField. Since I can't implement SimdPartialOrd and am forced to rely on the blanket impl, I can't really generate those vectorised bools. And moreover, since I cannot implement SimdPartialOrd, I have to constrain the SimdRealField/SimdComplexField implementations to T: SimdValue<Element = T, SimdBool = bool>, which means I cannot ever use those traits on DualNumber<f32x4>.

Essentially, the blanket implementation on SimdPartialOrd has foreclosed the possibility of vectorising any new algebraic field implementations in such a way that nalgebra can use them. The resulting DualNumber can only be used with f32 and f64 directly. It only has the un-vectorised code in RealField and ComplexField.

There is one workaround, which is unacceptable: do not implement PartialOrd on DualNumber. Then the trait solver will reject that blanket impl and you can make your own. But the whole point of this exercise was to have operator overloading work with nalgebra, I'm not giving up PartialOrd.

My recommendation is to add a marker trait like the one in the illustration, reproduced here:

trait OptIn {}
impl OptIn for f32 {}
impl OptIn for f64 {}

impl<T> SimdPartialOrd for T
where
      T: SimdValue<Element = T, SimdBool = bool> + PartialOrd,
      // and crucially, add this as a constraint
      T: OptIn,
{}

That way, because DualNumber<T> chooses not to implement OptIn, it is unaffected by the blanket impl of SimdPartialOrd. It can then

impl SimdValue for DualNumber<T> where T: SimdValue with Element = DualNumber<T::Element> and SimdBool = T::SimdBool>
impl SimdPartialOrd for DualNumber<T> where T: SimdPartialOrd, very simple forwarding impl (to only compare the real parts)
impl SimdRealField for DualNumber<T> where T: SimdRealField, using T's simd operations to implement dual numbers