I want quickcheck to have a compiler plugin that automatically generates Arbitrary imp

Big research dump ahead! <a href="https://www.reddit.com/r/haskellqu

Feature request: Add auto-derivators for structs and enums about quickcheck HOT 18 CLOSED

burntsushi commented on August 11, 2024 2

Feature request: Add auto-derivators for structs and enums

from quickcheck.

Comments (18)

emk commented on August 11, 2024 9

Big research dump ahead!

This discussion has links to some prior art and research. There's also an interesting discussion of the tradeoffs involved in default instance distribution, although this mostly seems to affect the size distribution of recursive data structures such as lists (that are considerably rarer in Rust than Haskell). The two papers they mention can be found here and here. While these papers look interesting, I'm not sure if they're directly relevant.

So there are some reasons why #[derive(Arbitrary)] seems slightly sketchy in Haskell, but these reasons don't seem to be especially relevant to Rust.

Another question is the relationship between #[derive(Arbitrary)] and instance invariants. For example, my Rect type should never have left + width large enough to overflow usize, regardless of whether it was created by Arbitrary or by a new method—or by updating it with a &mut self method. This could be represented by a trait Invariant with a method that checks whether the invariant holds. (There are sketches of related ideas here (design by contract) and here (a wrapper type which enforces restrictions on a wrapped type), but neither is mature enough for what we need here.)

So there are a bunch of ways to tackle this, but it's hard to proceed without more examples. So let's dig a bit:

RegexLikeString. This has a bias towards regex characters.
SmallAscii and BiasAscii. More biasing.
Iter. Could be automatically derived.
NonEmptyAscii. More biasing.
PerspectiveBase. This has a very nice helper::reject function which is used to filter out individual field values that fail constraints. Here's another example. (There are also quite a few simpler impls that don't need constraints.)
Small. Reduces the average size by 2.
Graph. Complex and customized, like you'd expect. No way to automate this.
Word. Another trivial case with no contraints.

Here's a code snippet from nalgebra demonstrating reject:

#[cfg(feature="arbitrary")]
impl<N, S> Arbitrary for PerspectiveBase<N, S>
    where N: Real + Arbitrary,
          S: OwnedStorage<N, U4, U4> + Send,
          S::Alloc: OwnedAllocator<N, U4, U4, S> {
    fn arbitrary<G: Gen>(g: &mut G) -> Self {
        let znear  = Arbitrary::arbitrary(g);
        let zfar   = helper::reject(g, |&x: &N| !(x - znear).is_zero());
        let aspect = helper::reject(g, |&x: &N| !x.is_zero());

        Self::new(aspect, Arbitrary::arbitrary(g), znear, zfar)
    }
}

So custom Arbitrary impls seem to fall into several categories:

A biased distribution of an underlying type, typically implemented as a wrapper.
Trivial impls that can be derived with no customization.
Implementations that use per-field filtering using reject to meet various constraints.

I think it would be both desirable and possible to automate (1) somehow, but that's a separate discussion. I'm interested in (2) and (3). I could imagine a syntax like:

#[derive(Arbitrary, Clone)]
pub struct Rect {
    left: usize,
    top: usize,
    #[arbitrary(constraint = "left.checked_add(width).is_some()")]
    width: usize,
    #[arbitrary(constraint = "top.checked_add(height).is_some()")]
    height: usize,
}

A naive implementation might write:

let left = g.gen();
let width = g.gen_constrained(|width| left.checked_add(width).is_some());
let top = g.gen();
let height = g.gen_constrained(|height| top.checked_add(height).is_some());
Rect { left: left, width: width, top: top, height: height }

...but this would tend to fail if left is picked too close to 2^64-1, unless width generates 0 values fairly often. So we might want (not actually valid code; it needs to continue somewhere):

for _ in 0..MAX_TRIES {
    let left = g.gen();
    let width = g.gen_constrained(|width| left.checked_add(width).is_some());
    let top = g.gen();
    let height = g.gen_constrained(|height| top.checked_add(height).is_some());
    return Rect { left: left, width: width, top: top, height: height };
}
panic!("could not generate instance of Rect; check your constraints");

Anyway, this is just a very rough sketch. But it would work for the examples of cases (2) and (3) that I found, and it wouldn't be especially hard to implement on stable Rust. What are some possible ways we could tweak this to make it better?

from quickcheck.

BurntSushi commented on August 11, 2024 4

We're not changing the API of the Arbitrary trait to accommodate a niche corner case like void types. Panicing seems fine to me, since if that panic occurs, there's a bug somewhere, no?

@remexre Not sure what to do about Shrinkable because I don't really understand the problems you're facing. I don't have time to dive into your code, so if you could describe the problem here I might be able to give feedback. :-)

from quickcheck.

remexre commented on August 11, 2024 2

https://github.com/remexre/quickcheck_derive seems to be working? If people want to try it (and rate my first proc-macro), feel free. I'll probably move it into this repo's workspace and PR once I've implemented constraints and shrinking.

edit: Use https://github.com/remexre/quickcheck/tree/derive instead, I'm planning to PR soon

from quickcheck.

8573 commented on August 11, 2024 2

I note that there now is a third-party quickcheck_derive crate, by @panicbit and @llogiq.

from quickcheck.

BurntSushi commented on August 11, 2024

I've often thought of having something like this, but in practice, almost every Arbitrary impl I write needs to encode some key invariants around the values being generated or shrunk. This would defeat the use of a plugin to do any automatic derivation.

Popping up a level, I'm a little down on plugins since their stabilization future is probably still a far way off. As such, I personally probably won't work on something like this any time soon.

from quickcheck.

aldanor commented on August 11, 2024

Bumping this since proc-macros are getting stabilised in a week or two, and a derive-macro is very easy to implement.

This would be potentially very useful for generating Arbitrary implementations for simple structs without complex invariants.

from quickcheck.

burdges commented on August 11, 2024

If automatic derivation is problematic, then maybe a generator that wrote a proposed Arbitrary implementation, which the developer could paste into their code and customize?

from quickcheck.

emk commented on August 11, 2024

Macros 1.1 now allows custom derive in stable (as of 1.15). This would involve a separate library named something like quickcheck_derive. The main quickcheck library could remain compatible with an older version like 1.10, and you'd only need 1.15 if you wanted to use #[derive(Arbitrary)].

I think the key puzzle is how to handle additional constraints, such as the ones I described in #163. Maybe we should try brainstorming what these constraints look like in typical cases. Here's my example:

#[derive(Arbitrary, Clone)]
pub struct Rect {
    left: usize,
    top: usize,
    width: usize,
    height: usize,
}

Constraints: left + width and top + bottom both fit in a usize.

We can actually add attributes like #[arbitrary = "..."] or #[constraint = "...code..."] if we wish, and use them to control the code we generate (much like serde does). But I think it would work better to collect more examples of constraints before trying to design the actual syntax.

from quickcheck.

remexre commented on August 11, 2024

I'm gonna take a swing at this without constraints; I'll post here if I get it working.

from quickcheck.

remexre commented on August 11, 2024

What should the behavior be for trying to derive Arbitrary on an empty enum? It doesn't really make sense to do, so maybe it should be disallowed? But I might want to have e.g. an Arbitrary Either<A, Void> or another generic enum that contains a Void.

Right now, it results in a run-time panic, which is probably the least optimal solution.

from quickcheck.

remexre commented on August 11, 2024

Also, I've done a bit of work on constraints, and it's not been great. I'd actually like to see Shrinkable as a separate trait, and valid(&Self) -> bool (with a default impl of always-true) on the Arbitrary trait; I don't think there's a reasonable/sane way to allow the user to specify one function of a derive for shrinking, and the checking for constraints is essentially orthogonal to generating the values, and the logic gets thorny when I try to put them in one function.

I'm going to push what I've got (as https://github.com/remexre/quickcheck_derive/tree/constraints-dead-end), then work on a the above approach.

If anyone has better ideas, certainly share them.

from quickcheck.

vi commented on August 11, 2024

Maybe to handle empty enums properly, Arbitrary API

fn arbitrary<G: Gen>(g: &mut G) -> Self;

should be changed to return Option<Self> or Vec<Self>?

from quickcheck.

remexre commented on August 11, 2024

I don't see trying to test an enum containing a Void (possibly because of generic parameters) as a bug in the user's code per se, but I agree that it probably doesn't merit an API change. The problem is that the code gets a lot more complex when I'm trying to figure out whether an enum branch is possible or not as I'm doing codegen. I'm leaning towards it being a compile-time error for now; the panic is from rand, not quickcheck, so it's a bit bad UX.

The problem with Shrinkable is that I can forsee users wanting custom shrink() implementations, but still wanting to derive arbitrary(). For example, num::Complex and num::rational::Ratio probably want different shrink() implementations, despite having essentially the same internal layout (a struct of two T's). Working around this with an attribute #[arbitrary(shrink_with = "my_custom_function")] probably would be possible, but a bit ugly, I suppose.

Splitting Shrinkable into a different trait, though, would let users impl Shrinkable separately from the derive, and a separate derive for Shrinkable could be added for those who want the default behavior.

from quickcheck.

BurntSushi commented on August 11, 2024

@remexre RE shrinkable: That's interesting. I grant that the use case is valid. I wonder how often that's required though, and when it is, I imagine writing out the Arbitrary impl wouldn't be that onerous. Generating values tends to be much easier than shrinking in my experience. That is, as a user, I probably want to derive Arbitrary to get automatic shrinking rather than automatic generation.

But, I am open to splitting the trait, but I'm kind of leaning towards "there's not enough justification" presently...

from quickcheck.

remexre commented on August 11, 2024

Okay, for now I'll write it so I could add #[arbitrary(shrink_with = "my_custom_function")] later if need be. I also totally forgot that you could have nested functions, so I can use that for valid(&self) -> bool. I should have a basic PR ready by next Sunday, although that depends how my work schedule ends up.

from quickcheck.

remexre commented on August 11, 2024

This can probably be closed when #175 gets merged.

from quickcheck.

panicbit commented on August 11, 2024

@8573 The implementation in #175 looks more feature rich

from quickcheck.

BurntSushi commented on August 11, 2024

#222 (comment)

from quickcheck.

Feature request: Add auto-derivators for structs and enums about quickcheck HOT 18 CLOSED

Comments (18)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent