Comments (18)
Big research dump ahead!
This discussion has links to some prior art and research. There's also an interesting discussion of the tradeoffs involved in default instance distribution, although this mostly seems to affect the size distribution of recursive data structures such as lists (that are considerably rarer in Rust than Haskell). The two papers they mention can be found here and here. While these papers look interesting, I'm not sure if they're directly relevant.
So there are some reasons why #[derive(Arbitrary)]
seems slightly sketchy in Haskell, but these reasons don't seem to be especially relevant to Rust.
Another question is the relationship between #[derive(Arbitrary)]
and instance invariants. For example, my Rect
type should never have left + width
large enough to overflow usize
, regardless of whether it was created by Arbitrary
or by a new
method—or by updating it with a &mut self
method. This could be represented by a trait Invariant
with a method that checks whether the invariant holds. (There are sketches of related ideas here (design by contract) and here (a wrapper type which enforces restrictions on a wrapped type), but neither is mature enough for what we need here.)
So there are a bunch of ways to tackle this, but it's hard to proceed without more examples. So let's dig a bit:
- RegexLikeString. This has a bias towards regex characters.
- SmallAscii and BiasAscii. More biasing.
- Iter. Could be automatically derived.
- NonEmptyAscii. More biasing.
- PerspectiveBase. This has a very nice
helper::reject
function which is used to filter out individual field values that fail constraints. Here's another example. (There are also quite a few simpler impls that don't need constraints.) - Small. Reduces the average size by 2.
- Graph. Complex and customized, like you'd expect. No way to automate this.
- Word. Another trivial case with no contraints.
Here's a code snippet from nalgebra
demonstrating reject
:
#[cfg(feature="arbitrary")]
impl<N, S> Arbitrary for PerspectiveBase<N, S>
where N: Real + Arbitrary,
S: OwnedStorage<N, U4, U4> + Send,
S::Alloc: OwnedAllocator<N, U4, U4, S> {
fn arbitrary<G: Gen>(g: &mut G) -> Self {
let znear = Arbitrary::arbitrary(g);
let zfar = helper::reject(g, |&x: &N| !(x - znear).is_zero());
let aspect = helper::reject(g, |&x: &N| !x.is_zero());
Self::new(aspect, Arbitrary::arbitrary(g), znear, zfar)
}
}
So custom Arbitrary
impls seem to fall into several categories:
- A biased distribution of an underlying type, typically implemented as a wrapper.
- Trivial impls that can be derived with no customization.
- Implementations that use per-field filtering using
reject
to meet various constraints.
I think it would be both desirable and possible to automate (1) somehow, but that's a separate discussion. I'm interested in (2) and (3). I could imagine a syntax like:
#[derive(Arbitrary, Clone)]
pub struct Rect {
left: usize,
top: usize,
#[arbitrary(constraint = "left.checked_add(width).is_some()")]
width: usize,
#[arbitrary(constraint = "top.checked_add(height).is_some()")]
height: usize,
}
A naive implementation might write:
let left = g.gen();
let width = g.gen_constrained(|width| left.checked_add(width).is_some());
let top = g.gen();
let height = g.gen_constrained(|height| top.checked_add(height).is_some());
Rect { left: left, width: width, top: top, height: height }
...but this would tend to fail if left
is picked too close to 2^64-1, unless width
generates 0 values fairly often. So we might want (not actually valid code; it needs to continue
somewhere):
for _ in 0..MAX_TRIES {
let left = g.gen();
let width = g.gen_constrained(|width| left.checked_add(width).is_some());
let top = g.gen();
let height = g.gen_constrained(|height| top.checked_add(height).is_some());
return Rect { left: left, width: width, top: top, height: height };
}
panic!("could not generate instance of Rect; check your constraints");
Anyway, this is just a very rough sketch. But it would work for the examples of cases (2) and (3) that I found, and it wouldn't be especially hard to implement on stable Rust. What are some possible ways we could tweak this to make it better?
from quickcheck.
We're not changing the API of the Arbitrary
trait to accommodate a niche corner case like void types. Panicing seems fine to me, since if that panic occurs, there's a bug somewhere, no?
@remexre Not sure what to do about Shrinkable
because I don't really understand the problems you're facing. I don't have time to dive into your code, so if you could describe the problem here I might be able to give feedback. :-)
from quickcheck.
https://github.com/remexre/quickcheck_derive seems to be working? If people want to try it (and rate my first proc-macro), feel free. I'll probably move it into this repo's workspace and PR once I've implemented constraints and shrinking.
edit: Use https://github.com/remexre/quickcheck/tree/derive instead, I'm planning to PR soon
from quickcheck.
I note that there now is a third-party quickcheck_derive
crate, by @panicbit and @llogiq.
from quickcheck.
I've often thought of having something like this, but in practice, almost every Arbitrary
impl I write needs to encode some key invariants around the values being generated or shrunk. This would defeat the use of a plugin to do any automatic derivation.
Popping up a level, I'm a little down on plugins since their stabilization future is probably still a far way off. As such, I personally probably won't work on something like this any time soon.
from quickcheck.
Bumping this since proc-macros are getting stabilised in a week or two, and a derive-macro is very easy to implement.
This would be potentially very useful for generating Arbitrary
implementations for simple structs without complex invariants.
from quickcheck.
If automatic derivation is problematic, then maybe a generator that wrote a proposed Arbitrary
implementation, which the developer could paste into their code and customize?
from quickcheck.
Macros 1.1 now allows custom derive
in stable (as of 1.15). This would involve a separate library named something like quickcheck_derive
. The main quickcheck
library could remain compatible with an older version like 1.10, and you'd only need 1.15 if you wanted to use #[derive(Arbitrary)]
.
I think the key puzzle is how to handle additional constraints, such as the ones I described in #163. Maybe we should try brainstorming what these constraints look like in typical cases. Here's my example:
#[derive(Arbitrary, Clone)]
pub struct Rect {
left: usize,
top: usize,
width: usize,
height: usize,
}
Constraints: left + width
and top + bottom
both fit in a usize
.
We can actually add attributes like #[arbitrary = "..."]
or #[constraint = "...code..."]
if we wish, and use them to control the code we generate (much like serde
does). But I think it would work better to collect more examples of constraints before trying to design the actual syntax.
from quickcheck.
I'm gonna take a swing at this without constraints; I'll post here if I get it working.
from quickcheck.
What should the behavior be for trying to derive Arbitrary on an empty enum? It doesn't really make sense to do, so maybe it should be disallowed? But I might want to have e.g. an Arbitrary Either<A, Void>
or another generic enum that contains a Void
.
Right now, it results in a run-time panic, which is probably the least optimal solution.
from quickcheck.
Also, I've done a bit of work on constraints, and it's not been great. I'd actually like to see Shrinkable
as a separate trait, and valid(&Self) -> bool
(with a default impl of always-true) on the Arbitrary
trait; I don't think there's a reasonable/sane way to allow the user to specify one function of a derive for shrinking, and the checking for constraints is essentially orthogonal to generating the values, and the logic gets thorny when I try to put them in one function.
I'm going to push what I've got (as https://github.com/remexre/quickcheck_derive/tree/constraints-dead-end), then work on a the above approach.
If anyone has better ideas, certainly share them.
from quickcheck.
Maybe to handle empty enums properly, Arbitrary
API
fn arbitrary<G: Gen>(g: &mut G) -> Self;
should be changed to return Option<Self>
or Vec<Self>
?
from quickcheck.
I don't see trying to test an enum containing a Void (possibly because of generic parameters) as a bug in the user's code per se, but I agree that it probably doesn't merit an API change. The problem is that the code gets a lot more complex when I'm trying to figure out whether an enum branch is possible or not as I'm doing codegen. I'm leaning towards it being a compile-time error for now; the panic is from rand, not quickcheck, so it's a bit bad UX.
The problem with Shrinkable
is that I can forsee users wanting custom shrink()
implementations, but still wanting to derive arbitrary()
. For example, num::Complex
and num::rational::Ratio
probably want different shrink()
implementations, despite having essentially the same internal layout (a struct of two T's). Working around this with an attribute #[arbitrary(shrink_with = "my_custom_function")]
probably would be possible, but a bit ugly, I suppose.
Splitting Shrinkable
into a different trait, though, would let users impl Shrinkable
separately from the derive, and a separate derive for Shrinkable
could be added for those who want the default behavior.
from quickcheck.
@remexre RE shrinkable: That's interesting. I grant that the use case is valid. I wonder how often that's required though, and when it is, I imagine writing out the Arbitrary
impl wouldn't be that onerous. Generating values tends to be much easier than shrinking in my experience. That is, as a user, I probably want to derive Arbitrary
to get automatic shrinking rather than automatic generation.
But, I am open to splitting the trait, but I'm kind of leaning towards "there's not enough justification" presently...
from quickcheck.
Okay, for now I'll write it so I could add #[arbitrary(shrink_with = "my_custom_function")]
later if need be. I also totally forgot that you could have nested functions, so I can use that for valid(&self) -> bool
. I should have a basic PR ready by next Sunday, although that depends how my work schedule ends up.
from quickcheck.
This can probably be closed when #175 gets merged.
from quickcheck.
@8573 The implementation in #175 looks more feature rich
from quickcheck.
from quickcheck.
Related Issues (20)
- Cannot use Rng methods on `Gen` when implementing `Arbitrary` HOT 5
- Identity checking HOT 3
- Stack overflow in quickcheck case shrinking HOT 3
- example case sort TEST FAILED HOT 1
- QuickChecking Const Generic Code HOT 5
- Implement Arbitrary for AsMut<[T: Arbitrary]> HOT 2
- Infinite Repetition/Never Ending Test with `f32` and `f64`. HOT 17
- Q: Idiomatic way to specify the length of an arbitrary vector HOT 7
- <newbie> How to generate a number within a range HOT 2
- Negating an integer leads to stack overflow HOT 2
- upgrade notes would be nice. HOT 1
- debug_reprs taking up 41% of test runtime HOT 2
- warning: panic message is not a string literal HOT 1
- Rng Size for Vec Arbitrary cannot be 0
- Impl Clone for Gen
- Implement something like choose_weighted for `Gen`
- Is this still maintained? HOT 1
- Is quickcheck still maintained? HOT 1
- How to combine quickcheck 1+ with fake? HOT 3
- Durations's Arbitrary instance is dependant on Gen's size HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from quickcheck.