quickchick / quickchick Goto Github PK

View Code? Open in Web Editor NEW

239.0 17.0 44.0 28.07 MB

Randomized Property-Based Testing Plugin for Coq

License: Other

Emacs Lisp 0.03% Coq 64.82% OCaml 31.21% Makefile 0.92% Shell 0.11% C 2.88% Perl 0.04%

testing coq

quickchick's People

Contributors

Stargazers

Watchers

Forkers

jcarlson23 palmskog lastland maximedenes lysxia nomeata jasongross ejgallego liyishuai chobbes ppedrot jashug skyskimmer maysis175 gares proux01 ruidtlima llelf herbelin andreacolo97 simonboulier zimmi48 dgfug mattam82 msoegtropimc anton-trunov 4ever2 mrhaandi xiving durbatuluk1701 rybla zeta1999 olaure01 dijkstracula hanxic andres-erbsen jeprinz lasseblaauwbroek rtjoa trilis villetaneuse

quickchick's Issues

One more case study on IFC language encodings

My first serious contact with PBT was actually before getting to know John; it was part of the "All Your IFCException" work, where we had 5 encodings between 4 languages or so, and the proofs turned out horrible, and I just wanted to test something to be able to conjecture the encodability in the paper. The code is here https://github.com/mgree/navdifc and it's already written in Coq. Just the testing infrastructure around it it is in Haskell (Testing.hs), but I don't think it would be that hard to port all of this to QuickChick, use some mutations, fix some bugs, and verify some things about testing, and maybe even prove some things for real. Anyway, such an additional case study would be nice long term.

Support for Coq 8.5

Maxime already started working on this.

About mutation testing

I intend to add the following features:

Apply multiple mutants in one test by labelling them into groups, which is useful when we mutate multiple files.
Specify which mutant(s) to apply in command line, for testing different aspects with the same specification. This is partly implemented in #60.

Any suggestions?

Add strong abstraction barriers

The way we write generators is very principled, but this is currently just convention and it is not enforced by strong abstraction barriers. For instance the MkGen constructor should really only be available to the concrete implementation of the primitive generators, but not to derived ones. Also the combinators in higher layers in the cake should depend only on the existence and specification of the combinators in the lower layers.

Strong abstraction might also be the key for using parametricity in #8.

I remember discussing about this with Zoe in the past, will need to look at the notes and try to summarize that.

Unconstrained instance resolution loops

Here's a minimal example.

Require Import QuickChick.
Check arbitrary.

One workaround is to always pass arguments explicitly, but that doesn't take advantage of type classes, and it is prone to forgetful mistakes.

I suppose Coq is repeatedly picking some blanket instances lying around. Wouldn't it be better to have them not in scope by default, or to reduce their priority to avoid these situations?
What is the usual way out when Coq gets stuck like that? Currently in Proof General, I'm killing it (C-x k *coq*) but it's a chore to jump back in the previous location.
Is Coq's instance resolution algorithm just a big backtracking search? Are there ways to customize it beyond priority numbers?

Put opam package in coq/released

Hi, I'm wondering why you didn't put the opam package in the coq opam archive yet? I guess there's not much to do here since you already have an opam file. Note that we also have an extra-dev repository for development versions if things are not quite ready for prime-time yet.

https://coq.inria.fr/packages
https://coq.inria.fr/opam/www/packaging.html

Unprovable facts about checker combinators

There are some proofs that are commented out in file EndToEnd.v. Those proofs do not seem provable in the new setting, although they were provable in the previous coarse approach. The main problem is that the spec of bind is weaker than we initially thought. Some of the unprovable statements are key facts like the specs of forAll and forAllShrink combinators.

More systematic tracking of sizes ... or not

We considered a more systematic tracking of sizes at some point (using type classes and stuff, see ProperSizes.v). Should have a look at that again in light of the new definitions and see if it makes more sense now. If it still doesn't then let's just remove ProperSizes.v.

Hook <--> up to the generalized rewriting framework

We should hook <--> up to the generalized rewriting framework.
I was trying to do this for semReturn and there are still some scars left in the code:

   Lemma semReturn : forall A (x : A),
                       semGen (returnGen x) <--> eq x.
   Proof. (* this kind of proof should be "trivial by rewriting",
          but this doesn't quite work as easy at it should *)
     move => A x. rewrite /semGen.
     (* was hoping to do the rest by rewriting: setoid_rewrite semReturnSize. *)
     pose proof (semReturnSize x) as G. unfold set_eq in G.
     (* setoid_rewrite G. -- failed constraints *)
     move => a /=. split.
     - move => [size H]. (* or rewrite -> G in H. exact H. *)
       rewrite <- G. exact H.
     - move => H. exists 0. by rewrite G.
   Qed.

Growing number of extraction warnings

Could this be since Leo's latest code addition?

Warning: The extraction is currently set to bypass opacity,
the following opaque constant bodies have been accessed :
  BinOpT_beqP EqDec_block Mem.EqDec_block Z_eqbP add_rect
  StringOT.AsciiOT.compare StringOT.StringOT.compare eqtype.eqP ssrnat.eqnP
  seq.eqseqP equal_rect find_rect fold_rect ssrbool.idP ssrbool.iffP
  instr_eqbP label_eqP mem_rect pc_eqP remove_rect val_eqP.

Warning: The following logical axiom was encountered:
  fstepP_admitted.
Having invalid logical axiom in the environment when extracting
 may lead to incorrect or non-terminating ML terms.

File "/tmp/QuickChickc52860.ml", line 5239, characters 16-18:
Warning 20: this argument will not be used by the function.
File "/tmp/QuickChickc52860.ml", line 5242, characters 19-21:
Warning 20: this argument will not be used by the function.
File "/tmp/QuickChickc52860.ml", line 5242, characters 22-23:
Warning 20: this argument will not be used by the function.
File "/tmp/QuickChickc52860.ml", line 5242, characters 24-25:
Warning 20: this argument will not be used by the function.
File "/tmp/QuickChickc52860.ml", line 5242, characters 26-28:
Warning 20: this argument will not be used by the function.
File "/tmp/QuickChickc52860.ml", line 5243, characters 58-62:
Warning 20: this argument will not be used by the function.

Also prove things about shrinking

For instance, we could add a proof obligation in the Arbitrary class that then ensures the termination of shrinking. Many shrinking bugs have in fact to do with termination.

Do notation for monads

Mutation test parse error with two consecutive mutations

It seems that the syntax for mutation tests does not work well when there are two consecutive mutations? For example:

            put (c', Response (*!*)c(*! (c+1) *) (*!*) (print_res rs) (*! nonsense *):: s')

QuickChick would report:

Parse error when reading file: ./xxx.v
Fatal error: exception Failure("Error in line 42, position 69")

(line 42 is the code line I have shown above, the 69th character on that line is the space right after (print_res rs).)

Our verification based on "axiomatic semantics"

While we only have 5 axioms, and they are not too hard to believe, they are still 5 axioms too much. A fully verified QuickChick would be a very nice achievement, but it would require significantly more work.

Why does defNumTests affect mutation testing?

Setting defNumTests to a large number causes mutation testing to take really long, even if it takes really little with a small number. What's happening?

Scrap the boilerplate when invoking plugin

Even the simplest example of using the plugin looks like this:

Definition test0 :=
  showResult (quickCheck removeP).
QuickCheck test0.

Ideally, I would like to write only

QuickCheck removeP.

Build Error: Unbound module CAst

Hi, I'm trying to build the latest QuickChick on master branch on my own machine, but I got:

File "src/quickChick.ml4", line 19, characters 15-24: 
Error: Unbound module CAst

Did I miss anything?

PS: I have Coq 8.6.1, OCaml 4.05.0, and ssreflect 1.6.1.

Monolithic extraction

Every time the QuickCheck command is invoked the whole code of our framework gets extracted, although if nothing changed in between it's the same code each time.

Our specs are polluted by default value cases

For instance:

    forall {A} (l: list A) (def : A),
      (semGen (elements def l)) <--> (fun e => List.In e l \/ (l = nil /\ e = def)).

These default arguments weren't needed in Haskell, where functions can just fail in such cases. In Coq we had the choice between making Gen contain option (and that would have been tedious, right?) and taking default arguments. One way to get rid of these defaults would be to use dependent types, but in Coq that means computing with proofs, which is horrible (F* gets this right, but Coq doesn't!).

One suggestion by Leo would be to turn this lemma into two:

    forall {A} (l: list A) (def : A),
      l <> nil ->
      (semGen (elements def l)) <--> (fun e => List.In e l).

and

    forall {A} (def : A),
      (semGen (elements def nil)) <--> eq def.

While this seems indeed like an improvement, I still wonder what's the problem with incorporating the error monad into Gen. We could use partial correctness specs that only talk about the Some case, so at the spec level I think we would get the cleanest thing without dependent types.

Installation

Please add an INSTALL file with more precise installation instructions than in README, with opam but also on an existing installation of Coq without opam.

Parallel compilation is broken

make -j fails, it seems that some dependencies are missing.

By the way, I couldn't find any call to ocamldep in the Makefile. Am I missing something?

Remove (now) useless type annotations

also in IFC

Too much duplication with modules

[Zoe might have discovered a nicer way?]

there is this Declare keyword, it might be helpful
You can do something like that

Module Type M1.
 Parameter X : Type.
 Parameter x : X.
End M1.

Module Type M2.
 Parameter X : Type.
 Parameter y : X.
 Declare Module M3 : M1.
End M2.

https://coq.inria.fr/refman/Reference-Manual004.html
paragraph 2.5.7

choose with invalid interval

I noticed that when calling choose with an empty interval, such as choose (1, 0), QuickChick fails with the message Error: Exited with status 2. I suppose an ideal solution would be to rule out such calls statically (potentially using typeclass magic?), however that may be tricky to implement. This is somewhat analogous to a version of elements without a default, however, this would make choose very cumbersome to use.

In any case, I think choose should produce a more helpful error message in such cases.

Time of failure

I hope to see the time of failing test (so as to locate it in the runtime logs). Is there an instant way of doing so?

More complete generators for IFC case study

Prove that the semantics of bind and return satisfies the monad laws

Keep testing mutants after failure

For now, quickChick stops immediately after an error occurs. It would be helpful to continue on testing other mutants after one case fails.

Error while building stlc example

I have

File "./verif.v", line 403, characters 11-32:
Error: The reference NPeano.Nat.max_le_iff was not found in the current
environment.

when doing make inside examples/stlc, master branch. Coq 8.7.

Clean up ifc-basic?

It might be nice to have ifc-basic as a simple example (especially if it's much simpler than https://github.com/QuickChick/IFC), but this means we need to clean it up first.

Solve issues with forAllShrink and promote

Promote has a bit of a funny semantics:(https://github.com/QuickChick/QuickChick/blob/master/src/SetOfOutcomes.v#L50
so Zoe didn't include it in her thesis. We should see if that's fixable.

Code duplication in GenHigh

Why are the notations defined twice?

Implement (and eventually formalize) splittable randomness

Currently we are cheating in the extraction and using state in OCaml to keep track of the random seed. It would be more interesting verification-wise to get rid of this (very effective) hack and instead implement splittable randomness in Coq and prove things about it. Probably a long term thing ...

Splittable Pseudorandom Number Generators using Cryptographic Hashing
http://publications.lib.chalmers.se/records/fulltext/183348/local_183348.pdf

Fast Splittable Pseudorandom Number Generators
http://2014.splashcon.org/event/oopsla2014-fast-splittable-pseudorandom-number-generators

Random missing bits: only one left semSuchThatMaybe_complete

Making a list of TODOs in the code we might want to fix (or at least should have a fairly good idea that we could fix):
ModuleGen.v:

   (* TODO: We need completeness as well - this is not exact *)
   Hypothesis semSuchThatMaybe_sound:
     forall A (g : G A) (f : A -> bool),
       semGen (suchThatMaybe g f) --->
       (fun o => o = None \/
                 (exists y, o = Some y /\ semGen g y /\ f y)).

Unfinished proofs in GenHigh.v

There are some proofs to be done about non-primitive in GenHigh.v and they are currently commented out in the module signature. I do not expect them to be very hard. Some of them are stated using semGen, but in order to prove them we might need to change this to semSize.

Should use better terminology

QuickChick uses the same terminology as QuickCheck, but in the context of Coq this leads to all sorts of confusion (e.g. property in QuickChick has nothing to do with Prop). For the intro of Zoe's thesis and her defense we devised a much better terminology that should we should now apply to QuickChick. It's a pity we didn't apply it already to the rest of Zoe's thesis, since this makes things super confusing.

Audit Axioms

Audit Axioms, try to reduce their number.

[hritcu@detained plugin]$ grep "Axiom\b" src/*.v
grep "Axiom\b" src/*.v
src/Random.v:Axiom RandomSeed : Type.
src/Random.v:Axiom randomSeedInhabitant : RandomSeed.
src/Random.v:Axiom randomNext     : RandomSeed -> Z * RandomSeed.
src/Random.v:Axiom randomGenRange : RandomSeed -> Z * Z.
src/Random.v:Axiom mkRandomSeed   : Z          -> RandomSeed.
src/Random.v:Axiom newRandomSeed  : RandomSeed.
src/Random.v:Axiom randomSplit    : RandomSeed   -> RandomSeed * RandomSeed.
src/Random.v:Axiom randomSplitAssumption :
src/Random.v:Axiom randomRBool : bool * bool -> RandomSeed -> bool * RandomSeed.
src/Random.v:Axiom randomRBoolCorrect :
src/Random.v:Axiom randomRNat  : nat  * nat  -> RandomSeed -> nat * RandomSeed.
src/Random.v:Axiom ramdomRNatCorrect:
src/Random.v:Axiom randomRInt  : Z * Z    -> RandomSeed -> Z * RandomSeed.
src/Random.v:Axiom ramdomRIntCorrect:
src/Show.v:Axiom show_nat  : nat -> string.
src/Show.v:Axiom show_bool : bool -> string.
src/Show.v:Axiom show_int  : Z -> string.
src/Test.v:Axiom defNumTests    : nat.
src/Test.v:Axiom defNumDiscards : nat.
src/Test.v:Axiom defNumShrinks  : nat.
src/Test.v:Axiom defSize        : nat.

Path issue in the Makefile on trunk

The following command makes too many assumptions about how QuickChick is being installed:

$(V)cp src/quickChickTool $(shell echo $(PATH) | tr ':' "\n" | grep opam | uniq)/quickChick

In particular, it breaks the Nix build. This of course can be patched after download, but I think I'd mention it here, because when the command used above fails, it tries to install into the root path /quickChick.

More configurable printing

There should be a way to configure how much information is produced by quickChickArgs. Currently the chatty argument seems completely ignored. Also, it might also be nice to have a way to display discard information by just setting an argument. There is some code for getting discards in examples/RedBlack/tests.v:

Definition showDiscards (r : Result) :=
  match r with
  | Success ns nd _ _ => "Success: number of successes " ++ show (ns-1) ++ newline ++
                         "         number of discards "  ++ show nd ++ newline
  | _ => showResult r
  end.

But at the moment there is no obvious way to combine that with the QuickChickArgs top-level command. Maybe some sort of configurable printing callback that's instantiated with showResult by default, but that can be changed?

Use lift instead of bind in examples

at ICFP point-free style is appreciated

Type error installing QuickChick with Coq 8.5

While installing QuickChick, downloading the coq8.5 branch and running make, I get the following type error:

File "src/derive.ml4", line 505, characters 14-16:
Error: This expression has type unit but an expression was expected of type string
Makefile.coq:424: recipe for target 'src/derive.cmo' failed
make[1]: *** [src/derive.cmo] Error 2
make[1]: Leaving directory '/home/caleb/git/QuickChick'
Makefile:15: recipe for target 'plugin' failed
make: *** [plugin] Error 2

I have coq version 8.5pl3, ocaml 4.02.3, opam 1.2.2, and math-comp ssreflect 1.6:

$ coqc --version
The Coq Proof Assistant, version 8.5pl3 (November 2016)
compiled on Nov 28 2016 12:27:29 with OCaml 4.02.3
$ ocaml -version
The OCaml toplevel, version 4.02.3
$ opam --version
1.2.2
$ sudo opam list
[...]
coq-mathcomp-ssreflect    1.6  Small Scale Reflection

Any idea what is causing the type error here, and how this could be fixed?

Thanks!
Caleb

Bring back proofs of IFC case study

We broke these proofs in the JFP submission craze (commit 2f9d25043cf), when we were adding and removing instructions. Moreover the changes on the semGen branch will also affect some of these proofs.

Serious performance issue for naive red-black tree generator

STLC case study

Apply the new size-free methodology to the STLC case study.

Fix show for strings

show : string -> string only escapes quotes, but the same should be done for newlines, tabs and other weird or unprintable characters.

Certificate producing meta-program constructing generators for basic datatypes

I think it would make our story stronger if we could claim that our framework is also useful for certificate producing meta-programs, that is, commands written in OCaml that produce artifacts like generators together with their correctness proofs in our framework.

The canonical example I have in mind is an OCaml command for obtaining a default generator for any basic inductive type (basically building generators for various kinds of trees); this command would also produce a proof that the produced generator is complete for the given type. This would work very much like induction principles in Coq: if your datatype is simple enough you can get one for free.

Formally justifying our sets of values abstraction: porting guarantees to concrete instantiation

Once we proved a specification for the set of outcomes semantics of a generator, what exactly can we deduce about the concrete semantics?

Update/complete case studies

Update the case studies in order to use the new "axiom-free" setting. Files/folders that need to be updated:

Tests.v --> Completed
RedBlack/ --> Completed. (No sacrifice on the strength of the statements we prove)
ifc-basic/ (not finished)

No generation of functions (verification interesting)

The code

Require Import Gen.
Definition tryGenFun : Gen (nat -> nat) := arbitrary.

yields the following error

Toplevel input, characters 43-52:
Error: Cannot infer the implicit parameter Arbitrary of
arbitrary.
Could not find an instance for "Arbitrary (nat -> nat)".

This is particularly interesting from a verification perspective, because probably the generators for function types will not be complete (or will need an interesting spec with respect to which they can be complete).

The original QuickCheck paper (Section 3.3) explains how to generate functions: http://www.eecs.northwestern.edu/~robby/courses/395-495-2009-fall/quick.pdf
Is that what the current QuickCheck implementation still does?

Universe inconsistency in examples/RedBlack/testing.v

Trying Leo's new changes it seems that make tests no longer works because of an Universe inconsistency in examples/RedBlack/testing.v:

[hritcu@detained RedBlack]$ make
make -f Makefile.coq
make[1]: Entering directory '/home/hritcu/Projects/quick-chick/pub/plugin/examples/RedBlack'
"coqdep" -c -Q "." QuickChick.RedBlack "verif.v" > "verif.v.d" || ( RV=$?; rm -f "verif.v.d"; exit ${RV} )
"coqdep" -c -Q "." QuickChick.RedBlack "testing.v" > "testing.v.d" || ( RV=$?; rm -f "testing.v.d"; exit ${RV} )
"coqdep" -c -Q "." QuickChick.RedBlack "redblack.v" > "redblack.v.d" || ( RV=$?; rm -f "redblack.v.d"; exit ${RV} )
"coqc"  -q  -Q "." QuickChick.RedBlack   redblack.v
"coqc"  -q  -Q "." QuickChick.RedBlack   testing.v
*** Gave up! Passed only 2403 tests
Discarded: 20000


real    0m5.520s
user    0m5.513s
sys 0m0.003s
File "./testing.v", line 174, characters 0-4:
Error: Universe inconsistency.
make[1]: *** [Makefile.coq:264: testing.vo] Error 1
make[1]: Leaving directory '/home/hritcu/Projects/quick-chick/pub/plugin/examples/RedBlack'
make: *** [Makefile:5: all] Error 2

The rest of make tests seems to work, and I'm not sure what's happening here. Tried starting this in interactive mode but got an even earlier error (on line 6 in examples/RedBlack/testing.v):

Error: Cannot find a physical path bound to logical path matching suffix
<> and prefix QuickChick.RedBlack.

I'm using coq 8.5pl2 and latest proof general.