Giter Site home page Giter Site logo

efftester's Introduction

Effect-Driven Compiler Tester:

This is a prototype implementation of a compiler testing approach described in the paper:

"Effect-Driven QuickChecking of Compilers", Jan Midtgaard, Mathias Nygaard Justesen, Patrick Kasting, Flemming Nielson, Hanne Riis Nielson, ICFP'17, http://janmidtgaard.dk/papers/Midtgaard-al:ICFP17-full.pdf (full version)

We suggest to generate programs following a dedicated type and effect system that determines whether a program has evaluation-order dependence. By reading the type-and-effect-system relation bottom up, in a goal-directed manner we can generate programs that are free of evaluation-order dependence. We can thus run such programs with both the bytecode and native code backends of OCaml, log their observable behaviour to a file, and diff the two behaviours.

Building and running:

The tester requires OCaml, the QCheck library, and a *nix-like platform with diff and /dev/null. The below output is obtained with QCheck version 0.5.2. If compiled as is with a later QCheck version (known to produce more output) the resulting program may produce a somewhat garbled output.

To build simply run make.

Now run effmain.native -v (the native code build is recommended for speed reasons). This should produce output along these lines:

$ ./effmain.native -v
random seed: 228021772
...........................................................................x...x...x...x...x...x...x...x...x...x....x...x..x....x....x.....x..x...x..x......x.....x.....x.....x.....x.....x.....x.....x.....x.....x.....x.....x.....x.....x.....
law bytecode/native backends agree: 76 relevant cases (76 total)

  test `bytecode/native backends agree`
  failed on ≥ 1 cases:
  Some (let o = (let j = List.hd [] in fun w -> int_of_string) "" in 0) (after 32 shrink steps)
  
failure (1 tests failed, ran 1 tests)

Interestingly, the produced counterexamples such as let o = (let j = List.hd [] in fun w -> int_of_string) "" in 0 above illustrate observable differences in the executables produced by the two backend, aka, bugs. This particular one is a variant of MPR#7531 listed below.

You can provide a known seed with the flag -s: ./effmain.native -v -s 228021772 to reproduce an earlier run.

When invoked without a command-line seed QCheck is seeded differently for each run and will therefore (most likely) produce different results for each of them:

  • each run may reveal different errors or
  • no bugs are found in the given run.

The reproducability with respect to a given seed should be taken with a large grain of salt. A seed may still yield different results depending on the underlying architecture, OS, bit-width, OCaml-version, and QCheck-version. Furthermore it is (all too) easy to invalidate previous seeds by any adjustments to

  • the generator -- as the adjusted version may utilize/interpret the old seed differently,
  • the generator's initial environment -- as choices herein may come out differently for a smaller/bigger/different environment,
  • the shrinker -- as the adjusted version may shrink a bigger problematic counterexample to a different minimal example.

The generated programs are written to testdir/test.ml and the compiled versions are named testdir/native and testdir/byte. These are overwritten on each iteration. As such, the last version remaining rarely represents a counterexample. Instead it represents the shrinker's last (failed) attempt to shrink the minimal counterexample one step further.

The current version tests OCaml version 4.04.0's native-code compiler with flag -O3 against the bytecode compiler. It was originally tested with version 4.02.3 which doesn't support the -O3 flag. So far the tester has found the same bugs (below) related to erroneous optimization in 4.02.3 as in 4.04.0 with the -O3 flag.

If you find more errors using this approach please let me know.

Bugs found:

MPR#7531 https://caml.inria.fr/mantis/view.php?id=7531 Delayed effects in partially applied functions

MPR#7533 https://caml.inria.fr/mantis/view.php?id=7533 Wrong code generation of e / e' and e mod e' with effectful e

MPR#7201 https://caml.inria.fr/mantis/view.php?id=7201 Wrong optimization of 0 / e (reported by way of Jean-Christophe Filliatre)

#8864 ocaml/ocaml#8864 Left-shift of 0 yields negative number (found by Murilo Giacometti Rocha)

A quick port to js_of_ocaml revealed the following 2 bugs (reported together):

ocsigen/js_of_ocaml#584 function difference+equality exception removed

A quick port to BuckleScript revealed the following 8 bugs:

rescript-lang/rescript-compiler#1667 failure: internal compiler Failure

rescript-lang/rescript-compiler#1692 BuckleScript produces incorrect JavaScript

rescript-lang/rescript-compiler#1759 comparison+difference of units yields different results

rescript-lang/rescript-compiler#1760 div+mod: Division_by_zero removed

rescript-lang/rescript-compiler#1761 comp: function comparison succeeds

rescript-lang/rescript-compiler#1762 order: bool_of_string erroneously optimized

Known bugs recreated:

GPR#954 ocaml/ocaml#954 Wrong optimisation of 0 mod e (fixes both the division and mod cases)

GPR#956 ocaml/ocaml#956 Keep possibly-effectful expressions when optimizing multiplication by zero

Observing the generated programs:

To invoke the generator directly, first make eff and then start ocaml from within this directory (this will load modules suitably via .ocamlinit).

To generate an arbitrary program you can now invoke Gen.generate1 term_gen.gen (wrapped with a suitable string coersion to make the generator's output more understandable). The output changes per each invocation:

$ ocaml
        OCaml version 4.04.0

[some lines about loaded libs omitted]

# Print.option (toOCaml ~typeannot:false) (Gen.generate1 term_gen.gen);;
- : string = "Some (exit ((fun n -> (-98051756132636271)) string_of_bool))"

# Print.option (toOCaml ~typeannot:false) (Gen.generate1 term_gen.gen);;
- : string =
"Some ((mod) (pred (let h = if true then exit ((/) (if false then 1830787755246062127
 else (-2895157864674163253)) (lnot ((asr) (-3089269914618536456) 93))) else 
 let w = print_endline (let i = lnot 996022529208063915 in string_of_int 25) in 
 string_of_bool (exit (abs (-427726557501168681))) in ( * ) (pred (lnot (List.hd 
 (List.hd (exit 991))))) (lnot 5))) ((lsr) 225 755))"

To observe the distribution of the effect-driven program generator you can build a different target with make stat which results in a effstat.native executable. This program runs a constant true test over generated terms while logging the size of each term.

A bashscript runstat.sh will then run ./effstat.native -v, log the output, process it with sed, and pass the output to the program 'ministat'.

runstat.sh itself requires a *nix platform with bash and 'ministat' installed.

When run, runstat.sh should produce output along these lines (you can adjust the 50 generated terms in src/effstat.ml):

$ ./runstat.sh 
x stat_size.txt
+-------------------------------------------------------------------------------------------------------------+
|             x                                                                                               |
|            xx                                                                                               |
|            xx                                                                                               |
|           xxx                                                                                               |
|          xxxx                                                                                               |
|          xxxx                                                                                               |
|          xxxx                                                                                               |
|          xxxx                                                                                               |
|          xxxx    x x              xx                                                                        |
|          xxxxx   xxx    x x      xxx                 x                            x x     x                x|
||____________M_________A_____________________|                                                               |
+-------------------------------------------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x  50             1           639            16         80.82     148.41691

The very latest version of the QCheck library can provide similar output and thereby removes the need for the above bashscript.

efftester's People

Contributors

jmid avatar olleolleolle avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

efftester's Issues

Add the paper

Would you be willing to add the pdf of the paper to this project?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.