stedolan / ocaml-afl-persistent Goto Github PK

View Code? Open in Web Editor NEW

15.0 15.0 7.0 12 KB

persistent-mode afl-fuzz for ocaml

License: MIT License

OCaml 74.76% Shell 25.24%

ocaml-afl-persistent's People

Contributors

Stargazers

Watchers

Forkers

yomimono smondet edwintorok dra27 gridbugs dune-universe gabitulba

ocaml-afl-persistent's Issues

build fails when bash is not in `/bin`

Because of the #! :

ocaml-afl-persistent/build.sh

Line 1 in 723b19c

#!/bin/bash

"ocamlopt: not found" with OCaml 5 on several platforms

It seems that afl-persistent can't be installed on platforms without ocamlopt. With OCaml 5, this includes arm32, ppc64, s390x and x86_32. This in turn means that CI fails for these platforms, even when AFL isn't being used:

#=== ERROR while compiling afl-persistent.1.3 =================================#
# context     2.1.4 | linux/arm32 | ocaml-base-compiler.5.0.0 | file:///home/opam/opam-repository
# path        ~/.opam/5.0/.opam-switch/build/afl-persistent.1.3
# command     ~/.opam/5.0/.opam-switch/build/afl-persistent.1.3/./build.sh
# exit-code   127
# env-file    ~/.opam/log/afl-persistent-7-f6ccda.env
# output-file ~/.opam/log/afl-persistent-7-f6ccda.out
### output ###
# + dirname /home/opam/.opam/5.0/.opam-switch/build/afl-persistent.1.3/./build.sh
# + cd /home/opam/.opam/5.0/.opam-switch/build/afl-persistent.1.3/.
# + rm -rf _build/
# + rm -f afl-persistent.config
# + mkdir _build
# + cd _build
# + ocamlc=ocamlc -g -bin-annot
# + ocamlopt=ocamlopt -g -bin-annot
# + echo print_string "hello"
# + ocamlopt -dcmm -c afl_check.ml
# + grep -q caml_afl
# + afl_always=false
# + ocamlopt -afl-instrument afl_check.ml -o test
# + [  = hello ]
# + ocamlopt -version
# /home/opam/.opam/5.0/.opam-switch/build/afl-persistent.1.3/./build.sh: 23: ocamlopt: not found
# + [  = 4.04.0+afl ]
# + afl_available=false
# + cat
# + cp ../aflPersistent.mli .
# + [ false = true ]
# + cp ../aflPersistent-stub.ml aflPersistent.ml
# + ocamlc -g -bin-annot -c aflPersistent.mli
# + ocamlc -g -bin-annot -c aflPersistent.ml
# + ocamlc -g -bin-annot -a aflPersistent.cmo -o afl-persistent.cma
# + ocamlopt -g -bin-annot -c aflPersistent.ml
# /home/opam/.opam/5.0/.opam-switch/build/afl-persistent.1.3/./build.sh: 50: ocamlopt: not found

Which version is it?

Both the opam file and the META file claim 1.2, but CHANGES.md and the github tag say 1.3.

Port to Dune

As discussed offline, it would be useful to port this package to use dune (e.g. in order to be able to pack it into a duniverse).

CC @avsm.

a release would be great

Hi Stephen,

I wanted to install afl-persistent via opam right now, but on my FreeBSD system this fails, since the latest release (1.2) uses #!/bin/bash in build.sh -- a binary which does not exist on my system. Since #4 this issue is fixed in master, but not in the released opam package. It would be great to get a new release out. :)

Thanks!

config.sh issue with process sandboxing on macos

In an opam-repository PR, we observed the following error:

#=== ERROR while compiling afl-persistent.1.4 =================================#
# context              2.2.0~alpha2 | macos/x86_64 | ocaml-base-compiler.4.14.1 | file:///Users/mac1000/opam-repository
# path                 ~/.opam/4.14.1/.opam-switch/build/afl-persistent.1.4
# command              ~/.opam/opam-init/hooks/sandbox.sh build ./config.sh
# exit-code            1
# env-file             ~/.opam/log/afl-persistent-61874-2bd2b7.env
# output-file          ~/.opam/log/afl-persistent-61874-2bd2b7.out
### output ###
# ./config.sh: line 17: cannot create temp file for here document: Operation not permitted

AFAweCT, the issue is that macos ships an old version of bash which creates a temporary file for the heredoc (the part in between <<EOF and EOF which is not allowed by sandboxing in /.

Suggested fix:
Replace cd / by cd .. on line 14 in config.sh

Soliciting feedback on a (currently failing) attempt to improve "stability" by tweaking the compiler

I noticed that afl-persistent and crowbar have run into issues with
"stability" that are to be handled at the level of the compiler code
generation strategy:

ocaml/ocaml#1345
(disabling class caching in afl-instrument mode)

and

stedolan/crowbar#14
(yet-unfixed stability issue with `Lazy.force)

are instances of this.

Energized by my use of Crowbar during the excellent Mirage retreat
this year, I have wondered if it was something I could help with, and
tried to implement pieces of a solution at the level of the OCaml
compiler. Specifically I would like to make it easy to say, in the
compiler, "don't instrument this branch" or something like that, and
then this opt-out mechanism could be used to solve those stability
issues. This is the dream; what I have right now is a branch that does
what I wanted it to do, but does not solve stability issues in any way
that I was able to measure (so maybe I wanted the wrong thing). So,
very Work In Progress. I'm posting here to solicit feedback and
comments (cc @stedolan and @yomimono); in particular, I'm curious if
someone think that this approach has a chance to ever work.

The branch is afl on my personal fork:

https://github.com/gasche/ocaml/tree/afl

there are commits that describe what it does

https://github.com/gasche/ocaml/commits/afl
https://github.com/gasche/ocaml/compare/trunk...gasche:afl?expand=1

Understanding of the problem

The stability issues, if I understand correctly (but then I don't
actually understand what "stability" means in afl-fuzz parlance), come
from code that will run into different control path if they are
executed several times from the same input, when the change-of-control
mechanism are outside the view of the tool. (In the classes case, it
is caching; in the Lazy.force case, if @stedolan's analysis of it is
correct, it is the non-deterministic effect of a GC removing a Forward
tag before or after the user tries to force the already-evaluated
thunk for a second time).

So if we don't want to change how the compiler produces code (I would
rather not), we have to find how to selectively disable the
instrumentation point that are involved in these un-stable
conditionals. For class caching, I had the impression that if we could
make sure that neither the "has the method table already been
computed" test nor any of the method-table-computation code was
instrumented, we would recover stability. For the lazy forcing, I had
the impression that if the only instrumentation points to run were the
ones involved in forcing the trunk, and one when the final value is
returned, then that would be correct (I'm not convinced anymore).

Design of a solution attempt

It seems impossible or at least extremely fragile to try to annotate
individual control-flow expressions at the level of OCaml or Lambda
code, and hope to robustly control instrumentation with that, because
instrumentation happens at the Cmm level and further control-flow
constructs may be inserted by the compiler in the middle.

So I think one would need a way to disable all instrumentation within
one portion (defined as static code zone or dynamic fragment of
execution) of the program. (It would also be useful to be able to
share instrumentation points across two program locations, but that
sounded more difficult to implement so I haven't worked on it.)

My solution attempt is to introduce two primitives that would look
essentially as follows:

val suspend_afl : unit -> unit
val restore_afl : unit -> unit

Between a call to suspend_afl and the nest call to restore_afl, no
instrumentation happens.

I implement these two primitives in the commits

"add {suspend,restore}afl primitives to selectively silence path collection"
(this propagates the primitives from the frontend to cmm,
where I expect that they will be given a semantics in afl_instrument.ml)
"selectgen: interpret {suspend,restore}afl as no-ops"
(this ignores then in instruction selection, that is after cmm)

(I'm using commit names instead of hashes as I expect the hashes to
change as I keep rebasing the PR.)

Attempt 1: disabling instrumentation dynamically

The dumbest possible implementation for this is to have a piece of
global state, a boolean, to indicate whether instrumentation is
currently active or suspended:

the {suspend,restore} operations are compiled into writes that
change that global state
the instrumentation code is changed to first check this global flag
before performing any action

I implemented this approach in

"semantics of {suspend,restore}afl"

(I realize as I'm writing this report that this implementation does
not bracket well, calling "suspend" twice and then "restore" only once
restores execution, and that incrementing and decrementing an integer
would work better. Easy to implement.)

Attempt 2: disabling instrumentation statically

When I started thinking about {suspend,restore}_afl, I had in mind
something closer to static annotations that would direct the
non-emission of afl instrumentation code. Basically the idea is that
after you encounter suspend_afl, you stop emitting insturmentation
code, and you start again on the next restore_afl.

The naive implementation of this assumes that function calls always
start with instrumentation enabled, so that instrumentation always
restarts during inner function calls (so it is not equivalent to the
static semantics above). This is not good for the class caching
scenario, when we want the no-insturmentation zone to cross function
calls. But it would be easy to add a blacklist of specific functions
from CamlinternalOO to start in suspended mode, or to just add
well-placed (suspendafl) instructions at the beginning of those
functions (as a ppx extension or what not).

The way this "static" policy is implemented may qualify as a "cool
hack", as in: I think it's cool, but it is also a horrible
hack. Anyway, this was just for experiment, and I checked the output
with -dcmm and it seems to be doing what I want:

"afl: make the afl_status a purely static property"

Negative results

I haven't tried to measure stability of object method table
caching. I tried to measure stability of thunk forcing, and the
results are terrible: none of what I have done improves stability, in
fact it decreases it.

I made two attempts at artfully placing {suspend,restore}_afl
instructions in the Lazy.force generated code:

"try to use {suspend,restore}afl primitives to stabilize Lazy forcing"
"{suspend,restore}afl in lazy_force: try to better bracket the instructions"

The code emitted by the compiler for Lazy.force looks as follows:

match Obj.tag obj with
| Lazy_tag -> force_the_thunk_of obj
| Forward_tag -> obj.(0)
| _ -> obj

and I tried to change it as follows (in the first commit):

suspend_afl ();
match Obj.tag obj with
| Lazy_tag -> restore_afl (); force_the_thunk_of obj
| Forward_tag -> obj.(0)
| _ -> obj

or as follows (in the second commit)

suspend_afl ();
let result =
  match Obj.tag obj with
  | Lazy_tag -> restore_afl (); force_the_thunk_of obj
  | Forward_tag -> obj.(0)
  | _ -> obj
in
restore_afl ();
result

Finally, I decided that what I really wanted was to have both the
Forward_tag and the _ branches assigned the same instrumentation
point, so I wrote a different version of the lazy-forcing code
generation strategy that looks like this:

if Obj.tag obj = Lazy_tag
then begin
  force_the_thunk of obj
end else begin
  suspend_afl ();
  let result =
    if Obj.tag obj = Forward_tag then obj.(0) else obj
  in
  restore_afl ();
  result
end

None of that seems to work: without my changes, my reproduction code
(below) sports a stability of 57.14%, and some of what I've done bring
stability down to 0%, or 33%, or at best 52.94%.

It was not easy for me to find code to reproduce the stability issue
for laziness, below is my repro-case. Is it sensible?

let test = lazy (print_endline "foo")

let f () =
  ignore (List.init 10 (fun _ -> Lazy.force test))

let _ = AflPersistent.run f

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.