hbgit / map2check Goto Github PK

View Code? Open in Web Editor NEW

27.0 7.0 9.0 1.04 GB

Map2Check: Finding Software Vulnerabilities

Home Page: https://map2check.github.io

License: GNU General Public License v2.0

Python 2.69% C 12.50% Shell 1.32% CMake 2.66% C++ 24.09% Dockerfile 0.21% Roff 2.60% SWIG 53.74% Assembly 0.19%

c software-verification pointer-analysis reachability-analysis overflow

map2check's Introduction

Map2Check

Finding Software Vulnerabilities using Symbolic Execution and Fuzzing

Map2Check is a bug hunting tool that automatically generating and checking safety properties in C programs. It tracks memory pointers and variable assignments to check user-specified assertions, overflow, and pointer safety. The generation of the tests cases are based on assertions (safety properties) from the code instructions, adopting the LLVM framework version 6.0, LibFuzzer, KLEE to generate input values to the test cases generated by Map2Check. Additionally, we have adopted Crab-LLVM tool to computes inductive invariants for LLVM-based languages.

Extra documentation is available at https://map2check.github.io

Requirements for using the tool

To use the Map2Check tool is necessary a Linux 64-bit OS system. In the linux OS, you should install the requirements, typing the commands:

$ sudo apt install python-minimal
$ sudo apt install libc6-dev

Our binary tool is avaliable at https://github.com/hbgit/Map2Check/releases

Install Map2Check

In order to install Map2Check on your PC, you should download and save the map2check zip file on your disk from https://github.com/hbgit/Map2Check/releases. After that, you should type the following command:

$ unzip map2check-rc-v7.3-svcomp20.zip
$ cd map2check-rc-v7.3-svcomp20

Running the tool

Map2Check can be invoked through a standard command-line interface. Map2Check should be called in the installation directory as follows:

$ ./map2check --memtrack svcomp_960521-1_false-valid-free.c

In this case, --memtrack is the option to check for memory errors. For help and others tool options:

$ ./map2check --help

When you use a LLVM bytecode as input for the tool, be sure to add -g flag when generating the file, it is not required, but map2check will provide better info (like line numbers).

How to build Map2Check?

You can build Map2Check using our Dockerfile by clone our repository:

$ git clone https://github.com/hbgit/Map2Check.git
$ cd Map2Check
$ git submodule update --init --recursive
# Build docker image to compile Map2Check
$ docker build -t hrocha/mapdevel --no-cache -f Dockerfile .
$ docker run --rm -v $(pwd):/home/map2check/devel_tool/mygitclone:Z --user $(id -u):$(id -g) hrocha/mapdevel /bin/bash -c "cd /home/map2check/devel_tool/mygitclone; ./make-release.sh; ./make-unit-test.sh"

More details at https://map2check.github.io/docker.html

How to run the regression testing

You should build the Benchexec docker image that is available in our repository in the submodule:

$ cd Map2Check
$ docker build -t hrocha/benchexecrun --no-cache -f utils/benchexecrun/Dockerfile utils/benchexecrun/
# Running the regression testing
$ ./make-regression-test.sh t

Instructions for SV-COMP'20

Use the map2check-wrapper.py script in the Map2Check binary directory to verify each single test-case.

Usage:

$ ./map2check-wrapper.py -p propertyFile.prp file.i

Map2Check accepts the property file and the verification task and provides as verification result: TRUE + Witness, [FALSE|FALSE(p)] + Witness, or UNKNOWN. FALSE(p), with p in {valid-free, valid-deref, valid-memtrack, valid-memcleanup}, means that the (partial) property p is violated. For each verification result the witness file (called witness.graphml) is generated Map2Check root-path folder. There is timeout of 897 seconds set by this script, using "timeout" tool that is part of coreutils on debian. If these constraints are violated, it should be treated as UNKNOWN result.

Authors

Maintainers:

Herbert O. Rocha (since 2014), Federal University of Roraima, Brazil - https://hbgit.github.io/
Rafael Menezes (since 2016), Federal University of Roraima, Brazil - https://rafaelsa94.github.io/

Questions and bug reports:
E-mail: [email protected]

      .-.          
      /v\
     // \\    > L I N U X - GPL <
    /(   )\
     ^^-^^

map2check's People

Contributors

Stargazers

Watchers

Forkers

lucasccordeiro loongwalker victordeluca gitter-badger 85lx c0de3 fatimahkj pombredanne guilhermebn198

map2check's Issues

Fix incorrect result in ReachSafety-ControlFlow

Verification tasks from SV-COMP

One possible solution for the incorrect true results is to adopt a redundant check.

Add program invariant result in ERROR
libfuzzer out: fuzzing was not performed, you have only executed the target code on a fixed set of inputs.
../../sv-benchmarks/c/ntdrivers-simplified/cdaudio_simpl1.cil-1.c
- Checking KLEE options
- Bug in process KLEE result "KLEE: WARNING: error opening file"
../../sv-benchmarks/c/ntdrivers-simplified/floppy_simpl4.cil-1.c
- Bug in process KLEE result "KLEE: WARNING: error opening file"
../../sv-benchmarks/c/ntdrivers-simplified/floppy_simpl4.cil-1.c
- Bug in process KLEE result "KLEE: WARNING: error opening file"

KLEE was not able to find the bug:

Improving counterexample generation

Identify failures
Fix bugs on counterexample generation
Propose a method to simplify the counterexample size
Coding a method to simplify the counterexample size

LLVM Optimizations with KLEE

Is your feature request related to a problem? Please describe.
Not related to a problem. Currently, we default to -O3 optimizations and let KLEE decides which solver to be used. The -O3 may not be the best for Symbolic Execution and the SMT formula generated by KLEE may not be the best for the selected solver.

Describe the solution you'd like
We should select which solver and the best optimization for Symbolic Execution.

Describe alternatives you've considered
Extend Map2Check to have a class to select which optimizations will be done.

Additional context
The paper Studying the influence of standard compiler optimizations on symbolic execution contains details on how LLVM optimizations affects KLEE and the solvers

Add option to use a specific generator

Is your feature request related to a problem? Please describe.
I'd like to run only KLEE or only libFuzzer (or any future generator)

Describe the solution you'd like
Add the --generator option with klee and libfuzerr as valid values

Describe alternatives you've considered
None

Additional context
It should be simple, the major change will be at the wrapper of the interaction between klee and libFuzzer will have to be refactored.

Add a argument to specify the name of the target function

Is your feature request related to a problem? Please describe.
FuseBMC is trying to use map2check to create coverage for C programs. However, its targets consists in functions named GOAL_N (where N is a number).

Describe the solution you'd like
Add the option --target-function-name which will set the target function name for map2check. This will require patches over the TargetPass and over the opt calls.

Describe alternatives you've considered
The tool that is using map2check could instead rename the current target into __VERIFIER_error()

Additional context
Some reference:

CommandLine: http://llvm.org/docs/CommandLine.html
This answer: https://stackoverflow.com/a/13668358

Skip call nondet generator

Skipping to call nondet generator

I would like to suggest to avoid call fuzzer or symbolic execution when a program does not have a nondet function.

Line 422 map2check.ccp. I guess is possible to apply a static analysis to identify nondet functions call

Adopt a code slicer

Adopt a code slicer
Aiming to improve the code exploration and avoiding states in the program unnecessary to prove a given property, you could adopt a code slicer.

Checking out some slicers:

Frama-C: https://github.com/Frama-C/Frama-C-snapshot
DG: https://github.com/mchalupa/dg
GIRI: https://github.com/liuml07/giri

Clean up branches

Checkout the following branches, aiming to update or remove from the project:

Fix incorrect results in ReachSafety-Arrays

Bug in reachability analysis

Fix build crab-llvm

Update fork
Port sea-dsa fork to LLVM 6.0
Frozen build adopting commits
Update crab-llvm fork from hbgit user on github
Fix Crabllvm build broken

Support to check floating-point arithmetic

KLEE-Float
Check out https://srg.doc.ic.ac.uk/projects/klee-float/

That is an extension project to the KLEE symbolic execution engine that supports reasoning about floating-point arithmetic. For more details see:

Floating-Point Symbolic Execution: A Case Study in N-version Programming. Daniel Liew, Daniel Schemmel, Cristian Cadar, Alastair Donaldson, Rafael Zähl, Klaus Wehrle. IEEE/ACM International Conference on Automated Software Engineering (ASE 2017)

Fix incorrect results in NoOverflows-Other

Verification tasks in SV-COMP

../../sv-benchmarks/c/bitvector/byte_add-2.i

Fix check for shift operation r3 << 24U, see ISO/IEC 9899:2011 6.5.7#4, in that case r3<<24U will exceed INT_MAX
../../sv-benchmarks/c/bitvector/byte_add_1-2.i

Not identified bug root, but we have modeling issues with shift operation
../../sv-benchmarks/c/bitvector/byte_add_2-1.i

Not identified bug root, but we have modeling issues with shift operation

Updating front-end

Review tool input options
Improve how the tool modes are executed
Review the counterexample generation

Support the reach_error() function

Add support for new reach function
Checking out sosy-lab/sv-benchmarks@28369c9

Fix incorrect results in NoOverflows-BitVectors

Verification tasks from SV-COMP

../../sv-benchmarks/c/signedintegeroverflow-regression/AdditionIntMax.i

It hasn't nondet, thus you could improve the analysis turn off libfuzzer and klee for nondet
the sum result (2147483647 + 1) is not allowed in int type of sizeof(int) = 4
../../sv-benchmarks/c/signedintegeroverflow-regression/Multiplication-2.i

It hasn't nondet, thus you could improve the analysis turn off libfuzzer and klee for nondet
the mult result (65536 * 32768) is not allowed in int type of sizeof(int) = 4
../../sv-benchmarks/c/signedintegeroverflow-regression/NoConversion.i

It hasn't nondet, thus you could improve the analysis turn off libfuzzer and klee for nondet
the sum result long x = 2147483647 + 1; is not allowed in long type of sizeof(long or long int) = 4

Update LLVM to 8.0

Update to LLVM 8.0

Aiming to reduce the docker image and apply new futures from LLVM 8

KLEE
Docker image dev-llvm_6.0 @ c290835
Docker image base_image_map2check @ b1040dd
CrabLLVM - CLAM : https://github.com/seahorn/crab-llvm/tree/llvm-8.0
make-release.sh

Improving testing tool

Create a new flow to test tool
Adopting benchexec to test regression
Setting up Travis to run the tests
Create a docker to run benchexec for regression test
Create testing for production
Create testing for deploy

CRABLLVM generates UNKNOWN result

Check out this sample: https://raw.githubusercontent.com/sosy-lab/sv-benchmarks/master/c/ntdrivers-simplified/cdaudio_simpl1.cil-1.c

I would suggest to you add a checker result when CrabLLVM was adopted in the verification.

Improving the Dockerfile documentation

Add the dockerfile base image to build the tool
Improve the dockerfile documentation to build the tool
Update the docker image on docker hub

Can symbolic execution tool KLEE run in parallel?

In the current experiments, KLEE adopt only one core from CPU. I propose to adopt the following solution:

This is mainly because when we try to check program correctness.

Fix incorrect results in MemSafety-MemCleanup

Verification tasks from SV-COMP

../../sv-benchmarks/c/list-ext3-properties/dll_nullified-1.i

You need to improve memory leak check, bug root new_head->data_2 = __VERIFIER_nondet_int() == len? 1 : 0;
expected_verdict: false but the result is TRUE

Caller Refactoring

We should use the Clang/LLVM API instead of calling external binaries. For example, the older version used this

Places where refactoring is needed:

Also, some other fixes for caller (we should open an issue for Caller class only):

Originally posted by @RafaelSa94 in #8 (comment)

Update strategy to apply smt solver

Checkout https://smt-comp.github.io/2019/results

Is it possible to run different smt solver in parallel?

Is it possible to update the default smt solver?
Try Yices 2.6.2 (http://yices.csl.sri.com/) or Boolector (https://boolector.github.io/)

Change code on:
https://github.com/hbgit/Map2Check/blob/develop/modules/frontend/map2check.cpp#L153
https://github.com/hbgit/Map2Check/blob/develop/modules/frontend/map2check.cpp#L208
https://github.com/hbgit/Map2Check/blob/develop/modules/frontend/caller.cpp#L309

Add new subcategories from SV-COMP

New subcategories from SV-COMP not supported yet

Apply minor changes to support:

ReachSafety-ECA
ReachSafety-Floats
ReachSafety-ProductLines
ReachSafety-Sequentialized
SoftwareSystems-AWS-C-Common-ReachSafety
SoftwareSystems-BusyBox-MemSafety
SoftwareSystems-BusyBox-NoOverflows
SoftwareSystems-DeviceDriversLinux64-ReachSafety
SoftwareSystems-OpenBSD-MemSafety

Add support to Termination Analysis

Termination Analysis

Add support to verify the category Termination-MainControlFlow on SV-COMP that contains programs for which termination should be decided.

I suggest you check out the follows paper:

T2: Temporal Property Verification
Termination-Checking for LLVM Peephole Optimizations
Termination Analysis of C Programs Using Compiler Intermediate Languages

Also suggest you to check SeaHorn tool.

Fix incorrect results in MemSafety-Other

Verification tasks from SV-COMP

../../sv-benchmarks/c/memsafety-ext3/derefInLoop1.c

Expected FALSE, but result is TRUE
Incorrect mapping = BUG

../../sv-benchmarks/c/memsafety-ext3/scopes1.c

LOC 16, not checked ptr scope

../../sv-benchmarks/c/memsafety-ext3/scopes3.c

Incorrect mapping = ptr not initialized

../../sv-benchmarks/c/memsafety-ext3/scopes5.c

Incorrect mapping = incorrect initialized

../../sv-benchmarks/c/pthread-memsafety/list1.i

pthread not modelling, I suggest you to create a new issue

../../sv-benchmarks/c/memsafety-bftpd/bftpd_2.i

Incorrect modelling struct and void

KLEE crash incorrect results

Check KLEE output after the map2check verification

KLEE crashes and incorrect TRUE results for Map2Check mode (e.g., target generate prt error).

I would like to suggest you check out the KLEE output before to show TRUE answer for the verification task.

Improving the code style

Add code style scripts
Add documentation to apply code style
Update build script to validate code style

Update KLEE

I would like to suggest you update KLEE and uClibc, check out:

KLEE: https://github.com/klee/klee/releases/tag/v2.1
uClibc: https://github.com/klee/klee-uclibc/tree/klee_uclibc_v1.2

CII Best Practices Badge Program

Add CII Best Practices Badge Program

Check out the info at https://bestpractices.coreinfrastructure.org

More information on the CII Best Practices Badging program, including background and criteria, is available on GitHub. Project statistics and criteria statistics are available.

Adding robots to check require tests and other checks to improve the quality of the code

Set up required status checks.
Semantic Versioning specification
Automating our team's conventions surrounding code review

Program invariant generation

Inferring program invariant adopting templates
Adopting InvGen flow

Fix incorrect CRAB call to fuzzer

Fix incorrect call of CRABLLVM

I would like to suggest you skip the CRABLLVM call to run fuzzing.
Start at:

Map2Check::NonDetGenerator::LibFuzzer >> map2check.cpp line 421
caller->compileToCrabLlvm(); >> incorrect call at map2check.cpp line 202

Crucible support as a Symbolic Engine

Is your feature request related to a problem? Please describe.
Currently, we only have support for KLEE as a symbolic engine, the issue is that it contains a limitation for floating numbers. Crucible supports enconding for floats, has support for SV-COMP notation and it's API contains all that is needed for a Generator.

Describe the solution you'd like
Implement a NonDetGeneratorCrucible and compare it with KLEE specially for floats.

Fix incorrect results in ReachSafety-Heap

Verification results from SV-COMP

../../sv-benchmarks/c/ldv-regression/sizeofparameters_test.i

Incorrect long int a = __VERIFIER_nondet_long(); representation
../../sv-benchmarks/c/ldv-regression/test27-2.c

Expected FALSE, but the result is TRUE
Check:
libfuzzer -> ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address
KLEE -> memory error: out of bound pointer
OR -> Struct modeling error
../../sv-benchmarks/c/ldv-regression/test28-1.c
>> Same ERROR as the ldv-regression/test27-2.c

Add support to verify concurrent programs

Add support for concurrency programs, e.g., the category ConcurrencySafety.

One possible solution is to adopt Lazy-Cseq* (https://www.southampton.ac.uk/~gp1y10/cseq/) tool to generate the C code, then apply Map2Check tool with its features to improve de verification. Note that Lazy-Cseq generates nondet calls for KLEE.

*CSeq is a framework that facilitates the development of code-to-code translations for concurrent C programs with POSIX threads based on sequentialization.

Add Coverage report

Could you adopt https://coveralls.io/repos

Add SPDX-License-Identifier

Aiming to improve Map2Check to handle different types of licenses. I would like to suggest to add SPDX-License-Identifier. Check out https://spdx.org/ids

hbgit / map2check Goto Github PK

map2check's Introduction

Map2Check

Finding Software Vulnerabilities using Symbolic Execution and Fuzzing

Requirements for using the tool

Install Map2Check

Running the tool

How to build Map2Check?

How to run the regression testing

Instructions for SV-COMP'20

Authors

map2check's People

Contributors

Stargazers

Watchers

Forkers

map2check's Issues

Recommend Projects

Recommend Topics

Recommend Org