Giter Site home page Giter Site logo

luis-sousa-pinto / defects4j-multifault Goto Github PK

View Code? Open in Web Editor NEW

This project forked from coinse/defects4j-multifault

0.0 0.0 0.0 135.09 MB

Artifact of "Searching for Multi-Fault Programs in Defects4J", SSBSE 2021

Home Page: https://link.springer.com/book/10.1007/978-3-030-88106-1

License: MIT License

Shell 9.85% Python 20.87% Java 4.81% Jupyter Notebook 61.24% Dockerfile 2.99% Vim Script 0.25%

defects4j-multifault's Introduction

Multiple faults dataset for Java programs (Extension of Defects4J)

This replication package implements a technique to find multi-fault versions in the Defecst4J dataset accompanying the paper: "Searching for Multi-Fault Programs in Defects4J", Gabin An, Juyeon Yoon, Shin Yoo, SSBSE 2021 [preprint][slides].

If you use this dataset, please cite our SSBSE paper:

@inproceedings{an2021searching,
  title={Searching for Multi-fault Programs in Defects4J},
  author={An, Gabin and Yoon, Juyeon and Yoo, Shin},
  booktitle={International Symposium on Search Based Software Engineering},
  pages={153--158},
  year={2021},
  organization={Springer}
}

Raw search results are available!

You can simply extract the search results to fault_data/multi/ using the following command.

cd fault_data; sh extract.sh multi.tar.bz2

Please note that you can reproduce these results following Step 1 and 2 in the replication guide.

Description of result files

  • ./multiple_faults: A dataset of multiple faults. A <project>.json file includes the bugs contained in each buggy version in <project>.
  • ./fault_data/multi/: Raw search results (about 3GB).
    • The following files are written in <project>-<N>-<M>
      • .*_error: these files are created when an error occurs during the test transplanation, test compile, or execution.
      • .overlapping: this file is created when the revealing test cases of the bug N and M are ovelapped.
      • tests.trigger.*:
        • expected: the original set of fault revealing test cases of N
        • actual: the set of tests that fail on <project>-<M>b containing the fault-revealing test cases of N
        • actual.fixed: the set of tests that fail on <project>-<M>f containing the fault-revealing test cases of N (generated only when .overlapping exists)
      • failing_tests.*:
        • expected: the original error messages of fault revealing test cases of N
        • actual: the error messages of failing tests on <project>-<M>b containing the fault-revealing test cases of N
        • actual.fixed: the error messages of failing tests on <project>-<M>f containing the fault-revealing test cases of N (generated only when .overlapping exists)
      • <project>-<N>-<M>.patch: the git patch that transplants the fault-revealing test cases of <project>-<N> to <project>-<M>b
      • test_copy.log: log produced during the test transplanation. Similar to <project>-<N>-<M>.patch
      • *.xml: the coverage of fault-revealing test cases of N on <project>-<M>b.
      • .not_exist: if this file exists, N is not revealed in <project>-<M>b. See the file contents for more details.
    • ./analysis.ipynb: The data analysis script used to draw the figures in the paper

A Guide to Replication

Prerequisite

  • Docker
  • (Host machine) Python version 3.9.1

Step 1. Preparing the docker environment

Build the docker image & execute the docker container

cd docker/
# build a docker image
docker build --tag mf:latest .
# create a docker container in the background
docker run -dt --name mf -v $(pwd)/resources/workspace:/root/workspace -v $(pwd)/../fault_data:/root/fault_data mf:latest
# execute an interactive bash shell on the container
docker exec -it mf bash

Step 2. Running the search

To replicate the whole search in the paper,

# on the docker container
cd /root/workspace
python3.6 search.py <projects> --savedir <dirname>

Then, the search results are saved to /root/fault_data/<dirname>/ directory in the container (which is ./fault_data/<dirname> in the host machine).

For example,

python3.6 search.py Lang,Chart,Time,Math,Closure --savedir multi_replicated

will save the searh results to /root/fault_data/multi_replicated/

Want to test for a single pair of N and M?

One can use the following command to check a specific bug <project>-<N> exists in the buggy version <project>-<M>b.

# on the docker container
sh check_exist.sh <project> <N> <M> <savepath>
# ex) sh check_exish.sh Math 5 6 ./Math-5-6

The existence check result will be saved into <savepath>.

Step 3. Checkout the multi-fault version (Optional)

Note that we do not alter any source codes to artifically inject faults. We only added bug-revealing test cases that can reveal multiple faults in the original code version

After the step 2, you may want to check out the multiple fault version where the source code remains the same, but additional bug-revealing test cases are transplanted.

On the docker container, use the following command to check out the source code:

# on the docker container
python3.6 checkout.py Lang-26-27-31 -w /tmp/Lang-26-27-31
cd /tmp/Lang-26-27-31
git diff
cat tests.trigger.*

Step 4. Analysing the search results and creating the dataset of multiple faults

# on the host machine (repository root)
python create_dataset.py <project> --savepath <savepath>

This command creates two result files, <savepath> and <savepath>.pairs.csv.

For example, executing the following command

python create_dataset.py Lang --savepath multiple_faults_replicated/Lang.json

will create multiple_faults_replicated/Lang.json and multiple_faults_replicated/Lang.json.pairs.csv. Please refer to multiple_faults/README.md for more details about the data format.

defects4j-multifault's People

Contributors

agb94 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.