Giter Site home page Giter Site logo

icse23repair's Introduction

ICSE2023Repair

A PyTorch Implementation of "Tare: Type-Aware Neural Program Repair".

Introduction

Automated program repair (APR) aims to reduce the effort for software development. With the development of deep learning, lots of DL-based APR approaches have been proposed using an encoder-decoder architecture. Despite the promising performance, these models share one same limitation: generating lots of untypable patches. The main reason for this phenomenon is that the existing models do not consider the constraints of code captured by a set of typing rules.

In this paper, we propose, Tare, a type-aware model for neural program repair to learn the typing rules. To encode an individual typing rule, we introduces three novel components: (1) a novel type of grammars, T-Grammar that integrates the type information into a standard grammar, (2) a novel representation of code, T-Graph that integrates the key information needed for type checking in an AST, and (3) a novel type-aware neural program repair approach, Tare that encodes the T-Graph and generates the patches guided by T-Grammar.

The experiment was conducted on three benchmarks, 393 bugs from Defects4J v1.2, 444 additional bugs from Defects4J v2.0, and 40 bugs from QuixBugs. Our results show that Tare repairs 64, 32, and 27 bugs on these benchmarks respectively and outperforms the existing APR approaches on all benchmarks. The further analysis also shows that Tare tends to generate more compilable patches than the existing DL-based APR approaches with the typing rule information.

The Main File Tree

.
├── generation: code for patch generation
│   ├── bugs-QuixBugs
│ 	├── location
│ 	├── location2
│ 	└── tesetDefect4j.py
├── model: code for neural model
│   ├── data
│   ├── Model.py
│   └── relAttention.py
├── train: code for training the model
│   ├── run.py
│   ├── Dataset.py
│   └── SearchNode.py
├── validation: code for patch validation
│   ├── patches-all: plausible patches
│   ├── Dataset.py
│   └── SearchNode.py

Dataset

Train set

The raw data https://drive.google.com/drive/folders/1ECNX98qj9FMdRT2MXOUY6aQ6-sNT0b_a?usp=sharing from Recoder.

Test set

Updated Result

❗❗❗ We test Tare with the perfect fault localization on Defects4J 2.0.

The left of table below is copied the original Table V in the paper, showing the results using Ochiai for fault localization, where "x/y" indicates the approach correctly repairs x bugs and produces plausible patches for y bugs (a plausible patch passes all the tests). The last column is newly added, showing the number of bugs Tare correctly repairs with perfect fault localization. Only the first plausible patch is considered in the experiment.

Project Bugs TBar SimFix Recoder RewardRepair Tare Tare+PerfectLocation
Original Result Original Result Original Result Original Result Original Result Original Result Updated Result
Cli 39 1/7 0/4 3/3 2/- 5/13 8
Closure 43 0/5 1/5 0/7 1/- 0/5 1
JacksonDatabind 112 0/0 0/0 0/0 3/- 0/4 0
Codec 18 2/6 0/2 2/2 3/- 3/7 5
Collections 4 0/1 0/1 0/0 0/- 0/0 0
Compress 47 1/13 0/6 3/9 0/- 4/13 4
Csv 16 0/2 1/5 4/4 2/- 5/7 5
JacksonCore 26 0/6 0/0 0/4 1/- 2/7 14
Jsoup 93 3/7 1/5 7/13 4/- 10/16 14
JxPath 22 0/0 0/0 0/4 3/- 2/10 3
Gson 18 0/0 0/0 0/0 1/- 1/1 1
JacksonXml 6 0/0 0/0 0/0 0/- 0/1 0
Total 444 8/50 2/25 19/46 20/- 32/84 55

Usage

Train a New Model

CUDA_VISIBLE_DEVICES=0,1 python3 train/run.py train

The saved model is checkModel/best_model.cpkt.

Test the Model

Generate Patches for Defects4J v1.2 with Ochiai by

CUDA_VISIBLE_DEVICES=0 python3 generation/testDefect4j.py bugid

The generated patches are in folder generation/patch-all/patch/ in json.

Generate Patches for Defects4J v1.2 with groundtruth by

CUDA_VISIBLE_DEVICES=0 python3 generation/testDefect4j1.py bugid

The generated patches are in folder generation/patch-all/patchground/ in json.

Generate Patches for Defects4J v2.0 by

CUDA_VISIBLE_DEVICES=0 python3 generation/testDefects4j2.py bugid

The generated patches are in folder generation/patch-all/patch2 in json.

Generate Patches for QuixBugs by

CUDA_VISIBLE_DEVICES=0 python3 generation/testQuixbug.py bugid

The generated patches are in folder generation/patch-all/patchQuix in json.

Validate Patches for Defects4J v1.2 with Ochiai

python3 validation/repair.py bugid

The results are in folder validation/patch-all/patches/ in json.

Validate Patches for Defects4J v1.2 with groundtruth

python3 validation/repair1.py bugid

The results are in folder validation/patch-all/patchesground/ in json.

Validate Patches for Defects4J v2.0 with Ochiai

python3 validation/repair2.py bugid

The results are in folder validation/patch-all/patches2/ in json.

Validate Patches for QuixBugs with Ochiai

python3 validation/repairqfix.py bugid

The results are in folder validation/patch-all/patchesqfix/ in json.

Gnerated Patches

The generated patches are in the folder patches-all.

Dependency

  • Python 3.7
  • PyTorch 1.3
  • Defects4J
  • Java 8
  • docker
  • nvidia-docker

icse23repair's People

Contributors

pkuzqh avatar grammart5 avatar xiongyingfei avatar

Stargazers

 avatar Kai Gao avatar Sergey Mechtaev avatar Sᴜᴘᴇʀ Lᴇᴇ avatar Fengjie Li avatar  avatar  avatar zhongqiang pan avatar  avatar Wenkang Zhong avatar  avatar

Watchers

 avatar  avatar

Forkers

leonyang95 kwz219

icse23repair's Issues

Can you provide the list of plausible patches?

Thanks for your excellent work, it is very impressive.
I'm developing an apr tool and want to set Tare as one of my baselines. But I need filter some bugs out to compare the precision value.

Can you provide the list of bugids that Tare will generate plausible patches?

Thank you for considering my request. I am looking forward to your reply.

Train dataset

Hello authors, may I ask for the codes to process the data from raw data (data0.pkl you mentioned in readme) into the mode input (data.pkl)? Thanks so much for your time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.