Giter Site home page Giter Site logo

cooper's Introduction

Cooper: Testing the Binding Code of Scripting Languages with Cooperative Mutation

Cooper utilize cooperative mutation to test the binding code of scripting languages to find memory-safe issues. Cooperative mutation simultaneously modifies the script code and the related document objects to explore various code paths of the binding code. To support cooperative mutation, we infer the relationship between script code and document objects to guide the two-dimensional mutation. We applied our tool Cooper on three popular commercial software, Adobe Acrobat, Foxit Reader and Microsoft Word. Cooper detected 134 previously unknown bugs, which resulted in 33 CVE entries and 22K bug bounties. Cooper has three components:

  • Object Clustering: In the begining, Cooper parses the given sample documents to extract native objects. To reduce the object search space, Cooper categorizes objects into different classes based on their attributes.
  • Relationship Inference: Then, Cooper infer the relationship between object classes and Api groups. Specifically, it produces a large number of documents by combining different object classes and API groups, and records the execution results of the embedded scripts. Based on the success rate of the script execution and the distribution of object classes, Cooper infers the relationship between Api groups and object classes.
  • Relatinship-Guided Mutation: At end, Cooper leverages the inferred relationship to guide the object selection, script generation and object mutation. We also design several cooperative mutation strategies.

The overview of Cooper is illustrated by the diagram below.

The overview of Cooper

For more details, please check our paper published in the 29th Annual Network and Distributed System Security Symposium (NDSS 2022).

Installation & Run

Platform

  • Windows10 64bit
  • Python2

Prerequisites

Collecting PDF/Word samples

You need to prepare some PDF (Word) samples and place them in a folder which will be used as sample_dir. The number of PDF (Word) samples should be between 10,000 to 20,000. We have prepared 200 PDF samples in DropBox, and 214 word samples in DropBox. You can test with these small samples, but for better results, use more samples.

Usage

$ python PdfSolution.py/WordSolution.py <sample_dir> <data_dir> <output_dir> <generate_cnt>
  
  <sample_dir>:   the absolute path of raw pdf samples folder
  <data_dir>:     the absolute path for mid data folder
  <output_dir>:   the absolute path for folder storing generated samples
  <generate_cnt>: the amount how many input Cooper will generate

Authors

Publications

Cooper: Testing the Binding Code of Scripting Languages with Cooperative Mutation

@inproceedings{xu:cooper,
  title        = {{Cooper: Testing the Binding Code of Scripting Languages with Cooperative Mutation (To Appear)}},
  author       = {Peng Xu and Yanhao Wang and Hong Hu and Purui Su},
  booktitle    = {Proceedings of the 29th Annual Network and Distributed System Security Symposium (NDSS 2022)},
  month        = {apr},
  year         = {2022},
  address      = {San Diego, CA},
}

cooper's People

Contributors

huhong789 avatar xupeng1231 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.