Giter Site home page Giter Site logo

grafe-sim's Introduction

grafe-sim

Code similarity via graph-based features.

Project structure

/GraphGrepSX

This folder contains a modified version on GraphGrepSX. The changes concern the creation of an additional XML output of input graphs index (as prefix tree).

/llvmGrammars

This folder contains the BNF grammars in G4 format of LLVM-IR 7.0.0 language. The original grammar description can be found on LLVM mailing list at this link.

/py

This folder contains all the source files regarging the execution pipeline. This folder contain ccpt python package. The source code is located in /src directory and the unit tests in /tests directory.

/analysis

This folder contains some workflow examples of the clustering and classification problems. They are both contained inside a jupyter notebook. It also contains a /data folder in which are a sample of the Codenet dataset source files.

/maps

This folder contains a set of .csv files which represents some predefinite maps that can be used for A-CFG creation.

Setup

LLVM Clang and Opt (v7.0.0)

In order to make the preprocessing steps (.c $\rightarrow$ .dot CFG) of ccpt works, clang-7 and opt-7 must be installed: v 7.0.0 required.

  • Build specific version of llvm project from source:

    mkdir <llvm-7-dir>
    cd <llvm-7-dir>
    git clone https://github.com/llvm/llvm-project.git
    cd llvm-project
    git checkout tags/llvmorg-7.0.0
    cmake -S llvm -B build -G "Unix Makefiles" -DCMAKE_BUILD_TYPE="Release" -DLLVM_ENABLE_PROJECTS=clang
    cmake --build build -j 8
  • Then, create a symlink:

    ln -s /full/path/to/<llvm-7-dir>/llvm-project/build/bin/clang /usr/bin/clang-7
    ln -s /full/path/to/<llvm-7-dir>/llvm-project/build/bin/clang++ /usr/bin/clang++-7
    ln -s /full/path/to/<llvm-7-dir>/llvm-project/build/bin/opt /usr/bin/opt-7
    ln -s /full/path/to/<llvm-7-dir>/llvm-project/build/bin/llvm-cxxfilt /usr/bin/llvm-cxxfilt-7
    

It is only required that Clang/Opt 7.0.0 could be called from CLI with clang-7 and opt-7 commands. To check if they work:

clang-7 --version
clang++-7 --version
opt-7 --version
llvm-cxxfilt-7 --version

Graphviz

To install graphviz on Ubuntu:

sudo apt install graphviz

Modified GraphGrepSX (ggsx)

To compile GraphGrepSX on your own machine:

cd /GraphGrepSX
make clean
make -B
cp ggsx /py/src/ccpt/cfg_to_trie

ccpt Python package

1. Python requirements

To install Python requirements it is raccomanded to create a new virtual environment:

python3 -m venv venv

To activate the virtual environment venv:

  • Linux/MacOS

    source venv/bin/activate
  • Windows

    venv\Scripts\activate

To install requirements for the project in the virtual environment venv:

(venv)$ pip install -r py/requirements.txt

To install also additional requirements for development:

(venv)$ pip install -r py/requirements_dev.txt

(To deactivate venv):

(venv)$ deactivate

2. ccpt package in editor mode

From virtual environment:

(venv)$ pip install -e py/

Usage

Test if ccpt is installed:

pip show ccpt

For description and usage of ccpt:

Launch jupyter notebook server:

python -m jupyter notebook

Analysis

The analysis phase is performed using jupyter notebook. The analysis in these notebooks are done on a series of precomputed features extracted from the A-CFGs generated from the source code.

Features generation

Execute these commands to generate the features starting from the source code.

Create a folder for the generated files inside /analysis

cd analysis
mkdir featuresData

Choose a file from the data folder and generate the features

python3 gen-features-callgraph.py data/p00000/[somefile] -od featuresData

Necessary packages

In order to run the examples inside the jupyter notebooks some packages must be installed, which are:

  • sklearn
  • yellowbrick

Installation

After activating the virtual environment created above, execute the following commands:

skelearn

pip3 install -U scikit-learn

yellowbrick

pip install yellowbrick

Running the example

After activating the virtual environment, start the jupyter server from the 'analysis' folder:

jupyter notebook

This command will start a new tab in the default browser showing a list of all the files and folder located into the folder in which the command has been executed, in this case analysis.

Select the desired notebook and execute the internal boxes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.