Giter Site home page Giter Site logo

nashid / cpg Goto Github PK

View Code? Open in Web Editor NEW

This project forked from fraunhofer-aisec/cpg

0.0 1.0 1.0 8.59 MB

A library to extract Code Property Graphs from C/C++, Java, Golang and Python.

Home Page: https://fraunhofer-aisec.github.io/cpg/

License: Apache License 2.0

Kotlin 13.68% Java 77.32% C++ 2.29% CMake 0.01% C 0.12% Go 3.88% Shell 0.02% Python 2.41% TypeScript 0.25% JavaScript 0.02%

cpg's Introduction

Code Property Graph

Actions Status Quality Gate Status Security Rating Coverage

A simple library to extract a code property graph out of source code. It has support for multiple passes that can extend the analysis after the graph is constructed. It currently supports C/C++ (C17), Java (Java 13) and has experimental support for Golang, Python and TypeScript.

What is this?

A code property graph (CPG) is a representation of source code in form of a labelled directed multi-graph. Think of it as directed a graph where each node and edge is assigned a (possibly empty) set of key-value pairs (properties). This representation is supported by a range of graph databases such as Neptune, Cosmos, Neo4j, Titan, and Apache Tinkergraph and can be used to store source code of a program in a searchable data structure. Thus, the code property graph allows to use existing graph query languages such as Cypher, NQL, SQL, or Gremlin in order to either manually navigate through interesting parts of the source code or to automatically find "interesting" patterns.

This library uses Eclipse CDT for parsing C/C++ source code and JavaParser for parsing Java. In contrast to compiler AST generators, both are "forgiving" parsers that can cope with incomplete or even syntactically incorrect source code. That makes it possible to analyze source code even without being able to compile it (due to missing dependencies or minor syntax errors).

Usage

For Visualization Purposes

In order to get familiar with the graph itself, you can use the subproject cpg-neo4j. It uses this library to generate the CPG for a set of user-provided code files. The graph is then persisted to a Neo4j graph database. The advantage this has for the user, is that Neo4j's visualization software Neo4j Browser can be used to graphically look at the CPG nodes and edges, instead of their Java representations.

As Library

The most recent version is being published to Maven central and can be used as a simple dependency, either using Maven or Gradle. Since Eclipse CDT is not published on maven central, it is necessary to add a repository with a custom layout to find the released CDT files. For example, using Gradle's Kotlin syntax:

repositories {
    ivy {
        setUrl("https://download.eclipse.org/tools/cdt/releases/10.2/cdt-10.2.0/plugins")
        metadataSources {
            artifact()
        }
        patternLayout {
            artifact("/[organisation].[module]_[revision].[ext]")
        }
    }
}

dependencies {
    api("de.fraunhofer.aisec", "cpg", "3.5.1")
}

Development Builds

A published artifact of every commit can be requested through JitPack. This is especially useful, if your external project makes use of a specific feature that is not yet merged in yet or not published as a version yet. Please follow the instructions on the JitPack page. Please be aware, that similar to release builds, the CDT repository needs to be added as well (see above).

On Command Line

The library can be used on the command line using jshell, the Java shell to try out some basic queries.

First, a jar consisting all the necessary dependencies should be created with ./gradlew shadowJar. Afterwards, the shell can be launched using jshell --class-path cpg-library/build/libs/cpg-library-all.jar.

The following snippet creates a basic TranslationManager with default settings to analyze a sample file in src/test/resources/openssl/client.cpp:

import de.fraunhofer.aisec.cpg.TranslationConfiguration;
import de.fraunhofer.aisec.cpg.TranslationManager;
import de.fraunhofer.aisec.cpg.graph.declarations.FunctionDeclaration;

var path = Paths.get("src/test/resources/openssl/client.cpp");
var config = TranslationConfiguration.builder().sourceLocations(path.toFile()).defaultPasses().defaultLanguages().debugParser(true).build();
var analyzer = TranslationManager.builder().config(config).build();
var result = analyzer.analyze().get();
var tu = result.getTranslationUnits().get(0);

Afterwards, a list of function declarations can be obtained like this:

var functions = tu.getDeclarations().stream().filter(decl -> decl instanceof FunctionDeclaration).map(FunctionDeclaration.class::cast).collect(Collectors.toList());

Information about specific functions can be obtained using the property getters:

var func = functions.get(0);
func.getName();
func.getSignature();
func.getParameters();

Usage of Experimental Languages

Some languages, such as Golang are marked as experimental and depend on other native libraries. These are NOT YET bundled in the release jars (with exception of TypeScript), so you need to build them manually using the property -Pexperimental when using tasks such as build or test. For typescript, please use -PexperimentalTypeScript.

Golang

In the case of Golang, the necessary native code can be found in the src/main/golang folder. Gradle should automatically find JNI headers and stores the finished library in the src/main/golang folder. This currently only works for Linux and macOS. In order to use it in an external project, the resulting library needs to be placed somewhere in java.library.path.

Python

You need to install jep. This can either be system wide or in a virtual environment. Furthermore, the python source, which are located in src/main/python need to be present in a directory with that name relative to where you execute or use CPG. We are working on extracting this into an actual python module, similar to jep. Currently, only Python 3.9 is supported.

Through the JepSingleton, the CPG library will look for well known paths on Linux and OS X. JepSingleton will prefer a virtualenv with the name cpg, this can be adjusted with the environment variable CPG_PYTHON_VIRTUALENV.

System Wide

Follow the instructions at https://github.com/ninia/jep/wiki/Getting-Started#installing-jep.

Virtual Env
  • python3 -m venv ~/.virtualenvs/cpg
  • source ~/.virtualenvs/cpg/bin/activate
  • pip3 install jep

TypeScript

For parsing TypeScript, the necessary NodeJS-based code can be found in the src/main/nodejs directory of the cpg-library folder. Gradle should build the script automatically, provided NodeJS (>=16) is installed. The bundles script will be placed inside the jar's resources and should work out of the box.

Development Setup

Code Style

We use Google Java Style as a formatting. Please install the appropriate plugin for your IDE, such as the google-java-format IntelliJ plugin or google-java-format Eclipse plugin.

Integration into IntelliJ

Straightforward, however three things are recommended

  • Enable gradle "auto-import"
  • Enable google-java-format
  • Hook gradle spotlessApply into "before build" (might be obsolete with IDEA 2019.1)

Git Hooks

You can use the hook in style/pre-commit to check for formatting errors:

cp style/pre-commit .git/hooks

How to build

This project requires Java 11. If Java 11 is not your default Java version, make sure to configure gradle to use it by setting its java.home variable:

./gradlew -Dorg.gradle.java.home="/usr/lib/jvm/java-11-openjdk-amd64/" build

Contributors

The following authors have contributed to this project (in alphabetical order):

Further reading

A preliminary version of this cpg has been used to analyze ARM binaries of iOS apps:

[1] Julian Schütte, Dennis Titze. liOS: Lifting iOS Apps for Fun and Profit. Proceedings of the ESORICS International Workshop on Secure Internet of Things (SIoT), Luxembourg, 2019

An initial publication on the concept of using code property graphs for static analysis:

[2] Yamaguchi et al. - Modeling and Discovering Vulnerabilities with Code Property Graphs https://www.sec.cs.tu-bs.de/pubs/2014-ieeesp.pdf

[3] is an unrelated, yet similar project by the authors of the above publication, that is used by the open source software Joern [4] for analysing C/C++ code. While [3] is a specification and implementation of the data structure, this project here includes various Language frontends (currently C/C++ and Java, Python to com) and allows creating custom graphs by configuring Passes which extend the graph as necessary for a specific analysis:

[3] https://github.com/ShiftLeftSecurity/codepropertygraph

[4] https://github.com/ShiftLeftSecurity/joern/

cpg's People

Contributors

oxisto avatar konradweiss avatar vfsrfs avatar julianschuette avatar masrepus avatar renovate[bot] avatar renovate-bot avatar maximiliankaul avatar

Watchers

James Cloos avatar

Forkers

doytsujin

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.