Giter Site home page Giter Site logo

joerncpgexport's Introduction

joernCpgExport

Export ShiftLeft Code Property Graph (cpg) from OverflowDB to json and csv. The csv can be imported into the Neo4j graph database. Neo4j can be used to visualize the graph and to write queries via cypher.

Please note: The OverflowDB file format is not compatible between different versions. Be sure to use the same OverflowDB version in joern and joernCpgExport.

Reference:

Build

$ sbt stage

Usage

$ ./target/universal/stage/bin/joerncpgexport --help
Usage: joernCpgExport [options]

  -d, --no-default-overlays  do not apply default overlays
  -o, --no-oss-dataflow      do not apply oss dataflow overlay
  -c, --cpg <cpg.bin>        load cpg from OverflowDB
  -j, --json <cpg.json>      export cpg as json file
  --help                     prints this usage text

Example

# use fuzzyc2cpg from joern to parse source code and create OverflowDB
$ fuzzyc2cpg.sh --output cpg.bin src
# convert OverflowDB into json
$ joernCpgExport --cpg cpg.bin --json cpg.json

cpg.json:

{
  "nodes" : [ {
    "ID" : 1,
    "OVERLAYS" : [ "semanticcpg", "dataflowOss" ],
    "VERSION" : "0.1",
    "LANGUAGE" : "C",
    "TYPE" : "META_DATA"
  }, {
    "ID" : 2,
    "FULL_NAME" : "<global>",
    "ORDER" : 1,
    "TYPE" : "NAMESPACE_BLOCK",
    "NAME" : "<global>",
    "FILENAME" : "<unknown>"
  }, {
    "ID" : 100,
    "TYPE_DECL_FULL_NAME" : "",
    "NAME" : "",
    "FULL_NAME" : "",
    "TYPE" : "TYPE"
  }, {
    "ID" : 101,
    "TYPE_DECL_FULL_NAME" : " [ 1 ]",
    "NAME" : " [ 1 ]",
    "FULL_NAME" : " [ 1 ]",
    "TYPE" : "TYPE"
  },
  [...]
  ]
}

Convert json to csv

You can convert the json file into csv, grouped by node type. The csv files can be imported into Neo4j via bulk import. The csv files will be created in the csv subfolder of the basedir of the provided cpg.json. Existing files will be overwritten.

# we need pandas as dependency
$ pip3 install --user pandas
$ scripts/joern-json-csv.py <cpg.json>

csv/edge.csv:

:END_ID,:START_ID,:TYPE,VARIABLE
1024809,1313,REF,
1154373,1314,REF,
1024810,1315,REF,
1154374,1316,REF,
1030528,1317,REF,
1154375,1318,REF,
1024811,1319,REF,
1154376,1320,REF,
1154377,1321,REF,
1154378,1322,REF,
1154379,1323,REF,

Import into Neo4j

Neo4j offers two methods to import csv data: 1) via cypher LOAD CSV command and 2) via the neo4j-admin command. For (1) you need first to create the graph model. Data types of the csv columns must be explicitly converted during loading. For (2) the database must be empty (delete all nodes and relations seems not to be sufficient). Data types are detected automatically, or via column header. See also: https://neo4j.com/developer/guide-import-csv/

Neo4j commands:

# stop Neo4j service
$ neo4j stop
# delete database
$ rm -rf data
# set Neo4j admin password
$ neo4j-admin set-initial-password <password>
# batch import
$ neo4j-admin import --multiline-fields=true --quote=\" --relationships=csv/edge.csv --nodes=csv/vertex_BLOCK.csv --nodes=csv/vertex_CALL.csv ...
# start Neo4j service
$ neo4j start

Cpg viewed via Neo4j web UI:

graph

joerncpgexport's People

Contributors

peckto avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

joerncpgexport's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.