Giter Site home page Giter Site logo

graph-xll's Introduction

Graph-XLL

A Graph Library for Extra Large Graph Analytics on a Single Machine with WebGraph BVGraph format


Input

The input graphs are in WebGraph format.

There are three files in this format:

basename.graph basename.properties basename.offsets

Available datasets in this format can be found in: http://law.di.unimi.it/datasets.php

The first two files are for the forward (regular) graph. The other two are for the transposed graph. What's missing is the "offsets" file. This can be easily created by running:

java -cp "lib/*" it.unimi.dsi.webgraph.BVGraph -o -O -L cnr-2000

Converting Edgelist Format to WebGraph Format

This section is for the case when your graph is given a text file of edges (known as edgelist). If your graph is already in WebGraph format, skip to the next section.

It is very easy to convert an edgelist file into WebGraph format. I am making the folloiwng assumptions:

  1. The graph is unlabeled and the vertices are given by consecutive numbers, 0,1,2,... (If there are some vertices "missing", e.g. you don't have a vertex 0 in your file, it's not a problem. WebGraph will create dummy vertices, e.g. 0, that does not have any neighbor.)
  2. The edgelist file is TAB separated (not comma separated).

Now, to convert the edgelist file to WebGraph format execute the following steps:

Sort the file, then remove any duplicate edges:

sort -nk 1 edgelistfile | uniq > edgelistsortedfile

(If you are on Windows, download sort.exe and uniq.exe from http://gnuwin32.sourceforge.net/packages/coreutils.htm)

Run:

java -cp "lib/*" it.unimi.dsi.webgraph.BVGraph -g ArcListASCIIGraph edgelistsortedfile  dummyBasename

For example:

java -cp "lib/*" it.unimi.dsi.webgraph.BVGraph -g ArcListASCIIGraph car-2000.txt  cnr-2000

Undirected Graphs in WebGraph

Undirected graphs are used in some algorithms, e.g., k-truss decomposition. After creating the offset files, remove self-loops in the graphs by running:

java -cp "lib/*:bin" SelfLoopRemover cnr-2000 cnr-2000-noself
java -cp "lib/*:bin" SelfLoopRemover cnr-2000-t cnr-2000-noself-t

Symmetrize by taking union:

java -cp "lib/*" it.unimi.dsi.webgraph.Transform union cnr-2000-noself cnr-2000-noself-t cnr-2000-noself-sym

Now we obtain the undirected simple graph in WebGraph as the input:

cnr-2000-noself-sym.graph cnr-2000-noself-sym.properties cnr-2000-noself-sym.offsets

Compiling

The programs are already compiled with javac 1.8 (Java 8). Please use Java 8 or newer version.

javac -cp "lib/*" -d bin src/*

Running

java -cp "lib/*:bin" BV2Ascii

For larger graphs, allocate more memory to avoid OutOfMemoryError by appending -Xmx4g or -Xmx8g or -Xmx16g etc. to the above line.

Acknowledgements

We would like to thank:

graph-xll's People

Contributors

graph-xll avatar jianwu10 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.