Giter Site home page Giter Site logo

graph_project_start's Introduction


Software requirement

gcc 4.4.7 or 4.8.5

OS: Linux, MacOS and Windows


Graph format basics

Alt text

Given a graph G = (V, E, W), as shown in (a), mainstream storage formats are (b) edge (tuple) list format and (c) Compressed sparse row (CSR) format. Note, V, E and W stand for the sets of vertices, edges and weights of the graph G, respectively.

CSR format: is firstly introduced to store sparse matrices. When it comes to graph, we assume each source vertex to be one row of the matrix, and all the destination vertices (originated from the same source vertex) as column indices for the non-zero (nnz) entries of this row. The edge weight resembles the content of the corresponding nnz entry. CSR only stores the destination vertices and edge weights in Adjacency list and Weight list. Further, the Begin position array is used to describe the ranges of destination vertices (also edge weights) for every row. Because the begin index of vertex n+1 is the end index for vertex n. Consequently, a dedicated end index array is unnecessary, except one extra entry at the end of the begin position array to indicate the end index of the destination vertices for the last source vertex, as shown in (c).

In summary, CSR space consumptions are:

  • Begin position: |V| + 1 entries.
  • Adjacency list: |E| entries.
  • Weight list: |E| entries.

Code specification

The overall code structure of this project is:

  • tuple_text_to_binary_csr_mmap: converting tuple text list into binary CSR with the following feature.

Directly back the CSR and weight arrays with files residing on disk, which has to synchronize udpates of CSR and weight to files on disk. Thus this design suffers from very slow processing speed in network-based file system (e.g., LUSTRE). Fortunately, this design is memory efficient comparing to tuple_text_to_binary_csr. This source code generate **symmetric weights**, that is, weight of a->b = weight of b->a for undirected graph.

  • tuple_text_to_binary_csr_extreme_graph: converting tuple text list into binary CSR with the following feature.

Split tuple list file to enhance parallelism, fast and more complex. There is a README.md file inside of this folder details the use. This source code CANNOT generate weights.

  • tuple_text_to_binary_csr_mem: Converting arbitrary text format edge tuple list file into Compressed Sparse Row (CSR) format binary file.

Allocating CSR and weight arrays in memory. Once all updates to these two arrays are done, we dump them to disk.

  • graph_reader: Reading the binary format CSR file into memory for graph computing.

Converter: edge tuples to CSR

  • Compile: make
  • To execute: type "./text_to_bin.bin", it will show you what is needed
  • One example is provided in toy_graph folder. The user can use ./text_to_bin.bin ./toy_graph/toy.dat 1 2 to run the converter.

Real example:

  • Download a graph file, e.g., wget https://snap.stanford.edu/data/bigdata/communities/com-orkut.ungraph.txt.gz file.
  • Decompress the file, e.g., gzip -d /path/to/com-orkut.ungraph.txt.gz.
  • Convert the edge list into binary CSR, e.g., /path/to/text_to_bin.bin /path/to/com-orkut.ungraph.txt 1 0. Note, the first number '1' means we want to reverse the edge, the second number '0' means we will skip 0 lines from /path/to/com-orkut.ungraph.txt. Eventually, this commandline will yield two files: /path/to/com-orkut.ungraph.txt_beg_pos.bin and /path/to/com-orkut.ungraph.txt_csr.bin.
  • Then you can use these two files to run enterprise.

Graph reader

  • Compile: make
  • To execute: type "./graph_loader.bin", it will show you what is needed
  • You can use the converter converted graph to run this command.

Toy example from converter:

  • ./graph_loader.bin ../tuple_text_to_binary_csr/toy_graph/toy.dat_beg_pos.bin ../tuple_text_to_binary_csr/toy_graph/toy.dat_csr.bin ../tuple_text_to_binary_csr/toy_graph/toy.dat_weight.bin

Further Development

This repo serves as the purpose of helping developers to, instead of distracted by coding tools to convert graph, immediately focusing on developing graph algorithms, such as, BFS, SSSP, PageRank, and etc.


Acknowledgement

The toy graph is based on the example of Figure 1 in the following paper:

[SC '15] Enterprise: Breadth-First Graph Traversal on GPUs [PDF] [Slides] [Blog]

graph_project_start's People

Contributors

anil-gaihre avatar anna18-uml avatar asherliu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.