GraphMini

Artifacts for "GraphMini: Accelerating Graph Pattern Matching Using Auxiliary Graphs"

Github link: https://github.com/ds-umass/GraphMini/

Hardware Requirements

128GB of free RAM to preprocess the graph Friendster correctly.
180GB of free disk space to store preprocessed graphs.

Software Requirements

Operating System

Ubuntu 22.04 (tested)

Libraries

CMake (Version >= 3.20)
Make
MPI
GCC compiler (>= 7)
Python (>= 3.8)
bc
clang-format (optional)

Install libraries

sudo apt install bc cmake mpich clang-format -y

Tested Graph Data

Wiki
YouTube
Patents
LiveJournal
Orkut
Friendster

All graphs are treated as unlabelled and undirected graphs. We also provide scripts to download and preprocess these data automatically. See details below.

Project Structure (including two baseline systems: GraphPi and Dryadic)

- Datasets
    Subfolders:
    - Dryadic (preprocessed graph data for Dryadic)
    - GraphPi (preprocessed graph data for GraphPi)
    - GraphMini (preprocessed graph data for GraphMini)
    - TXT (downloaded graph data from SNAP)
    Scripts: the following scripts are used to download and preprocess datasets used in the experiments
    - download.sh (downloads data from SNAP)
    - dryadic_prep.sh scripts for preprocessing datasets into format can be handled by Dryadic
    - graphpi.sh scripts for preprocessing datasets into format can be handled by GraphPi
    - graphmini_prep.sh scripts for preprocessing datasets into format can be handled by GraphMini
    - prep.sh run all the scripts above
- Dryadic (source code/binaries of Dryadic)
- GraphPi (source code/binaries of GraphPi)
- GraphMini (source code/binaries of GraphMini)
- Experiments 
    Subfolders:
    - Dryadic (experiment results for Dryadic)
    - GraphMini (experiment results for GraphMini)
    - GraphPi (experiment results for GraphPi)
    Scripts: the following scripts are used to automate running the benchmark experiments that appeared in the paper
    - queries.sh (defines what queries and graphs to run in the experiment)
    - run_dryadic_edge.sh (test dryadic on edge-induced pattern defined in queries.sh)
    - run_dryadic_vertex.sh (test dryadic on vertex-induced pattern defined in queries.sh)
    - run_graphpi_edge_iep.sh (test graphpi on edge-induced pattern defined in queries.sh, with inclusion-exclusion optimization)
    - run_graphpi_edge.sh (test graphpi on edge-induced pattern defined in queries.sh, without inclusion-exclusion optimization)
    - run_graphmini_vertex.sh (test GraphMini on vertex-induced pattern defined in queries.sh, without inclusion-exclusion optimization)
    - run_graphmini_edge.sh (test GraphMini on edge-induced pattern defined in queries.sh, without inclusion-exclusion optimization)
    - run_graphmini_edge_iep.sh (test GraphMini on edge-induced pattern defined in queries.sh, with inclusion-exclusion optimization)
    - run_base_vertex.sh (test Base on vertex-induced pattern defined in queries.sh, without inclusion-exclusion optimization)
    - run_base_edge.sh (test Base on edge-induced pattern defined in queries.sh, without inclusion-exclusion optimization)
    - run_base_edge_iep.sh (test Base on edge-induced pattern defined in queries.sh, with inclusion-exclusion optimization)

How to reproduce the experiment results

Build the source code for Dryadic, GraphPi and GraphMini:

bash ./build.sh

Download the dataset from SNAP and preprocess the datasets for Dryadic, GraphPi, and GraphMini:

bash ./Datasets/prep.sh

Define the queries to run and graphs to test by modifying the following file

./Experiments/queries.sh

The Queries variable defines the queries to test. It takes inputs in (undirected) adjacency matrix format.

The QuerySizes variable defines the number of vertex in each of the query patterns.

The GraphNames defines the data graphs to run the experiments. Examples are given in the script. Notice that the default setting will run experiments on all the graphs, which can take a significant amount of time, you can modify the GraphNames to run experiments on graphs of interest only.

The TIMEOUT defines the maximum amount of time before terminating a query. Notice that some queries can run for a very long time (>24h). You can modify the "TIMEOUT" variable to change the upper bound of query execution time. GraphMini can finish most queries in less than 12 hours on a 32-core system.

Reproduce the benchmark experiments:

nohup bash ./Experiments/run.sh &

This script will run all the tested systems (GraphPi, Dryadic, GraphMini, Base) by invoking all the tested scripts in the directory (e.g. run_graphmini_vertex.sh). Each script can take more than 3 days to run. So if you want to get results quickly, you can use multiple machines and run them separately by invoking each script one by one.

The script will generate a log file for each tested query. The path to the log file is named as:

./Experiments/{System}_{Graph}_{Query}_{QueryType}/run.log

Generating Tables

To generate CSV tables corresponding to "Fig 5: GraphMini vs State-of-The-Art"

Fig 5a:

cd Experiments && python3 log2csv.py --baseline=Dryadic --target=GraphMini --adjtype=VertexInduced

Fig 5b:

cd Experiments && python3 log2csv.py --baseline=Dryadic --target=GraphMini --adjtype=EdgeInduced

Fig 5c:

cd Experiments && python3 log2csv.py --baseline=GraphPi --target=GraphMini --adjtype=EdgeInduced

Fig 5d:

cd Experiments && python3 log2csv.py --baseline=GraphPi --target=GraphMini --adjtype=EdgeInducedIEP

To generate tables corresponding to "Fig 6: GraphMini vs Base" Fig 6a:

cd Experiments && python3 log2csv.py --baseline=Base --target=GraphMini --adjtype=EdgeInduced

Fig 6b:

cd Experiments && python3 log2csv.py --baseline=Base --target=GraphMini --adjtype=VertexInduced

To learn more about GraphMini

See more detail in GraphMini/README.md

ds-umass / graphmini Goto Github PK

graphmini's Introduction

GraphMini

Hardware Requirements

Software Requirements

Operating System

Libraries

Install libraries

Tested Graph Data

Project Structure (including two baseline systems: GraphPi and Dryadic)

How to reproduce the experiment results

To learn more about GraphMini

graphmini's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent