Giter Site home page Giter Site logo

gspan's Introduction

gSpan

gSpan is an algorithm for mining frequent subgraphs.

Library

Header-only in include directory

Example command

$ ./example/gspan --help
Usage: gspan [options]
Graph-based substructure pattern mining.
Depending on the graph count in input, there are two modes:
  1. input contains one graph. Mined patterns belong to this one;
       in this case only --mincount=NUM option is used
  2. input contains many graphs. Mined patterns belong to some graph in input;
       in this case --minsupp=NUM option is used, as more useful.
Options:
  -i, --input FILE        file to read, default stdin
  -o, --output FILE       file to write, default stdout
  -c, --mincount NUM      minimal count, integer value, default 1
  -s, --minsupp NUM       minimal support, 0..1
  -l, --legacy            use tgf format for input and output (slower!)
  -e, --embeddings [opts] none, autgrp, all. default is none
  -h, --help              this help

Test

Performace Test

Data format

two formats supported:

First format (recomodated)

Input data is a one or many input graphs. Output data is one or many patterns with mappings to input graph. # is comment line

#
# begin section with input graph (or transaction)
#
t <graph_id>
v <vertex_id> <value>
...[repeat v]...
e <edge_id> <vertex_id> <vertex_id> <value>
...[repeat e]...

...[repeat t]..


#
# begin section with mined pattern, it is also graph
#
p <pattern_id> # occurence: <num>
v <vertex_id> <value>
...[repeat v]...
e <edge_id> <vertex_id> <vertex_id> <value>
...[repeat e]...

#
# each pattern has multiple mappings to original input graph
#
m <mapping_id>
v <vertex_id> <graph_id> <vertex_id> # map pattern vertex to original vertex
...[repeat v]..
e <edge_id> <graph_id> <edge_id>     # map pattern edge to original edge
...[repeat e]...

...[repeat m]...

...[repeat p]...

Legacy format


#
# input
#
t # <graph_id>
v <vertex_id> <value>
e <vertex_id> <vertex_id> <value>

#
# output
#
t # <pattern_id> * <support>
parent : <pattern_id>
v <vertex_id> <value>
e <vertex_id> <vertex_id> <value>
x: <graph_id> ...

Reference

gSpan: Graph-Based Substructure Pattern Mining, by X. Yan and J. Han. Proc. 2002 of Int. Conf. on Data Mining (ICDM'02).

gspan's People

Contributors

stvdedal avatar

Watchers

Fan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.