Giter Site home page Giter Site logo

dzhi1993 / fastsc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yuj-umd/fastsc

0.0 0.0 0.0 6.84 MB

A fast implementation of spectral clustering on GPU-CPU Platform

License: Apache License 2.0

MATLAB 10.04% Makefile 2.02% C++ 54.01% Cuda 33.93%

fastsc's Introduction

A Fast Implementation of Spectral Clustering on GPU-CPU Platforms

Introduction

This software package provides a fast implementation of spectral clustering on GPU and CPU platforms.

This work is published on IPDPS 2016 workshop titled as "A high performance implementation of spectral clustering on CPU-GPU platforms" authored by Yu Jin and Joseph F. JaJa.

If you use the software in your applications, please cite the paper as

@inproceedings{jin2016,
title={A high performance implementation of spectral clustering on cpu-gpu platforms},
author={Jin, Yu and JaJa, Joseph F},
booktitle={Parallel and Distributed Processing Symposium Workshops, 2016 IEEE International},
pages={825--834},
year={2016},
organization={IEEE}
}

Spectral clustering is one of the most popular clustering algorithms for finding structural communities in graphs. However, the running time of the algorithm is usually quite long as it involves very expensive numerical operations, i.e. finding the smallest few eigenvectors of a real symmetric matrix.

In this package, we provides a fast implementation of the spectral clustering algorithm which is significantly faster than using other CPU-based software packages such as Matlab and Python. As far as we know, our implementation is also the fastest implementation available in the open source community.

The implementation contains three parts:

  • Normalize the edge weights by the inverse of the corresponding node degrees.
  • Computet the first k eigenvectors of the normalized Laplacian matrix based on arpackpp package and CUDA libraries.
  • Apply k-means algorithm on rows of the k eigenvectors. The implementation of k-means algorithm is originally developed by Bran Catanzaro at https://github.com/bryancatanzaro/kmeans

Each part can be easily divided for individual functional usage.

If you have trouble working with the software package, please contact Yu Jin (yuj AT umd.edu).

Installation

CUDA Environment Setup

CUDA libraries, such as CUSPARSE, CUBLAS and Thrust are pre-installed.

arpackpp installation

Check out ARPACK++ package from https://github.com/yuj-umd/arpackpp

$ git clone https://github.com/yuj-umd/arpackpp.git
$ cd arpackpp

Install the libraries

$ ./install-openblas.sh
$ ./install-arpack-ng.sh
$ ./install-superlu.sh
$ ./install-suitesparse.sh

Change the directory and library path in Makefile.inc, as instructed in Makefile_example.inc.

fastsc installation

Check out the code from https://github.com/yuj-umd/fastsc

$ git clone https://github.com/yuj-umd/fastsc.git
$ cd fastsc

Modify the arpack library path and cuda architecture in Makefile.

Compile and run the program

$ make
$ ./spectral_clustering input_file_name n k output_file_name

Usage

The program format is

$ ./spectral_clustering input_file_name n k output_file_name

The input file contains the graph information represented as edge list. By default, the program supports unweighted graphs where each row contains two node indices. It is easy to adapt the code for weighted graphs and other graph representations.

n is the total number of nodes and k is the desired number of clusters. The graph nodes are indexed from 0 to n-1 and there are NO isolated nodes.

output file will contain the node ID and the corresponding label.

Two input examples are contained in Dataset folder.

Benchmarks

The Benchmark folder contains Matlab code with the same function. Our implementation is faster than the naive Matlab implementation especially for large-scale problems.

Reference

Jin, Yu, and Joseph F. JaJa. "A high performance implementation of spectral clustering on cpu-gpu platforms." Parallel and Distributed Processing Symposium Workshops, 2016 IEEE International. IEEE, 2016.

fastsc's People

Contributors

yuj-umd avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.