Giter Site home page Giter Site logo

cuvite's Introduction

*************
-------------
 COMPILATION
-------------
*************

Pass -DDEBUG_PRINTF if detailed diagonostics is required along
program run. This program requires OpenMP and C++11 support,
so pass -fopenmp (for g++)/-qopenmp (for icpc) and -std=c++11/
-std=c++0x. For Cray systems, use CC.

Pass -DUSE_32_BIT_GRAPH if number of nodes in the graph are 
within 32-bit range (2 x 10^9), else 64-bit range is assumed.

Pass -DDONT_CREATE_DIAG_FILES if you dont want to create 2 files
per process with detail diagonostics.

Upon building, the program will generate a binary file named 
`graphClustering`.


Cmake build option:

$ mkdir build install
$ cd build
$ cmake ../ -DCMAKE_INSTALL_PREFIX=~/vite-gpu/install -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx
$ make -j4
$ make install

*******************
-------------------
 GRAPH CONVERSION
-------------------
*******************

The file format of cuVite is compatible with Vite, so please see Vite 
README for information on binary file conversion from the various native
formats.
https://github.com/Exa-Graph/vite/blob/master/README

Also, see above for comparing communities with a ground-truth.

***********************
-----------------------
 EXECUTING THE PROGRAM
-----------------------
***********************

0. Input file conversion to Vite binary format from the respective native 
file formats: convert the input file to binary format using fileConvert.

1. Set the desired number of processes and threads.  

2. Run distributed parallel Louvain algorithm (graphClustering binary) 
using the binary file created in Step #0:

mpiexec -n 2 bin/./graphClustering -c -i -f karate.bin

Possible options (can be combined):

1. -c <ncolors>  : Enable coloring, adjacent vertices are processed
                   in different order. Synchronization happens once
                   per <ncolors>.
2. -d <ncolors>  : Enable coloring, adjacent vertices are processes
                   in different order. No synchronization.
3. -i            : Threshold cycling, threshold changes dynamically
                   in every phase of Louvain algorithm.
4. -t 1          : If a vertex is in the same community for past 3
                   iterations, then consider it inactive.
5. -t 3          : If a vertex is in the same community for past 3
                   iterations, then consider it inactive. Also,
                   adds a communication step to gather inactive 
                   vertices and terminate Louvain if >= 90% vertices
                   at an iteration are inactive.
6. -t 2 -a <0-1> : Early termination* using probability alpha.
7. -t 4 -a <0-1> : Early termination* using probability alpha. Also,
                   adds a communication step to gather inactive 
                   vertices and terminate Louvain if >= 90% vertices
                   at an iteration are inactive.
8. -b            : Only valid for real-world inputs. Attempts to 
                   distribute approximately equal number of edges among 
                   processes. Irregular number of vertices owned by a 
                   particular process. Increases the distributed graph 
                   creation time due to serial overheads, but may 
                   significantly improve the overall execution time.
9. -o            : Output communities into a file. This option will result 
                   in Vite dumping the communities (community-per-vertex in 
                   each line, total number of lines == number of vertices) 
                   in a text file named <input-binary-file>.communities in 
                   the same path as the input binary file.
10. -r <nranks>  : This is used to control the number of aggregators in MPI 
                   I/O and is meaningful when an input binary graph file is 
                   passed with option "-f".
                   naggr := (nranks > 1) ? (nprocs/nranks) : nranks;
11. -g <gfile>   : Pass a ground truth file for community comparison. We 
                   expect the ground truth file to contain N lines (equal to 
                   the total #vertices in the graph), while each line containing 
                   a distinct vertex ID and associated community ID, separated by 
                   a space or tab. Ground truth community comparison is performed
                   in a single node, and it uses OpenMP to parallelize. It may take
                   a substantial amount of time for large files.
12. -z           : Only applicable if "-g <gfile>" option is passed. This tells us
                   that the passed ground truth file is 1-based. If this option is
                   not passed, we assume the ground truth to the 0-based.

*Note: Option to update inactive vertices percentage is defined as
macro ET_CUTOFF in louvain.hpp, and the default is 2%.

*Note: Only the default version has been ported on GPUs, most of the 
heuristics mentioned above will only work for the CPU-only version.
See Vite: https://github.com/Exa-Graph/vite/blob/master/README

***************
---------------
 OUTPUT RESULT
---------------
***************

If -DDONT_CREATE_DIAG_FILES is passed during compilation,
then output is send to stdout.
Otherwise, the output result is dumped per process on files 
named as dat.out.<process-id>. Check dat.out.0 to review 
program diagonostics. 
Output files are cleared with: `make clean`.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.