Giter Site home page Giter Site logo

parallel_pagerank's Introduction

parallel_pageRank

Faculty of Electrical and Computer Engineering AUTH 4th assignment on Parallel and Distributed Systems (7th semester)

Two versions of the PageRank program provided:

  • Serial version implementing the iterative method (pagerank_iter.c)
  • Parallel version using OpenMP and implementing the Gauss-Seidel method (pagerank_gs_omp.c)

In order to compile the programs run the following commands

  • For pagerank_iter.c run : gcc pagerank_iter.c -o pagerank_iter -O3 -lm
  • For pagerank_gs_omp.c run : gcc pagerank_gs_omp.c -o pagerank_gs_omp -O3 -lm -fopenmp
  • (or you can run the provided compile.sh to compile both)

The programs accept the following arguments:

  1. (both versions): The path that contains the data on which the algorithm will be applied. The data needed are:
    • sparseMatrix.bin: A binary file containing an adjacency matrix of size (NxN) in sparse representation.So the binary file actually contains an integer array of 2 columns and rows as many as the non-zero elements of the full adjacency matrix.The first column contains the row-number of the non-zero element on the full adjacency matrix and the second column contains the column-number of the element.The matrix represents a directed graph with N nodes and edges defined by the elements A[i][j] (directed edge from node i to node j).
    • labels.txt: A text file containing a 1d array of size (N) and type (string).The array contains the label of each node.
    • matResults.bin: A binary file containing a 1d array of size (N) and type (double).The array contains the PageRanks that have been calculated on matlab for the same dataset.It is used to verify that the results from this algorithm are correct.
  2. (both versions): The number of the nodes (N) of the given dataset.
  3. (parallel version only): The number of threads to be used for the parallelism.

In order to execute the programs you can run the provided run.sh.
You can also edit the run.sh in order to change the dataset(and it's size) to be used and the number of threads to be created.

Both of the programs calculate the PageRank vector of the nodes and print:

  • The time taken for the algorithm to converge.
  • The convergence error.
  • The iterations taken for the algorithm to converge.
  • The divergence of the calculated Pagerank vector from the matlab one.

References:

parallel_pagerank's People

Contributors

polizois avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.