Giter Site home page Giter Site logo

graphalytics-platforms-powergraph's Introduction

Graphalytics PowerGraph platform driver

Build Status

Getting started

This is a Graphalytics benchmark driver for the PowerGraph. Please refer to the documentation of Graphalytics core for an introduction to using Graphalytics.

  • Make sure that you have installed Graphalytics.
  • Download the source code from this repository.
  • Execute mvn clean package in the root directory (See details in Software Build).
  • Extract the distribution from graphalytics-{graphalytics-version}-powergraph-{platform-version}.tar.gz.

The following dependencies are required for this platform extension:

Software Version (tested) Usage Description Provided
C Compiler gcc 5.2.1 Build Building PowerGraph code -
PowerGraph 2.2 Platform PowerGraph implementation -
CMake 3.2.2 Build Building PowerGraph code -
GNU Make 4.0 Build Building PowerGraph code -
OpenMPI or MPICH2 1.10.3 Deployment Job deployment -

Download PowerGraph, unpack into any directory, patch the missing CMakeLists.txt file using a diff from bin/utils/ and fully compile/build using the instructions given by the authors. Note that Graphalytics does not support HDFS as data source for PowerGraph, so it is recommended to compile with the --no_jvm flag.

Alternatively, one may use the build-distribution.sh script that performs the steps described above in an automated fashion.

Finally, refer to the documentation of the Graphalytics core on how to build and run this platform repository.

PowerGraph-implementation-specific configuration

Edit config/powergraph.properties to change the following settings:

  • platform.powergraph.home: Set to the root directory where PowerGraph has been installed.
  • platform.powergraph.num-threads: Set the number of threads PowerGraph should use.
  • platform.powergraph.nodes: Set the the names of computation nodes, with format e.g., 10.149.0.55\,10.149.0.56 (note: IP's separated between \, non-separated by spaces).

Known Issues

  • PowerGraph does not support machines with more than 64 threads. A workaround has been proposed in this issue.
  • The PowerGraph installation process is somewhat outdated, it has a few broken links to dependencies. Patching the CMakeLists.txt file with our diff fixes these broken URIs.

Running the benchmark

To execute a Graphalytics benchmark on PowerGraph (using this driver), follow the steps in the Graphalytics tutorial on Running Benchmark.

graphalytics-platforms-powergraph's People

Contributors

alexandru-uta avatar amusaafir avatar clemaire98 avatar maninthegithub avatar stijnh avatar thegeman avatar wlngai avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

graphalytics-platforms-powergraph's Issues

Disable integration test.

Some integration tests have been added as unit tests, which will not be successful unless proper environment is set up. These tests need to be declared as integration test.

b<nworkers for any value of threads

Hello. I am running on a 72-core machine. I saw from a previous issue for GraphLab that there may be concerns running on more than 64 threads, but when setting the powergraph.properties.num-threads to any value (in this case 4) the program still won't run. Here is the error.

Forgive me if this is a misconfiguration from PowerGraph.

ERROR:    fiber_control.cpp(launch:270): Check failed: b<nworkers  [4 < 4]

17:41:30.436 [ERROR] Algorithm "Local clustering coefficient" on graph "dota-league failed to complete:
nl.tudelft.graphalytics.PlatformExecutionException: failed to execute command
	at nl.tudelft.graphalytics.powergraph.PowerGraphPlatform.executeAlgorithmOnGraph(PowerGraphPlatform.java:122) ~[graphalytics-platforms-powergraph-std-0.1.jar:?]
	at nl.tudelft.graphalytics.BenchmarkSuiteRunner.execute(BenchmarkSuiteRunner.java:133) [graphalytics-platforms-powergraph-std-0.1.jar:?]
	at nl.tudelft.graphalytics.Graphalytics.main(Graphalytics.java:48) [graphalytics-platforms-powergraph-std-0.1.jar:?]
Caused by: java.io.IOException: unexpected error code
	at nl.tudelft.graphalytics.powergraph.PowerGraphJob.run(PowerGraphJob.java:80) ~[graphalytics-platforms-powergraph-std-0.1.jar:?]
	at nl.tudelft.graphalytics.powergraph.PowerGraphPlatform.executeAlgorithmOnGraph(PowerGraphPlatform.java:120) ~[graphalytics-platforms-powergraph-std-0.1.jar:?]
	... 2 more

Support more configurations?

hi,

I have run the driver successfully in distributed environments : )

To conduct more experiments, I'm wondering whether the driver supports more configurations of powergraph than these already shown in platform.properties (namely platform.powergraph.nodes, platform.powergraph.num-threads). If it can support more than these two, can you list all the supported configuration options? Or shall I look into the code to identify?

Looking forward to your reply.

mvn package can't find configuration file when running tests

When doing the usual business after installing PowerGraph, one gets

-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running nl.tudelft.graphalytics.powergraph.algorithms.bfs.BreadthFirstSearchJobTest
02:17:48.866 [WARN ] failed to load powergraph.properties
org.apache.commons.configuration.ConfigurationException: Cannot locate configuration source powergraph.properties
	at org.apache.commons.configuration.AbstractFileConfiguration.load(AbstractFileConfiguration.java:259) ~[commons-configuration-1.10.jar:1.10]
	at org.apache.commons.configuration.AbstractFileConfiguration.load(AbstractFileConfiguration.java:238) ~[commons-configuration-1.10.jar:1.10]
	at org.apache.commons.configuration.AbstractFileConfiguration.<init>(AbstractFileConfiguration.java:158) ~[commons-configuration-1.10.jar:1.10]
	at org.apache.commons.configuration.PropertiesConfiguration.<init>(PropertiesConfiguration.java:252) ~[commons-configuration-1.10.jar:1.10]
	at nl.tudelft.graphalytics.powergraph.Utils.loadConfiguration(Utils.java:69) [test-classes/:?]
	at nl.tudelft.graphalytics.powergraph.algorithms.bfs.BreadthFirstSearchJobTest.execute(BreadthFirstSearchJobTest.java:55) [test-classes/:?]
	at nl.tudelft.graphalytics.powergraph.algorithms.bfs.BreadthFirstSearchJobTest.executeUndirectedBreadthFirstSearch(BreadthFirstSearchJobTest.java:42) [test-classes/:?]
	at nl.tudelft.graphalytics.validation.algorithms.bfs.BreadthFirstSearchValidationTest.testUndirectedBreadthFirstSearchOnValidationGraph(BreadthFirstSearchValidationTest.java:91) [graphalytics-validation-0.3.jar:?]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_91]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_91]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_91]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_91]
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) [junit-4.11.jar:?]
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) [junit-4.11.jar:?]
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) [junit-4.11.jar:?]
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) [junit-4.11.jar:?]
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) [junit-4.11.jar:?]
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) [junit-4.11.jar:?]
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) [junit-4.11.jar:?]
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) [junit-4.11.jar:?]
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) [junit-4.11.jar:?]
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) [junit-4.11.jar:?]
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) [junit-4.11.jar:?]
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) [junit-4.11.jar:?]
	at org.junit.runners.ParentRunner.run(ParentRunner.java:309) [junit-4.11.jar:?]
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264) [surefire-junit4-2.17.jar:2.17]
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153) [surefire-junit4-2.17.jar:2.17]
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) [surefire-junit4-2.17.jar:2.17]
	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) [surefire-booter-2.17.jar:2.17]
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) [surefire-booter-2.17.jar:2.17]
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) [surefire-booter-2.17.jar:2.17]

Tests fail due to missing PowerGraph binaries

Currently none of the validation tests pass on Jenkins because the algorithms are not compiled during the build process. Due to the size of the main PowerGraph repository, pulling and building the PowerGraph dependency for every build is not feasible. A possible solution is compiling PowerGraph once as a shared library on the build server and linking against that library in future builds of this project (to be discussed with the maintainers of our Jenkins server).

run_benchmark.sh unable to link files correctly

After packaging with mvn package -DskipTests, running ./run-benchmark.shyields

[ 50%] Linking CXX executable main

/home/users/spollard/graphalytics/PowerGraph/release/src/graphlab/libgraphlab.a(hdfs.cpp.o): In function `graphlab::hdfs::hdfs(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned short)':

/home/users/spollard/graphalytics/PowerGraph/src/graphlab/util/hdfs.hpp:110: undefined reference to `hdfsConnect'

/home/users/spollard/graphalytics/PowerGraph/release/src/graphlab/libgraphlab.a(hdfs.cpp.o): In function `graphlab::hdfs::~hdfs()':

/home/users/spollard/graphalytics/PowerGraph/src/graphlab/util/hdfs.hpp:115: undefined reference to `hdfsDisconnect'

collect2: error: ld returned 1 exit status

CMakeFiles/main.dir/build.make:96: recipe for target 'main' failed

make[2]: *** [main] Error 1

CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/main.dir/all' failed

make[1]: *** [CMakeFiles/main.dir/all] Error 2

Makefile:83: recipe for target 'all' failed

make: *** [all] Error 2

This can be (temporarily) patched by editing bin/standard/CMakeFile/main.dir/link.txt to include the -lhdfs flag. However, running the benchmark again yields

$ ./run-benchmark.sh 
grep: /disks/large/home/users/spollard/graphalytics/graphalytics-platforms-powergraph/graphalytics-0.3-powergraph-0.1/config//granula.properties: No such file or directory
-- Configuring done
-- Generating done
-- Build files have been written to: /home/users/spollard/graphalytics/graphalytics-platforms-powergraph/graphalytics-0.3-powergraph-0.1/bin/standard
[ 50%] Linking CXX executable main
/home/users/spollard/graphalytics/PowerGraph/deps/local/lib/libhdfs.a(hdfsJniHelper.o): In function `getJNIEnv':
/home/users/spollard/graphalytics/PowerGraph/deps/hadoop/src/hadoop/src/c++/libhdfs/hdfsJniHelper.c:404: undefined reference to `JNI_GetCreatedJavaVMs'
/home/users/spollard/graphalytics/PowerGraph/deps/hadoop/src/hadoop/src/c++/libhdfs/hdfsJniHelper.c:458: undefined reference to `JNI_CreateJavaVM'
collect2: error: ld returned 1 exit status
CMakeFiles/main.dir/build.make:96: recipe for target 'main' failed
make[2]: *** [main] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/main.dir/all' failed
make[1]: *** [CMakeFiles/main.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2
spollard@arya:~/graphalytics/graphalytics-platforms-powergraph/graphalytics-0.3-powergraph-0.1๐Ÿบ vim bin/standard/CMakeFiles/main.dir/link.txt 
spollard@arya:~/graphalytics/graphalytics-platforms-powergraph/graphalytics-0.3-powergraph-0.1๐Ÿบ ./run-benchmark.sh 
grep: /disks/large/home/users/spollard/graphalytics/graphalytics-platforms-powergraph/graphalytics-0.3-powergraph-0.1/config//granula.properties: No such file or directory
-- Configuring done
-- Generating done
-- Build files have been written to: /home/users/spollard/graphalytics/graphalytics-platforms-powergraph/graphalytics-0.3-powergraph-0.1/bin/standard
[ 50%] Linking CXX executable main
/home/users/spollard/graphalytics/PowerGraph/deps/local/lib/libhdfs.a(hdfsJniHelper.o): In function `getJNIEnv':
/home/users/spollard/graphalytics/PowerGraph/deps/hadoop/src/hadoop/src/c++/libhdfs/hdfsJniHelper.c:404: undefined reference to `JNI_GetCreatedJavaVMs'
/home/users/spollard/graphalytics/PowerGraph/deps/hadoop/src/hadoop/src/c++/libhdfs/hdfsJniHelper.c:458: undefined reference to `JNI_CreateJavaVM'
collect2: error: ld returned 1 exit status
CMakeFiles/main.dir/build.make:96: recipe for target 'main' failed
make[2]: *** [main] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/main.dir/all' failed
make[1]: *** [CMakeFiles/main.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

Notice that this is with my LD_LIBRARY_PATH set to
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.