llnl / magpie Goto Github PK

Magpie contains a number of scripts for running Big Data software in HPC environments, including Hadoop and Spark. There is support for Lustre, Slurm, Moab, Torque. LSF, Flux, and more.

License: GNU General Public License v2.0

Perl 0.09% Shell 98.83% Makefile 0.88% PigLatin 0.01% Python 0.20%

shell hpc workflows

magpie's Introduction

Magpie

Magpie contains a number of scripts for running Big Data software in HPC environments. Thus far, Hadoop, Spark, Hbase, Storm, Pig, Phoenix, Kafka, Zeppelin, Zookeeper, and Alluxio are supported. It currently supports running over the parallel file system Lustre and running over any generic network filesytem. There is scheduler/resource manager support for Slurm, Moab, Torque, LSF, and Flux.

Some of the features presently supported:

Run jobs interactively or via scripts.
Run against a number of filesystem options, such as HDFS, HDFS over Lustre, HDFS over a generic network filesystem, Lustre directly, or a generic network filesystem.
Take advantage of SSDs/NVRAM for local caching if available
Make decent optimizations for your hardware

Experimental support for several distributed machine learning frameworks has also been added. Presently tensorflow, tensorflow w/ horovod, and ray is supported.

Basic Idea

The basic idea behind these scripts are to:

Submit a Magpie batch script to allocate nodes on a cluster using your HPC scheduler/resource manager. Slurm, Slurm+mpirun, Moab+Slurm, Moab+Torque, LSF+mpirun, and Flux are currently supported.
The batch script will create configuration files for all appropriate projects (Hadoop, Spark, etc.) The configuration files will be setup so the rank 0 node is the "master". All compute nodes will have configuration files created that point to the node designated as the master server.

The configuration files will be populated with values for your filesystem choice and the hardware that exists in your cluster. Reasonable attempts are made to determine optimal values for your system and hardware (they are almost certainly better than the default values). A number of options exist in the batch scripts to adjust these values for individual jobs.
Launch daemons on all nodes. The rank 0 node will run master daemons, such as the Hadoop Namenode. All remaining nodes will run appropriate worker daemons, such as the Hadoop Datanodes.
Now you have a mini big data cluster to do whatever you want. You can log into the master node and interact with your mini big data cluster however you want. Or you could have Magpie run a script to execute your big data calculation instead.
When your job completes or your allocation time has run out, Magpie will cleanup your job by tearing down daemons. When appropriate, Magpie may also do some additional cleanup work to hopefully make re-execution on later runs cleaner and faster.

Supported Packages & Versions

For a complete list of supported package versions and dependencies, please see doc/README. The following can be considered a summary of support.

Hadoop - 2.2.0, 2.3.0, 2.4.X, 2.5.X, 2.6.X, 2.7.X, 2.8.X, 2.9.X, 3.0.X, 3.1.X, 3.2.X, 3.3.X

Spark - 1.1.X, 1.2.X, 1.3.X, 1.4.X, 1.5.X, 1.6.X, 2.0.X, 2.1.X, 2.2.X, 2.3.X, 2.4.X, 3.0.X, 3.1.X, 3.2.X, 3.3.X, 3.4.X, 3.5.X

Hbase - 1.0.X, 1.1.X, 1.2.X, 1.3.X, 1.4.X, 1.5.X, 1.6.X

Hive - 2.3.0

Pig - 0.13.0, 0.14.0, 0.15.0, 0.16.0, 0.17.0

Zookeeper - 3.4.X

Storm - 0.9.X, 0.10.X, 1.0.X, 1.1.X, 1.2.X

Phoenix - 4.5.X, 4.6.0, 4.7.0, 4.8.X, 4.9.0, 4.10.1, 4.11.0, 4.12.0, 4.13.X, 4.14.0

Kafka - 2.11-0.9.0.0

Zeppelin - 0.6.X, 0.7.X, 0.8.X

Alluxio - 2.3.0

TensorFlow - 1.9, 1.12

Ray - 0.7.0

Older Supported Packages & Features

Some packages and features were dropped due to lack of interest, the software becoming old/deprecated, and/or their initial experimental addition into Magpie. If you are interested in them, please look at older versions for supported versions and documentation. If you are very interested in support in current versions of Magpie beyond an experimental nature, please submit a support request and we can reconsider adding it back in.

Removed in Magpie 2.0

Hadoop 1.X support
Tachyon
UDA/uda-plugin for Hadoop
HDFS Federation in Hadoop
IntelLustre option for a Hadoop Filesystem
MagpieNetworkFS option for a Hadoop Filesystem

Removed in Magpie 3.0

Spark 0.9.X support
Hbase 0.98.X and 0.99.X support
Mahout

Documentation

All documentation is in the 'doc' subdirectory. Please see the doc/README file as a starting point. It provides general instructions as well as pointers to documentation for each project, setup requirements, ability to do local configurations, tips & tricks, and more information.

Release

Magpie is release under a GPL license. For more information, see the COPYING file.

LLNL-CODE-644248

magpie's People

Contributors

Stargazers

Watchers

Forkers

plagtag cmd-ntrf scleveland pombreda zorino codeaudit crystalfp souravzzz bpanneton georghildebrand yaweizhao imace 4sp1r3 huhoo milinda linhua55 sammuli andreasdalsgaard guoyu07 mahdidavari rbogle aschilds bhaskar24 kashenfelter nealepetrillo gguzunsjsu rpomaris yooerzf suleisl2000 azzaea amaji sujatapatnaik52 databill86 luunam darkknight314 spisakt zqyou xueshuke jackjiang-hpc utjfritz xunpan ilibx orcsup rudi-romania qitsweauca hoeze mrzhang638993 fedatrel chu11 gijshendriksen

magpie's Issues

generate environment variable for zookeeper nodes

likely should create new script outside of magpie-common-exports that is called only before pre-run, run, and post-run activities and after setup-core has been run.

would probably solve #34

Teardown Script

I think it would be nice to move all the teardown stuff to a file like magpie-teardown. Then this would allow one to quickly and safely shutdown the cluster while in interactive mode. I'm not sure on all the logistics of how to get all the variables automatically but I think it wouldn't be too difficult.

I wanted to run it by you and see if you had any thoughts on it before I attempted to do it.

detect zookeeper versioning differences between runs and output clearer error message

Zookeeper changes far less frequently than Hadoop/HDFS so unsure how likely this is. Remember to look into this the next time there is a major zookeeper version change.

latex-ize README

The README is getting gigantic and ridiculous. Need to latex-ize it, generate HTML, PDFs, etc.

add documentation on spark memoryfraction in advanced usage

Support HDFS Namenode failover

re-org code to deal w/ poor ability to see settings from prior configs

most noticeably with blatant cut & paste of code into magpie-run-hadoop-terasort for #36

Support spark with Yarn

Support Tachyon newer versions

Develop more significant "test jobs" so Magpie can be integrated into test harnesss

Most notably, need good stdout so various scripts can parse appropriately

e.g. run terasort data verifaction on data after run that can output consistent verifiable text

Determine mechanism to deal w/ hsperfdata

With no-local-dir do not want java to dump into /tmp

but no way to manipulate directory of where it is stored to

http://bugs.java.com/view_bug.do?bug_id=6447182

may have to disable if no-local-dir enabled, but should perhaps make it an option???

Add hbase conf into spark classpath if hbase + spark executed

update pig properties file for newer pig versions

Support Cassandra in Magpie

Update Magpie to support newest Hadoop, Hbase, Zookeeper, etc.

Add Mahout support

Support network 'local scratch' directories

Some users may not have real local scratch space and only network mounted locations.

hbase quorum

Is there a variable that sets the hbasezookeeperquorum? When I run interactively, even when I set the conf dir for hbase, this doesn't seem to get set internally. I must set it manually like such:

conf = {"hbase.zookeeper.quorum": host, <----------------- list of nodes
"hbase.mapred.outputtable": table,
"mapreduce.outputformat.class": "org.apache.hadoop.hbase.mapreduce.TableOutputFormat",
"mapreduce.job.output.key.class": "org.apache.hadoop.hbase.io.ImmutableBytesWritable",
"mapreduce.job.output.value.class": "org.apache.hadoop.io.Writable"}

I was hoping I didn't need to set anything, because it requires a lookup.

I tried setting the classpath with --driver-class-path and at first thought it was working. However it does not seem to be the case.

Documentation?

Hi,

Since I cannot find how to ask this question anywhere else, I will do it here. Feel free to delete it if this is not the right place for it.

I am looking into the code and there is no documentation to show how should users use magpie. I do it like this: In script-templates folder I use the Makefile for my cluster customizations. Then with magpi run I try to run sparkpi with the resulting configs. I am sure I am doing it wrong since I am changing so many of the files myself that now I ended up with scripts of my own.

Will be any documentation for running and customizing Spark, Hadoop, and Yarn. I would like to use it so that I can have it as a "module add" for other users.

Thanks

Srun error on Cray system

Trying to run magpie on a Cray XC40 under slurm 14.03.7
Done installation using the misc/magpie-apache-download-and-setup.sh script.
Then customized magpie.sbatch-hadoop submission script. But trying to sbatch -k customized.sh the job ends with the following error:

srun: error: Unable to create job step: Requested node configuration is not available

No way making the error disappears by modifying the #SBATCH special comments in the script.

The wisdom here is that you should never use srun on Cray and instead use aprun because Cray version of slurm is not well integrated with their machines. Unfortunately these commands are not equivalent (changing the first srun --no-kill -W 0 to aprun -B makes the first script magpie-check-inputs complain about missing SLURM_* variables for example).

Before trying to hack magpie scripts, I would ask:

Is it a known error? If yes, is there any workaround or debugging tip (besides adding -v 4 times to srun)?
Have you run magpie on any Cray machine?
On which slurm version have you tested magpie?
If it is not much trouble, could I have a working submission script for slurm and the log generated adding -v -v -v -v to the first srun command?

Thanks for helping!
mario

Ing. Mario Valle
Swiss National Supercomputing Centre (CSCS) | http://mariovalle.name/
v. Trevano 131, 6900 Lugano, Switzerland | Tel: +41 (91) 610.82.60

add tachyon into testsuite

it's not there right now, depends on prior issues such as #29 and #53 and others.

Support some sort of Monitoring Software

It would be great it we added something like ambari so that we could check the status of our nodes. In one case, all of my HBase region servers went down and my Spark job just hung around waiting for them to come back up (which they never did due to an unrelated issue). I had no idea that this occurred. It would be nice to be able to see what is happening all in one place.

I'm not sure what else is out that other than ambari at this point.

limit teragen mappers based on datasize and hdfs block size

otherwise you can have too many mappers creating tiny files

Support newer versions of software

re-org submission scripts to list run mechanism

different "run" mechanism can exist for each scheduler/resource manager (srun vs aprun, mpirun vs pdsh, etc.).

add terasort validation option

support mechanism to allow users to add site specific pre-run script code

some sites may need to load modules, set paths, and things like that

develop mechanism to access HDFS over Lustre/networkfs w/o launching a job

I believe through a series of scripting tricks, it would be doable to script this. Hypothetically, launch X hdfs daemons as processes and launch a namenode process on the same node. Configure them to use appropriate paths in lustre/networkfs as their "local drive". Make sure each have separate ports so they communicate to each other on the same node.

Support compression option in spark

Support kryo serializer option for Spark

Add script/mechanism for downloading packages and patching them

Script/mechanism to download latest supported versions of Hadoop, Hbase, Zookeeper, etc. and applying patches would be useful for users setting up Magpie for the first time.

In addition, after the download, scripts could be pre-seeded with paths to appropriate locations for the projects.

support configuration fo spark shuffle memoryFraction?

Enhancement proposal: making timeout behavior more dynamic

While testing, I found that when the setup time was shorter than expected, the job was unable to use all the walltime available because of the shutdown timeout.

Quickly, I see two ways to fix this.
A- Make MAGPIE_STARTUP_TIME dynamic, instead of a user fixed variable. For Moab, we could use the walltime reported by checkjob as a startup time in Magpie_wait_script function. Something along these lines:

  walltime=$(checkjob ${MOAB_JOBID} | grep -Po '(?<=WallTime:  \s).*' | cut -d' ' -f1))
  startuptime=$((10#${walltime:0:2}*60 + 10#${walltime:3:2} + 10#${walltime:6:2}>0))
  scriptsleepamounttemp=`expr ${MAGPIE_TIMELIMIT_MINUTES} - ${startuptime}`

B- Replace Magpie_wait_script by a mechanism of signal catching. Moab can be told to send a pre-termination signal at the desired time before expiration of the job's wall clock limit, for example:

-l signal=SIGHUP@5:00

This signal could be catch with the bash trap command and the script killed adequately.

Both solutions are specific to Moab for now, but I think that the mechanisms used are provided by most scheduler.

If I had to choose, I would opt for solution B which I find more elegant as it alleviates the necessity of having a timeout stop watch in Magpie.

Output clearer error message on hdfs over lustre issues

minimally easy idea, store hadoop version in path and on subsequent runs support better error message to user on what they have to do (upgrade hdfs, etc.). Is it possible to read HDFS version within files within? maybe ... or get version via hadoop commands and store?

minimally easy idea, store number of data nodes in path and on subsequent runs support better error message if user uses lower number of nodes.

add environment vars & help in output for no-local-dir interactive use

most notably PIG_OPTS being set w/ java.io.tmpdir

hadoop, hbase, etc. probably fine without b/c it'll load their respective -env.sh files

Support Tachyon in Magpie

add checks for user provided conf files

an obvious check that I should be doing all the time

Hbase performance eval does not output anything when configured w/ phoenix enabled

AFAICT the performanc eval does work, it just doesn't output anything to stdout. Maybe a log4j issue??

Fix bash-isms in code

Many parts of the code don't use "bash" correctly, some things are:

using bash math natively instead of in subshells
using smart bash parameter substitution instead of giant if statements (http://www.gnu.org/software/bash/manual/bashref.html#Shell-Parameter-Expansion), there's probably a lot more

But before doing this, need to make a comprehensive test system testing, significantly more than is currently done, to make sure nothing is broken along the way

would we want to split masters onto different nodes too?

Add spec file and stuff to build rpm

hbase tar.gz path has been archived

My workaround was to apply the patch below, however I think the best idea would be to use a newer hbase. Thus I didn't submit the patch for a pull request.

diff --git a/scripts/misc/magpie-apache-download-and-setup.sh b/scripts/misc/magpie-apache-download-and-setup.sh
index 2b4ec9a..aa6a5a1 100755
--- a/scripts/misc/magpie-apache-download-and-setup.sh
+++ b/scripts/misc/magpie-apache-download-and-setup.sh
@@ -109,9 +109,13 @@ fi

 if [ "${HBASE_DOWNLOAD}" == "Y" ]
 then
-    APACHE_DOWNLOAD_HBASE="${APACHE_DOWNLOAD_BASE}/${HBASE_PACKAGE}"
-
-    HBASE_DOWNLOAD_URL=`wget -q -O - ${APACHE_DOWNLOAD_HBASE} | grep "${HBASE_PACKAGE}" | head -n 1 | grep -o '<a href=['"'"'"][^"'"'"']*['"'"'"]' | sed -e 's/^<a href=["'"'"']//' -e 's/["'"'"']$//'`
+         # Package has been archived
+   if [ "${HBASE_PACKAGE}" == "hbase/hbase-0.98.9/hbase-0.98.9-hadoop2-bin.tar.gz" ]; then
+       HBASE_DOWNLOAD_URL="http://archive.apache.org/dist/${HBASE_PACKAGE}"
+   else
+       APACHE_DOWNLOAD_HBASE="${APACHE_DOWNLOAD_BASE}/${HBASE_PACKAGE}"
+       HBASE_DOWNLOAD_URL=`wget -q -O - ${APACHE_DOWNLOAD_HBASE} | grep "${HBASE_PACKAGE}" | head -n 1 | grep -o '<a href=['"'"'"][^"'"'"']*['"'"'"]' | sed -e 's/^<a href=["'"'"']//' -e 's/["'"'"']$//'`
+    fi

     echo "Downloading from ${HBASE_DOWNLOAD_URL}"

Document version combinations that work

i.e. pig 0.12.0 w/ hadoop 2.X
hbase 0.X w/ hadoop 2.X
etc.