swimprojectucb / swim Goto Github PK
View Code? Open in Web Editor NEWStatistical Workload Injector for MapReduce - Project at UC Berkeley AMP Lab
Home Page: https://github.com/SWIMProjectUCB/SWIM/wiki
Statistical Workload Injector for MapReduce - Project at UC Berkeley AMP Lab
Home Page: https://github.com/SWIMProjectUCB/SWIM/wiki
Excuse me, l wanna to know where is facebook history logs? Where is FB-2009_LogRepository? Thank you for your help!
Hello everyone,
I used SWIM to test my MR cluster. But when I finished the execution and want to use "parse-hadoop-jobhistory.pl" to analyse the job logs, I found that it doesn't work.
I tried it in Centos6 and Ubuntu12.04. But it doesn't work.
I follow the guide "Step 1. Parse historical Hadoop logs"
And hope you can check it again.
Thanks
ZHANG Bo
Hi,
While compiling WorkGen.java, I encountered two compilation errors "Cannot Find Symbol" at Line no. 267 and 268. I think these need to be changed
From:
System.out.println("shuffleInputRatio = " + Double.parseDouble(job.getRaw("workGen.ratios.shuffleInputRatio")));
System.out.println("outputShuffleRatio = " + Double.parseDouble(job.getRaw("workGen.ratios.outputShuffleRatio")));
To:
System.out.println("shuffleInputRatio = " + Double.parseDouble(jobConf.getRaw("workGen.ratios.shuffleInputRatio")));
System.out.println("outputShuffleRatio = " + Double.parseDouble(jobConf.getRaw("workGen.ratios.outputShuffleRatio")));
Regards,
Saurav
FB-2009 comes from historical Hadoop traces on a 600-machine cluster at Facebook. The original trace spans 6 months from May 2009 to October 2009, and contains roughly 1 million jobs.
FB-2010 comes from historical Hadoop traces on the same cluster at Facebook, now grown to 3000 machines. The original trace spans 1.5 months from October 2010 to November 2010, and also contains roughly 1 million jobs.
I am wondering how can I access this data? ... Thanks!
I am following the instructions at https://github.com/SWIMProjectUCB/SWIM/wiki/Performance-measurement-by-executing-synthetic-or-historical-workloads.
I was able to successfully complete steps 1 through 4 for the synthetic workload provided as part of SWIM, specifically FB-2009_samples_24_times_1hr_0_first50jobs.tsv.
On the 5th step, when I run hadoop jar HDFSWrite.jar, it gives an error:
14/06/16 15:43:00 WARN mapred.JobClient: No job jar file set. User classes may not be
After this the map tasks started fail and the job fails. So, the input data is not created within HDFS and I am unable to move to the next step.
Can someone help me figure out what I am missing?
Thanks,
Madhura
More details from the console:
[hdfs@hadoop1 workloadSuite]$ hadoop jar HDFSWrite.jar org.apache.hadoop.examples.HDFSWrite -conf conf/randomwriter_conf.xsl workGenInput
client.getClusterStatus().getMaxMapTasks() gives 72
client.getClusterStatus().getMaxReduceTasks() gives 36
Running on 6 nodes with 60 maps,
writing 64424509440 bytes with 1073741824 bytes per map.
Job started: Mon Jun 16 15:43:00 PDT 2014
14/06/16 15:43:00 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.examples.HDFSWrite$RandomInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1806)
at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:620)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:394)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.examples.HDFSWrite$RandomInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1774)
at org.apache.hadoop.conf.Configuration.getClass(Co
14/06/16 15:22:33 INFO mapred.JobClient: Task Id : attempt_201406161518_0001_m_000046_1, Status : FAILED
I am new user, using hadoop and trace compilation.I completed Step 2 of the link "https://github.com/SWIMProjectUCB/SWIM/wiki/Performance-measurement-by-executing-synthetic-or-historical-workloads". Now, I don't know where I can find the file HDFSWrite.java and WorkGen.java, to complete other steps?
Regards
Zubair
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.