Create a Hadoop MapReduce application to find the maximum temperature in every day of the years 1901 and 1902. Your application should read the input from HDFS and store the output to HDFS. When your application completes, merge all the results to one file and store it on the local cluster.
public static class MaxTempMapper
extends Mapper<LongWritable, Text, Text, IntWritable> {
private static final int MISSING = 9999;
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
String date = line.substring(15, 23);
int temp;
if (line.charAt(87) == '+') {
temp = Integer.parseInt(line.substring(88, 92));
}
else {
temp = Integer.parseInt(line.substring(87, 92));
}
String quality = line.substring(92, 93);
if(temp != MISSING && quality.matches("[01459]")) {
context.write(new Text(date), new IntWritable(temp));
}
}
}
2. Copy of Your reducer.py code (or equivalent in another programming language) (25% of total grade)
public static class MaxTempReducer
extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int maxValue = Integer.MIN_VALUE;
for (IntWritable value : values) {
maxValue = Math.max(maxValue, value.get());
}
context.write(key, new IntWritable(maxValue));
}
}