Zihao Guo 931278
Yan Yan 1320588
Assignment is to search the large Twitter data set (bigTwitter.json) and using the language used when tweeting, the number of tweets in those languages and the tweet location (lat/long) count the total number of tweets in a given cell that are made in different languages. The final result will be a score for each cell with the following format, where the numbers are obviously representative. Application should allow a given number of nodes and cores to be utilized. Specifically, your application should be run once to search the bigTwitter.json file on each of the following resources:
• 1 node and 1 core;
• 1 node and 8 cores;
• 2 nodes and 8 cores (with 4 cores per node).
• multi.py -- Constructor for classes
• newmain.py -- main file for loading Json file and data processing
• 1Node1Cores.slurm 1nodes8cores.slurm 2Node8Cores.slurm
• 1Node1Cores.out 1nodes8cores.out 2Node8Cores.out
• Assignment 1 report.pdf
To run the program on Spartan, Three slurm files configured with three different resources, which are 1node1core, 1node8cores, and 2nodes8cores. Run the command:
sbatch 1Node1Cores.slurm sbatch 1Node8Cores.slurm sbatch 2Node8Cores.slurm
will submit a new job to the Slurm queue on Spartan. A job ID will be given after that. To track the progress of the job, run squeue -u username to check the status of the work. When the status shows finished, a file named slurm-jobID.out will be in the directory. Run more slurm-jobID.out to check the performance of each job.
Or
Run
mpiexec -n #num of cores python3 newmain.py sydGrid.json #anyTwitter.json #batch_sizeon your local computer