Comments (5)
I think either the instructor notes (https://hpc-carpentry.github.io/hpc-intro/guide/) or as callout boxes in "why use a cluster" (https://hpc-carpentry.github.io/hpc-intro/00-hpc-intro/) are the places to put this depending on whether we want instructors or students to see this. We already have a short version in the "why use a cluster" session, so maybe there?
from hpc-intro.
At CarpentryCon, people came up with these 3 profiles:
X is a molecular biologist that soon will receive a large DNA sequence dataset.
Her adviser has told her she will need to use the university's HPC cluster to analyze them.
A colleague that is supposed to teach her how has left her a set of written instructions, but is on parental leave and thus cannot help her at the moment.
An email from the sequencing centre arrives with a set of download links for X to obtain the datasets.
X's adviser asks her to quickly investigate the quality of the data using an appropriate software available on the cluster,
so decision can be made to pay for the cost of data generation.
The workshop will teach X how to transfer the data to an appropriate place on the cluster and access and schedule and run jobs.
Afterwards, X will understand enough of the written instructions to be able to start the correct analyses for her data.
(By Lex Nederbragt)
------
Y is an environmental biologist that uses DNA signatures obtained from soils to study species diversity in the environment.
She need to compare DNA sequences to large databases. So far, she has been able to use web-based tools for her limited datasets.
Recently, Y has started working with much larger datasets, and discovered that the online tool he uses has a limit of 50 entries on the online server.
He has heard it should be possible to run the same tool through the commandline, and managed to install it on his local Laptop.
Now, however, it takes several days before each of the analyses are finished.
The workshop will teach Y to move his data to and from the university's computer cluster, and submit jobs using pre-installed software on the cluster.
Afterwards, Y will be able to analyze her own data and pre-installed command-line based versions of the tool
to spread the analysis over several dozen cores so it finishes in a few hours.
(By Lex Nederbragt)
-------
Z is a skilled bioinformatician running analysis pipelines on the computer cluster.
A new tool has been published and Z has decided it she wants to see whether it could be replace an existing tool.
She submits a request to the HPC IT support staff to have the new tool installed in the cluster's environment module system.
The response is that support is swamped with requests and for now has to prioritize issues with already installed tools.
Support suggests that Z installs the tool herself.
Z has never installed software before and quickly realizes the instructions given are way above her head.
Meanwhile, more and more researchers report on Twitter how good the new tool is working relative to the one she uses now.
This workshop will enable Z to make a local installation of the new tool and test it out in her pipeline.
(By Lex Nederbragt)
--------
New PhD student is given a task to select parameters for their simulation. They need to run a set of calculations on several thousand
combinations of parameters. One calculation takes several minutes. They set up the problem on their laptop but quickly realise
that it would take more than one month to complete the task.
They are told to use local HPC but they are not sure how this would help them.
Should I make them individual issues so we can discuss them and bring them into the material?
from hpc-intro.
My feelings are that learners should be exposed to this without larger obstacles. My hope would be that this helps them identify themselves with the main character, i.e. the material, better.
from hpc-intro.
The difference between what's currently (Jan. 2021) in the lesson and what's suggested here is mainly the stress on the fact that the users are HPC cluster novices. This helps learners "see themselves", but isn't really the answer to "Why use a cluster?", it's closer "What will someone like me get out of this lesson?". Retitle the lesson? Or the subsection with the profiles?
from hpc-intro.
As @reid-a points out, the current lesson includes three user profiles based on @psteinb's proposed blurbs. If more refinement would be useful, a more specific issue should be filed.
from hpc-intro.
Related Issues (20)
- E-mail notification from jobs?
- Amdahl's Law confusion
- Images in the jargon presentation are not rendering
- Provide reading resources for backup of essential data
- record jargon presentation
- incorporate firewall gif
- Amdahl code deployment strategy HOT 1
- propagate script name through snippet library
- Jargon buster presentation - presenter notes repeated
- ENH: Possible addition of Netlify-bot HOT 2
- Interesting forks of the `hpc-intro` lesson
- scp introduced during ssh keygen without explanation
- Broken link to Python code
- Add some material on environment variables? HOT 1
- use MagicCastle as the default snippet library HOT 1
- Question about username on the cluster HOT 3
- Confused with "shell application with SSH"
- Shell prerequisites for hpc-intro
- Tiny self-hosted cluster for HPC Carpentry workshop? HOT 2
- Adopting the Carpentries Workbench...and reducing divergent forks! HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hpc-intro.