screenshot of putty for the ssh episode diagram of two computers, for s

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Cross-posted from <a class="issue-link js-issue-link" data-error-text="Failed to l

Potential images or diagrams,about carpentries-incubator/hpc-intro

Comments (16)

bernhold commented on July 26, 2024 2

It may be worth a footnote that this is common, but it does depend on each site's scheduling policies and is not universal.

At our facility, queue policies are setup to encourage and favor large jobs. While the smallest jobs often quickly run as backfill, there is a middle ground that can lose out to larger jobs, depending on a variety of factors.

from hpc-intro.

Sabryr commented on July 26, 2024 1

OK, I take that positive response as an encouragement and make a SVG (easier to modify and version control friendly) . I think the restaurant idea is more closer as we have tables with fixed number of seats. When you meant host/hostess I guess you were thinking more of when you arrive at the door and then someone take you in when a table is empty .

from hpc-intro.

ocaisa commented on July 26, 2024 1

@bernhold I agree about the footnote, our site is the same.

from hpc-intro.

kamil963 commented on July 26, 2024

I have given the following intro to scheduling talk,
there are many potential diagrams in that presentation to simplify and illustrate
concepts. I can edit them as well for specific example for addition to HPC carpentry.

https://docs.google.com/presentation/d/e/2PACX-1vQFH3oQL6NAOogswftWy19E1jwIkzO0lFzNKQVXKRPX4QaxCh3SB4EYxg3b7QlXCNEP7k6x6j8DYHDh/pub?start=false&loop=false&delayms=3000

from hpc-intro.

Sabryr commented on July 26, 2024

Scheduler - in our current training material we depict scheduler as a "bouncer" manger a queue for crowded club (Slide 17 of https://www.uio.no/english/services/it/research/events/2018b/abel_intro_march2018.pdf) . If this makes sense, I can create a diagram (we do not have a citation for the current diagram) with CC-BY.

from hpc-intro.

ChristinaLK commented on July 26, 2024

I love it! I've definitely compared a scheduler to the host/hostess at a restaurant, which is the same idea.

from hpc-intro.

ChristinaLK commented on July 26, 2024

@Sabryr yes, that's what I meant. I also really like that analogy because (at least on our systems), jobs that request fewer resources will start sooner, just like smaller parties get seated faster at a busy restaurant. ;)

from hpc-intro.

Sabryr commented on July 26, 2024

Yes, Site specific configurations and SLURM configuration options for fair usage are important. When users know this they would have a better understanding on for example "why I had to wait longer today". While supporting the foot note idea, I suggest to elaborate this further in an "optional section" or similar (do not want to complicate stuff at this stage though).

from hpc-intro.

tkphd commented on July 26, 2024

@Sabryr and @ChristinaLK, I like the analogy of the host/head waiter/maître d' leading you to an appropriately sized table, once one becomes available.

@bernhold, I think the analogy holds: your facility would be like a restaurant with several very large tables, and few small ones. The medium-sized jobs just have to wait until a suitable table opens up, or until the maître d' can find a complementary group to add so that the composite fills a large table.

edited for spelling, jargon thesaurus, word choice

from hpc-intro.

tkphd commented on July 26, 2024

Cross-posted from #84

The metaphor seems to break down the further it stretches. In a restaurant, raw material is converted to finished results by the back-of-house staff, usually hidden in the kitchen: this is the parallel workforce. The front-of-house staff carry the results from the workers to the clients, more like an interconnect or intranet linking the HPC facility to the campus or Internet.

Perhaps better analogies could be drawn between a shared office space, where the workers are the professionals occupying each office. Reservations and access are managed through the front desk (workload manager). Different offices serve different purposes (architectures/accelerators): accounting jobs go to the accountant, legal to the lawyer, et cetera. A conference room (interconnect) permits efficient collaboration by temporary associations (communicators) of different professionals (nodes).
A linear workflow can be crafted...

A client comes through the door, carrying with them their notes and reference materials (SSH into the login node).
The client requests an appointment at the front desk (submit the job to the workload manager).
The front desk staff reads through the client's portfolio to assess the workload requirements.
If none have been specified, or the request exceeds this office's resources, then the job is rejected.
If the job is manageable, the front desk determines the earliest available time and sets the appointment (returns the job ID).
Appointments are best-estimates. The actual start time may drift relative to the reported "start time."
If the task requires only one professional, then at the appointed time, the front desk hands off the client's portfolio and time limit and the worker gets to work.
Once finished, or out of time, the front desk retrieves the updated portfolio.
If the task requires more than one professional, then at the appointed time, the front desk calls the necessary workers into the conference room and delivers the portfolio and time limit. The pool of workers get to work, communicating as necessary, until the task is complete or the clock runs out.
Once finished, or out of time, the front desk retrieves the updated portfolio.
The client may ask at the front desk (check job status), or the front desk may contact the client when the job changes state (start, finish, error).
Once the job is complete, the client may retrieve the portfolio from the front desk (SSH into or RSYNC from the login node).

All that being said, explaining this extended metaphor in detail would be tantamount to describing the real HPC system in detail. I doubt this abstraction helps the learner to understand; it would take a couple walk-throughs in the class to get the facts straight; and it doesn't help anyone actually understand and use an HPC resource. The time would be better spent, in my opinion, in describing increasingly complex computational frameworks:

You launch a program on the computer at your desk.
You ask a colleague to run the program on their beefier computer. They let you know when it finishes.
You modify the program to use all available cores on your colleague's computer.
...
You submit your job to the queuing system, and it runs in no time on the HPC resource.

from hpc-intro.

ChristinaLK commented on July 26, 2024

@tkphd I still think it's useful to present a metaphor (maybe more than one!)

It sounds like to be helpful, we should keep it rather simplified, just to avoid pushing it to the point where it breaks down.

from hpc-intro.

tkphd commented on July 26, 2024

@ChristinaLK, sure, I don't disagree. My argument is that the restaurant metaphor is best suited for explaining the scheduler as the head waiter, only. It has the added benefit that most people are familiar with the concept of a restaurant, so an illustration is not strictly necessary.

Finding additional, better-suited metaphors for workers and resources would be great.

from hpc-intro.

tkphd commented on July 26, 2024

I'm shocked to see #84 closed, which means I've failed to communicate constructively. @Sabryr, please accept my sincere apology for turning discussion of your work to a discouraging or hostile direction. My goal was to encourage further discussion, and eventually to have an adjusted version of your illustration for reference. I hope that you will consider re-engaging, and re-opening your pull request. I will certainly take this exchange as an opportunity to revise my tone and try harder to foster collaboration on this developing curriculum.

I had a couple of fruitful discussions with @guyer and @reid-a about the restaurant metaphor. While it's not the best fit for describing an entire HPC ecosystem, @guyer in particular came up with some useful features of a workload/queue manager that could be discussed:

Small parties can wait in line for a table with full service, or jump straight to the bar if they're in a hurry. This would be a quick intro to "fast" queues with decreased runtimes or constrained hardware.
There will be different wait times for parties of 2-3 vs. 6-8, which is the same of queuing systems, if table size is an analog for number of nodes.
The restaurant owner decides how many tables there are of each size class. This is the same as HPC partitions into small, medium, and large job sizes, with different numbers of nodes assigned to each.
Overall, the task of the head waiter can be understood and used to outline the purpose and constraints of a queuing system: because there's a reasonable correlation between party size and dining time, reasonable estimates can be made, and dinner rarely takes more than 3 hours. However, in HPC, the queuing system must accommodate "diners" anywhere from a few minutes to a few weeks, which makes it very important to accurately guess at your job's runtime.

Again, @Sabryr and @ChristinaLK, thanks for engaging in this discussion, and please accept my humble apology for derailing it. I was wrong.

from hpc-intro.

Sabryr commented on July 26, 2024

@tkphd apology not required , the pull request was closed to submit a new one. Diff was too much to continue with that.

from hpc-intro.

tkphd commented on July 26, 2024

That's a relief, @Sabryr, and I look forward to seeing the new PR.
I still stand by the apology, though, since I need to work on effectively communicating and dialing back dismissive comments. In particular, I fall into the common expertise trap of assuming things are obvious when they are, in fact, very much not.

from hpc-intro.

Sabryr commented on July 26, 2024

Still don't see the need for the apology, thank you for reviews. I will try to open up the same pull, to keep the discussions intact.

from hpc-intro.

Potential images or diagrams about hpc-intro HOT 16 CLOSED

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent