Giter Site home page Giter Site logo

Comments (18)

symulation avatar symulation commented on July 26, 2024

Possibly related to #8 ?

from hpc-intro.

psteinb avatar psteinb commented on July 26, 2024

I like the idea. I'd be careful doing so though. For me, (gnu) parallel is a way to replace a shell for-loop. From that perspective, it would be a nice addition to hpc-shell (also to differentiate to swc-shell if that is needed).

It's however debatable, which HPC user behavior would be fostered when teaching parallel. From an admin/dev perspective, it would foster writing even more difficult shell scripts potentially. Compared to doing the same thing using a programming language (python multi-threading/-processing or any other shared-memory parallelisation technique). In this regard, I'd prefer to give people a more thorough approach to apply parallelisation (profiling, hot spot search, speed-up estimation).

I know there is interest in this topic from some people (mostly from a cloud perspective AFAIK). So my vote would be on potentially adding it as extra material.

from hpc-intro.

symulation avatar symulation commented on July 26, 2024

Yes, having parallel as an extra, a callout or side note of sorts, is what would make the most sense.

I agree that there is a danger in walking too far down the path toward difficult shell scripts. However, it does have the potential to provide an introduction to thinking about processing in parallel on a single line and that's the core reason that I find it attractive. Anything else and we'd typically be in a mess this early in the arc of a workshop.

Regardless, as I think we also agree, we'd need the right use case.

Likely something to just mull over for the time being and return to once we've got more essential changes looked after.

from hpc-intro.

pdoehle avatar pdoehle commented on July 26, 2024

Here is a parallel example we put together using the Nelle Nemo story line for a group of undergraduate students. Maybe something in there will provide some helpful ideas. As @psteinb mentioned, we used it as a replacement for a for-loop.

from hpc-intro.

tkphd avatar tkphd commented on July 26, 2024

TBH, I like @pdoehle's parallel Nelle Nemo exercise more than our current "build fastqc" demo. This is nice work that meshes well with (due to building upon) another Carpentries lesson, which helps to reinforce and gently expand on previously covered knowledge.

from hpc-intro.

psteinb avatar psteinb commented on July 26, 2024

Yap, I agree. I think though that we should have a larger discussion where such parallel paradigms/tools should go. I like the Nelle Nemo example too, to be frank.

To illustrate my point: One could argue that a similar example can very well tie into the snakemake intro given in hpc-python, see https://github.com/hpc-carpentry/hpc-python/blob/gh-pages/_episodes/11-snakemake-intro.md. So I'd encourage a more conceptual discussion where an introduction to any parallel paradigm can and should go within hpc-carpentry.

from hpc-intro.

colinsauze avatar colinsauze commented on July 26, 2024

I have an example of GNU parallel (https://github.com/SupercomputingWales/SCW-tutorial/blob/gh-pages/_episodes/07-optimising-for-parallel-processing.md) which I use that's based upon Nelle's pipeline from the Software Carpentry Unix Shell Novice. We expanded that example to have it to have 6000 files to process instead of the original 17. There's also a section on a more complex multi argument example. I'm happy to integrate this into HPC Carpentry if there's interest in reusing this material.

from hpc-intro.

psteinb avatar psteinb commented on July 26, 2024

from hpc-intro.

bkmgit avatar bkmgit commented on July 26, 2024

There is a lesson for this already developed:
https://deapsecure.gitlab.io/deapsecure-lesson01-hpc/

from hpc-intro.

bkmgit avatar bkmgit commented on July 26, 2024

It may be helpful to move the MPI section from HPC-intro to another 4 hour block such as HPC-novice or HPC-programming as discussed here. GNU parallel or some other embarasingly parallel task or throughput application may be good to have in the intro as it would build on knowledge more smoothly.

from hpc-intro.

tkphd avatar tkphd commented on July 26, 2024

GNU parallel is a useful tool, but IMO, it reflects a high throughput computing workflow to a much greater extent than a high performance computing paradigm, discretizing at the task rather than the data structure. This should be mentioned and perhaps covered "somewhere," but not here, at least for now, while we focus on MPI using C, Python, etc.

from hpc-intro.

bkmgit avatar bkmgit commented on July 26, 2024

MPI is very rushed in the last section of the current 4.5 hour hpc-intro. HPC in a day is at least 6.5 hours of material. Most Carpentry workshops are about 16 hours. Moving MPI and other parallel programming models to a separate section of 3 to 4 hours would give a better learning experience. An embarasingly parallel job script is probably a reasonable ending point for HPC-intro. Should one assume sw-shell as a pre-requisite?

from hpc-intro.

bkmgit avatar bkmgit commented on July 26, 2024

As explained by Dursi here simply focusing on MPI will do a disservice to the many ways in which HPC clusters are used. Introducing a variety of programming models in a more structured way would be beneficial.

from hpc-intro.

tkphd avatar tkphd commented on July 26, 2024

A good and valid point, @bkmgit, thank you. I'm coming at HPC from the realm of PDE solvers, where MPI, OpenMP, and CUDA rule, but the umbrella is much broader than my experience.

GNU parallel is essentially a tool for dispatching jobs on the local resource, which is exactly the role of a queuing system on the cluster. Since we spend a bunch of time introducing queuing systems, and not much time at all using them, launching a bunch of jobs from a reconfigurable script, or by creating a job array, would be a great way to demonstrate the core tool and conclude the lesson.

from hpc-intro.

tkphd avatar tkphd commented on July 26, 2024

Comments on this issue since August all share a theme: it's a great idea, but hpc-intro is not the right lesson to incorporate GNU parallel. Recommend closing this issue.

from hpc-intro.

bkmgit avatar bkmgit commented on July 26, 2024

Am ok with closing the issue and creating a new one for reconfigurable script example or job array for https://carpentries-incubator.github.io/hpc-intro/16-parallel/index.html

from hpc-intro.

bkmgit avatar bkmgit commented on July 26, 2024

The assumption being that a typical introduction will have two modules, hpc-intro and hpc-novice/hpc-parallel

from hpc-intro.

tkphd avatar tkphd commented on July 26, 2024

Completely agreed! This issue is superseded by #244.

from hpc-intro.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.