Giter Site home page Giter Site logo

spack-c2sm's People

Contributors

abishekg7 avatar annikalau avatar benweber42 avatar cosunae avatar dominichofer avatar edopao avatar elsagermann avatar faeglas avatar frostymike avatar github-actions[bot] avatar guydemorsier avatar halungge avatar havogt avatar jonasjucker avatar kosterried avatar leclairm avatar leuty avatar lxavier avatar mjaehn avatar mroethlin avatar muellch avatar ninaburg avatar ruestefa avatar samkellerhals avatar stagno avatar stelliom avatar sylvaine avatar vectorflux avatar victoria-cherkas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spack-c2sm's Issues

Should a dsl dycore be built separately or as part of cosmo/icon

The point of having a dsl dycore be part of the model repo (cosmo/icon) is so that both components will always use the same commit id when they are being built.

However, spack will always do a separate git clone when installing packages. So having separate packages for the dsl dycore and model components defeats the purpose of having them in the same git repo, since spack might use completely different versions.

Instead, it might be a better idea to build a dsl dycore as part of the model's package by default. This way spack will git clone only once for both and it is ensured that the same commit id will be used for both components.

Additionally, the devbuildcosmo command shouldn't be necessary anymore and the dev-build command should automatically do what we want instead. (I believe dev-build will still have issues, since it doesn't overwrite by default, but this can be seen as a separate issue)

If needed, we could still add a +external_dycore variant which allows a user to link the model (cosmo/icon) with an already built dsl dycore. However, this would be turned off by default, so that building with different dsl dycore/model versions is opt-in.

Create issue labels

To help with the organization of the spack repo, we should create some useful labels for issues.

Improve dependencies selection

  • how to set the version of our depencies e.g. eccodes. We want to have a default version which is the version known tested version. This may be different for int2lm or cosmo. It does not mean that other version are not compatible - just not tested.
  • where to specify dependencies : proposition -> in the repo, see carlos branch and PR
  • The isssue of cosmo-dycore (which should be always the same verison as cosmo), may be adressed differently than other dependencies such as eccodes and claw

Fix cosmo plan with spack

The plans fails if there is no instance, fora example after spack_monthly erased it. Need to add some check or ignore error

COSMO tag missing in release output

The cosmo tag information is missing from std out when installing with spack.
Seems to only be an issue when using "spack install" , version is correctly set when using "spack devbuild"

spack installcosmo --test=root rebuilds cosmo

When executing :

spack installcosmo cosmo%pgi
spack installcosmo --test=root cosmo%pgi

the second cmd will rebuild again cosmo.
Instead we would like to use the installed version and run the tests only.

Upgrade spack version

As of this writing we are using spack version 0.15.4 whereas the newest spack version is 0.16.1. There are some interesting features we could benefit from in spack's newest version.

In this issue we should gather blockers for the upgrade. I believe there was an issue with access permissions for an upstream folder?

Explicit import in spack packages

Currently we always import everything from Spack
from spack import *

This makes it hard to debug or find documentatin about a specific function
that is used in the packages.
@leclairm uses explicit import for OASIS:

from spack.build_systems.makefile 
import MakefilePackage
from spack.directives import depends_on, version
from llnl.util.filesystem import working_dir, FileFilter, install_tree

These explicit imports are much clearer about which module/function is used in the package later on.

Use spack's yaml library

Currently we require a user of config.py to provide a yaml library. However, spack already ships with ruamel, so we could just use that instead. This means less dependencies to provide for a user of config.py.

spack installcosmo --only dependencies returns 0 on failure

spack installcosmo cosmo
cmd returns properly error when it fails, but while building only the dependencies, it does not returns a failure.

To reproduce it:

spack installcosmo -j 20 --only dependencies [email protected]%[email protected] +set_version  real_type=float cosmo_target=gpu +production +claw +verbose +pollen +eccodes

if any of the packages fail, the command still returns 0.
We should make it return an error

Figure out recipe to pre-build (big) dependencies

Example:

dawn depends on llvm. Additionally, dawn depends on [email protected] which means that spack will install llvm ^[email protected] for dawn.
Because building llvm takes a long time, it's desirable to pre-build it in the upstream instance. However, spack will conretize llvm to llvm ^[email protected] which it can't use for dawn since spack thinks that dawn requires llvm ^[email protected].

This problem also occures with cosmo-dycore and cosmo, where spack will build cosmo-dycore with a different MPI provider/version than cosmo.

We should try to figure out a recipe how to deal with this issue in general.

Avoid rebuilding dependencies

This is in particular an issue on daint, for example serialbox is rebuild with icon although it is in the upstream with same spec

Default gcc version incompatible with nvcc on Daint

On Daint the default gcc version selected is [email protected] which is incompatible with nvcc 10.2 from cudatoolkit/10.2.89_3.28-7.0.2.1_2.17__g52c0314.
For example spack install icondusk-e2e or spack install icondusk-e2e%gcc fail on Daint with this error:

error: #error -- unsupported GNU version! gcc versions later than 8 are not supported!
      138 | #error -- unsupported GNU version! gcc versions later than 8 are not supported!
          |  ^~~~~

I think this is the relevant configuration:
https://github.com/MeteoSwiss-APN/spack-mch/blob/ca5beaf2c3455d62918e38da24d0b5975d739de0/sysconfigs/daint/packages.yaml#L7
Maybe changing the order of the compiler list above could fix it?

Discuss whether we should contribute to Spack

Sometimes we have the need to customize Spack to fit our needs. Two examples:

  • Expressing dependencies in the repository of the depender package,
  • Expressing some dependencies in a non-strict way (don't care about which version of cmake, python, etc. is used to build a dependency).
    I propose to consider contributing to Spack, i.e. having a fork and opening pull requests on the main repo.
    In this way we will also get feedback from the Spack developers about our proposed changes.

multiple modules for compiler

Some modules are hidden behind other modules, i.e. the can only be loaded when some other module is preloaded.
For example, it makes sense that gcc/8.3.0 can only be loaded if PrgEnv-gnu is already loaded.
PrgEnv-gnu is a metamodule that just adds the path of all the gnu modules to MODULEPATHS.

One could argue that we can directly add the full path of gcc/8.3.0 like
/apps/arolla/UES/jenkins/RH7.7/MCH-PE20.08/generic/easybuild/modules/all/gcc/8.3.0 which would not depend on loading first PrgEnv-gnu.

However that does not work in general. gcc/8.3.0 will try to load other modules as well:

conflict	 gcc 
module		 load gcccore/.8.3.0 
module		 load binutils/.2.32-gcccore-8.3.0 

which are only visible again if PrgEnv-gnu is preloaded.

If I add to the compilers.yaml the two modules in the right order:

- compiler:
    environment: {}
    extra_rpaths: []
    flags: {}
    modules:
    - PrgEnv-gnu
    - gcc/8.3.0

I get the stack trace posted at the end.
The problem is here:
https://github.com/spack/spack/blob/764cafc1ceeda30724c368edef3fe5a1d82275c1/lib/spack/spack/build_environment.py#L645
out of the list of modules, it assumes the second one in the list (modules[1]), here gcc/8.3.0 , is the module associated to the compiler.
Whatever that means ๐Ÿ˜’, it tries to get the rpath from the module[1], which does not happen if there is only one module in the list.

What follows is that spack tries to find the root path of the module,
https://github.com/spack/spack/blob/764cafc1ceeda30724c368edef3fe5a1d82275c1/lib/spack/spack/util/module_cmd.py#L152
by searching variables defined like _DIR, _ROOT, MANPATH, PATH , ...

https://github.com/spack/spack/blob/764cafc1ceeda30724c368edef3fe5a1d82275c1/lib/spack/spack/util/module_cmd.py#L221

And what happens is that the tsa module for gcc/8.3.0 does not define any of those variables and raises an exception.

Why is this pattern of using multiple modules working in daint ? well I guess contrary to tsa, some of these variables are found in the module[1] (i.e. the second one) so that the assumed rpath is found.

However even if in that case it does not crash, I have my doubts that the expected behaviour we thought, i.e. that one can specify a list of modules which will all be loaded, is what spack does.

All this is for version v0.15.4

Situation might be solved (or not) with latest release
spack/spack#3817

File "/scratch-shared/meteoswiss/scratch/cosuna/cosmo/cosmo/ACC/jenkins/spack-mch/spack/spack/lib/spack/spack/build_environment.py", line 845, in child_process
    setup_package(pkg, dirty=dirty)
  File "/scratch-shared/meteoswiss/scratch/cosuna/cosmo/cosmo/ACC/jenkins/spack-mch/spack/spack/lib/spack/spack/build_environment.py", line 733, in setup_package
    set_module_variables_for_package(pkg)
  File "/scratch-shared/meteoswiss/scratch/cosuna/cosmo/cosmo/ACC/jenkins/spack-mch/spack/spack/lib/spack/spack/build_environment.py", line 534, in set_module_variables_for_package
    _set_variables_for_single_module(pkg, mod)
  File "/scratch-shared/meteoswiss/scratch/cosuna/cosmo/cosmo/ACC/jenkins/spack-mch/spack/spack/lib/spack/spack/build_environment.py", line 477, in _set_variables_for_single_module
    m.std_cmake_args = spack.build_systems.cmake.CMakePackage._std_args(pkg)
  File "/scratch-shared/meteoswiss/scratch/cosuna/cosmo/cosmo/ACC/jenkins/spack-mch/spack/spack/lib/spack/spack/build_systems/cmake.py", line 170, in _std_args
    spack.build_environment.get_rpaths(pkg)),
  File "/scratch-shared/meteoswiss/scratch/cosuna/cosmo/cosmo/ACC/jenkins/spack-mch/spack/spack/lib/spack/spack/build_environment.py", line 646, in get_rpaths
    return list(dedupe(filter_system_paths(rpaths)))
  File "/scratch-shared/meteoswiss/scratch/cosuna/cosmo/cosmo/ACC/jenkins/spack-mch/spack/spack/lib/spack/spack/util/environment.py", line 61, in filter_system_paths
    return [p for p in paths if not is_system_path(p)]
  File "/scratch-shared/meteoswiss/scratch/cosuna/cosmo/cosmo/ACC/jenkins/spack-mch/spack/spack/lib/spack/spack/util/environment.py", line 61, in <listcomp>
    return [p for p in paths if not is_system_path(p)]
  File "/scratch-shared/meteoswiss/scratch/cosuna/cosmo/cosmo/ACC/jenkins/spack-mch/spack/spack/lib/spack/spack/util/environment.py", line 56, in is_system_path
    return os.path.normpath(path) in system_dirs
  File "/scratch-shared/meteoswiss/scratch/cosuna/cosmo/.venv/lib/python3.7/posixpath.py", line 340, in normpath
    path = os.fspath(path)
TypeError: expected str, bytes or os.PathLike object, not NoneType

==> Error: TypeError: expected str, bytes or os.PathLike object, not NoneType

/scratch-shared/meteoswiss/scratch/cosuna/cosmo/cosmo/ACC/jenkins/spack-mch/spack/spack/lib/spack/spack/build_environment.py:863, in child_process:
        860
        862            # build up some context from the offending package so we can
  >>    863            # show that, too.
        864            package_context = get_package_context(tb)
        865
        866            build_log = None

spack load runtime error with cosmo

for info the spack load SPEC for cosmo leads for some installation to a runtime error (before start):

/../libstdc++.so.6: version `GLIBCXX_3.4.20' not found

this does not affect the environment generated by the extract_env.py script. Both my user and jenkins user seems affected. Not all install have the issue master is ok but

spack load -r [email protected]%[email protected] real_type=float cosmo_target=gpu 

would lead to the error at runtime.
I tried to adapt the extract_env.py to only do the spack load in a file. In that case sourcing the file seems to solve the problem. There seems to be a difference between directly calling spack load and calling spack load from within the python script

This is an issue to use the cosmo benchmark jenkins plan with other intallation then master

Update spack v0.xx

2 issues with v0.16.0

  • user get an error due to write permission in a new cache file
  • error due to long path
  • the PR needed to use v0.16.x is this one: spack/spack#22500. It was decided though not to cherry-pick it because one never knows if it could break operations since it is not tested.
  • One solution would be to add the Spack Team to integrate this PR to the bug fixes of v0.16.4 or wait for v0.17.0 which is coming soon.

Remove unused packages for Daint

Currently there are a lot more packages contained in the config for Daint as for Tsa.

I'd be in favor of removing unused packages in order the have better overview about what is relevant for our Daint users.

Remove CSCS_APPS_PATH

Remove from both compilers.yaml in spack and from ICON config script (the whole use modeule statement)

Install cosmo build env

The implementation from #176 causes an error in case of dev-build r as package.env_path is incorrect because it looks for the file in stages.

This fix remove the installation of the build environement #195

The new implementation (without installation of build env) is likely an issue for the install release .

We should either not install the build env in case of dev-build or find it at the correct place (local folder)

disambiguate the spec in the extract_env

In the deployment script extract_env.py in order to figure out the installation path of a given spec, we simply use

Spec(args.spec).concretized()

print(package_spec.format('{prefix}'))

This will give back a path with a hash irrespective of whether the spec was installed or not.
Typically an incomplete spec will lead to a wrong path.

The behaviour of spack location -i is actually very different.
Say you installed the spec:
[email protected]%[email protected] real_type=float cosmo_target=gpu +pollen +eccodes +production +claw +verbose +cppdycore ~debug +set_version

but now you are executing the script with an incomplete spec
[email protected]%[email protected] real_type=float cosmo_target=gpu +pollen +eccodes +production +claw +verbose +cppdycore ~debug

spack location is not considering the default of variants. So as far as ~set_version was not installed, it will disambiguate the spec, finds the one with +set_version installed and give you that one back.

I think we should emulate this behaviour of spack location
https://github.com/spack/spack/blob/8ccdcf2e843ec02a1af3d4cf72c5c68236f61e36/lib/spack/spack/cmd/location.py#L101
In the extract_env

Nitpick: Consistent naming scheme of C2SM versions

To install the recommended branch of c2sm for COSMO I do:
spack installcosmo cosmo@c2sm-master

Whereas for int2lm we do:
spack install int2lm@c2sm_master

Would it it be possible to either use a minus or an underscore for both names? I always pick the wrong one at first, like with USB-A plugs :-D

use jenkins file for spack PR

so that we can parametrized and better maintain it.
The idea is that you can specify a SPEC parameter, and will trigger a specific build of spack.
But if not, the default behaviour would be to build a set of predefined tests: cosmo, icon, int2lm, etc.
for dummy users that are not sure what should be tested for validating a spack PR

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.