Giter Site home page Giter Site logo

filesystem-layer's People

Contributors

bedroge avatar boegel avatar casparvl avatar huebner-m avatar ocaisa avatar pescobar avatar peterstol avatar rungitta avatar terjekv avatar trz42 avatar victorusu avatar zilverberg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

filesystem-layer's Issues

Document procedure for adding a Stratum 1

There's already documentation for deploying a Stratum 1 server, but we should add the procedure to really add a Stratum 1 to the project, i.e. something like:

  • add the URL to the config file(s) (group_vars/all.yml)
  • send PR
  • generate new client configuration/packages
  • add DNS entry?
  • update the clients (how do we do/announce this?)

single directory recommendation for CVMFS repos

According to @rptaylor there's a recommendation for CVMFS repos to use a single directory at the top level, for example /cvmfs/pilot.eessi-hpc.org/repo/, and only start using multiple subdirectories below that.

To quote @rptaylor:

It provides some options for aliasing (or possible future migrations) , e.g. with symlinks. It is analogous
to putting a service into production using a DNS CNAME instead of the real server name, so that you can
change the backend system without disrupting users, (or at least it gives some options and flexibility). 
There might be other reasons as well that I am not aware of.  All the major CERN repos are set up this way.

Using the stack in a container on a parallel FS

As part of #37 I was testing using that setup to run GROMACS. Things work fine if I don't use too many MPI tasks per node, but once I go above 4 I'm getting errors:

[ocais1@juwels03 test]$  OMP_NUM_THREADS=6 srun --time=00:05:00 --nodes=1 --ntasks-per-node=6 --cpus-per-task=6 singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" /p/project/cecam/singularity/cecam/ocais1/client-pilot_centos7-2020.08.sif /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi mdrun -s ion_channel.tpr -maxh 0.50 -resethway -noconfout -nsteps 1000 -g logfile
srun: job 2622253 queued and waiting for resources
srun: job 2622253 has been allocated resources
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
Failed to initialize loader socket
Failed to initialize loader socket
Failed to initialize loader socket
FATAL:   stat /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi: transport endpoint is not connected
FATAL:   stat /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi: transport endpoint is not connected
FATAL:   stat /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi: transport endpoint is not connected
FATAL:   stat /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi: transport endpoint is not connected
FATAL:   stat /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi: transport endpoint is not connected
FATAL:   stat /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi: transport endpoint is not connected
CernVM-FS: loading Fuse module... CernVM-FS: loading Fuse module... CernVM-FS: loading Fuse module... CernVM-FS: loading Fuse module... CernVM-FS: loading Fuse module... srun: error: jwc04n178: tasks 0-5: Exited with exit code 255

I suspect the alien cache right now is not enough and we also need a local cache on the node for this use case

Symlinks for "latest" and "preview" stacks

It would be nice to have a latest (-> 2020.10 right now) symlink in /cvmfs/pilot.eessi-hpc.org/ so that we don't need to continuously update scripts. This will also facilitate integration in other tools like Magic Castle.

Having a preview symlink for the next iteration of the stack would also be useful.

Config repo not added as replica on stratum1

The config repository does not seem to be added as a replica to the stratum 1, though the clients are configured to use the stratum 1 as server. We should add them automatically.

Directory structure

We already have an open issue to discuss the naming scheme for the repositories (see #4), but inside the prod/test repositories we need to come up with / decide on a directory structure.

This is the concept we showed during the EESSI & CVMFS meeting:

.
└── cvmfs/
    ├── cvmfs-config.eessi-hpc.org/
    │   └── etc/
    │       └── cvmfs/
    ├── test.eessi-hpc.org
    └── prod.eessi-hpc.org/
        └── 2020.06/
            ├── compat/
            │   ├── aarch64
            │   ├── ppc64le
            │   └── x86_64/
            │       ├── bin
            │       ├── etc
            │       ├── lib64
            │       └── usr
            └── software/
                ├── aarch64
                ├── ppc64le
                └── x86_64/
                    ├── amd/
                    │   └── zen2
                    └── intel/
                        ├── haswell
                        └── skylake/
                            ├── modules
                            └── software/
                                ├── GROMACS/
                                │   ├── 2019.3-fosscuda-2019b
                                │   └── 2020.1-foss-2020a-Python-3.8.2
                                └── TensorFlow/
                                    └── 2.2.0-fosscuda-2019b-Python-3.7.4

(edit it here)

With variant symlinks we can point a fixed directory to the right tree, based on the client's architecture.

Script to configure and populate an alien cache

This is a (basic) script to create and populate a CVMFS alien cache for use on systems that do not have access to the internet (but you do require internet access to perform the initial run of the script).

# Set group (as required, useful if you would like to share the cache with others)
MYGROUP=$GROUPS

# Set user
MYUSER=$USER

# Set path to shared space
SHAREDSPACE="/path/to/shared/space"

# Set path to (node) local space to store a local alien cache (e.g., /tmp or /dev/shm)
# WARNING: This directory needs to exist on the nodes where you will mount or you will
#          get a binding error from Singularity!
LOCALSPACE="/tmp"

# Chose the Singularity image to use
STACK="2020.12"
SINGULARITY_REMOTE="client-pilot:centos7-$(uname -m)"

#########################################################################
# Variables below this point can be changed (but they don't need to be) #
#########################################################################

SINGULARITY_IMAGE="$SHAREDSPACE/$MYGROUP/$MYUSER/${SINGULARITY_REMOTE/:/_}.sif"

# Set text colours for info on commands being run
YELLOW='\033[0;33m'
NC='\033[0m' # No Color

# Make the directory structures
SINGULARITY_CVMFS_ALIEN="$SHAREDSPACE/$MYGROUP/alien_$STACK"
mkdir -p $SINGULARITY_CVMFS_ALIEN

SINGULARITY_HOMEDIR="$SHAREDSPACE/$MYGROUP/$MYUSER/home"
mkdir -p $SINGULARITY_HOMEDIR

##################################################
# No more variable definitions beyond this point #
##################################################

# Pull the container
if [ ! -f $SINGULARITY_IMAGE ]; then
    echo -e "${YELLOW}\nPulling singularity image\n${NC}"
    singularity pull $SINGULARITY_IMAGE docker://eessi/$SINGULARITY_REMOTE
fi

# Create a default.local file in the users home
# We use a tiered cache, with a shared alien cache and a local alien cache.
# We populate the shared alien cache and that is used to fill the local
# alien cache (which is usually in a space that gets cleaned up like /tmp or /dev/shm)
if [ ! -f $SINGULARITY_HOMEDIR/default.local ]; then
    echo -e "${YELLOW}\nCreating CVMFS configuration for shared and local alien caches\n${NC}"
    echo "# Custom settings" > $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_WORKSPACE=/var/lib/cvmfs" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_PRIMARY=hpc" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_hpc_TYPE=tiered" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_hpc_UPPER=local" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_hpc_LOWER=alien" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_hpc_LOWER_READONLY=yes" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_local_TYPE=posix" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_local_SHARED=no" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_local_QUOTA_LIMIT=-1" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_local_ALIEN=\"/local_alien\"" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_alien_TYPE=posix" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_alien_SHARED=no" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_alien_QUOTA_LIMIT=-1" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_alien_ALIEN=\"/shared_alien\"" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_HTTP_PROXY=\"INVALID-PROXY\"" >> $SINGULARITY_HOMEDIR/default.local
fi


# Environment variables
export EESSI_CONFIG="container:cvmfs2 cvmfs-config.eessi-hpc.org /cvmfs/cvmfs-config.eessi-hpc.org"
export EESSI_PILOT="container:cvmfs2 pilot.eessi-hpc.org /cvmfs/pilot.eessi-hpc.org"
export SINGULARITY_HOME="$SINGULARITY_HOMEDIR:/home/$MYUSER"
export SINGULARITY_SCRATCH="/var/lib/cvmfs,/var/run/cvmfs"
export SINGULARITY_BIND="$SINGULARITY_CVMFS_ALIEN:/shared_alien,$LOCALSPACE:/local_alien"

# Create a dirTab file so we only cache the stack we are interested in using
if [ ! -f $SINGULARITY_HOMEDIR/dirTab.$STACK ]; then
    # We will only use this workspace until we have built our dirTab file
    # (this is required because the Singularity scratch dirs are just 16MB,
    #  i.e., not enough to cache what we need to run the python script below.
    #  Once we have an alien cache this is no longer a concern)
    export SINGULARITY_WORKDIR=$(mktemp  -d)

    platform=$(uname -m)
    if [[ $(uname -s) == 'Linux' ]]; then
        os_type='linux'
    else
        os_type='macos'
    fi

    # Find out which software directory we should be using (grep used to filter warnings)
    arch_dir=$(singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" $SINGULARITY_IMAGE /cvmfs/pilot.eessi-hpc.org/${STACK}/compat/${os_type}/${platform}/usr/bin/python3 /cvmfs/pilot.eessi-hpc.org/${STACK}/init/eessi_software_subdir_for_host.py /cvmfs/pilot.eessi-hpc.org/${STACK} | grep ${platform})

    # Construct our dirTab the alien cache is populated with the software we require
    echo -e "${YELLOW}\nCreating CVMFS dirTab for $STACK alien cache\n${NC}"
    echo "/$STACK/init" > $SINGULARITY_HOMEDIR/dirTab.$STACK
    echo "/$STACK/tests" >> $SINGULARITY_HOMEDIR/dirTab.$STACK
    echo "/$STACK/compat/${os_type}/${platform}" >> $SINGULARITY_HOMEDIR/dirTab.$STACK
    echo "/$STACK/software/${arch_dir}" >> $SINGULARITY_HOMEDIR/dirTab.$STACK

    # Now clean up the workspace
    rm -r $SINGULARITY_WORKDIR
    unset SINGULARITY_WORKDIR   
fi

# Download the script for populating the alien cache
if [ ! -f $SINGULARITY_HOMEDIR/cvmfs_preload ]; then
    echo -e "${YELLOW}\nGetting CVMFS preload script\n${NC}"
    singularity exec $SINGULARITY_IMAGE curl https://cvmrepo.web.cern.ch/cvmrepo/preload/cvmfs_preload -o /home/$MYUSER/cvmfs_preload
fi

# Get the public keys for our repos
if [ ! -f $SINGULARITY_HOMEDIR/pilot.eessi-hpc.org.pub ]; then
    echo -e "${YELLOW}\nGetting CVMFS repositories public keys\n${NC}"

    export SINGULARITY_WORKDIR=$(mktemp  -d)
    singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" $SINGULARITY_IMAGE cp /cvmfs/cvmfs-config.eessi-hpc.org/etc/cvmfs/keys/eessi-hpc.org/pilot.eessi-hpc.org.pub /home/$MYUSER/
    singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" $SINGULARITY_IMAGE cp /etc/cvmfs/keys/eessi-hpc.org/cvmfs-config.eessi-hpc.org.pub /home/$MYUSER/

    # Now clean up the workspace
    rm -r $SINGULARITY_WORKDIR
    unset SINGULARITY_WORKDIR   
fi

# Populate the alien cache (the connections to these can fail and may need to be restarted)
#  (A note here: this is an expensive operation and puts a heavy load on the Stratum 0. From the developers:
#   "With the -u <url> preload parameter, you can switch between stratum 0 and stratum 1 as necessary. I'd not
#    necessarily use the stratum 1 for the initial snapshot though because the replication thrashes the stratum
#    1 cache. Instead, for preloading I'd recommend to establish a dedicated URL. This URL can initially be simply
#    an alias to the stratum 0. As long as there are only a handful of preload destinations, that should work fine. If
#    more sites preload, this URL can turn into a dedicated stratum 1 or a large cache in front of the stratum 0.
#   ")
echo -e "${YELLOW}\nPopulating CVMFS alien cache\n${NC}"
singularity exec $SINGULARITY_IMAGE sh /home/$MYUSER/cvmfs_preload -u http://cvmfs-s0.eessi-hpc.org/cvmfs/cvmfs-config.eessi-hpc.org  -r /shared_alien -k /home/$MYUSER/cvmfs-config.eessi-hpc.org.pub 
# We use the dirTab file for the software repo to limit what we pull in
singularity exec $SINGULARITY_IMAGE sh /home/$MYUSER/cvmfs_preload -u http://cvmfs-s0.eessi-hpc.org/cvmfs/pilot.eessi-hpc.org  -r /shared_alien -k /home/$MYUSER/pilot.eessi-hpc.org.pub -d /home/$MYUSER/dirTab.$STACK 

# Now that we have a populated alien cache we can use it
export SINGULARITY_BIND="$SINGULARITY_CVMFS_ALIEN:/shared_alien,$LOCALSPACE:/local_alien,$SINGULARITY_HOMEDIR/default.local:/etc/cvmfs/default.local"

# Get a shell
echo -e "${YELLOW}\nTo get a shell inside a singularity container (for example), use:\n${NC}"
echo -e "  export EESSI_CONFIG=\"$EESSI_CONFIG\""
echo -e "  export EESSI_PILOT=\"$EESSI_PILOT\""
echo -e "  export SINGULARITY_HOME=\"$SINGULARITY_HOME\""
echo -e "  export SINGULARITY_BIND=\"$SINGULARITY_BIND\""
echo -e "  export SINGULARITY_SCRATCH=\"/var/lib/cvmfs,/var/run/cvmfs\""
echo -e "  singularity shell --fusemount \"\$EESSI_CONFIG\" --fusemount \"\$EESSI_PILOT\" $SINGULARITY_IMAGE"

Automatically push files to CVMFS repositories

There are multiple use cases in basically all layers where we would need to push new/modified files to /cvmfs. For instance, in case of updates of configuration files in the cvmfs-config repo, EESSI init scripts, files in the Prefix installation, etc. This is now being done manually, but we need to come up with a procedure/mechanism to push files into the repository automatically.

A suggestion from @boegel is to clone the Git repositories in a hidden directory, and make symlinks in the repositories that point to the corresponding files in that hidden dir.

Test procedure for ingesting tarballs with package updates

If we need to patch some package (e.g. Python/glibc) in our compatibility (or software) layer, we need to figure out how to catch the changes on a build node and do the actual change on a publisher node. Modifying and adding files is simple, but the tricky part is removing files.

Ultimately, we can perhaps tar the entire new compatibility layer directory, and ingest that with the -d option to remove the old directory. We need to think about this and test what does (not) work.

Broken CI for building packages

This suddenly stopped working in December, due to an issue with building .deb packages:

/usr/local/bundle/gems/fpm-1.11.0/lib/fpm/package/deb.rb:495:in `block in output': uninitialized constant FPM::Package::Deb::Zlib (NameError)
167	from <internal:kernel>:90:in `tap'
168	from /usr/local/bundle/gems/fpm-1.11.0/lib/fpm/package/deb.rb:494:in `output'
169	from /usr/local/bundle/gems/fpm-1.11.0/lib/fpm/command.rb:487:in `execute'
170	from /usr/local/bundle/gems/clamp-1.0.1/lib/clamp/command.rb:68:in `run'
171	from /usr/local/bundle/gems/fpm-1.11.0/lib/fpm/command.rb:574:in `run'
172	from /usr/local/bundle/gems/clamp-1.0.1/lib/clamp/command.rb:133:in `run'
173	from /usr/local/bundle/gems/fpm-1.11.0/bin/fpm:7:in `<top (required)>'
174	from /usr/local/bundle/bin/fpm:23:in `load'
175	from /usr/local/bundle/bin/fpm:23:in `<main>'

We're using this Github Action:
https://github.com/bpicode/github-action-fpm
This wasn't changed, and it's still using the same fpm version. However, the base layer with Ruby was changed in December because of a new Ruby release (3.0.0), so I suspect this is related. Strangely, I now seem to have the issue too when doing some manually tests with an older Ruby container...

The same issue is discussed here:
jordansissel/fpm#1739

And there's a PR for fixing the issue in fpm:
jordansissel/fpm#1740
But that's still waiting to be merged...

(Re)configure garbage collection / automatic tagging

Our playbook currently enables garbage collection (CVMFS_GARBAGE_COLLECTION=true + cronjobs), but also automatic tagging (CVMFS_AUTO_TAG=true). However, those tags never got cleaned up. I manually changed this for now by setting setting a retention period using CVMFS_AUTO_TAG_TIMESPAN="2 weeks ago.

Especially for our production repo we should think about what kind of garbage collection (there's also a CVMFS_AUTO_GC) and automatic tagging (can also be disabled completely) we want.

Variant symlink for host_injections

Just like the latest symlink (see #51), we need some kind of automation/configuration management to make/update the (variant) symlink for /cvmfs/pilot.eessi-hpc.org/host_injections. I've created it manually for now using:

ln -s '$(EESSI_HOST_INJECTIONS:-/opt/eessi)' /cvmfs/pilot.eessi-hpc.org/host_injections

Naming scheme for repositories

We have to come up with / agree on a naming scheme that we will use for our repository layout. Currently we only have

  • cvmfs-config.eessi-hpc.org -> contains the config of all repositories and our eessi-hpc.org CVMFS "domain"
  • pilot.eessi-hpc.org -> pilot repository for playing around with CVMFS, Easybuild, etc

In the future we will need at least repositories for production and development/testing, so it could be something like (as suggested by @boegel):

  • prod.eessi-hpc.org (or perhaps soft(ware).eessi-hpc.org / repo.eessi-hpc.org)
  • dev.eessi-hpc.org

More ideas are welcome!

CentOS fallout

With people now moving to CentOS alternatives, config managers are slow to catch up.

Fresh example: Rocky 8.4 only offers centos-release-ansible-29 and this version of ansible doesn't yet recognize Rocky as a redhat family distro. I'm told that ansible 2.11 or newer is required for this. Which is worth a sentence or two in our current docs until this is sorted out out of the box.

Set up yum/apt repositories for client packages

This makes it much easier to get the latest version in, for instance, scripts / container definition files; now we have to change the URL every time we update the packages. Also, updating clients will be easier (by using their package managers).

Readme: Stratum1 verification is outdated

README.md mentions this curl to verify operation of stratum1:
curl --proxy http://url-to-your-proxy:3128 --head http://url-to-your-stratum1/cvmfs/cvmfs-config.eessi-hpc.org/.cvmfspublished
Since cvmfs-config is not in use anymore, this returns 404 and causes some initial confusion as to whether things are working correctly or not.
Please update this line to point to one of the current eessi cvmfs entries (ci or pilot).

Stratum1 data location

While deploying stratum1 system I learned that its data actually ends up in /srv.
roles/galaxyproject.cvmfs/defaults/main.yml actually offers to handle cvmfs_srv_device and cvmfs_srv_mount but it's commented out by default. Would be nice if a sentence in docs points to that setting so one can properly set it up before first ansible run fills up root partition ;)

Issue starting the EESSI container

When trying torun the EESSI container with

mkdir -p $TMPDIR/{var-lib-cvmfs,var-run-cvmfs,home}

export SINGULARITY_BIND="$TMPDIR/var-run-cvmfs:/var/run/cvmfs,$TMPDIR/var-lib-cvmfs:/var/lib/cvmfs"
export SINGULARITY_HOME="$TMPDIR/home:/home/$USER"

export EESSI_CONFIG="container:cvmfs2 cvmfs-config.eessi-hpc.org /cvmfs/cvmfs-config.eessi-hpc.org"
export EESSI_PILOT="container:cvmfs2 pilot.eessi-hpc.org /cvmfs/pilot.eessi-hpc.org"

singularity shell --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" docker://eessi/client-pilot:centos7-$(uname -m)-2020.10

I ran into this:

CernVM-FS: loading Fuse module... Failed to initialize shared lru cache (12 - quota init failure)

It seems to suggest a quota issue, but my quota seem fine... After clearing out my $TMPDIR, the container starts fine. I think some previous versions of var-lib-cvmfs,var-run-cvmfs,home were left in the $TMPDIR from a previous mount.

I figured I'd just share the issue - and solution - here in case others run into it. We could consider adding a line in the instructions that clears those dirs from $TMPDIR before creating new ones...

setuid binaries in cvmfs

Hi,

I am opening this issue as requested by @boegel just to keep track of this doubt for others and maybe for the EESSI docs.

I was not sure how setuid binaries are handled by cvmfs and checking the cvmfs docs I found this:

CVMFS_SUID If set to yes, enable suid magic on the mounted repository. Requires mounting as root.

According to official docs setuid binaries in cvmfs are disabled by default. To enable setuid you need to use CVMFS_SUID=yes and mount the cvmfs file system as root

Stratum 1 on CentOS 8 installation messages

running the playbook without an user:

Error:
root@stratum filesystem-layer]# ansible-playbook -i hosts -b -K stratum1.yml
BECOME password:

PLAY [CVMFS Stratum 1] *************************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************************************
fatal: [stratum.home.local]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: [email protected]: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).", "unreachable": true}

PLAY RECAP *************************************************************************************************************************************
stratum.home.local : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0

Running the playbook with u .
[root@stratum filesystem-layer]# ansible-playbook -i hosts -b -K stratum1.yml -u dell --ask-pass
SSH password:
BECOME password[defaults to SSH password]:

PLAY [CVMFS Stratum 1] *************************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************************************
ok: [stratum.home.local]

TASK [geerlingguy.repo-epel : Check if EPEL repo is already configured.] ***********************************************************************
ok: [stratum.home.local]

TASK [geerlingguy.repo-epel : Install EPEL repo.] **********************************************************************************************
skipping: [stratum.home.local]

TASK [geerlingguy.repo-epel : Import EPEL GPG key.] ********************************************************************************************
skipping: [stratum.home.local]

TASK [geerlingguy.repo-epel : Disable Main EPEL repo.] *****************************************************************************************
skipping: [stratum.home.local]

TASK [cvmfs : Set OS-specific variables] *******************************************************************************************************
ok: [stratum.home.local]

TASK [cvmfs : Set facts for Galaxy CVMFS config repository, if enabled] ************************************************************************
ok: [stratum.home.local]

TASK [cvmfs : Set facts for Galaxy CVMFS static repositories, if enabled] **********************************************************************
skipping: [stratum.home.local]

TASK [cvmfs : Set facts for CVMFS config repository, if enabled] *******************************************************************************
ok: [stratum.home.local]

TASK [cvmfs : include_tasks] *******************************************************************************************************************
skipping: [stratum.home.local]

TASK [cvmfs : include_tasks] *******************************************************************************************************************
included: /root/Downloads/filesystem-layer/roles/cvmfs/tasks/stratum1.yml for stratum.home.local

TASK [cvmfs : Include initial OS-specific tasks] ***********************************************************************************************
included: /root/Downloads/filesystem-layer/roles/cvmfs/tasks/init_redhat.yml for stratum.home.local

TASK [cvmfs : Configure CernVM yum repositories] ***********************************************************************************************
ok: [stratum.home.local] => (item={'name': 'cernvm', 'description': 'CernVM packages', 'baseurl': 'http://cvmrepo.web.cern.ch/cvmrepo/yum/cvmfs/EL/$releasever/$basearch/'})
ok: [stratum.home.local] => (item={'name': 'cernvm-config', 'description': 'CernVM-FS extra config packages', 'baseurl': 'http://cvmrepo.web.cern.ch/cvmrepo/yum/cvmfs-config/EL/$releasever/$basearch/'})

TASK [cvmfs : Install CernVM yum key] **********************************************************************************************************
ok: [stratum.home.local]

TASK [cvmfs : Install CernVM-FS packages and dependencies (yum)] *******************************************************************************
fatal: [stratum.home.local]: FAILED! => {"changed": false, "failures": ["No package mod_wsgi available."], "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}

PLAY RECAP *************************************************************************************************************************************
stratum.home.local : ok=9 changed=0 unreachable=0 failed=1 skipped=5 rescued=0 ignored=0

Running the playbook with -vvv
gives more info aroun the modules, but with module do we need installed before we run the playbook

fatal: [stratum.home.local]: FAILED! => {
"changed": false,
"failures": [
"No package mod_wsgi available."
],
"invocation": {
"module_args": {
"allow_downgrade": false,
"autoremove": false,
"bugfix": false,
"conf_file": null,
"disable_excludes": null,
"disable_gpg_check": false,
"disable_plugin": [],
"disablerepo": [],
"download_dir": null,
"download_only": false,
"enable_plugin": [],
"enablerepo": [],
"exclude": [],
"install_repoquery": true,
"install_weak_deps": true,
"installroot": "/",
"list": null,
"lock_timeout": 30,
"name": [
"httpd",
"mod_wsgi",
"squid",
"cvmfs-server",
"cvmfs-config-default"
],
"releasever": null,
"security": false,
"skip_broken": false,
"state": "present",
"update_cache": false,
"update_only": false,
"validate_certs": true
}
},
"msg": "Failed to install some of the specified packages",
"rc": 1,
"results": []
}

Any info is welcome

The test script for wsgi works fine, which module do we need here?

Add support for testing different CVMFS setups in the docker images for the pilot.

For testing CVMFS setups in the pilot, it may be a good idea for the docker images to offer three modes for setting up CVMFS:

  1. Bind to stratum 0 (if that fails, we have issues)
  2. Bind to a specific stratum 1 to test it, and
  3. Go “live” and use Geo-location to pick the closest one.

Selecting one of these modes via an environment variable or similar would allow us to test our CVMFS setup with (relative) ease.

Switch to https

The Ansible role already seems to support this, so we should try/test it.

Share masterkey between EESSI repos and make config package for the entire domain

During the monthly CVMFS coordination meeting it was mentioned that you can make a client package (without using a cvmfs-config repo) that still allows you to easily add more repos, as long as they are under the same domain and share the masterkey.
The documentation (see https://cvmfs.readthedocs.io/en/stable/cpt-repo.html#master-keys) also says something about this:

Each cvmfs repository uses two sets of keys, one for the individual repository and another called
the “masterkey” which signs the repository key. The pub key that corresponds to the masterkey
is what needs to be distributed to clients to verify the authenticity of the repository.
It is usually most convenient to share the masterkey between all repositories in a domain
so new repositories can be added without updating the client configurations.

We should change this for our repos, as they now have their own masterkeys.

Setting up cvmfs on MacOS.

  1. mkdir -p /Users/Shared/cvmfs/{cvmfs-config,pilot}.eessi-hpc.org
  2. Install https://github.com/osxfuse/osxfuse/releases/download/osxfuse-3.11.2/osxfuse-3.11.2.dmg. Nothing newer.
  3. curl -o ~/Downloads/cvmfs-2.7.5.pkg https://ecsft.cern.ch/dist/cvmfs/cvmfs-2.7.5/cvmfs-2.7.5.pkg ; installer -pkg ~/Downloads/cvmfs-2.7.5.pkg -target /

Note, installing the CVMFS package creates /cvmfs as a symlink to /Users/Shared/cvmfs by utilising /etc/synthetic.conf. If you want to have a MacOS build node and don't want to install CVMFS, you can do this yourself: https://derflounder.wordpress.com/2020/01/18/creating-root-level-directories-and-symbolic-links-on-macos-catalina/

But, please note that after editing /etc/synthetic.conf you have to restart for the changes to take effect, so...

  1. Reboot
  2. curl -o ~/Downloads/config-eessi-0.2.3.pkg https://github.com/EESSI/filesystem-layer/releases/download/v0.2.3/cvmfs-config-eessi-0.2.3.pkg ; installer -pkg ~/Downloads/cvmfs-config-eessi-0.2.3.pkg -target /
  3. sudo mount -t cvmfs cvmfs-config.eessi-hpc.org /Users/Shared/cvmfs/cvmfs-config.eessi-hpc.org
  4. sudo mount -t cvmfs pilot.eessi-hpc.org /Users/Shared/cvmfs/pilot.eessi-hpc.org

EESSI Clients Repository

Can we create a EESSI Client Repository for Client installations for Linux, macOS and WSL. So people can test WSL or macOS (which are not part of the current Pilot)

Document the CVMFS config.

The contents of our cvmfs-config-eessi-packages are good to have, but some people (like on Mac) may want to create the config themselves. Or do we want to offer a pkg?

Filesystem requirements

What are the requirements for installing the stratum0/1

Is it only ansable
We need also Open SSH server and client on the server.
Do we need more?

Got some ssh errors when I was running "ansible-playbook -i hosts -b -K stratum1.yml"

CI for testing the client packages

We already have CI for building the client packages (deb and rpm), but we also need something that actually tests them to prevent silly mistakes as #47.

GEO API required?

If cvmfs_geo_license_key is not set, the add-replica command on the stratum1 server fails, because CVMFS_GEO_LICENSE_KEY is not set. This shouldn't be a required option, though.

Monitor CVMFS infrastructure

We had an issue with one of our Stratum 1 servers this week, which caused it to serve an older tag of the repository. This made me realize again that we should think about setting up some monitoring dashboard that gives an overview and statistics of our infrastructure, sends out alerts when something is wrong, etc. One way to easily grab some information about Stratum 1s is by reading out the .cvmfspublished file (and maybe the .cvmfs_last_snapshot too); the structure of that file is explained here in the docs.

BECOME password - running stratum1.yml

When I run $ ansible-playbook -i hosts -b -K stratum1.yml it's asking me to enter the password for BECOME

BECOME password:

I don't see any information around this account or did I miss something?

WSL with Ubuntu 16.04- running playbook gives an Error

Running WSL with on Windows 10 client, the following erro pops up:

ERROR! no action detected in task

The error appears to have been in '/home/dell/filesystem-layer/roles/cvmfs/tasks/main.yml': line 25, column 3, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:

  • include_tasks: client.yml
    ^ here

Problem: Ansible on WSL - with Ubuntu 16.04 has version 2.0.0.2 installed. Upgrade ansible to 2.9 problem will be solved.

Is it possible to display a message if we find the wrong version of ansible is installed on the system?

Container build workflow not triggered by release

The workflow does have:

  release:
    types: [published]

But this doesn't seem to work. Probably caused by:

When you use the repository's GITHUB_TOKEN to perform tasks on behalf of the GitHub Actions app, events triggered by the GITHUB_TOKEN will not create a new workflow run. This prevents you from accidentally creating recursive workflow runs.

We need to find some other way to automatically build new containers after the release (and, hence, the new client packages) has been published.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.