eessi / filesystem-layer Goto Github PK
View Code? Open in Web Editor NEWFilesystem layer of the EESSI project
Home Page: https://eessi.github.io/docs/filesystem_layer
License: GNU General Public License v2.0
Filesystem layer of the EESSI project
Home Page: https://eessi.github.io/docs/filesystem_layer
License: GNU General Public License v2.0
Now that we have a 2nd Stratum 1, we should deploy updated configuration files to the cvmfs-config repo, and build new client packages (by making a new release).
The Provides
will ensure that our packages get recognized as a cvmfs-config
package (which is a requirement for the client), and Conflicts
can prevent it from being installed at the same time as another cvmfs-config
package.
There's already documentation for deploying a Stratum 1 server, but we should add the procedure to really add a Stratum 1 to the project, i.e. something like:
According to @rptaylor there's a recommendation for CVMFS repos to use a single directory at the top level, for example /cvmfs/pilot.eessi-hpc.org/repo/
, and only start using multiple subdirectories below that.
To quote @rptaylor:
It provides some options for aliasing (or possible future migrations) , e.g. with symlinks. It is analogous
to putting a service into production using a DNS CNAME instead of the real server name, so that you can
change the backend system without disrupting users, (or at least it gives some options and flexibility).
There might be other reasons as well that I am not aware of. All the major CERN repos are set up this way.
This also has the funny effect of adding sss
as a default configuration option, which failed to work on SAGA in Norway. The solution was to disable sss
in /etc/nsswitch.conf
. Also, on a system using systemd, automount may be deprecated for .mount
-files: https://www.freedesktop.org/software/systemd/man/systemd.mount.html
As part of #37 I was testing using that setup to run GROMACS. Things work fine if I don't use too many MPI tasks per node, but once I go above 4 I'm getting errors:
[ocais1@juwels03 test]$ OMP_NUM_THREADS=6 srun --time=00:05:00 --nodes=1 --ntasks-per-node=6 --cpus-per-task=6 singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" /p/project/cecam/singularity/cecam/ocais1/client-pilot_centos7-2020.08.sif /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi mdrun -s ion_channel.tpr -maxh 0.50 -resethway -noconfout -nsteps 1000 -g logfile
srun: job 2622253 queued and waiting for resources
srun: job 2622253 has been allocated resources
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
CernVM-FS: pre-mounted on file descriptor 3
Failed to initialize loader socket
Failed to initialize loader socket
Failed to initialize loader socket
FATAL: stat /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi: transport endpoint is not connected
FATAL: stat /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi: transport endpoint is not connected
FATAL: stat /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi: transport endpoint is not connected
FATAL: stat /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi: transport endpoint is not connected
FATAL: stat /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi: transport endpoint is not connected
FATAL: stat /cvmfs/pilot.eessi-hpc.org/2020.08/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi: transport endpoint is not connected
CernVM-FS: loading Fuse module... CernVM-FS: loading Fuse module... CernVM-FS: loading Fuse module... CernVM-FS: loading Fuse module... CernVM-FS: loading Fuse module... srun: error: jwc04n178: tasks 0-5: Exited with exit code 255
I suspect the alien cache right now is not enough and we also need a local cache on the node for this use case
It would be nice to have a latest
(-> 2020.10
right now) symlink in /cvmfs/pilot.eessi-hpc.org/
so that we don't need to continuously update scripts. This will also facilitate integration in other tools like Magic Castle.
Having a preview
symlink for the next iteration of the stack would also be useful.
The config repository does not seem to be added as a replica to the stratum 1, though the clients are configured to use the stratum 1 as server. We should add them automatically.
We already have an open issue to discuss the naming scheme for the repositories (see #4), but inside the prod/test repositories we need to come up with / decide on a directory structure.
This is the concept we showed during the EESSI & CVMFS meeting:
.
└── cvmfs/
├── cvmfs-config.eessi-hpc.org/
│ └── etc/
│ └── cvmfs/
├── test.eessi-hpc.org
└── prod.eessi-hpc.org/
└── 2020.06/
├── compat/
│ ├── aarch64
│ ├── ppc64le
│ └── x86_64/
│ ├── bin
│ ├── etc
│ ├── lib64
│ └── usr
└── software/
├── aarch64
├── ppc64le
└── x86_64/
├── amd/
│ └── zen2
└── intel/
├── haswell
└── skylake/
├── modules
└── software/
├── GROMACS/
│ ├── 2019.3-fosscuda-2019b
│ └── 2020.1-foss-2020a-Python-3.8.2
└── TensorFlow/
└── 2.2.0-fosscuda-2019b-Python-3.7.4
(edit it here)
With variant symlinks we can point a fixed directory to the right tree, based on the client's architecture.
The Ansible CVMFS role that we use (https://github.com/galaxyproject/ansible-cvmfs) doesn't seem to support this.
This is a (basic) script to create and populate a CVMFS alien cache for use on systems that do not have access to the internet (but you do require internet access to perform the initial run of the script).
# Set group (as required, useful if you would like to share the cache with others)
MYGROUP=$GROUPS
# Set user
MYUSER=$USER
# Set path to shared space
SHAREDSPACE="/path/to/shared/space"
# Set path to (node) local space to store a local alien cache (e.g., /tmp or /dev/shm)
# WARNING: This directory needs to exist on the nodes where you will mount or you will
# get a binding error from Singularity!
LOCALSPACE="/tmp"
# Chose the Singularity image to use
STACK="2020.12"
SINGULARITY_REMOTE="client-pilot:centos7-$(uname -m)"
#########################################################################
# Variables below this point can be changed (but they don't need to be) #
#########################################################################
SINGULARITY_IMAGE="$SHAREDSPACE/$MYGROUP/$MYUSER/${SINGULARITY_REMOTE/:/_}.sif"
# Set text colours for info on commands being run
YELLOW='\033[0;33m'
NC='\033[0m' # No Color
# Make the directory structures
SINGULARITY_CVMFS_ALIEN="$SHAREDSPACE/$MYGROUP/alien_$STACK"
mkdir -p $SINGULARITY_CVMFS_ALIEN
SINGULARITY_HOMEDIR="$SHAREDSPACE/$MYGROUP/$MYUSER/home"
mkdir -p $SINGULARITY_HOMEDIR
##################################################
# No more variable definitions beyond this point #
##################################################
# Pull the container
if [ ! -f $SINGULARITY_IMAGE ]; then
echo -e "${YELLOW}\nPulling singularity image\n${NC}"
singularity pull $SINGULARITY_IMAGE docker://eessi/$SINGULARITY_REMOTE
fi
# Create a default.local file in the users home
# We use a tiered cache, with a shared alien cache and a local alien cache.
# We populate the shared alien cache and that is used to fill the local
# alien cache (which is usually in a space that gets cleaned up like /tmp or /dev/shm)
if [ ! -f $SINGULARITY_HOMEDIR/default.local ]; then
echo -e "${YELLOW}\nCreating CVMFS configuration for shared and local alien caches\n${NC}"
echo "# Custom settings" > $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_WORKSPACE=/var/lib/cvmfs" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_PRIMARY=hpc" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_hpc_TYPE=tiered" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_hpc_UPPER=local" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_hpc_LOWER=alien" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_hpc_LOWER_READONLY=yes" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_local_TYPE=posix" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_local_SHARED=no" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_local_QUOTA_LIMIT=-1" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_local_ALIEN=\"/local_alien\"" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_alien_TYPE=posix" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_alien_SHARED=no" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_alien_QUOTA_LIMIT=-1" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_CACHE_alien_ALIEN=\"/shared_alien\"" >> $SINGULARITY_HOMEDIR/default.local
echo "CVMFS_HTTP_PROXY=\"INVALID-PROXY\"" >> $SINGULARITY_HOMEDIR/default.local
fi
# Environment variables
export EESSI_CONFIG="container:cvmfs2 cvmfs-config.eessi-hpc.org /cvmfs/cvmfs-config.eessi-hpc.org"
export EESSI_PILOT="container:cvmfs2 pilot.eessi-hpc.org /cvmfs/pilot.eessi-hpc.org"
export SINGULARITY_HOME="$SINGULARITY_HOMEDIR:/home/$MYUSER"
export SINGULARITY_SCRATCH="/var/lib/cvmfs,/var/run/cvmfs"
export SINGULARITY_BIND="$SINGULARITY_CVMFS_ALIEN:/shared_alien,$LOCALSPACE:/local_alien"
# Create a dirTab file so we only cache the stack we are interested in using
if [ ! -f $SINGULARITY_HOMEDIR/dirTab.$STACK ]; then
# We will only use this workspace until we have built our dirTab file
# (this is required because the Singularity scratch dirs are just 16MB,
# i.e., not enough to cache what we need to run the python script below.
# Once we have an alien cache this is no longer a concern)
export SINGULARITY_WORKDIR=$(mktemp -d)
platform=$(uname -m)
if [[ $(uname -s) == 'Linux' ]]; then
os_type='linux'
else
os_type='macos'
fi
# Find out which software directory we should be using (grep used to filter warnings)
arch_dir=$(singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" $SINGULARITY_IMAGE /cvmfs/pilot.eessi-hpc.org/${STACK}/compat/${os_type}/${platform}/usr/bin/python3 /cvmfs/pilot.eessi-hpc.org/${STACK}/init/eessi_software_subdir_for_host.py /cvmfs/pilot.eessi-hpc.org/${STACK} | grep ${platform})
# Construct our dirTab the alien cache is populated with the software we require
echo -e "${YELLOW}\nCreating CVMFS dirTab for $STACK alien cache\n${NC}"
echo "/$STACK/init" > $SINGULARITY_HOMEDIR/dirTab.$STACK
echo "/$STACK/tests" >> $SINGULARITY_HOMEDIR/dirTab.$STACK
echo "/$STACK/compat/${os_type}/${platform}" >> $SINGULARITY_HOMEDIR/dirTab.$STACK
echo "/$STACK/software/${arch_dir}" >> $SINGULARITY_HOMEDIR/dirTab.$STACK
# Now clean up the workspace
rm -r $SINGULARITY_WORKDIR
unset SINGULARITY_WORKDIR
fi
# Download the script for populating the alien cache
if [ ! -f $SINGULARITY_HOMEDIR/cvmfs_preload ]; then
echo -e "${YELLOW}\nGetting CVMFS preload script\n${NC}"
singularity exec $SINGULARITY_IMAGE curl https://cvmrepo.web.cern.ch/cvmrepo/preload/cvmfs_preload -o /home/$MYUSER/cvmfs_preload
fi
# Get the public keys for our repos
if [ ! -f $SINGULARITY_HOMEDIR/pilot.eessi-hpc.org.pub ]; then
echo -e "${YELLOW}\nGetting CVMFS repositories public keys\n${NC}"
export SINGULARITY_WORKDIR=$(mktemp -d)
singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" $SINGULARITY_IMAGE cp /cvmfs/cvmfs-config.eessi-hpc.org/etc/cvmfs/keys/eessi-hpc.org/pilot.eessi-hpc.org.pub /home/$MYUSER/
singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" $SINGULARITY_IMAGE cp /etc/cvmfs/keys/eessi-hpc.org/cvmfs-config.eessi-hpc.org.pub /home/$MYUSER/
# Now clean up the workspace
rm -r $SINGULARITY_WORKDIR
unset SINGULARITY_WORKDIR
fi
# Populate the alien cache (the connections to these can fail and may need to be restarted)
# (A note here: this is an expensive operation and puts a heavy load on the Stratum 0. From the developers:
# "With the -u <url> preload parameter, you can switch between stratum 0 and stratum 1 as necessary. I'd not
# necessarily use the stratum 1 for the initial snapshot though because the replication thrashes the stratum
# 1 cache. Instead, for preloading I'd recommend to establish a dedicated URL. This URL can initially be simply
# an alias to the stratum 0. As long as there are only a handful of preload destinations, that should work fine. If
# more sites preload, this URL can turn into a dedicated stratum 1 or a large cache in front of the stratum 0.
# ")
echo -e "${YELLOW}\nPopulating CVMFS alien cache\n${NC}"
singularity exec $SINGULARITY_IMAGE sh /home/$MYUSER/cvmfs_preload -u http://cvmfs-s0.eessi-hpc.org/cvmfs/cvmfs-config.eessi-hpc.org -r /shared_alien -k /home/$MYUSER/cvmfs-config.eessi-hpc.org.pub
# We use the dirTab file for the software repo to limit what we pull in
singularity exec $SINGULARITY_IMAGE sh /home/$MYUSER/cvmfs_preload -u http://cvmfs-s0.eessi-hpc.org/cvmfs/pilot.eessi-hpc.org -r /shared_alien -k /home/$MYUSER/pilot.eessi-hpc.org.pub -d /home/$MYUSER/dirTab.$STACK
# Now that we have a populated alien cache we can use it
export SINGULARITY_BIND="$SINGULARITY_CVMFS_ALIEN:/shared_alien,$LOCALSPACE:/local_alien,$SINGULARITY_HOMEDIR/default.local:/etc/cvmfs/default.local"
# Get a shell
echo -e "${YELLOW}\nTo get a shell inside a singularity container (for example), use:\n${NC}"
echo -e " export EESSI_CONFIG=\"$EESSI_CONFIG\""
echo -e " export EESSI_PILOT=\"$EESSI_PILOT\""
echo -e " export SINGULARITY_HOME=\"$SINGULARITY_HOME\""
echo -e " export SINGULARITY_BIND=\"$SINGULARITY_BIND\""
echo -e " export SINGULARITY_SCRATCH=\"/var/lib/cvmfs,/var/run/cvmfs\""
echo -e " singularity shell --fusemount \"\$EESSI_CONFIG\" --fusemount \"\$EESSI_PILOT\" $SINGULARITY_IMAGE"
There are multiple use cases in basically all layers where we would need to push new/modified files to /cvmfs. For instance, in case of updates of configuration files in the cvmfs-config repo, EESSI init scripts, files in the Prefix installation, etc. This is now being done manually, but we need to come up with a procedure/mechanism to push files into the repository automatically.
A suggestion from @boegel is to clone the Git repositories in a hidden directory, and make symlinks in the repositories that point to the corresponding files in that hidden dir.
If we need to patch some package (e.g. Python/glibc) in our compatibility (or software) layer, we need to figure out how to catch the changes on a build node and do the actual change on a publisher node. Modifying and adding files is simple, but the tricky part is removing files.
Ultimately, we can perhaps tar the entire new compatibility layer directory, and ingest that with the -d
option to remove the old directory. We need to think about this and test what does (not) work.
This suddenly stopped working in December, due to an issue with building .deb packages:
/usr/local/bundle/gems/fpm-1.11.0/lib/fpm/package/deb.rb:495:in `block in output': uninitialized constant FPM::Package::Deb::Zlib (NameError)
167 from <internal:kernel>:90:in `tap'
168 from /usr/local/bundle/gems/fpm-1.11.0/lib/fpm/package/deb.rb:494:in `output'
169 from /usr/local/bundle/gems/fpm-1.11.0/lib/fpm/command.rb:487:in `execute'
170 from /usr/local/bundle/gems/clamp-1.0.1/lib/clamp/command.rb:68:in `run'
171 from /usr/local/bundle/gems/fpm-1.11.0/lib/fpm/command.rb:574:in `run'
172 from /usr/local/bundle/gems/clamp-1.0.1/lib/clamp/command.rb:133:in `run'
173 from /usr/local/bundle/gems/fpm-1.11.0/bin/fpm:7:in `<top (required)>'
174 from /usr/local/bundle/bin/fpm:23:in `load'
175 from /usr/local/bundle/bin/fpm:23:in `<main>'
We're using this Github Action:
https://github.com/bpicode/github-action-fpm
This wasn't changed, and it's still using the same fpm version. However, the base layer with Ruby was changed in December because of a new Ruby release (3.0.0), so I suspect this is related. Strangely, I now seem to have the issue too when doing some manually tests with an older Ruby container...
The same issue is discussed here:
jordansissel/fpm#1739
And there's a PR for fixing the issue in fpm:
jordansissel/fpm#1740
But that's still waiting to be merged...
Our playbook currently enables garbage collection (CVMFS_GARBAGE_COLLECTION=true
+ cronjobs), but also automatic tagging (CVMFS_AUTO_TAG=true
). However, those tags never got cleaned up. I manually changed this for now by setting setting a retention period using CVMFS_AUTO_TAG_TIMESPAN="2 weeks ago
.
Especially for our production repo we should think about what kind of garbage collection (there's also a CVMFS_AUTO_GC
) and automatic tagging (can also be disabled completely) we want.
For easy configuration of clients, see e.g.:
https://ecsft.cern.ch/dist/cvmfs/cvmfs-config/
Just like the latest
symlink (see #51), we need some kind of automation/configuration management to make/update the (variant) symlink for /cvmfs/pilot.eessi-hpc.org/host_injections
. I've created it manually for now using:
ln -s '$(EESSI_HOST_INJECTIONS:-/opt/eessi)' /cvmfs/pilot.eessi-hpc.org/host_injections
We have to come up with / agree on a naming scheme that we will use for our repository layout. Currently we only have
cvmfs-config.eessi-hpc.org
-> contains the config of all repositories and our eessi-hpc.org CVMFS "domain"pilot.eessi-hpc.org
-> pilot repository for playing around with CVMFS, Easybuild, etcIn the future we will need at least repositories for production and development/testing, so it could be something like (as suggested by @boegel):
prod.eessi-hpc.org
(or perhaps soft(ware).eessi-hpc.org / repo.eessi-hpc.org)dev.eessi-hpc.org
More ideas are welcome!
OS: CentOS 8.2.2004
Proxy port is only bound to IPv6.
netstat -nlp | grep 3128
tcp6 0 0 :::3128 :::* LISTEN 1603/(squid-1)
Can we bound to both IPv4 and IPv6?
With people now moving to CentOS alternatives, config managers are slow to catch up.
Fresh example: Rocky 8.4 only offers centos-release-ansible-29 and this version of ansible doesn't yet recognize Rocky as a redhat family distro. I'm told that ansible 2.11 or newer is required for this. Which is worth a sentence or two in our current docs until this is sorted out out of the box.
This makes it much easier to get the latest version in, for instance, scripts / container definition files; now we have to change the URL every time we update the packages. Also, updating clients will be easier (by using their package managers).
The current setup only allows you to specify proxy groups with one entry per group, by using semicolon separated entries for the proxy list. By using pipes, you can also define multiple proxies per group; this will then act as a load-balance group, see:
https://cvmfs.readthedocs.io/en/2.2/cpt-configure.html#proxy-lists
Do we want/need this? Only seems to make sense if you have more than two proxies per site, e.g. two clusters with two proxies per cluster.
README.md mentions this curl to verify operation of stratum1:
curl --proxy http://url-to-your-proxy:3128 --head http://url-to-your-stratum1/cvmfs/cvmfs-config.eessi-hpc.org/.cvmfspublished
Since cvmfs-config is not in use anymore, this returns 404 and causes some initial confusion as to whether things are working correctly or not.
Please update this line to point to one of the current eessi cvmfs entries (ci or pilot).
While deploying stratum1 system I learned that its data actually ends up in /srv.
roles/galaxyproject.cvmfs/defaults/main.yml
actually offers to handle cvmfs_srv_device
and cvmfs_srv_mount
but it's commented out by default. Would be nice if a sentence in docs points to that setting so one can properly set it up before first ansible run fills up root partition ;)
When trying torun the EESSI container with
mkdir -p $TMPDIR/{var-lib-cvmfs,var-run-cvmfs,home}
export SINGULARITY_BIND="$TMPDIR/var-run-cvmfs:/var/run/cvmfs,$TMPDIR/var-lib-cvmfs:/var/lib/cvmfs"
export SINGULARITY_HOME="$TMPDIR/home:/home/$USER"
export EESSI_CONFIG="container:cvmfs2 cvmfs-config.eessi-hpc.org /cvmfs/cvmfs-config.eessi-hpc.org"
export EESSI_PILOT="container:cvmfs2 pilot.eessi-hpc.org /cvmfs/pilot.eessi-hpc.org"
singularity shell --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" docker://eessi/client-pilot:centos7-$(uname -m)-2020.10
I ran into this:
CernVM-FS: loading Fuse module... Failed to initialize shared lru cache (12 - quota init failure)
It seems to suggest a quota issue, but my quota seem fine... After clearing out my $TMPDIR
, the container starts fine. I think some previous versions of var-lib-cvmfs,var-run-cvmfs,home
were left in the $TMPDIR
from a previous mount.
I figured I'd just share the issue - and solution - here in case others run into it. We could consider adding a line in the instructions that clears those dirs from $TMPDIR
before creating new ones...
Hi,
I am opening this issue as requested by @boegel just to keep track of this doubt for others and maybe for the EESSI docs.
I was not sure how setuid binaries are handled by cvmfs and checking the cvmfs docs I found this:
CVMFS_SUID If set to yes, enable suid magic on the mounted repository. Requires mounting as root.
According to official docs setuid binaries in cvmfs are disabled by default. To enable setuid you need to use CVMFS_SUID=yes
and mount the cvmfs file system as root
This can be done by using the following "trick" (thanks @rptaylor!) in the local configuration:
CVMFS_SERVER_URL="http://your-local-s1;$CVMFS_SERVER_URL"
running the playbook without an user:
Error:
root@stratum filesystem-layer]# ansible-playbook -i hosts -b -K stratum1.yml
BECOME password:
PLAY [CVMFS Stratum 1] *************************************************************************************************************************
TASK [Gathering Facts] *************************************************************************************************************************
fatal: [stratum.home.local]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: [email protected]: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).", "unreachable": true}
PLAY RECAP *************************************************************************************************************************************
stratum.home.local : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0
Running the playbook with u .
[root@stratum filesystem-layer]# ansible-playbook -i hosts -b -K stratum1.yml -u dell --ask-pass
SSH password:
BECOME password[defaults to SSH password]:
PLAY [CVMFS Stratum 1] *************************************************************************************************************************
TASK [Gathering Facts] *************************************************************************************************************************
ok: [stratum.home.local]
TASK [geerlingguy.repo-epel : Check if EPEL repo is already configured.] ***********************************************************************
ok: [stratum.home.local]
TASK [geerlingguy.repo-epel : Install EPEL repo.] **********************************************************************************************
skipping: [stratum.home.local]
TASK [geerlingguy.repo-epel : Import EPEL GPG key.] ********************************************************************************************
skipping: [stratum.home.local]
TASK [geerlingguy.repo-epel : Disable Main EPEL repo.] *****************************************************************************************
skipping: [stratum.home.local]
TASK [cvmfs : Set OS-specific variables] *******************************************************************************************************
ok: [stratum.home.local]
TASK [cvmfs : Set facts for Galaxy CVMFS config repository, if enabled] ************************************************************************
ok: [stratum.home.local]
TASK [cvmfs : Set facts for Galaxy CVMFS static repositories, if enabled] **********************************************************************
skipping: [stratum.home.local]
TASK [cvmfs : Set facts for CVMFS config repository, if enabled] *******************************************************************************
ok: [stratum.home.local]
TASK [cvmfs : include_tasks] *******************************************************************************************************************
skipping: [stratum.home.local]
TASK [cvmfs : include_tasks] *******************************************************************************************************************
included: /root/Downloads/filesystem-layer/roles/cvmfs/tasks/stratum1.yml for stratum.home.local
TASK [cvmfs : Include initial OS-specific tasks] ***********************************************************************************************
included: /root/Downloads/filesystem-layer/roles/cvmfs/tasks/init_redhat.yml for stratum.home.local
TASK [cvmfs : Configure CernVM yum repositories] ***********************************************************************************************
ok: [stratum.home.local] => (item={'name': 'cernvm', 'description': 'CernVM packages', 'baseurl': 'http://cvmrepo.web.cern.ch/cvmrepo/yum/cvmfs/EL/$releasever/$basearch/'})
ok: [stratum.home.local] => (item={'name': 'cernvm-config', 'description': 'CernVM-FS extra config packages', 'baseurl': 'http://cvmrepo.web.cern.ch/cvmrepo/yum/cvmfs-config/EL/$releasever/$basearch/'})
TASK [cvmfs : Install CernVM yum key] **********************************************************************************************************
ok: [stratum.home.local]
TASK [cvmfs : Install CernVM-FS packages and dependencies (yum)] *******************************************************************************
fatal: [stratum.home.local]: FAILED! => {"changed": false, "failures": ["No package mod_wsgi available."], "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}
PLAY RECAP *************************************************************************************************************************************
stratum.home.local : ok=9 changed=0 unreachable=0 failed=1 skipped=5 rescued=0 ignored=0
Running the playbook with -vvv
gives more info aroun the modules, but with module do we need installed before we run the playbook
fatal: [stratum.home.local]: FAILED! => {
"changed": false,
"failures": [
"No package mod_wsgi available."
],
"invocation": {
"module_args": {
"allow_downgrade": false,
"autoremove": false,
"bugfix": false,
"conf_file": null,
"disable_excludes": null,
"disable_gpg_check": false,
"disable_plugin": [],
"disablerepo": [],
"download_dir": null,
"download_only": false,
"enable_plugin": [],
"enablerepo": [],
"exclude": [],
"install_repoquery": true,
"install_weak_deps": true,
"installroot": "/",
"list": null,
"lock_timeout": 30,
"name": [
"httpd",
"mod_wsgi",
"squid",
"cvmfs-server",
"cvmfs-config-default"
],
"releasever": null,
"security": false,
"skip_broken": false,
"state": "present",
"update_cache": false,
"update_only": false,
"validate_certs": true
}
},
"msg": "Failed to install some of the specified packages",
"rc": 1,
"results": []
}
Any info is welcome
The test script for wsgi works fine, which module do we need here?
For testing CVMFS setups in the pilot, it may be a good idea for the docker images to offer three modes for setting up CVMFS:
Selecting one of these modes via an environment variable or similar would allow us to test our CVMFS setup with (relative) ease.
We should optimize our settings at some points. One obvious thing to start with, is making the catalog files:
https://cvmfs.readthedocs.io/en/stable/cpt-repo.html#maintaining-a-cernvm-fs-repository
The Ansible role already seems to support this, so we should try/test it.
During the monthly CVMFS coordination meeting it was mentioned that you can make a client package (without using a cvmfs-config repo) that still allows you to easily add more repos, as long as they are under the same domain and share the masterkey.
The documentation (see https://cvmfs.readthedocs.io/en/stable/cpt-repo.html#master-keys) also says something about this:
Each cvmfs repository uses two sets of keys, one for the individual repository and another called
the “masterkey” which signs the repository key. The pub key that corresponds to the masterkey
is what needs to be distributed to clients to verify the authenticity of the repository.
It is usually most convenient to share the masterkey between all repositories in a domain
so new repositories can be added without updating the client configurations.
We should change this for our repos, as they now have their own masterkeys.
mkdir -p /Users/Shared/cvmfs/{cvmfs-config,pilot}.eessi-hpc.org
curl -o ~/Downloads/cvmfs-2.7.5.pkg https://ecsft.cern.ch/dist/cvmfs/cvmfs-2.7.5/cvmfs-2.7.5.pkg ; installer -pkg ~/Downloads/cvmfs-2.7.5.pkg -target /
Note, installing the CVMFS package creates /cvmfs
as a symlink to /Users/Shared/cvmfs
by utilising /etc/synthetic.conf
. If you want to have a MacOS build node and don't want to install CVMFS, you can do this yourself: https://derflounder.wordpress.com/2020/01/18/creating-root-level-directories-and-symbolic-links-on-macos-catalina/
But, please note that after editing /etc/synthetic.conf
you have to restart for the changes to take effect, so...
curl -o ~/Downloads/config-eessi-0.2.3.pkg https://github.com/EESSI/filesystem-layer/releases/download/v0.2.3/cvmfs-config-eessi-0.2.3.pkg ; installer -pkg ~/Downloads/cvmfs-config-eessi-0.2.3.pkg -target /
sudo mount -t cvmfs cvmfs-config.eessi-hpc.org /Users/Shared/cvmfs/cvmfs-config.eessi-hpc.org
sudo mount -t cvmfs pilot.eessi-hpc.org /Users/Shared/cvmfs/pilot.eessi-hpc.org
Can we create a EESSI Client Repository for Client installations for Linux, macOS and WSL. So people can test WSL or macOS (which are not part of the current Pilot)
We should consider adding the necessary EESSI configuration files to https://github.com/cvmfs-contrib/config-repo (as discussed during today's CernVM-FS coordination meeting).
They also provide a yum repository, see https://cvmfs-contrib.github.io/
cc @bedroge
The contents of our cvmfs-config-eessi-
packages are good to have, but some people (like on Mac) may want to create the config themselves. Or do we want to offer a pkg?
What are the requirements for installing the stratum0/1
Is it only ansable
We need also Open SSH server and client on the server.
Do we need more?
Got some ssh errors when I was running "ansible-playbook -i hosts -b -K stratum1.yml"
We already have CI for building the client packages (deb and rpm), but we also need something that actually tests them to prevent silly mistakes as #47.
If cvmfs_geo_license_key is not set, the add-replica command on the stratum1 server fails, because CVMFS_GEO_LICENSE_KEY is not set. This shouldn't be a required option, though.
By using the following command, I got a Permission denied message
$ git clone --recursive [email protected]:EESSI/cvmfs-layer.git
gives the following message:
Permission denied (publickey).
fatal: Could not read from remote repository.
$ git clone https://github.com/EESSI/cvmfs-layer.git
Works fine
The CVMFS documentation recommend to have a DNS cache on the Stratum 1, using dnsmasq
or bind
:
https://cvmfs.readthedocs.io/en/stable/cpt-replica.html#recommended-setup
The Ansible role that we have doesn't include this at the moment.
We had an issue with one of our Stratum 1 servers this week, which caused it to serve an older tag of the repository. This made me realize again that we should think about setting up some monitoring dashboard that gives an overview and statistics of our infrastructure, sends out alerts when something is wrong, etc. One way to easily grab some information about Stratum 1s is by reading out the .cvmfspublished
file (and maybe the .cvmfs_last_snapshot
too); the structure of that file is explained here in the docs.
When I run $ ansible-playbook -i hosts -b -K stratum1.yml it's asking me to enter the password for BECOME
BECOME password:
I don't see any information around this account or did I miss something?
Running WSL with on Windows 10 client, the following erro pops up:
ERROR! no action detected in task
The error appears to have been in '/home/dell/filesystem-layer/roles/cvmfs/tasks/main.yml': line 25, column 3, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
Problem: Ansible on WSL - with Ubuntu 16.04 has version 2.0.0.2 installed. Upgrade ansible to 2.9 problem will be solved.
Is it possible to display a message if we find the wrong version of ansible is installed on the system?
The workflow does have:
release:
types: [published]
But this doesn't seem to work. Probably caused by:
When you use the repository's GITHUB_TOKEN to perform tasks on behalf of the GitHub Actions app, events triggered by the GITHUB_TOKEN will not create a new workflow run. This prevents you from accidentally creating recursive workflow runs.
We need to find some other way to automatically build new containers after the release (and, hence, the new client packages) has been published.
Explain how to use the new client package, e.g. the CVMFS_CLIENT_PROFILE
parameter (see. https://cvmfs.readthedocs.io/en/stable/cpt-quickstart.html#create-default-local).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.