Giter Site home page Giter Site logo

databio / bulker Goto Github PK

View Code? Open in Web Editor NEW
24.0 16.0 2.0 373 KB

Manager for multi-container computing environments

Home Page: https://bulker.io

License: BSD 2-Clause "Simplified" License

Python 34.56% Shell 3.27% Jupyter Notebook 62.17%
containers docker environments reproducible-research

bulker's People

Contributors

donaldcampbelljr avatar jpsmith5 avatar mr-c avatar nsheff avatar stolarczyk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

mr-c lwaldron

bulker's Issues

bulker shell

I want to bulker shell command to start a shell in the container that is used to execute command.

pull access denied for databio/bioconductor

After fixing #59 the first bioconductor image (tag RELEASE_3_11) pull succeeded, but the second one did not (tag RELEASE_3_10):

Error response from daemon: pull access denied for databio/bioconductor, repository does not exist or may require 'docker login': denied: requested access to the resource is denied

Is that a misconfiguration on my end?

Full log:

[mstolarczyk@MichalsMBP bulker](master): bulker load databio/lab -r -b
Using default config. No config found in env var: BULKERCFG
Bulker config: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bulker/templates/bulker_config.yaml
Building images with template: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bulker/templates/docker_build.jinja2
Removing all executables in: /Users/mstolarczyk/bulker_crates/databio/lab/default
Importing crate 'bulker/biobase:default' from '/Users/mstolarczyk/bulker_crates/bulker/biobase/default'.
RELEASE_3_11: Pulling from bioconductor/bioconductor_docker
Digest: sha256:ed70db2dd7746fd3e3edcc5d36e0c24860da0fb7f1270d7f7174259380ef7bcc
Status: Image is up to date for bioconductor/bioconductor_docker:RELEASE_3_11
docker.io/bioconductor/bioconductor_docker:RELEASE_3_11
RELEASE_3_11: Pulling from bioconductor/bioconductor_docker
Digest: sha256:ed70db2dd7746fd3e3edcc5d36e0c24860da0fb7f1270d7f7174259380ef7bcc
Status: Image is up to date for bioconductor/bioconductor_docker:RELEASE_3_11
docker.io/bioconductor/bioconductor_docker:RELEASE_3_11
devel: Pulling from bioconductor/bioconductor_docker
d7c3167c320d: Pull complete 
131f805ec7fd: Pull complete 
322ed380e680: Pull complete 
6ac240b13098: Pull complete 
eb2c19c1373e: Pull complete 
7042df304f43: Pull complete 
0583bf9246de: Pull complete 
86dd0569b405: Pull complete 
edd2edc46676: Pull complete 
c02ecc6e905e: Pull complete 
882bc200c80a: Pull complete 
e8fe4cdbc8da: Pull complete 
ec39410c0fa9: Pull complete 
c7b4ee562600: Pull complete 
c9d215645fe9: Pull complete 
1dfdea28a902: Pull complete 
c141dbed6571: Pull complete 
396bb545830a: Pull complete 
ce090f3318e2: Pull complete 
3f5a1eecc27f: Pull complete 
972a3aacd627: Pull complete 
Digest: sha256:7fe07b0d37c88e084d0be659b0d6004785af28ab94945b53ec23a08ee5ed727a
Status: Downloaded newer image for bioconductor/bioconductor_docker:devel
docker.io/bioconductor/bioconductor_docker:devel
devel: Pulling from bioconductor/bioconductor_docker
Digest: sha256:7fe07b0d37c88e084d0be659b0d6004785af28ab94945b53ec23a08ee5ed727a
Status: Image is up to date for bioconductor/bioconductor_docker:devel
docker.io/bioconductor/bioconductor_docker:devel
RELEASE_3_10: Pulling from bioconductor/bioconductor_docker
7e2b2a5af8f6: Pull complete 
59c89b5f9b0c: Pull complete 
4017849f9f85: Pull complete 
c8b29d62979a: Pull complete 
12004028a6a7: Pull complete 
3f09b9a53dfb: Pull complete 
03ed58116b0c: Pull complete 
4ad402035056: Pull complete 
73305bb1cbbd: Pull complete 
4e3e3edc9ebe: Pull complete 
744a3c8e4286: Pull complete 
d23a28eea9b1: Pull complete 
a98e3f22b3ee: Pull complete 
5e31bac06346: Pull complete 
9c51ede5bd62: Pull complete 
db3b81badc0d: Pull complete 
2083fa2f9393: Pull complete 
Digest: sha256:dbe121d95a31eaa88a493ae66aa4035a236b2e37c69341ae74acb405e126f0c6
Status: Downloaded newer image for bioconductor/bioconductor_docker:RELEASE_3_10
docker.io/bioconductor/bioconductor_docker:RELEASE_3_10
RELEASE_3_10: Pulling from bioconductor/bioconductor_docker
Digest: sha256:dbe121d95a31eaa88a493ae66aa4035a236b2e37c69341ae74acb405e126f0c6
Status: Image is up to date for bioconductor/bioconductor_docker:RELEASE_3_10
docker.io/bioconductor/bioconductor_docker:RELEASE_3_10
Using default tag: latest
Error response from daemon: pull access denied for databio/bioconductor, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
------ Error building. Build script used: ------
#!/bin/sh

docker pull databio/bioconductor
------------------------------------------------
Loading manifest: 'databio/lab:default'. Activate with 'bulker activate databio/lab:default'.
Commands available: R, Rscript, Rd, Rscriptd, R3.6, Rscript3.6, RC

display commands in alphabetical order

bulker should display commands in alphabetical order

databio/lab|~$ bulker inspect
Bulker config: /project/shefflab/rivanna_config/bulker.yaml
Bulker manifest: databio/lab
Crate path: /project/shefflab/bulker/bulker_crates/databio/lab/default
Available commands: ['bedIntersect', 'timeout', 'pigz', 'split', 'ln', 'hisat2', 'expand', 'cufflinks', 'bigWigCat', 'seqtk', 'tr', 'dircolors', 'mkfifo', 'wc', 'rm', 'logname', 'cellranger', 'curl', 'expr', 'yes', 'hostname', 'chmod', 'sambamba', 'comm', 'sha512sum', 'macs2', 'Rd', 'install', 'md5sum', 'od', 'Rscript', 'cut', 'nice', 'mv', 'ascp', 'sed', 'tsort', 'prefetch', 'aws', 'sha224sum', 'id', 'pathchk', 'mashmap', 'paste', 'false', 'bash', 'which', 'rmdir', 'whoami', 'tac', 'picard', 'tail', 'singularity', 'csplit', 'bedItemOverlapCount', 'base32', 'R', 'sum', 'khmer', 'bedGraphToBigWig', 'dirname', 'sha256sum', 'pr', 'mkdir', 'dd', 'unexpand', 'awk', 'wigToBigWig', 'join', 'realpath', 'gt', 'seq', 'trimmomatic', 'R3.6', 'ptx', 'vep', 'Rscriptd', 'uptime', 'vdb-config', 'sh', 'base64', 'Rscript3.6', 'who', 'tee', 'refgenie', 'shred', 'rg', 'head', 'true', 'tabix', 'cp', 'stty', 'du', 'cat', 'fasterq-dump', 'cksum', 'bowtie2', 'bedClip', 'chown', 'repeatmasker', 'sync', 'bedtools', 'salmon', 'bedCommonRegions', 'bismark', 'faSplit', 'bedPileUps', 'cutadapt', 'liftOver', 'arch', 'bedToBigBed', 'blast', 'sort', 'bowtie', 'dir', 'gatk', 'date', 'env', 'fastqc', 'mknod', 'sleep', 'bigWigAverageOverBed', 'chgrp', 'factor', 'pwd', 'sha1sum', 'stat', 'ls', 'test', 'fmt', 'link', 'vdir', 'echo', 'basename', 'numfmt', 'shuf', 'nl', 'b2sum', 'hostid', 'touch', 'trim_galore', 'uniq', 'homer', 'samblaster', 'STAR', 'tty', 'samtools', 'nproc', 'bedops', 'unlink', 'groups', 'seqkit', 'bissnp', 'grep', 'nohup', 'bsmap', 'printenv', 'skewer', 'bamtools', 'printf', 'bigWigSummary', 'kallisto', 'runcon', 'stdbuf', 'bwa', 'sha384sum', 'fold', 'fastq-dump', 'df', 'uname', 'RC', 'pinky', 'truncate', 'segway', 'users']

bulker overrides default shell

For example, zsh is the default shell in the newest OSX update, so you a) get the following message, and b) lose your zsh settings such as custom prompt in favor of bulker-3.2$. I also noticed the loss of custom shell settings before the zsh migration.

10:37:58 wallabe:~/git > echo $SHELL
/bin/zsh
10:37:59 wallabe:~/git > bulker activate demo   
Bulker config: /Users/lwaldron/bulker_config.yaml
Activating bulker crate: demo


The default interactive shell is now zsh.
To update your account to use zsh, please run `chsh -s /bin/zsh`.
For more details, please visit https://support.apple.com/kb/HT208050.
bulker-3.2$ 

container-specific exclusions

Right now you can list variables and mount points to pass to all containers.

You can also use tool-specific args to pass a variable or mount point to a single container. (#7)

But what I need to do is pass to all but one -- so I need tool-specific exclusions.

use case: on rivanna, you have to mount /ext in order to use nameservers, like -B /ext:/ext with singularity. otherwise, for example, prefetch doesn't work. So, I put /ext into the global mounts in my bulker config. BUT... this breaks the bioconductor R image, for some reason, which cannot have the host /ext mounted. So, I'd like to make /ext always mounted, except for the R image.

right now I just load it for all, but after every bulker load I have to manually delete that line from R containerized executable script, which is a pain.

Bulker inspect currently active crate

It would be nice to have a way to notify the user of what crate is currently active. This could be done by loading an environment variable upon activation, and reading that environment variable upon check.

Also, a way to show which commands are provided by the currently active crate.

What if multiple crates are active? So the environment variable should be an ordered list of activated crates.

import domain for cascading manifests

I want to create a manifest that loads 2 other manifests.

i know I an already bulker activate manifest1,manifest2...but this would make distribution easier if it were cascading.

maybe add to manifest config:

manifest:
  from: namespace/crate

--workdir="`pwd`" is ignored when invoking rstudio-server

When invoking rstudio-server in waldronlab/bioconductor.yaml, the rstudio session starts in the directory /home/rstudio, meaning that --workdir="`pwd`" is ignored. This is inconvenient because the /home/rstudio directory is empty, and things like the host ${HOME}/.Renviron are ignored, and you have to change directories manually e.g. to /Users/lwaldron.

Since rstudio-server is not a command-line operation, one solution is for me to hard-code dockerargs: "-v $HOME:/home/rstudio" into the waldronlab/bioconductor.yaml manifest entry for rstudio-server. This works, even though my home directory appears under /home/rstudio. This setting is specific to the rstudio-server command, so it doesn't seem appropriate to put in bulker_config.yaml. What do you think @nsheff?

KeyError: 'singularity_fullpath'

might be related to #53

[mstolarczyk@MichalsMBP ~]: bulker load -r -b databio/lab
Using default config. No config found in env var: BULKERCFG
Bulker config: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bulker/templates/bulker_config.yaml
Building images with template: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bulker/templates/docker_build.jinja2
Removing all executables in: /Users/mstolarczyk/bulker_crates/databio/lab/default
Importing crate 'bulker/biobase:default' from '/Users/mstolarczyk/bulker_crates/bulker/biobase/default'.
RELEASE_3_11: Pulling from bioconductor/bioconductor_docker
23884877105a: Already exists 
bc38caa0f5b9: Already exists 
2910811b6c42: Already exists 
36505266dcc6: Already exists 
d21d194790d1: Pull complete 
3821ac91f817: Pull complete 
3334418e8171: Pull complete 
63380217dd8a: Pull complete 
1319b4ef3651: Pull complete 
94b33093acf7: Pull complete 
4d08753fb4b6: Pull complete 
ee9dbe765945: Pull complete 
4ed5d6c602cd: Pull complete 
7abf5c9d7995: Pull complete 
9a594bb377a3: Pull complete 
49dc1b5126c8: Pull complete 
ef45bfc7da05: Pull complete 
57e43e3bc013: Pull complete 
ff6b3c64cfd9: Pull complete 
Digest: sha256:ed70db2dd7746fd3e3edcc5d36e0c24860da0fb7f1270d7f7174259380ef7bcc
Status: Downloaded newer image for bioconductor/bioconductor_docker:RELEASE_3_11
docker.io/bioconductor/bioconductor_docker:RELEASE_3_11
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/attmap/ordattmap.py", line 45, in __getitem__
    return super(OrdAttMap, self).__getitem__(item)
KeyError: 'singularity_fullpath'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/bin/bulker", line 10, in <module>
    sys.exit(main())
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bulker/bulker.py", line 870, in main
    force=args.force)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bulker/bulker.py", line 426, in bulker_load
    _LOGGER.info("Container available at: {cmd}".format(cmd=pkg["singularity_fullpath"]))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/attmap/pathex_attmap.py", line 51, in __getitem__
    v = super(PathExAttMap, self).__getitem__(item)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/attmap/ordattmap.py", line 47, in __getitem__
    return AttMap.__getitem__(self, item)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/attmap/attmap.py", line 32, in __getitem__
    return self.__dict__[item]
KeyError: 'singularity_fullpath'

I just installed bulker v0.5.0 and tried to load a crate using the config that shipped with bulker. Commenting out the line that shows up in the traceback helped:

_LOGGER.info("Container available at: {cmd}".format(cmd=pkg["singularity_fullpath"]))

Running git in bulker

Can you tell me why git isn't working within my waldronlab/levi manifest? First example below is with native git, then with container git. Using the new Bulker 0.3.0.

Levis-Air:hub.bulker.io lw391$ bulker --version
bulker 0.3.0
Levis-Air:hub.bulker.io lw391$ git pull
Already up-to-date.
Levis-Air:hub.bulker.io lw391$ ls .git
COMMIT_EDITMSG	HEAD		config		hooks		info		objects		refs
FETCH_HEAD	ORIG_HEAD	description	index		logs		packed-refs
Levis-Air:hub.bulker.io lw391$ bulker activate levi
Bulker config: /Users/lw391/bulker_config.yaml
Activating bulker crate: levi

bulker-3.2$ git pull
fatal: Not a git repository (or any of the parent directories): .git
bulker-3.2$ ls .git
COMMIT_EDITMSG	HEAD		config		hooks		info		objects		refs
FETCH_HEAD	ORIG_HEAD	description	index		logs		packed-refs
bulker-3.2$ cat ~/bulker_config.yaml
bulker:
  volumes: ['/home']
  envvars: ['DISPLAY']
  default_crate_folder: ${HOME}/bulker_crates
  singularity_image_folder: ${HOME}/simages
  container_engine: docker
  default_namespace: bulker
  executable_template: /Users/lw391/Library/Python/2.7/lib/python/site-packages/bulker/templates/docker_executable.jinja2
  build_template: /Users/lw391/Library/Python/2.7/lib/python/site-packages/bulker/templates/docker_build.jinja2
  crates:
    bulker:
      demo:
        default: /Users/lw391/bulker_crates/bulker/demo/default
      bioconductor:
        default: /Users/lw391/bulker_crates/bulker/bioconductor/default
        docker_args: --volume=${HOME}/.local/lib/R:/usr/local/lib/R/host-site-library
      devel:
        docker_args: --volume=${HOME}/.local/lib/Rdev:/usr/local/lib/R/host-site-library
      levi:
        default: /Users/lw391/bulker_crates/bulker/levi/default

Loading/activating PEPATAC crate_Errors

Hi Nathan,
I'm working on sherlock (Stanford cluster)
I'm trying to load PEPATAC crate using the "bulker load databio/pepatac -b -r" command. During the load, I had this return
'INFO: Creating SIF file...
FATAL: While making image from oci registry: while building SIF from layers: while creating squashfs: create command failed: signal: killed:
mv: cannot stat ‘rpipe’: No such file or directory
------ Error building. Build script used: ------
#!/bin/sh

if [ ! -f "/home/users/karim90/simages/databio/rpipe" ]; then
singularity pull rpipe docker://databio/rpipe
mv rpipe /home/users/karim90/simages/databio/rpipe
fi

Container available at: /home/users/karim90/simages/databio/rpipe
Container available at: /home/users/karim90/simages/quay.io/biocontainers/samtools:1.9--h91753b0_8
Container available at: /home/users/karim90/simages/quay.io/biocontainers/ucsc-wigtobigwig:357--h35c10e6_3
Populating host commands
Loading manifest: 'databio/pepatac:default'. Activate with 'bulker activate databio/pepatac:default'.
Commands available: bedGraphToBigWig, bedToBigBed, bedtools, bigWigCat, bowtie2, fastqc, java, macs2, pigz, preseq, samblaster, skewer, R, Rscript, samtools, wigToBigWig
Host commands available: python3, Perl"

When I tried to activate bulker using the "bulker activate databio/pepatac" command on sherlock, I had this error

[karim90@sh02-ln03 login ]$ bulker activate databio/pepatac
Bulker config: /home/users/karim90/bulker_config.yaml
Activating bulker crate: databio/pepatac
Error for command "pull": unknown shorthand flag: 'n' in -n
"Run 'singularity pull --help' for more detailed usage information.
mv: cannot stat ‘alpine-coreutils’: No such file or directory
FATAL: could not open image /home/users/karim90/simages/databio/alpine-coreutils: failed to retrieve path for /home/users/karim90/simages/databio/alpine-coreutils: lstat /home/users/karim90/simages/databio/alpine-coreutils: no such file or directory
databio/pepatac|
$ exit"

Thank you for your help

start travis

probably worth supporting python 2.7/3.4 since:

  • bulker could be usefully deployed on old systems for awhile?
  • it's easy

Multiple registries

Would be nice if registry_url could accept a priority list of registries.

Under some circumstances, "No config found in env var: BULKERCFG"

Trying to setup my office (OSX) computer, I experienced bulker ignoring my BULKERCFG environment variable. I found a solution that worked for reasons I don't understand, below. Here's what happened originally:

wallabe:~ lwaldron$ pip uninstall bulker
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
Uninstalling bulker-0.3.0:
  Would remove:
    /Users/lwaldron/Library/Python/2.7/bin/bulker
    /Users/lwaldron/Library/Python/2.7/lib/python/site-packages/bulker-0.3.0-py2.7.egg-info
    /Users/lwaldron/Library/Python/2.7/lib/python/site-packages/bulker/*
Proceed (y/n)? y
  Successfully uninstalled bulker-0.3.0
wallabe:~ lwaldron$ ls Library/Python/2.7/lib/python/site-packages/
Jinja2-2.10.3.dist-info/       jinja2/                        oyaml.py                       yacman-0.6.4-py2.7.egg-info/
MarkupSafe-1.1.1.dist-info/    logmuse/                       oyaml.pyc                      yaml/
PyYAML-5.1.2-py2.7.egg-info/   logmuse-0.2.5-py2.7.egg-info/  ubiquerg/                      
attmap/                        markupsafe/                    ubiquerg-0.5.0-py2.7.egg-info/ 
attmap-0.12.11-py2.7.egg-info/ oyaml-0.9.dist-info/           yacman/                        
wallabe:~ lwaldron$ bulker load demo
Using default config. No config found in env var: BULKERCFG
Bulker config: /Users/lwaldron/Library/Python/2.7/lib/python/site-packages/bulker/templates/bulker_config.yaml
Got URL: http://hub.bulker.io/bulker/demo.yaml
That manifest has already been loaded. Overwrite? [y/N] y
Removing all executables in: /Users/lwaldron/bulker_crates/bulker/demo/default
Loading manifest: 'bulker/demo:default'. Activate with 'bulker activate bulker/demo:default'.
Commands available: cowsay, fortune
wallabe:~ lwaldron$ echo $BULKERCFG
/Users/lwaldron/bulker_config.yaml
wallabe:~ lwaldron$ cat `echo $BULKERCFG`
bulker:
  volumes: ['$HOME']
  envvars: ['DISPLAY']
  registry_url: http://hub.bulker.io/
  default_crate_folder: ${HOME}/bulker_crates
  singularity_image_folder: ${HOME}/simages
  container_engine: docker
  default_namespace: bulker
  executable_template: templates/docker_executable.jinja2
  shell_template: templates/docker_shell.jinja2
  build_template: templates/docker_build.jinja2
  crates:
    bulker:
      demo:
        default: ${HOME}/bulker_crates/bulker/demo/default
    waldronlab:
      bioconductor:
        default: ${HOME}/bulker_crates/waldronlab/bioconductor/default
      levi:
        default: ${HOME}/bulker_crates/waldronlab/levi/default
    databio:
      nsheff:
        default: ${HOME}/bulker_crates/databio/nsheff/default
  tool_args:
    bioconductor:
      bioconductor_full:
        default:
          docker_args: --volume=${HOME}/R/bioc-release:/usr/local/lib/R/host-site-library -e PASSWORD=rstudiopassword -p 8787:8787
        devel:
          docker_args: --volume=${HOME}/R/bioc-devel:/usr/local/lib/R/host-site-library -e PASSWORD=rstudiopassword -p 8788:8787
wallabe:~ lwaldron$ bulker activate demo
Using default config. No config found in env var: BULKERCFG
Bulker config: /Users/lwaldron/Library/Python/2.7/lib/python/site-packages/bulker/templates/bulker_config.yaml
Activating bulker crate: demo

bulker-3.2$ echo $BULKERCFG

bulker-3.2$

Then here's what I did that solved it.

wallabe:~ lwaldron$ bulker init -c ~/bulker_config.yaml 
Guessing container engine is docker.
Exists. Overwrite? [y/N] y
/Users/lwaldron/Library/Python/2.7/lib/python/site-packages/yacman/yacman.py:144: UserWarning: Writing to a non-locked, existing file. Beware of collisions.
  warnings.warn("Writing to a non-locked, existing file. Beware of collisions.", UserWarning)
Wrote new configuration file: /Users/lwaldron/bulker_config.yaml
wallabe:~ lwaldron$ rm bulker_config.yaml 
wallabe:~ lwaldron$ pip uninstall bulker
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
Uninstalling bulker-0.3.0:
  Would remove:
    /Users/lwaldron/Library/Python/2.7/bin/bulker
    /Users/lwaldron/Library/Python/2.7/lib/python/site-packages/bulker-0.3.0-py2.7.egg-info
    /Users/lwaldron/Library/Python/2.7/lib/python/site-packages/bulker/*
Proceed (y/n)? y
  Successfully uninstalled bulker-0.3.0
wallabe:~ lwaldron$ pip install --user bulker
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support
Collecting bulker
  Using cached https://files.pythonhosted.org/packages/98/36/4808bd1172e39c197cc092093fedb7cc9f1168685446c076c87e3f987364/bulker-0.3.0.tar.gz
Requirement already satisfied: yacman>=0.6.0 in ./Library/Python/2.7/lib/python/site-packages (from bulker) (0.6.4)
Requirement already satisfied: pyyaml>=5.1 in ./Library/Python/2.7/lib/python/site-packages (from bulker) (5.1.2)
Requirement already satisfied: logmuse>=0.2.0 in ./Library/Python/2.7/lib/python/site-packages (from bulker) (0.2.5)
Requirement already satisfied: jinja2 in ./Library/Python/2.7/lib/python/site-packages (from bulker) (2.10.3)
Requirement already satisfied: ubiquerg>=0.4.9 in ./Library/Python/2.7/lib/python/site-packages (from bulker) (0.5.0)
Requirement already satisfied: attmap>=0.12.9 in ./Library/Python/2.7/lib/python/site-packages (from yacman>=0.6.0->bulker) (0.12.11)
Requirement already satisfied: oyaml in ./Library/Python/2.7/lib/python/site-packages (from yacman>=0.6.0->bulker) (0.9)
Requirement already satisfied: MarkupSafe>=0.23 in ./Library/Python/2.7/lib/python/site-packages (from jinja2->bulker) (1.1.1)
Installing collected packages: bulker
    Running setup.py install for bulker ... done
Successfully installed bulker-0.3.0
wallabe:~ lwaldron$ bulker init -c ~/bulker_config.yaml 
Guessing container engine is docker.
Wrote new configuration file: /Users/lwaldron/bulker_config.yaml
wallabe:~ lwaldron$ export BULKERCFG=/Users/lwaldron/bulker_config.yaml
wallabe:~ lwaldron$ bulker load demo
Bulker config: /Users/lwaldron/bulker_config.yaml
Got URL: http://hub.bulker.io/bulker/demo.yaml
Loading manifest: 'bulker/demo:default'. Activate with 'bulker activate bulker/demo:default'.
Commands available: cowsay, fortune
wallabe:~ lwaldron$ bulker activate demo
Bulker config: /Users/lwaldron/bulker_config.yaml
Activating bulker crate: demo

bulker-3.2$ echo $BULKERCFG
/Users/lwaldron/bulker_config.yaml

Not sure this report will be helpful, but if you you know what is happening in the above situation, it would help to add some more advice to the message about BULKERCFG not pointing to a config.

make 'bulker load' printout more concise

after bulker load cmd execution lots of stuff is logged to the screen. Can this be more concise?
What is more, some of the information does not make sense for a bulker newbie -- after each entry:

None
None found.

entire printout:

[mstolarczyk@MichalsMBP test_genomes]: bulker load refgenie -f /Users/mstolarczyk/Uczelnia/UVA/code/refgenie/refgenie_manifest.yaml
Using default config. No config found in env var: BULKERCFG
Bulker config: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bulker/templates/bulker_config.yaml
Executable template: /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bulker/templates/docker_executable.jinja2
That manifest has already been loaded. Overwrite? [y/N] y
Removing all executables in: /Users/mstolarczyk/bulker_crates/bulker/refgenie/default
command: samtools
dockerargs: -i
docker_image: quay.io/biocontainers/samtools:1.9--h91753b0_8
volumes: ['/tmp']
envvars: ['DISPLAY']
default_crate_folder: /Users/mstolarczyk/bulker_crates
singularity_image_folder: /Users/mstolarczyk/simages
container_engine: docker
default_namespace: bulker
executable_template: docker_executable.jinja2
shell_template: docker_shell.jinja2
build_template: docker_build.jinja2
crates:
  bulker:
    refgenie:
      default: /Users/mstolarczyk/bulker_crates/bulker/refgenie/default
None
None found.
command: bowtie2-build
docker_image: quay.io/biocontainers/bowtie2:2.3.5--py37he860b03_0
volumes: ['/tmp']
envvars: ['DISPLAY']
default_crate_folder: /Users/mstolarczyk/bulker_crates
singularity_image_folder: /Users/mstolarczyk/simages
container_engine: docker
default_namespace: bulker
executable_template: docker_executable.jinja2
shell_template: docker_shell.jinja2
build_template: docker_build.jinja2
crates:
  bulker:
    refgenie:
      default: /Users/mstolarczyk/bulker_crates/bulker/refgenie/default
None
None found.
command: bwa
docker_image: quay.io/biocontainers/bwa:0.7.17--pl5.22.0_2
volumes: ['/tmp']
envvars: ['DISPLAY']
default_crate_folder: /Users/mstolarczyk/bulker_crates
singularity_image_folder: /Users/mstolarczyk/simages
container_engine: docker
default_namespace: bulker
executable_template: docker_executable.jinja2
shell_template: docker_shell.jinja2
build_template: docker_build.jinja2
crates:
  bulker:
    refgenie:
      default: /Users/mstolarczyk/bulker_crates/bulker/refgenie/default
None
None found.
command: hisat2-build
docker_image: quay.io/biocontainers/hisat2:2.0.4--py35_0
volumes: ['/tmp']
envvars: ['DISPLAY']
default_crate_folder: /Users/mstolarczyk/bulker_crates
singularity_image_folder: /Users/mstolarczyk/simages
container_engine: docker
default_namespace: bulker
executable_template: docker_executable.jinja2
shell_template: docker_shell.jinja2
build_template: docker_build.jinja2
crates:
  bulker:
    refgenie:
      default: /Users/mstolarczyk/bulker_crates/bulker/refgenie/default
None
None found.
command: bismark_genome_preparation
docker_image: quay.io/biocontainers/bismark:0.18.1--pl5.22.0
volumes: ['/tmp']
envvars: ['DISPLAY']
default_crate_folder: /Users/mstolarczyk/bulker_crates
singularity_image_folder: /Users/mstolarczyk/simages
container_engine: docker
default_namespace: bulker
executable_template: docker_executable.jinja2
shell_template: docker_shell.jinja2
build_template: docker_build.jinja2
crates:
  bulker:
    refgenie:
      default: /Users/mstolarczyk/bulker_crates/bulker/refgenie/default
None
None found.
command: tabix
docker_image: quay.io/biocontainers/htslib:1.6--0
volumes: ['/tmp']
envvars: ['DISPLAY']
default_crate_folder: /Users/mstolarczyk/bulker_crates
singularity_image_folder: /Users/mstolarczyk/simages
container_engine: docker
default_namespace: bulker
executable_template: docker_executable.jinja2
shell_template: docker_shell.jinja2
build_template: docker_build.jinja2
crates:
  bulker:
    refgenie:
      default: /Users/mstolarczyk/bulker_crates/bulker/refgenie/default
None
None found.
command: STAR
docker_image: biocontainers/rna-star:v2.5.2bdfsg-1-deb_cv1
volumes: ['/tmp']
envvars: ['DISPLAY']
default_crate_folder: /Users/mstolarczyk/bulker_crates
singularity_image_folder: /Users/mstolarczyk/simages
container_engine: docker
default_namespace: bulker
executable_template: docker_executable.jinja2
shell_template: docker_shell.jinja2
build_template: docker_build.jinja2
crates:
  bulker:
    refgenie:
      default: /Users/mstolarczyk/bulker_crates/bulker/refgenie/default
{'protocol': None, 'namespace': 'biocontainers', 'image': 'rna-star', 'subimage': None, 'tag': 'v2.5.2bdfsg-1-deb_cv1'}
None found.
command: salmon
docker_image: quay.io/biocontainers/salmon:0.11.3--h86b036
volumes: ['/tmp']
envvars: ['DISPLAY']
default_crate_folder: /Users/mstolarczyk/bulker_crates
singularity_image_folder: /Users/mstolarczyk/simages
container_engine: docker
default_namespace: bulker
executable_template: docker_executable.jinja2
shell_template: docker_shell.jinja2
build_template: docker_build.jinja2
crates:
  bulker:
    refgenie:
      default: /Users/mstolarczyk/bulker_crates/bulker/refgenie/default
None
None found.
command: kallisto
docker_image: quay.io/biocontainers/kallisto:0.42.4--2
volumes: ['/tmp']
envvars: ['DISPLAY']
default_crate_folder: /Users/mstolarczyk/bulker_crates
singularity_image_folder: /Users/mstolarczyk/simages
container_engine: docker
default_namespace: bulker
executable_template: docker_executable.jinja2
shell_template: docker_shell.jinja2
build_template: docker_build.jinja2
crates:
  bulker:
    refgenie:
      default: /Users/mstolarczyk/bulker_crates/bulker/refgenie/default
None
None found.
Loading manifest: 'bulker/refgenie:default'. Activate with 'bulker activate bulker/refgenie:default'.
Commands available: samtools, bowtie2-build, bwa, hisat2-build, bismark_genome_preparation, tabix, STAR, salmon, kallisto

/etc/sudoers.d is not shared from OS X and is not known to Docker.

After a recent Docker upgrade I found bulker broken, e.g. this command from waldronlab/bioconductor but the same for all bulker commands:

$ Rdev
WARNING: Published ports are discarded when using host network mode
docker: Error response from daemon: Mounts denied: 
The path /etc/sudoers.d
is not shared from OS X and is not known to Docker.
You can configure shared paths from Docker -> Preferences... -> File Sharing.
See https://docs.docker.com/docker-for-mac/osxfs/#namespaces for more info.
.
ERRO[0000] error waiting for container: context canceled 

To show the bulker script:

$ cat `which Rdev`
#!/bin/sh

docker run --rm --init \
  -it --volume=/Users/lwaldron/R/bioc-devel:/usr/local/lib/R/host-site-library -e DISABLE_AUTH=true -p 8788:8787 -v /Users/lwaldron:/home/rstudio \
  --user=$(id -u):$(id -g) \
  --network="host" \
  --env "DISPLAY" \
  --volume "$HOME:$HOME" \
  --volume="/etc/group:/etc/group:ro" \
  --volume="/Users/lwaldron/templates/mac_passwd:/etc/passwd:ro" \
  --volume="/etc/shadow:/etc/shadow:ro"  \
  --volume="/etc/sudoers.d:/etc/sudoers.d:ro" \
  --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
  --workdir="`pwd`" \

Note, I can see /etc/sudoers.d from the command line:

$ ls -al /etc/sudoers.d
total 0
drwxr-xr-x    2 root  wheel    64 Aug 18  2018 .
drwxr-xr-x  125 root  wheel  4000 May  4 11:32 ..

But I can't find it from the Docker client "File Sharing" graphical directory selection, so I didn't find a fix there. However, by removing the line:

  --volume="/etc/sudoers.d:/etc/sudoers.d:ro" \

from which Rdev, the problem went away. Doing the same to which _Rdev I see I don't have sudo access, but otherwise everything seems fine:

$ _Rdev
Starting interactive docker shell for image 'waldronlab/bioconductor:devel' and command 'Rdev'
WARNING: Published ports are discarded when using host network mode
lwaldron@docker-desktop:~$ whoami
lwaldron
lwaldron@docker-desktop:~$ sudo ls
sudo: unknown user: root
sudo: unable to initialize policy plugin
lwaldron@docker-desktop:~$

So perhaps the mac-fix script should just remove the sudoers line?

Bulker + CWL synergy and enhancement

I've been working on some updates to interface Bulker with CWL. Here are some notes and brainstorming about it.

Motivation

Bulker is very nice for making an interactive environment where a user runs tools as if they are native, but the actually run in containers. This makes a nice, portable and sharable environment that is reproducible across computing hardware, and also across container engines (docker and singularity). It's really useful for interactivity, and I'm unaware of any other tool that does something like this.

I envisioned that these interactive environments could also be useful to containerize a workflow. For example, a bulker environment can make a native workflow immediately containerized, which simplifies the process of containerization for a workflow author. The workflow just needs to be written using native tools, but then run inside an active -- and ideally strict -- bulker environment, and you get containerization and reproducibility for free.

But existing tools like CWL also have built in container management that people are using to containerize workflows. In CWL, individual tool files can specify the images they use to run. These images are then used within the CWL engine to make the workflow containerized, which is sort of fulfilling the same role that the bulker environments could fulfill. In the CWL approach, a tool definition is tightly coupled to its container. In the bulker approach, they are separated.

Some advantages of the CWL tight coupling approach are:

  • There may be useful information in the connection from the task to the container, which bulker intentionally severs. You could perhaps reconstruct this information from a bulker environment, but with CWL it is hard-coded and immediately there.
  • You could use two separate versions of a tool in one workflow. With bulker, you could do this, but you'd have to name the command something different, since it maps commands to images, rather than each specific invocation of a tool.
  • there's already an infrastructure built up around this model, so there's intertia behind the idea of coupling a task tightly to its image
  • if you hand the workflow off to someone else, you have some guarantee that it will work, because they can't change the tools. with Bulker, could just give them both the workflow and the environment, but because they could change the environment, they could introduce their own environment that would break the workflow.
  • others?

Some advantages of the decoupling are:

  • the environment is independent of the workflow; therefore, it can be re-used across multiple workflows
  • environments can be used interactively, for debugging, development, or just for everyday computing -- for example, this can supplant the need for environment modules systems, syncing environments between remote and local compute. So, the environments transcend use in workflows only
  • it's easier to update/change environments; for instance, if I have a workflow and I want to upgrade all the tools, I can just use an updated bulker environment. Probably I'm version-controlling the bulker manifest anyway, so this this sort of happens automatically, reducing long-term maintenance of the workflow
  • workflow authors don't need to care or even be aware of containers or how they work; it's completely outsourced so they can focus on the workflow itself.
  • others?

Synergy

One way to promote a connection and possibly get benefits of both methods is to make it easy to convert back-and-forth. To do that we'd need to enable two directions:

From a CWL to a bulker environment

If you could take a CWL and get an interactive environment, that would be useful. So, I've now implemented this in the cwl2man (CWL -> Bulker manifest) command in bulker. Given a list of CWL tool descriptions, bulker can create a manifest that can then be used interactively.

It works like this: I cloned bio-cwl-tools and built a bulker manifest, so we can create an interactive environment representing that repository. It works like this:

bulker cwl2man -f bio-cwl-manifest.yaml -c `ls */*.cwl`
bulker load cwl/test -f bio-cwl-manifest.yaml
bulker activate cwl/test
samtools

It was pretty simple on the surface. Will likely run into some details that need to be solved, but for now, it worked for some basic stuff.

From a bulker environment to a CWL workflow

Given a bulker environment, I could take a CWL workflow and update the containers to match the bulker environment. This would make it pretty simple to write a non-container-aware CWL workflow, and then just immediately containerize it. Haven't implemented this, but you'd do something like:

bulker containerize cwl/test -w workflow.cwl

A set of common tool descriptions

A useful thing for a CWL developer is to have a set of ready-made tool descriptions, and this is the goal behind the bio-cwl-tools repository: "to collect and collaboratively maintain CWL CommandLineTool descriptions of any biology/life-sciences related applications."

This is in fact not too different from a bulker manifest, really -- with the difference that the manifest is only about images, not about interfaces, whereas the CWL descriptions do both; and the manifests are version controlled as a collection, and hosted via bulker hub. But perhaps these two ideas can synergize into one: A centrally located collection of bioinformatics tools that is version controlled as a collection, and available as both a bulker manifest and as CWL tool descriptions. This way, someone could use such as set interactively with bulker, or as a tool description resource for building CWL workflows, which would then be tied to specific bulker environment versions.

trouble running on a cluster with singularity

It's annoying not being able to provide a reproducible example, but do you have any idea what is going on?

[levi.waldron@karle ~]$ bulker -V
bulker 0.5.0
[levi.waldron@karle ~]$ python --version
Python 3.6.2 :: Continuum Analytics, Inc.
[levi.waldron@karle ~]$ pip --version
pip 19.1.1 from /share/usr/compilers/python/miniconda3/lib/python3.6/site-packages/pip (python 3.6)
[levi.waldron@karle ~]$ bulker -V
bulker 0.5.0
[levi.waldron@karle ~]$ bulker load demo
Traceback (most recent call last):
  File "/scratch/levi.waldron/.local/bin/bulker", line 10, in <module>
    sys.exit(main())
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/bulker/bulker.py", line 750, in main
    bulker_config = yacman.YacAttMap(filepath=bulkercfg, writable=False)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yacman/yacman.py", line 84, in __init__
    file_contents = load_yaml(filepath)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yacman/yacman.py", line 389, in load_yaml
    return read_yaml_file(filepath)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yacman/yacman.py", line 366, in read_yaml_file
    data = yaml.safe_load(f)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/__init__.py", line 162, in safe_load
    return load(stream, SafeLoader)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/__init__.py", line 114, in load
    return loader.get_single_data()
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/constructor.py", line 49, in get_single_data
    node = self.get_single_node()
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 36, in get_single_node
    document = self.compose_document()
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 55, in compose_document
    node = self.compose_node(None, None)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/composer.py", line 127, in compose_mapping_node
    while not self.check_event(MappingEndEvent):
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/parser.py", line 98, in check_event
    self.current_event = self.state()
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/parser.py", line 428, in parse_block_mapping_key
    if self.check_token(KeyToken):
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/scanner.py", line 115, in check_token
    while self.need_more_tokens():
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/scanner.py", line 152, in need_more_tokens
    self.stale_possible_simple_keys()
  File "/scratch/levi.waldron/.local/lib/python3.6/site-packages/yaml/scanner.py", line 292, in stale_possible_simple_keys
    "could not find expected ':'", self.get_mark())
yaml.scanner.ScannerError: while scanning a simple key
  in "/scratch/levi.waldron/bulker_config.yaml", line 32, column 1
could not find expected ':'
  in "/scratch/levi.waldron/bulker_config.yaml", line 33, column 14
[levi.waldron@karle ~]$ 

AttributeError: default_namespace

not sure what's causing this error:

bulker activate databio/lab
Bulker config: /project/shefflab/bulker/bulker_config_rivanna.yaml
Traceback (most recent call last):
  File "/home/ns5bc/.local/lib/python3.6/site-packages/attmap/pathex_attmap.py", line 31, in __getattr__
    v = super(PathExAttMap, self).__getattribute__(item)
AttributeError: 'YacAttMap' object has no attribute 'default_namespace'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ns5bc/.local/lib/python3.6/site-packages/attmap/ordattmap.py", line 45, in __getitem__
    return super(OrdAttMap, self).__getitem__(item)
KeyError: 'default_namespace'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ns5bc/.local/lib/python3.6/site-packages/attmap/pathex_attmap.py", line 34, in __getattr__
    return self.__getitem__(item, expand)
  File "/home/ns5bc/.local/lib/python3.6/site-packages/attmap/pathex_attmap.py", line 51, in __getitem__
    v = super(PathExAttMap, self).__getitem__(item)
  File "/home/ns5bc/.local/lib/python3.6/site-packages/attmap/ordattmap.py", line 47, in __getitem__
    return AttMap.__getitem__(self, item)
  File "/home/ns5bc/.local/lib/python3.6/site-packages/attmap/attmap.py", line 32, in __getitem__
    return self.__dict__[item]
KeyError: 'default_namespace'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ns5bc/.local/bin/bulker", line 10, in <module>
    sys.exit(main())
  File "/home/ns5bc/.local/lib/python3.6/site-packages/bulker/bulker.py", line 521, in main
    cratelist = parse_registry_paths(args.crate_registry_paths, bulker_config.bulker.default_namespace)
  File "/home/ns5bc/.local/lib/python3.6/site-packages/attmap/pathex_attmap.py", line 38, in __getattr__
    raise AttributeError(item)
AttributeError: default_namespace

bulker --version
bulker 0.2.3

yacman is 0.6.0

NameError: global name 'FileNotFoundError' is not defined

Should I upgrade to Python 3?

$ bulker load demo
Bulker config: /Users/lwaldron/bulker_config.yaml
Got URL: http://hub.bulker.io/bulker/demo.yaml
That manifest has already been loaded. Overwrite? [y/N] y
Removing all executables in: /Users/lwaldron/bulker_crates/bulker/demo/default
Traceback (most recent call last):
  File "/usr/local/bin/bulker", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/site-packages/bulker/bulker.py", line 742, in main
    force=args.force)
  File "/usr/local/lib/python2.7/site-packages/bulker/bulker.py", line 304, in bulker_load
    except FileNotFoundError:
NameError: global name 'FileNotFoundError' is not defined
Levis-MBP:_pages lwaldron$ bulker -h
version: 0.4.0
usage: bulker [-h] [-V] [--commands] [--verbosity V] [--silent] [--logdev]
              {load,run,init,activate,list} ...

bulker - manage containerized executables

positional arguments:
  {load,run,init,activate,list}
    load                Load a crate from a manifest
    run                 Run a command in a crate
    init                Initialize a new bulker config file
    activate            Activate a crate by adding it to PATH
    list                List available bulker crates

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  --commands            show program's version number and exit
  --verbosity V         Set logging level (1-5 or logging module level name)
  --silent              Silence logging. Overrides verbosity.
  --logdev              Expand content of logging message format.

https://bulker.databio.org
$ 

user id not working in container on MacOS

I have a different problem now:

Levis-Air:~ lw391$ bulker activate waldronlab/levi
Bulker config: /Users/lw391/bulker_config.yaml
Activating bulker crate: waldronlab/levi

bulker-3.2$ cd git/hub.bulker.io/
bulker-3.2$ git pull
No user exists for uid 501
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
bulker-3.2$ _git
Starting interactive docker shell for image 'samueldebruyn/debian-git' and command 'git'
I have no name!@docker-desktop:/Users/lw391/git/hub.bulker.io$ ls .git
COMMIT_EDITMSG	FETCH_HEAD  HEAD  ORIG_HEAD  config  description  hooks  index	info  logs  objects  packed-refs  refs
I have no name!@docker-desktop:/Users/lw391/git/hub.bulker.io$ whoami
whoami: cannot find name for user ID 501
I have no name!@docker-desktop:/Users/lw391/git/hub.bulker.io$ 

Here is my bulker_config.yaml:

bulker:
  volumes: ['/tmp', '$HOME']
  envvars: ['DISPLAY']
  registry_url: http://hub.bulker.io/
  default_crate_folder: ${HOME}/bulker_crates
  singularity_image_folder: ${HOME}/simages
  container_engine: docker
  default_namespace: bulker
  executable_template: templates/docker_executable.jinja2
  shell_template: templates/docker_shell.jinja2
  build_template: templates/docker_build.jinja2
  crates:
    bulker:
      demo:
        default: /Users/lw391/bulker_crates/bulker/demo/default
    bioconductor:
      bioconductor_full:
        default:
          docker_args: --volume=${HOME}/.local/lib/R:/usr/local/lib/R/host-site-library
        devel:
          docker_args: --volume=${HOME}/.local/lib/Rdev:/usr/local/lib/R/host-site-library
      levi:
        default: /Users/lw391/bulker_crates/bulker/levi/default
    waldronlab:
      bioconductor:
        default: /Users/lw391/bulker_crates/waldronlab/bioconductor/default
      levi:
        default: /Users/lw391/bulker_crates/waldronlab/levi/default

Originally posted by @lwaldron in #28 (comment)

bulker init is broken

rm bulker_config.yaml 
nsheff@puma:~$ bulker init -c $BULKERCFG
Guessing container engine is docker.
Waiting for file lock: /home/nsheff/.local/lib/python3.5/site-packages/bulker/templates/lock.bulker_config.yaml ....Traceback (most recent call last):
  File "/home/nsheff/.local/bin/bulker", line 10, in <module>
    sys.exit(main())
  File "/home/nsheff/.local/lib/python3.5/site-packages/bulker/bulker.py", line 496, in main
    bulker_init(bulkercfg, DEFAULT_CONFIG_FILEPATH, args.engine)
  File "/home/nsheff/.local/lib/python3.5/site-packages/bulker/bulker.py", line 228, in bulker_init
    bulker_config = yacman.YacAttMap(filepath=template_config_path, writable=True)
  File "/home/nsheff/.local/lib/python3.5/site-packages/yacman/yacman.py", line 84, in __init__
    _make_rw(filepath, wait_max)
  File "/home/nsheff/.local/lib/python3.5/site-packages/yacman/yacman.py", line 265, in _make_rw
    _wait_for_lock(lock_path, wait_max)
  File "/home/nsheff/.local/lib/python3.5/site-packages/yacman/yacman.py", line 227, in _wait_for_lock
    raise RuntimeError("The maximum wait time has been reached and the lock file still exists.")
RuntimeError: The maximum wait time has been reached and the lock file still exists.

custom bulker prompt

It would be helpful to have some sort of indicator that I have activated a bulker manifest and am currently residing in that space. Something along the lines of a python virtual environment would do the trick.
e.g.

$ bulker activate databio/peppro
Bulker config: /mnt/storage/bulker_cfg.yaml
Activating bulker crate: databio/peppro

(databio/peppro) $ 

network error for rstudio-server on waldronlab/bioconductor

I'm using the waldronlab/bioconductor manifest on hub.bulker.io, the config file at https://github.com/waldronlab/config/blob/master/bulker_config.yaml, and have updated my templates/docker_executable.jinja2 and templates/docker_shell.jinja2 to the ones on the dev branch (https://github.com/databio/bulker/tree/dev/bulker/templates). Now when I launch rstudio-dev, I don't get any errors:

Levis-MBP:~ lwaldron$ bulker activate waldronlab/bioconductor
Bulker config: /Users/lwaldron/bulker_config.yaml
Activating bulker crate: waldronlab/bioconductor

bulker-3.2$ cat `which rstudio-server`
#!/bin/sh

docker run --rm --init \
  --volume=/Users/lwaldron/R/bioc-release:/usr/local/lib/R/host-site-library -e PASSWORD=rstudiopassword -p 8787:8787 \
  --env "DISPLAY" \
  --volume "/tmp:/tmp" \
  --volume "$HOME:$HOME" \
  --workdir="`pwd`" \
  bioconductor/bioconductor_full:release   "$@"
bulker-3.2$ rstudio-server
[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] add: executing... 
Nothing additional to add
[cont-init.d] add: exited 0.
[cont-init.d] userconf: executing... 
[cont-init.d] userconf: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.

However, when I try to use the rstudio server in my web browser at http://localhost:8787, I get an error "RStudio initialization error": "Error occurred during transmission". I noted that when I instead browsed to http://0.0.0.0:8787, rstudio did give me a login page, but after authentiating I got the same error. I couldn't reproduce that behavior, however: on subsequent launches, even after shutting down the rstudio process and restarting bulker, I got the "Error occurred during transmission" right away when browsing to either URL.

Running bulker activate before load throws error

Attempting to activate before loading throws error without helpful messaging. Only occurs after an init. Once something has been loaded, error disappears.

$ bulker activate databio/refgenie:0.7.0
Using default config. No config found in env var: BULKERCFG
Bulker config: /home/user/.local/lib/python3.7/site-packages/bulker/templates/bulker_config.yaml
Activating bulker crate: databio/refgenie:0.7.0

Traceback (most recent call last):
  File "/home/user/.local/bin/bulker", line 8, in <module>
    sys.exit(main())
  File "/home/user/.local/lib/python3.7/site-packages/bulker/bulker.py", line 616, in main
    bulker_activate(bulker_config, cratelist, echo=args.echo, strict=args.strict)
  File "/home/user/.local/lib/python3.7/site-packages/bulker/bulker.py", line 438, in bulker_activate
    newpath = get_new_PATH(bulker_config, cratelist, strict)
  File "/home/user/.local/lib/python3.7/site-packages/bulker/bulker.py", line 468, in get_new_PATH
    cratepaths += get_local_path(bulker_config, cratevars) + os.pathsep
  File "/home/user/.local/lib/python3.7/site-packages/bulker/bulker.py", line 455, in get_local_path
    _LOGGER.debug(bulker_config.bulker.crates[cratevars["namespace"]][cratevars["crate"]].to_dict())
TypeError: 'NoneType' object is not subscriptable

python console scripts are problematic

If you put python into bulker, then console scripts are getting a bad shebang... the shebang corresponds to the path within the container, of course, but since we actually want to run the script from outside the container (in the bulker environment), that shebang is incorrect. It needs instead to be the path to the bulkerized python interpreter.

One way to solve this ad hoc is to replace the shebang after-the-fact, like this would do for the console script for looper:

py3=`which python3` sed -i s"|/usr/local/bin/python3|$py3|" `which looper`

Another possibility is to not rely on console scripts, but instead alias python3 -m looper (assuming the packages are set up correctly with __main__.py files)

Neither of these solutions is really satisfactory. I think it's just a fundamental problem with the way python console scripts work, which tightly couple the python path installing the package and the script itself. In this case, I want to break that coupling because the install is happening in a container, while the running will initiate outside the container (though it eventually ends in the container as well). This is not possible with the current python system.

See also:

https://stackoverflow.com/questions/50557963/why-does-pip-install-seem-to-change-the-interpreter-line-on-some-machines

pepkit/looper#224

container location api

would be nice to have an API where I can retrieve a container location given the type of container I want (singularity, docker, etc), and the tool name and version. So, provide:

/biocontainers/singularity/samtools/1.9
/biocontainers/docker/samtools/1.9

Error running demo

I have installed bulker for the first time. Below, I confirm that the installation was successful

$ pip install --user bulker
Requirement already satisfied: bulker in ./.local/lib/python3.7/site-packages (0.2.1)
Requirement already satisfied: jinja2 in ./.local/lib/python3.7/site-packages (from bulker) (2.10.1)
Requirement already satisfied: pyyaml>=5.1 in /usr/lib/python3.7/site-packages (from bulker) (5.1.2)
Requirement already satisfied: yacman>=0.5.2 in ./.local/lib/python3.7/site-packages (from bulker) (0.5.2)
Requirement already satisfied: logmuse>=0.2.0 in ./.local/lib/python3.7/site-packages (from bulker) (0.2.4)
Requirement already satisfied: ubiquerg>=0.4.8 in ./.local/lib/python3.7/site-packages (from bulker) (0.4.9)
Requirement already satisfied: MarkupSafe>=0.23 in ./.local/lib/python3.7/site-packages (from jinja2->bulker) (1.1.1)
Requirement already satisfied: attmap>=0.12.9 in ./.local/lib/python3.7/site-packages (from yacman>=0.5.2->bulker) (0.12.9)
Requirement already satisfied: oyaml in ./.local/lib/python3.7/site-packages (from yacman>=0.5.2->bulker) (0.9)
$ ion: 0.2.1
usage: bulker [-h] [-V] [--silent] [--verbosity V] [--logdev]
              {init,list,load,activate,run} ...

bulker - manage containerized executables

positional arguments:
  {init,list,load,activate,run}
    init                Initialize a new bulker config file
    list                List available bulker crates
    load                Load a crate from a manifest
    activate            Activate a crate by adding it to PATH
    run                 Run a command in a crate

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  --silent              Silence logging. Overrides verbosity.
  --verbosity V         Set logging level (1-5 or logging module level name)
  --logdev              Expand content of logging message format.

https://bulker.databio.org

However, when I try to run the demo:

$ bulker load demo
Using default config. No config found in env var: BULKERCFG
Bulker config: /home/lgatto/.local/lib/python3.7/site-packages/bulker/templates/bulker_config.yaml
Got URL: http://hub.bulker.io/bulker/demo.yaml
Traceback (most recent call last):
  File "/home/lgatto/.local/bin/bulker", line 10, in <module>
    sys.exit(main())
  File "/home/lgatto/.local/lib/python3.7/site-packages/bulker/bulker.py", line 540, in main
    exe_template = mkabs(bulker_config.bulker.executable_template, os.path.dirname(bulker_config._file_path))
  File "/home/lgatto/.local/lib/python3.7/site-packages/bulker/bulker.py", line 459, in mkabs
    if os.path.isabs(xpand(path)):
  File "/home/lgatto/.local/lib/python3.7/site-packages/bulker/bulker.py", line 457, in xpand
    return os.path.expandvars(os.path.expanduser(path))
  File "/usr/lib/python3.7/posixpath.py", line 235, in expanduser
    path = os.fspath(path)
TypeError: expected str, bytes or os.PathLike object, not NoneType

Any help would be appreciated.

Issues with bulker when submitting jobs on SGE

I have installed pepatac and used the bulker container which works well on my cluster when using interactively on a development node; however when I submit a job on the SGE pepatac can no longer find the software from the container. This was not a problem when I was running pepatac using a singularity container in an older version.

Are there environmental variable I need to pass to help bulker play well with the SGE?
Thanks for any advice!

Recursive reloading

Right now if you change bulker config and then re-load a manifest, it only re-creates the executables for the direct manifest; it does not recursively reload any imported manifests.

usually I want to reload everything when I make a change.

maybe add:

  • a load option that reloads all imported manifests, recursively?
  • a 'reload all' that reloads all loaded manifests?

bulker activate fails with default config

Raised by @afrendeiro

bulker activate fails with default config

unset BULKERCFG
bulker load demo
Using default config. No config found in env var: BULKERCFG
Bulker config: /home/nsheff/.local/lib/python3.7/site-packages/bulker/templates/bulker_config.yaml
Loading manifest: 'bulker/demo:default'. Activate with 'bulker activate bulker/demo:default'.
Commands available: cowsay, fortune
nsheff@zither:/code/bulker$ bulker activate demo
Using default config. No config found in env var: BULKERCFG
Bulker config: /home/nsheff/.local/lib/python3.7/site-packages/bulker/templates/bulker_config.yaml
Activating bulker crate: demo
nsheff@zither:
/code/bulker$ cowsay

Command 'cowsay' not found, but can be installed with:

inspect requires crate name

based on the message below, crate inspecting should be possible in an active crate with no arg provided

databio/peppro|~$ bulker -V
bulker 0.5.0
databio/peppro|~$ bulker inspect
Bulker config: /Users/mstolarczyk/bulker_config.yaml
No active create. Inspect requires a provided crate, or a currently active create.

NameError: name 'create_folder' is not defined

on dev branch

[mstolarczyk@MichalsMBP test_genomes]: export BULKERCFG="~/bulker_config.yaml"
[mstolarczyk@MichalsMBP test_genomes]: bulker init -c $BULKERCFG
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/bin/bulker", line 10, in <module>
    sys.exit(main())
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bulker/bulker.py", line 611, in main
    _is_writable(os.path.dirname(bulkercfg), check_exist=False)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/bulker/bulker.py", line 196, in _is_writable
    elif create_folder:
NameError: name 'create_folder' is not defined

Bulker sources bashrc, updating PATH when it likely should not

If a user's .bashrc file prepends or appends the PATH envvar, this is carried over to bulker's path, even in --strict mode. If --strict is truly that, likely should not be calling .bashrc, or maybe it should overwrite PATH only after calling .bashrc.

container-specific volumes not mounting

I'm having trouble getting container-specific volume mounts in bulker 3.0:

bulker-3.2$ ls .local/lib/R  #I made these directories in advance, is this necessary?
bulker-3.2$ ls .local/lib/Rdev   #I made these directories in advance, is this necessary?
Levis-Air:~ lw391$ bulker activate waldronlab/bioconductor
Bulker config: /Users/lw391/bulker_config.yaml
Activating bulker crate: waldronlab/bioconductor

bulker-3.2$ _R
Starting interactive docker shell for image 'bioconductor/bioconductor_full:release' and command 'R'
I have no name!@docker-desktop:/Users/lw391$ ls /usr/local/lib/R/host-site-library
ls: cannot access '/usr/local/lib/R/host-site-library': No such file or directory
I have no name!@docker-desktop:/Users/lw391$ exit
bulker-3.2$ exit
Levis-Air:~ lw391$ ls .local/lib/R
Levis-Air:~ lw391$ ls .local/lib/Rdev
ls: .local/lib/Rdev: No such file or directory
Levis-Air:~ lw391$ mkdir .local/lib/Rdev
Levis-Air:~ lw391$ ls .local/lib/R
Levis-Air:~ lw391$ ls .local/lib/Rdev
Levis-Air:~ lw391$ bulker activate waldronlab/bioconductor
Bulker config: /Users/lw391/bulker_config.yaml
Activating bulker crate: waldronlab/bioconductor

bulker-3.2$ _R
Starting interactive docker shell for image 'bioconductor/bioconductor_full:release' and command 'R'
I have no name!@docker-desktop:/Users/lw391$ ls /usr/local/lib/R/host-site-library
ls: cannot access '/usr/local/lib/R/host-site-library': No such file or directory
I have no name!@docker-desktop:/Users/lw391$ exit
bulker-3.2$ _Rdev
Starting interactive docker shell for image 'bioconductor/bioconductor_full:devel' and command 'Rdev'
I have no name!@docker-desktop:/Users/lw391$ ls /usr/local/lib/R/host-site-library
ls: cannot access '/usr/local/lib/R/host-site-library': No such file or directory
I have no name!@docker-desktop:/Users/lw391$ exit
bulker-3.2$ 

Here is my bulker_config.yaml:

bulker:
  volumes: ['/tmp', '$HOME']
  envvars: ['DISPLAY']
  registry_url: http://hub.bulker.io/
  default_crate_folder: ${HOME}/bulker_crates
  singularity_image_folder: ${HOME}/simages
  container_engine: docker
  default_namespace: bulker
  executable_template: templates/docker_executable.jinja2
  shell_template: templates/docker_shell.jinja2
  build_template: templates/docker_build.jinja2
  crates:
    bulker:
      demo:
        default: /Users/lw391/bulker_crates/bulker/demo/default
    bioconductor:
      bioconductor_full:
        default:
          docker_args: --volume=${HOME}/.local/lib/R:/usr/local/lib/R/host-site-library
        devel:
          docker_args: --volume=${HOME}/.local/lib/Rdev:/usr/local/lib/R/host-site-library
      levi:
        default: /Users/lw391/bulker_crates/bulker/levi/default
    waldronlab:
      bioconductor:
        default: /Users/lw391/bulker_crates/waldronlab/bioconductor/default
      levi:
        default: /Users/lw391/bulker_crates/waldronlab/levi/default

Some ideas

how to handle templates? We have build templates and run templates...make them individually modifiable? Or use 'mode: docker' and 'mode: singularity' ?

bulker init uses 'is_command_callable' to determine if docker is present, or singularity, and uses appropriate mode (of course it can be changed).

bulker inspect fails on tagged crates

 bulker activate databio/refgenie:0.7.0
Bulker config: /home/nsheff/pCloudSync/env/bulker_config/zither.yaml
Activating bulker crate: databio/refgenie:0.7.0
databio/refgenie|~$ bulker inspect
Bulker config: /home/nsheff/pCloudSync/env/bulker_config/zither.yaml
The requested remote manifest 'http://hub.bulker.io/databio/refgenie.yaml' is not found.

Don't use $HOME, hard-code it

CWL rewrites $HOME for its runs, so it doesn't work with bulker shims, which by default are mounting $HOME as an env var...

I see no real value in keeping the variable in the bulker config, so when the config is initiated, by default we should probably simply resolve the environment variable at the config build time instead of maintaining the environment variable clear through to the containerized executable scripts.

images are not pulled; rivanna/singularity

I get the following error on Rivanna:

[mjs5kd@udc-ba36-36 bulker](dev): bwa
ERROR  : Image path /home/mjs5kd/simages/quay.io/biocontainers/bwa:0.7.17--pl5.22.0_2 doesn't exist: No such file or directory
ABORT  : Retval = 255

It worked on my laptop

I'm using the dev version of bulker and just initalized the configuration.

It looks like it does not proceed to image pulling in this case

document dealing with images with entrypoints

Raised by @lwaldron

If a docker image specifies an ENTRYPOINT, then we don't want to put a command in the containerized executable, because it's interpreted as an argument to the command.

It's simple as saying:

docker_command: ' '

but we need to make sure this is documented and tested well.

bulker overwrites PS1

if I activate a crate, bulker creates a new prompt, which is awesome. But It also overwrites my previously set prompt configuration in PS1. Is it possible to prepend or append bulker prompt to whatever I have already set?

[mstolarczyk@MichalsMBP peppro](master): bulker-activate databio/peppro
Bulker config: /Users/mstolarczyk/bulker_config.yaml
Activating bulker crate: databio/peppro
databio/peppro|~/Uczelnia/UVA/code/peppro$

Settings that are both tool-specific and host-specific

right now you can do env-specific volumes, which is awesome.
and you can do tool-specific args, which is even awesomer.
but what about tool-specific args that vary by environment? that would be awesomest.

use case: I want to put redis in my crate. I want to attach a local path for the redis db. I don't need that attached to any other tool, so it's tool-specific. I also want a different location attached on a different env, so it's env-specific also. thus, it fits in neither the bulker config nor the the manifest.

it seems to fit better with env than with tool. I propose a new section in the bulker config for

tool_args:
  redis-server:
    dockerargs: ""
    volumes: ['/local/data:/data']

not sure this is the best though...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.