Giter Site home page Giter Site logo

jupyterhub / batchspawner Goto Github PK

View Code? Open in Web Editor NEW
180.0 180.0 129.0 420 KB

Custom Spawner for Jupyterhub to start servers in batch scheduled systems

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
batch-scheduler hpc jupyter jupyterhub spawner supercomputer

batchspawner's Introduction

Technical Overview | Installation | Configuration | Docker | Contributing | License | Help and Resources


Latest PyPI version Latest conda-forge version Documentation build status GitHub Workflow Status - Test Test coverage of code GitHub Discourse Gitter

With JupyterHub you can create a multi-user Hub that spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server.

Project Jupyter created JupyterHub to support many users. The Hub can offer notebook servers to a class of students, a corporate data science workgroup, a scientific research project, or a high-performance computing group.

Technical overview

Three main actors make up JupyterHub:

  • multi-user Hub (tornado process)
  • configurable http proxy (node-http-proxy)
  • multiple single-user Jupyter notebook servers (Python/Jupyter/tornado)

Basic principles for operation are:

  • Hub launches a proxy.
  • The Proxy forwards all requests to Hub by default.
  • Hub handles login and spawns single-user servers on demand.
  • Hub configures proxy to forward URL prefixes to the single-user notebook servers.

JupyterHub also provides a REST API for administration of the Hub and its users.

Installation

Check prerequisites

  • A Linux/Unix based system

  • Python 3.8 or greater

  • nodejs/npm

    • If you are using conda, the nodejs and npm dependencies will be installed for you by conda.

    • If you are using pip, install a recent version (at least 12.0) of nodejs/npm.

  • If using the default PAM Authenticator, a pluggable authentication module (PAM).

  • TLS certificate and key for HTTPS communication

  • Domain name

Install packages

Using conda

To install JupyterHub along with its dependencies including nodejs/npm:

conda install -c conda-forge jupyterhub

If you plan to run notebook servers locally, install JupyterLab or Jupyter notebook:

conda install jupyterlab
conda install notebook

Using pip

JupyterHub can be installed with pip, and the proxy with npm:

npm install -g configurable-http-proxy
python3 -m pip install jupyterhub

If you plan to run notebook servers locally, you will need to install JupyterLab or Jupyter notebook:

python3 -m pip install --upgrade jupyterlab
python3 -m pip install --upgrade notebook

Run the Hub server

To start the Hub server, run the command:

jupyterhub

Visit http://localhost:8000 in your browser, and sign in with your system username and password.

Note: To allow multiple users to sign in to the server, you will need to run the jupyterhub command as a privileged user, such as root. The wiki describes how to run the server as a less privileged user, which requires more configuration of the system.

Configuration

The Getting Started section of the documentation explains the common steps in setting up JupyterHub.

The JupyterHub tutorial provides an in-depth video and sample configurations of JupyterHub.

Create a configuration file

To generate a default config file with settings and descriptions:

jupyterhub --generate-config

Start the Hub

To start the Hub on a specific url and port 10.0.1.2:443 with https:

jupyterhub --ip 10.0.1.2 --port 443 --ssl-key my_ssl.key --ssl-cert my_ssl.cert

Authenticators

Authenticator Description
PAMAuthenticator Default, built-in authenticator
OAuthenticator OAuth + JupyterHub Authenticator = OAuthenticator
ldapauthenticator Simple LDAP Authenticator Plugin for JupyterHub
kerberosauthenticator Kerberos Authenticator Plugin for JupyterHub

Spawners

Spawner Description
LocalProcessSpawner Default, built-in spawner starts single-user servers as local processes
dockerspawner Spawn single-user servers in Docker containers
kubespawner Kubernetes spawner for JupyterHub
sudospawner Spawn single-user servers without being root
systemdspawner Spawn single-user notebook servers using systemd
batchspawner Designed for clusters using batch scheduling software
yarnspawner Spawn single-user notebook servers distributed on a Hadoop cluster
wrapspawner WrapSpawner and ProfilesSpawner enabling runtime configuration of spawners

Docker

A starter docker image for JupyterHub gives a baseline deployment of JupyterHub using Docker.

Important: This quay.io/jupyterhub/jupyterhub image contains only the Hub itself, with no configuration. In general, one needs to make a derivative image, with at least a jupyterhub_config.py setting up an Authenticator and/or a Spawner. To run the single-user servers, which may be on the same system as the Hub or not, Jupyter Notebook version 4 or greater must be installed.

The JupyterHub docker image can be started with the following command:

docker run -p 8000:8000 -d --name jupyterhub quay.io/jupyterhub/jupyterhub jupyterhub

This command will create a container named jupyterhub that you can stop and resume with docker stop/start.

The Hub service will be listening on all interfaces at port 8000, which makes this a good choice for testing JupyterHub on your desktop or laptop.

If you want to run docker on a computer that has a public IP then you should (as in MUST) secure it with ssl by adding ssl options to your docker configuration or by using an ssl enabled proxy.

Mounting volumes will allow you to store data outside the docker image (host system) so it will be persistent, even when you start a new image.

The command docker exec -it jupyterhub bash will spawn a root shell in your docker container. You can use the root shell to create system users in the container. These accounts will be used for authentication in JupyterHub's default configuration.

Contributing

If you would like to contribute to the project, please read our contributor documentation and the CONTRIBUTING.md. The CONTRIBUTING.md file explains how to set up a development installation, how to run the test suite, and how to contribute to documentation.

For a high-level view of the vision and next directions of the project, see the JupyterHub community roadmap.

A note about platform support

JupyterHub is supported on Linux/Unix based systems.

JupyterHub officially does not support Windows. You may be able to use JupyterHub on Windows if you use a Spawner and Authenticator that work on Windows, but the JupyterHub defaults will not. Bugs reported on Windows will not be accepted, and the test suite will not run on Windows. Small patches that fix minor Windows compatibility issues (such as basic installation) may be accepted, however. For Windows-based systems, we would recommend running JupyterHub in a docker container or Linux VM.

Additional Reference: Tornado's documentation on Windows platform support

License

We use a shared copyright model that enables all contributors to maintain the copyright on their contributions.

All code is licensed under the terms of the revised BSD license.

Help and resources

We encourage you to ask questions and share ideas on the Jupyter community forum. You can also talk with us on our JupyterHub Gitter channel.

JupyterHub follows the Jupyter Community Guides.


Technical Overview | Installation | Configuration | Docker | Contributing | License | Help and Resources

batchspawner's People

Contributors

bollwyvl avatar carreau avatar chiroptical avatar cmd-ntrf avatar consideratio avatar dcbradley avatar ddemidov avatar deephorizons avatar dependabot[bot] avatar gmfricke avatar hoeze avatar kinow avatar mark-tomich avatar mbmilligan avatar mcburton avatar minrk avatar mkgilbert avatar olifre avatar petraea avatar pre-commit-ci[bot] avatar rcthomas avatar rkdarst avatar ryanlovett avatar t20100 avatar willfurnass avatar willingc avatar yarikoptic avatar yfoo avatar yuvipanda avatar zonca avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

batchspawner's Issues

Make easier to disable exporting of environment with sudo

BatchSpawner fails on one of our Grid Engine clusters due to sudo being unable to export the environment (enabled by it's -E argument). We only allow certain environment variables to be exported - from our sudoers:

Defaults:jupyter env_keep+="SGE_ROOT SGE_CELL SGE_EXECD_PORT SGE_QMASTER_PORT SGE_CLUSTER_NAME LANG JPY_API_TOKEN"
jupyter        host1, host2 = (ALL) NOPASSWD: /usr/local/sge/live/bin/lx-amd64/qsub
jupyter        host1, host2 = (ALL) NOPASSWD: /usr/local/sge/live/bin/lx-amd64/qstat
jupyter        host1, host2 = (ALL) NOPASSWD: /usr/local/sge/live/bin/lx-amd64/qdel

To work around this issue I've added the following to our jupyterhub_config.py:

# Need to tailor use of sudo to ShARC as exporting of entire environment         
# using sudo's -E argument is not permitted here                                 
sudo_prefix = 'sudo -u {username} '                                              
c.GridengineSpawner.batch_submit_cmd = sudo_prefix + 'qsub'                      
c.GridengineSpawner.batch_query_cmd = sudo_prefix + 'qstat -xml'                 
c.GridengineSpawner.batch_cancel_cmd = sudo_prefix + 'qdel {job_id}' 

but do others think it would be nicer to have a neater way of setting this e.g. a can_sudo_export_env boolean trait?

Zero to JupyterHub, batchspawner style

There are many batch jobqueue systems out there that are currently considering using JupyterHub. Currently the advice given to them is to talk to groups that have successfully done it in the past. This may not be accessible to smaller centers or clusters that don't feel quite as comfortable reaching out. The Zero-to-JupterHub documentation for Kubernetes has been a success in getting novice groups and users (like mysefl) to build productive systems. I wonder if we've learned enough about how BatchSpawner gets used in practice to provide best practice guidelines

cc @minrk @choldgraf

key_error runtime_dir

Hi,

I'm running jupyterhub on a cluster submit node dedicated for it and use the batchspawner plus profilespawner spinning up notebooks for students. I am running into some issues with jupyterhub 0.7.2 and batchspawner 02c0b40. We use local accounts. Here is my jupyterhub_config.py on the head node:

c.JupyterHub.hub_ip = '0.0.0.0'
c.JupyterHub.cookie_max_age_days = 8
c = get_config()
c.JupyterHub.spawner_class = 'batchspawner.SlurmSpawner'
c.Spawner.http_timeout = 120
c.SlurmSpawner.req_nprocs = '2'
c.SlurmSpawner.req_runtime = '12:00:00'
c.SlurmSpawner.batch_script = '''#!/bin/bash
#SBATCH --partition={partition}
#SBATCH --time={runtime}
#SBATCH --output={homedir}/jupyterhub_slurmspawner_%j.log
#SBATCH --job-name=spawner-jupyterhub
#SBATCH --workdir={homedir}
#SBATCH --mem={memory}
#SBATCH --export={keepvars}
#SBATCH --uid={username}
#SBATCH --get-user-env=L
#SBATCH {options}


module load courses/env

which jupyterhub-singleuser
{cmd}
'''
c.JupyterHub.spawner_class = 'wrapspawner.ProfilesSpawner'
c.Spawner.http_timeout = 120
c.ProfilesSpawner.profiles = [
   ( "Local server", 'local', 'jupyterhub.spawner.LocalProcessSpawner', {'ip':'0.0.0.0'} ),
   ('cluster Interactive GPU - 6 cores + Nvidia GTX 1080,  16 GB, 8 hours',
    'interactive-gpu',
    'batchspawner.SlurmSpawner',
      dict(req_nprocs='2', req_partition='gpu', req_runtime='8:00:00')
   )
   ]

c.Spawner.env_keep = ['PATH', 'LD_LIBRARY_PATH', 'PYTHONPATH', 'CONDA_ROOT', 'CONDA_DEFAULT_ENV', 'VIRTUAL_ENV', 'LANG', 'LC_ALL']

The problem is, that I'm getting an error message in the slurm job's log that I don't understand:

/home/courses/sw/apps/jupyterhub/0.7.2/bin/jupyterhub-singleuser
Traceback (most recent call last):
  File "/sw/apps/python/3.5.1/lib/python3.5/site-packages/traitlets/traitlets.py", line 528, in get
    value = obj._trait_values[self.name]
KeyError: 'runtime_dir'

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/courses/sw/apps/jupyterhub/0.7.2/bin/jupyterhub-singleuser", line 6, in <module>
    main()
  File "/home/courses/sw/apps/jupyterhub/0.7.2/lib/python3.5/site-packages/jupyterhub/singleuser.py", line 322, in main
    return SingleUserNotebookApp.launch_instance(argv)
  File "/home/courses/sw/lib/python3.5/site-packages/jupyter_core/application.py", line 267, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "/sw/apps/python/3.5.1/lib/python3.5/site-packages/traitlets/config/application.py", line 657, in launch_instance
    app.initialize(argv)
  File "<decorator-gen-7>", line 2, in initialize
  File "/sw/apps/python/3.5.1/lib/python3.5/site-packages/traitlets/config/application.py", line 87, in catch_config_error
    return method(app, *args, **kwargs)
  File "/home/courses/sw/lib/python3.5/site-packages/notebook/notebookapp.py", line 1294, in initialize
    self.init_configurables()
  File "/home/courses/sw/lib/python3.5/site-packages/notebook/notebookapp.py", line 1033, in init_configurables
    connection_dir=self.runtime_dir,
  File "/sw/apps/python/3.5.1/lib/python3.5/site-packages/traitlets/traitlets.py", line 556, in __get__
    return self.get(obj, cls)
  File "/sw/apps/python/3.5.1/lib/python3.5/site-packages/traitlets/traitlets.py", line 535, in get
    value = self._validate(obj, dynamic_default())
  File "/home/courses/sw/lib/python3.5/site-packages/jupyter_core/application.py", line 99, in _runtime_dir_default
    ensure_dir_exists(rd, mode=0o700)
  File "/sw/apps/python/3.5.1/lib/python3.5/site-packages/ipython_genutils/path.py", line 167, in ensure_dir_exists
    os.makedirs(path, mode=mode)
  File "/sw/apps/python/3.5.1/lib/python3.5/os.py", line 231, in makedirs
    makedirs(head, mode, exist_ok)
  File "/sw/apps/python/3.5.1/lib/python3.5/os.py", line 241, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/run/user/8900'

I am not sure where runtime_dir is coming from. I checked in the code bases of jupyterhub, batchspawner and wrapspawner.

Any ideas would be appreciated.

PS. It would be nice if there could be a tested jupyterhub_config.py as part of this repo.

Why is the implementation of `make_preexec_fn` not different from that in `Spawner`?

The (concrete) implementation of make_preexec_fn in the BatchSpawnerBase class is, as far as I can tell, exactly the same as that of make_preexec_fn in the Spawner class, from which BatchSpawnerBase inherits.

Why is the method "overwritten" in BatchSpawnerBase if no concrete changes are made to the implementation? Is this for historical reasons (e.g. an earlier version of the Spawner superclass did not have that method)?

The docstrings between the two methods are different -- is that the reason for "overwriting" the version from the parent Spawner class?

example configurations

I would like to add the setup I use on Comet at SDSC as an example of interfacing with SLURM remotely via gsissh/ssh.

What is the best place to put them?

some ideas:

  • wiki
  • folder in the repository
  • standalone repository linked in the wiki
  • gist

Moreover, my setup is also integrated with a OAUTH service with oauthenticator, is it confusing if I add also this to the mix or better just share the batchspawner configuration?

how to include a "extra launch script"

Hi, we've been previously using the slurmspawner here: https://github.com/mkgilbert/slurmspawner . We are getting close to having our config converted. Yet one thing remains. What is the equivalent of this:

c.SlurmSpawner.extra_launch_script = pjoin('/etc/jupyterhub/extra_launch_script')

Was thinking maybe this was:

c.BatchSpawnerBase.req_prologue = pjoin('/etc/jupyterhub/extra_launch_script')

But in the logs I see:

[W 2018-06-20 16:19:50.830 JupyterHub configurable:168] Config option req_prologue not recognized by SlurmSpawner. Did you mean one of: req_cluster, req_nprocs, req_runtime?

We use the extra launch script to prep the environment for the jupyter job.

Support for selecting Grid Engine parallel environments

The req_nslots trait could be more useful if the parallel environment can be selected. Think something like the following could be added to GridengineSpawner:

req_par_env = Unicode('', 
    help="Select Grid Engine Parallel Environment"
    ).tag(config=True) 

Job request request validation step

Could be useful to quickly provide feedback in cases where a set of resource requests could never be satisfied, even with an empty cluster.

E.g. for Grid Engine:

  1. run `qsub -w v for dry-run validation of the resource requests and if the return status is okay then
  2. run `qsub

NB we've found that such validation mechanisms can sometimes return false negatives.

Message when not enough resources are avalilable

We have has some problems when users tried to spawn a session, slurm didn't have enough resources to allocate then.

Would it be possible to show a message stating that it is the case? Ideally, the user would receive the message and the job would not be kept waiting in the queue.

Do you think this makes sense to be included?

Permission denied when spawning job using slurm

Attempting to use the slurm batch spawner results in the following error. I have tried unsetting the XDG_RUNTIME_DIR as some have suggested but this doesn't appear to work. It is odd because the user that is executing the slurm command via sudo has permissions to the directory listed below. This happens for all users.

Traceback (most recent call last):
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/site-packages/traitlets/traitlets.py", line 528, in get
    value = obj._trait_values[self.name]
KeyError: 'runtime_dir'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/web_services/galaxy/jupyter_conda/bin/jupyterhub-singleuser", line 6, in <module>
    main()
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/site-packages/jupyterhub/singleuser.py", line 455, in main
    return SingleUserNotebookApp.launch_instance(argv)
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/site-packages/jupyter_core/application.py", line 266, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/site-packages/traitlets/config/application.py", line 657, in launch_instance
    app.initialize(argv)
  File "<decorator-gen-7>", line 2, in initialize
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/site-packages/traitlets/config/application.py", line 87, in catch_config_error
    return method(app, *args, **kwargs)
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/site-packages/notebook/notebookapp.py", line 1505, in initialize
    self.init_configurables()
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/site-packages/notebook/notebookapp.py", line 1209, in init_configurables
    connection_dir=self.runtime_dir,
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/site-packages/traitlets/traitlets.py", line 556, in __get__
    return self.get(obj, cls)
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/site-packages/traitlets/traitlets.py", line 535, in get
    value = self._validate(obj, dynamic_default())
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/site-packages/jupyter_core/application.py", line 99, in _runtime_dir_default
    ensure_dir_exists(rd, mode=0o700)
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/site-packages/jupyter_core/utils/__init__.py", line 13, in ensure_dir_exists
    os.makedirs(path, mode=mode)
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/os.py", line 210, in makedirs
    makedirs(head, mode, exist_ok)
  File "/var/web_services/galaxy/jupyter_conda/lib/python3.6/os.py", line 220, in makedirs
    mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/run/user/1261'

Batchspawner and PBS Pro 14.1.0

Hi,

Does anybody know if the TORQUE spawner is compatible with PBS Pro 14.1.0?
The basic syntax is pretty similar (if not identical).

Has anybody had success with it?

Thanks

Using sudo but preserving needed environment variables

Hello sir --

Hope I'm doing this correctly. Thanks for putting this together!

I'm building a proof of concept using your batchspawner to allow JupyterHub to hand off kernels and user sessions to compute nodes via our Univa Grid Engine 8.2.1 grid. I've got it working well, but I had to make the following edits to do so:

In your batchspawner/batchspawner.py file:

[root@system ~]# diff batchspawner.orig.py batchspawner.py
454c454
<     batch_submit_cmd = Unicode('sudo -E -u {username} qsub', config=True)

---
>     batch_submit_cmd = Unicode('sudo -i -E -u {username} qsub', config=True)
456,457c456,457
<     batch_query_cmd = Unicode('sudo -E -u {username} qstat -xml', config=True)
<     batch_cancel_cmd = Unicode('sudo -E -u {username} qdel {job_id}', config=True)

---
>     batch_query_cmd = Unicode('sudo -i -E -u {username} qstat -xml', config=True)
>     batch_cancel_cmd = Unicode('sudo -i -E -u {username} qdel {job_id}', config=True)

This is because all user sessions as spun off via sudo require certain essential grid engine environment variables. For us, these are:

[user@grid-engine-system ~]$ env | grep SGE
SGE_ROOT=/gridware/uge
SGE_CELL=default
SGE_CLUSTER_NAME=grid-engine-system

(Univa now maintains Sun Grid Engine, so Univa Grid Engine still uses SGE_* everywhere for env variables and the like)

Passing the -i flag to sudo allows the standard user login scripts (either /etc/profile or /etc/csh.cshrc, depending on shell) to fire on job creation, which in turn each source the SGE config script necessary to define SGE variables (in /gridware/uge/default/common/settings.[c]sh).

I'm sure this is not required for all deployments, but it was necessary for ours. I just wanted to pass this on to you in the event that you have other users with similar configs.

Happy to answer questions or assist in testing. Thanks again for this awesome module!

EDIT: diff output and markdown syntax do not get along.

jupyterhub install needed on nodes?

Just a question, because this is not explicitly mentioned:

When using SlurmSpawner to run notebooks the compute nodes of our cluster, does jupyterhub need to be installed on each compute node? The command jupyterhub-singleuser which is run in all the example job scripts seems to imply this.

If this is indeed the case, is there any way around this? The notebook is installed on the compute nodes, but currently I only have Python2.7 there, and jupyterhub requires 3.x ...

How to terminate notebook job once a user logs out?

Hello,

I'm running into a problem where single-user notebooks stay on the compute nodes even after the user logs out tying up resources. What is the proper way to clean these sessions up? Is reducing the walltime my only option?

Thanks!

Release on Pypi?

I think it would be convenient for some users.

We'll need to release also wrapspawner.

Contributing

Hey Guys,

First of all thank you for all your work on this project. This is awesome!

I'm extending this project to work with LSF (IBM's workload manager). I'm not familiar with travis-ci so if you guys could give me a hand getting set up or showing me what tool you use to run all the tests locally that would be awesome.

Thanks!

Any way to not have public notebook servers?

Hello, I'm looking to use Jupyter on our cluster. My one concern is that of public notebooks. From looking at the code it appears the hub submits a job to a node machine. This starts a notebook that must be accessible from the hub. The hub then gets the hostname and uses that as the IP to proxy the user to their server. This means that jupyter binds to the public IP of the node machine.

In our case, our node machines are internet accessible, meaning that the notebook servers would be accessible outside the hub. This raises 2 concerns:

  1. Unauthorized access
  2. Encryption between hub and node machine

I know the new jupyter token authentication introduced in 4.3.0 solves 1, but there is still no encryption between the hub and the node machines.

Is there any way to not have public notebooks? Somehow have it bind to localhost and tunnel the connection to the hub? While not a deal breaker, this would be nice.

Specifying queue via config file

Hello --

A second, unrelated issue to my previous ticket. I have not been able to get batchspawner to respect the queue I've specified in jupyterhub_config.py. Here is what I've tried:

c.BatchSpawnerBase.req_queue = 'jupyterhub.q'

and also:

c.GridengineSpawner.batch_script = """#!/bin/bash
  # This wrapper script auto-generated by JupyterHub's BatchSpawner
  #$ -q jupyterhub.q
  #$ -j yes
  #$ -N spawner-jupyterhub
  #$ -v {keepvars}
  #$ {options}
  {cmd}
"""

I got around this by hardcoding the queue in batchspawner/batchspawner.py like so:

[root@wrds-cloud-dev-h batchspawner]# diff batchspawner.orig.py batchspawner.py
454c454
<     batch_submit_cmd = Unicode('sudo -E -u {username} qsub', config=True)

---
>     batch_submit_cmd = Unicode('sudo -i -E -u {username} qsub -q jupyterhub.q', config=True)

(ignore -i flag difference, addressed in my other issue, #9 )

The modification works, but neither config options do. How are you passing these parameters on to Grid Engine? Or am I making an egregious syntax error?

We are running Univa Grid Engine 8.2.1 on Redhat 6, with Jupyterhub 0.7.0-dev

Thanks!

"spawner.ip" is not cleaned up when the SingleUserServer stops

Behavior

After a user start and stop the server multiple times using the batchspawner, the IP and port of the prior spawner are feed to the new spawner's arguments. It is shown in the logs that:

[I 2018-04-29 12:58:18.960 JupyterHub batchspawner:186] Spawner submitted script:
...
jupyter-labhub --ip=""<OLD_HOSTNAME>"" --port=<OLD_PORT> --notebook-dir=""~/"" --NotebookApp.default_url=""/lab"" --disable-user-config
...

If the batch system (HTCondor in my case) doesn't schedule the job to the same host as the last one, the single-user server will crash because it is told to bind to the wrong hostname/IP.

Reason

The BatchSpawnerBase.stop() doesn't clean up the Spawner object which is NOT destroyed after the single-user server stops.

def stop(self, now=False):

Proposed solution

  • Clean up the spawner at the beginning of BatchSpawnerBase.start() or in the Spawner.pre_spawn_hook.
  • Do not reuse the spawner object.

Single-user notebooks are not shut down properly with jupyterhub-singleuser.

(Further is valid for SlurmSpawner, probably also for other spawners)

Expected behaviour: Usual Jupyter notebook and LocalProcessSpawner are sending the notebooks kernels interrupt signal, after which shutdown_kernel method is called. Kernels have time to collect some resources, save notebooks, etc.

Actual behaviour: kernels are just immediately killed (scancel just brutally terminate the job, for example).

Why this is important (for me): I have written a KernelManager, which launches Jupyter kernels as SLURM jobs in a configurable manner. It seems to work nicely in a standalone notebook and with LocalProcessSpawner, but when it is used with BatchSpawner, notebook jobs are staying in queue after the jupyterhub-singleuser termination, because shutdown_kernel is not called.

Additional notes. I have made an attempt to fix this in this branch, but this seems not to work out. The complication is that I do not have normal testing setup for SLURM+JupyterHub (I am not a root on a cluster and somehow fail to get SLURM working for now locally).

Separate wrapspawner to other repository

It is currently very difficult for people to know about WrapSpawner and ProfileSpawner because they are a bit hidden inside this repo.

I know there is a significant request for ProfileSpawner, it is really a very general and useful tool.

What about separating it out to another repository?

Failed to connect to hub api

I am trying to setup a jupyterhub server in a slurm cluster.

It has the following nodes:

  • nodo00: the one that runs the jupyterhub instance, and the only one that is exposed to the internet

  • nodo01, nodo02 and nodo03 the actual computing nodes.

I have tried to set it up with the batchspawner but when I try to start a new server, I get the following error message in the web browser:

Failed to connect to Hub API at 'http://127.0.0.1:8081/hub/api'.  Is the Hub accessible at this URL (from host: nodo03)?  Make sure to set c.JupyterHub.hub_ip to an IP accessible to single-user servers if the servers are not on the same host as the Hub.

The log of jupyterhub says the following:

[I 2018-04-07 18:03:43.003 JupyterHub app:834] Loading cookie_secret from /root/jupyterhub_cookie_secret
[W 2018-04-07 18:03:43.017 JupyterHub app:955] No admin users, admin interface will be unavailable.
[W 2018-04-07 18:03:43.017 JupyterHub app:956] Add any administrative users to `c.Authenticator.admin_users` in config.
[I 2018-04-07 18:03:43.017 JupyterHub app:983] Not using whitelist. Any authenticated user will be allowed.
[I 2018-04-07 18:03:43.046 JupyterHub app:1528] Hub API listening on http://127.0.0.1:8081/hub/
[W 2018-04-07 18:03:43.047 JupyterHub proxy:415] 
    Generating CONFIGPROXY_AUTH_TOKEN. Restarting the Hub will require restarting the proxy.
    Set CONFIGPROXY_AUTH_TOKEN env or JupyterHub.proxy_auth_token config to avoid this message.
    
[I 2018-04-07 18:03:43.047 JupyterHub proxy:458] Starting proxy @ https://*:443/
18:03:43.151 - info: [ConfigProxy] Proxying https://*:443 to (no default)
18:03:43.153 - info: [ConfigProxy] Proxy API at http://127.0.0.1:8001/api/routes
18:03:43.222 - info: [ConfigProxy] 200 GET /api/routes 
[W 2018-04-07 18:03:43.222 JupyterHub proxy:304] Adding missing default route
[I 2018-04-07 18:03:43.222 JupyterHub proxy:370] Adding default route for Hub: / => http://127.0.0.1:8081
18:03:43.228 - info: [ConfigProxy] Adding route / -> http://127.0.0.1:8081
18:03:43.229 - info: [ConfigProxy] 201 POST /api/routes/ 
[I 2018-04-07 18:03:43.230 JupyterHub app:1581] JupyterHub is now running at https://:443/
[I 2018-04-07 18:04:02.063 JupyterHub log:122] 302 GET / โ†’ /hub (@::ffff:188.79.184.88) 1.82ms
[I 2018-04-07 18:04:02.304 JupyterHub log:122] 302 GET /hub โ†’ /hub/login (@::ffff:188.79.184.88) 0.54ms
[I 2018-04-07 18:04:02.527 JupyterHub log:122] 200 GET /hub/login (@::ffff:188.79.184.88) 24.44ms
[I 2018-04-07 18:04:11.561 JupyterHub base:346] User logged in: mmarco
[I 2018-04-07 18:04:11.562 JupyterHub log:122] 302 POST /hub/login?next= โ†’ /hub/ (@::ffff:188.79.184.88) 24.12ms
[I 2018-04-07 18:04:11.834 JupyterHub log:122] 302 GET /hub/ โ†’ /hub/home (mmarco@::ffff:188.79.184.88) 6.23ms
[I 2018-04-07 18:04:12.097 JupyterHub log:122] 200 GET /hub/home (mmarco@::ffff:188.79.184.88) 9.89ms
[I 2018-04-07 18:04:13.936 JupyterHub log:122] 200 GET /hub/spawn (mmarco@::ffff:188.79.184.88) 10.31ms
[I 2018-04-07 18:04:16.200 JupyterHub batchspawner:177] Spawner submitting job using sudo -E -u mmarco sbatch --parsable
[I 2018-04-07 18:04:16.200 JupyterHub batchspawner:178] Spawner submitted script:
    #!/bin/bash
    #SBATCH --partition=ALL
    #SBATCH --time=8:00:00
    #SBATCH --output=/home/mmarco/jupyterhub_slurmspawner_%j.log
    #SBATCH --job-name=spawner-jupyterhub
    #SBATCH --workdir=/home/mmarco
    #SBATCH --mem=4gb
    #SBATCH --export=PATH,JUPYTERHUB_BASE_URL,LANG,JUPYTERHUB_API_URL,SHELL,USER,JUPYTERHUB_OAUTH_CALLBACK_URL,JUPYTERHUB_CLIENT_ID,JUPYTERHUB_SERVICE_PREFIX,JUPYTERHUB_API_TOKEN,JPY_API_TOKEN,JUPYTERHUB_USER,HOME,JUPYTERHUB_HOST
    #SBATCH --uid=mmarco
    #SBATCH --get-user-env=L
    #SBATCH 
    
    which jupyterhub-singleuser
    jupyterhub-singleuser --ip="0.0.0.0" --port=48132
    
[I 2018-04-07 18:04:16.212 JupyterHub batchspawner:181] Job submitted. cmd: sudo -E -u mmarco sbatch --parsable output: 383
[I 2018-04-07 18:04:16.746 JupyterHub batchspawner:319] Notebook server job 383 started at nodo03:48132
[I 2018-04-07 18:04:17.423 JupyterHub base:447] User mmarco took 1.526 seconds to start
[I 2018-04-07 18:04:17.426 JupyterHub proxy:231] Adding user mmarco to proxy /user/mmarco/ => http://nodo03:48132
18:04:17.428 - info: [ConfigProxy] Adding route /user/mmarco -> http://nodo03:48132
18:04:17.429 - info: [ConfigProxy] 201 POST /api/routes/user/mmarco 
[I 2018-04-07 18:04:17.434 JupyterHub log:122] 302 POST /hub/spawn โ†’ /user/mmarco/ (mmarco@::ffff:188.79.184.88) 1541.85ms
[I 2018-04-07 18:04:18.302 JupyterHub log:122] 302 GET /hub/api/oauth2/authorize?redirect_uri=%2Fuser%2Fmmarco%2Foauth_callback&client_id=user-mmarco&response_type=code&state=eyJuZXh0X3VybCI6ICIvdXNlci9tbWFyY28vdHJlZT8iLCAidXVpZCI6ICJkZjcyOTQ1ZWI2MTM0YzVmODcwMmExZDJjNzI4ZjQ1YiJ9 โ†’ /user/mmarco/oauth_callback?code=dcb7d6a3-06ee-4223-a20f-cfa7d17c2226&state=eyJuZXh0X3VybCI6ICIvdXNlci9tbWFyY28vdHJlZT8iLCAidXVpZCI6ICJkZjcyOTQ1ZWI2MTM0YzVmODcwMmExZDJjNzI4ZjQ1YiJ9 (mmarco@::ffff:188.79.184.88) 81.81ms
[W 2018-04-07 18:05:16.770 JupyterHub base:517] User mmarco server stopped, with exit code: 1
[I 2018-04-07 18:05:16.771 JupyterHub proxy:254] Removing user mmarco from proxy (/user/mmarco/)
18:05:16.774 - info: [ConfigProxy] Removing route /user/mmarco
18:05:16.775 - info: [ConfigProxy] 204 DELETE /api/routes/user/mmarco 
18:08:43.236 - info: [ConfigProxy] 200 GET /api/routes 

Finally, the log file of the job says:

jupyterhub_slurmspawner_380.log  jupyterhub_slurmspawner_382.log  
jupyterhub_slurmspawner_381.log  jupyterhub_slurmspawner_383.log  
[root@nodo00 ~]# more /home/mmarco/jupyterhub_slurmspawner_383.log 
/bin/jupyterhub-singleuser
[I 2018-04-07 18:04:17.292 SingleUserNotebookApp extension:53] JupyterLab beta preview extension loaded from /usr/lib/python3.4/si
te-packages/jupyterlab
[I 2018-04-07 18:04:17.292 SingleUserNotebookApp extension:54] JupyterLab application directory is /usr/share/jupyter/lab
[I 2018-04-07 18:04:17.297 SingleUserNotebookApp singleuser:365] Starting jupyterhub-singleuser server version 0.8.1
[E 2018-04-07 18:04:17.300 SingleUserNotebookApp singleuser:354] Failed to connect to my Hub at http://127.0.0.1:8081/hub/api (att
empt 1/5). Is it running?
    Traceback (most recent call last):
      File "/usr/lib/python3.4/site-packages/jupyterhub/singleuser.py", line 351, in check_hub_version
        resp = yield client.fetch(self.hub_api_url)
      File "/usr/lib64/python3.4/site-packages/tornado/gen.py", line 1099, in run
        value = future.result()
      File "/usr/lib64/python3.4/asyncio/futures.py", line 274, in result
        raise self._exception
    ConnectionRefusedError: [Errno 111] Connection refused
[I 2018-04-07 18:04:17.419 SingleUserNotebookApp log:122] 302 GET /user/mmarco/ โ†’ /user/mmarco/tree? (@192.168.2.10) 1.35ms
[I 2018-04-07 18:04:17.699 SingleUserNotebookApp log:122] 302 GET /user/mmarco/ โ†’ /user/mmarco/tree? (@::ffff:188.79.184.88) 1.03m
s
[I 2018-04-07 18:04:17.959 SingleUserNotebookApp log:122] 302 GET /user/mmarco/tree? โ†’ /hub/api/oauth2/authorize?redirect_uri=%2Fu
ser%2Fmmarco%2Foauth_callback&client_id=user-mmarco&response_type=code&state=eyJuZXh0X3VybCI6ICIvdXNlci9tbWFyY28vdHJlZT8iLCAidXVpZ
CI6ICJkZjcyOTQ1ZWI2MTM0YzVmODcwMmExZDJjNzI4ZjQ1YiJ9 (@::ffff:188.79.184.88) 5.45ms
[E 2018-04-07 18:04:18.556 SingleUserNotebookApp auth:249] Error connecting to http://127.0.0.1:8081/hub/api: HTTPConnectionPool(h
ost='127.0.0.1', port=8081): Max retries exceeded with url: /hub/api/oauth2/token (Caused by NewConnectionError('<urllib3.connecti
on.HTTPConnection object at 0x2b6e9f622908>: Failed to establish a new connection: [Errno 111] Connection refused',))
[W 2018-04-07 18:04:18.556 SingleUserNotebookApp web:1618] 500 GET /user/mmarco/oauth_callback?code=dcb7d6a3-06ee-4223-a20f-cfa7d1
7c2226&state=eyJuZXh0X3VybCI6ICIvdXNlci9tbWFyY28vdHJlZT8iLCAidXVpZCI6ICJkZjcyOTQ1ZWI2MTM0YzVmODcwMmExZDJjNzI4ZjQ1YiJ9 (::ffff:188.
79.184.88): Failed to connect to Hub API at 'http://127.0.0.1:8081/hub/api'.  Is the Hub accessible at this URL (from host: nodo03
)?  Make sure to set c.JupyterHub.hub_ip to an IP accessible to single-user servers if the servers are not on the same host as the
 Hub.
[E 2018-04-07 18:04:18.615 SingleUserNotebookApp log:114] {
      "Cookie": "user-mmarco-oauth-state=\"2|1:0|10:1523117057|23:user-mmarco-oauth-state|140:ZXlKdVpYaDBYM1Z5YkNJNklDSXZkWE5sY2k5
dGJXRnlZMjh2ZEhKbFpUOGlMQ0FpZFhWcFpDSTZJQ0prWmpjeU9UUTFaV0kyTVRNMFl6Vm1PRGN3TW1FeFpESmpOekk0WmpRMVlpSjk=|8beec7a37a57e5139f5f8898a
b06ef030ce43d131d77f56155be158495af89a3\"",
      "Dnt": "1",
      "X-Forwarded-Proto": "https",
      "Accept-Language": "es-ES,es;q=0.9",
      "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.148 Safari/537.36",
      "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
      "Referer": "https://gauss.unizar.es/hub/spawn",
      "Host": "gauss.unizar.es",
      "Accept-Encoding": "gzip, deflate, br",
      "Connection": "close",
      "X-Forwarded-Host": "gauss.unizar.es",
      "Cache-Control": "max-age=0",
      "Upgrade-Insecure-Requests": "1",
      "X-Forwarded-For": "::ffff:188.79.184.88",
      "X-Forwarded-Port": "443"
    }
[E 2018-04-07 18:04:18.615 SingleUserNotebookApp log:122] 500 GET /user/mmarco/oauth_callback?code=dcb7d6a3-06ee-4223-a20f-cfa7d17
c2226&state=eyJuZXh0X3VybCI6ICIvdXNlci9tbWFyY28vdHJlZT8iLCAidXVpZCI6ICJkZjcyOTQ1ZWI2MTM0YzVmODcwMmExZDJjNzI4ZjQ1YiJ9 (@::ffff:188.
79.184.88) 67.53ms
[E 2018-04-07 18:04:19.305 SingleUserNotebookApp singleuser:354] Failed to connect to my Hub at http://127.0.0.1:8081/hub/api (att
empt 2/5). Is it running?
    Traceback (most recent call last):
      File "/usr/lib/python3.4/site-packages/jupyterhub/singleuser.py", line 351, in check_hub_version
        resp = yield client.fetch(self.hub_api_url)
      File "/usr/lib64/python3.4/site-packages/tornado/gen.py", line 1099, in run
        value = future.result()
      File "/usr/lib64/python3.4/asyncio/futures.py", line 274, in result
        raise self._exception
    ConnectionRefusedError: [Errno 111] Connection refused
[E 2018-04-07 18:04:23.312 SingleUserNotebookApp singleuser:354] Failed to connect to my Hub at http://127.0.0.1:8081/hub/api (att
empt 3/5). Is it running?
    Traceback (most recent call last):
      File "/usr/lib/python3.4/site-packages/jupyterhub/singleuser.py", line 351, in check_hub_version
        resp = yield client.fetch(self.hub_api_url)
      File "/usr/lib64/python3.4/site-packages/tornado/gen.py", line 1099, in run
        value = future.result()
      File "/usr/lib64/python3.4/asyncio/futures.py", line 274, in result
        raise self._exception
    ConnectionRefusedError: [Errno 111] Connection refused
[E 2018-04-07 18:04:31.325 SingleUserNotebookApp singleuser:354] Failed to connect to my Hub at http://127.0.0.1:8081/hub/api (att
empt 4/5). Is it running?
    Traceback (most recent call last):
      File "/usr/lib/python3.4/site-packages/jupyterhub/singleuser.py", line 351, in check_hub_version
        resp = yield client.fetch(self.hub_api_url)
      File "/usr/lib64/python3.4/site-packages/tornado/gen.py", line 1099, in run
        value = future.result()
      File "/usr/lib64/python3.4/asyncio/futures.py", line 274, in result
        raise self._exception
    ConnectionRefusedError: [Errno 111] Connection refused
[E 2018-04-07 18:04:47.345 SingleUserNotebookApp singleuser:354] Failed to connect to my Hub at http://127.0.0.1:8081/hub/api (att
empt 5/5). Is it running?
    Traceback (most recent call last):
      File "/usr/lib/python3.4/site-packages/jupyterhub/singleuser.py", line 351, in check_hub_version
        resp = yield client.fetch(self.hub_api_url)
      File "/usr/lib64/python3.4/site-packages/tornado/gen.py", line 1099, in run
        value = future.result()
      File "/usr/lib64/python3.4/asyncio/futures.py", line 274, in result
        raise self._exception
    ConnectionRefusedError: [Errno 111] Connection refused

It seems that somehow there is no connection with the process that is running in the computing node. Any clue about how to proceed?

Create a new release

It's been a long while since we've had a batchspawner release. With the release of JupyterHub 0.8, we should create a more up-to-date release of batchspawner.

  • Review 3 pending PRs
  • Update changelog
  • Create beta release

I know folks are very busy. If you are willing and able to help review and the open PRs and test the beta release, please +1 this issue (@minrk, @zonca, @mbmilligan, @willfurnass, @yuvipanda, and others that may be interested) . Thanks.

cc @DeepHorizons

Add docs if/when needed

Congrats @mbmilligan on the move over here. If you wish to have docs on ReadTheDocs about batchspawner (now or in the future), please let me know. Also, great presentation today too.

Overriding Notebook Server Command Line Arguments

Hi,

When batchspawner launches my slurm job to start a notebook server, I get this in my logs:

[I 2017-08-04 13:08:02.590 JupyterHub batchspawner:167] Spawner submitted script:
#!/bin/bash
#SBATCH --partition=redacted
#SBATCH --output=redacted
#SBATCH --job-name=my_jupyterhub
#SBATCH --workdir=redacted
#SBATCH --mem=8G
#SBATCH --export=SHELL,JPY_API_TOKEN,PATH,HOME,JUPYTERHUB_API_TOKEN,LANG,USER
#SBATCH --uid=redacted
#SBATCH --get-user-env=L
#SBATCH

JUPYTER_RUNTIME_DIR=$HOME jupyterhub-singleuser --user="redacted" --cookie-name="jupyter-hub-token-redacted" --base-url="/user/redacted" --hub-host="" --hub-prefix="/hub/" --hub-api-url="http://redacted:8081/hub/api" --ip="0.0.0.0" --port=58384

How can I override things like "--hub-api-url" and "--port"?

I've tried adding this to my jupterhub_config.py:

c.Spawner.args = ['--hub-api-url="http://HOSTNAME:PORT/hub/api"', '--port=8888']

but this just appends the arguments to the end of the {cmd} instead of replacing those arguments.

Any ideas?

Thanks!

restricting port range for spawned jupyterhub singleuser servers

Hello,
I needed to setup jupyterhub on a cluster which - due to data security requirements and policy - the incoming connections to the compute resources in the private IP space from the submission host have to be restricted to a limited number of predefined ports on which the compute nodes can be reached.

I did not seem to be able to control that feature within the existing code for the batchspawner.
I solved the problem with the following change in a fork of the repo
pontiggi@50428e8
in which I added an additional option, configurable in jupyterhub_config.py , to set the port range for the spawned singleuser servers. something like for example
c.BatchSpawnerBase.ports = '40000-41000'

I was wondering if there was any other way already existing in the orignal code base to achieve the same goal, with some option I might have missed.
Or, if there is not such option, if you are interested in incorporating this in the upstream code. In which case I'll be happy to cleanup the code as needed according to your guidelines and open a pull request.

Thanks for any advise or feedback
Best
Francesco

jsonschema==3.0.0a1 is breaking tests

As you may have seen, lately all tests are failing on travis. I finally looked inside and kept seeing something about jsonschema. I see there is a new release recently, so tried downgrading to jsonschema==2.6.0 and it works. Fix is in #105.

This issue is opened separately because there is still some root problem that needs solving. Since the problem seems to occur when simply importing jsonschema, maybe we just wait for the next jsonschema release and remove the condition. They put an alpha on pypi.

To find the error, look at any of our recently failing builds on travis. Example traceback:

Traceback (most recent call last):
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/_pytest/config.py", line 327, in _getconftestmodules
    return self._path2confmods[path]
KeyError: local('/home/travis/build/jupyterhub/batchspawner/batchspawner/tests')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/_pytest/config.py", line 358, in _importconftest
    return self._conftestpath2mod[conftestpath]
KeyError: local('/home/travis/build/jupyterhub/batchspawner/batchspawner/tests/conftest.py')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/_pytest/config.py", line 364, in _importconftest
    mod = conftestpath.pyimport()
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/py/_path/local.py", line 668, in pyimport
    __import__(modname)
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 656, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 626, in _load_backward_compatible
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/_pytest/assertion/rewrite.py", line 212, in load_module
    py.builtin.exec_(co, mod.__dict__)
  File "/home/travis/build/jupyterhub/batchspawner/batchspawner/tests/conftest.py", line 3, in <module>
    from jupyterhub.tests.conftest import *
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 656, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 626, in _load_backward_compatible
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/_pytest/assertion/rewrite.py", line 212, in load_module
    py.builtin.exec_(co, mod.__dict__)
  File "/home/travis/build/jupyterhub/batchspawner/jupyterhub/jupyterhub/tests/conftest.py", line 17, in <module>
    from . import mocking
  File "/home/travis/build/jupyterhub/batchspawner/jupyterhub/jupyterhub/tests/mocking.py", line 21, in <module>
    from ..singleuser import SingleUserNotebookApp
  File "/home/travis/build/jupyterhub/batchspawner/jupyterhub/jupyterhub/singleuser.py", line 34, in <module>
    from notebook.notebookapp import (
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/notebook/notebookapp.py", line 82, in <module>
    from .services.contents.manager import ContentsManager
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/notebook/services/contents/manager.py", line 17, in <module>
    from nbformat import sign, validate as validate_nb, ValidationError
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/nbformat/__init__.py", line 33, in <module>
    from .validator import validate, ValidationError
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/nbformat/validator.py", line 12, in <module>
    from jsonschema import ValidationError
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/jsonschema/__init__.py", line 21, in <module>
    from jsonschema._types import TypeChecker
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/jsonschema/_types.py", line 49, in <module>
    class TypeChecker(object):
  File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/jsonschema/_types.py", line 64, in TypeChecker
    _type_checkers = attr.ib(default=pmap(), converter=pmap)
TypeError: attrib() got an unexpected keyword argument 'converter'
ERROR: could not load /home/travis/build/jupyterhub/batchspawner/batchspawner/tests/conftest.py

add account name parameter to base class

Suggested name: req_account

Would be usable with Torque with #PBS -A {account} and LSF #BSUB -P {account}, probably others have similar options.

Also noticed that LSF also has a partition option, so maybe req_partition should move into the base class.

License file

Would be nice to have a license file. Is this BSD 3-Clause? Also would be nice to include it in a MANIFEST.in to make releasing easier. ( #20 )

Consider security of environment

JupyterHub makes a big deal about not allowing users to alter their own environment for their single-user server:
https://jupyterhub.readthedocs.io/en/latest/reference/websecurity.html

First, I was wondering if I was missing something, because altering their own single-user would only serve HTML to themselves, so should be safe, right? (Besides, users of the same cluster could be considered "semi-trusted" but we should at least be aware of the issues)

But, assuming we need to be worried about this, we can evaluate the different spawners. For example, while setting up SlurmSpawner, I saw that by default, if --export=VAR,... is used, it will simulate a login shell and get the user's clean environment (as in, like what you might get on a fresh login) and then use that as the base, which directly allows people to do arbitrary env vars. Many spawners probably start a shell, which could evaluate config files. Basically, there are many possibilities here.

I had some mitigations:

  • Use full path of jupyterhub-singleuser and start Python with -E -s

We can't fix every spawner and every cluster's own environment. We can provide documentation for batchsystem maintainers to consider when writing their scripts. We could make some tests for others to know if they are affected.

What do you think?

Using Spawner options values for Slurm

Hello, I am testing Jypyterhub with Slurm and having a problem to use the values from Spawner option page. I followed Andrea's instruction https://github.com/jupyterhub/jupyterhub-deploy-hpc/tree/master/batchspawner-xsedeoauth-sshtunnel-sdsccomet
Spawner option page and code seems ok to me, but when I check the log, the actual value in c.SlurmSpawner.batch_script in jupyterhub_config.py doesn't change. Could you help me with this? I appreciate all your help.

comet_spawner.py

from batchspawner import SlurmSpawner
class SpawnerOptions(SlurmSpawner):
..............................
def options_from_form(self, formdata):
options = {}
options['queue'] = formdata.get('queue', [''])[0].strip()
options['runtime'] = formdata.get('runtime', [''])[0].strip()
return options

jupyterhub_config.py

import os
import sys
from comet_spawner import CometSpawner
c.JupyterHub.spawner_class = CometSpawner
...................................
c.SlurmSpawner.req_queue = 'standard'
c.SlurmSpawner.req_runtime = '12:00:00'
c.SlurmSpawner.batch_script = '''#!/bin/bash
SBATCH --partition={queue}
SBATCH --time={runtime}
....................................

Implement testing

This project needs testing of some kind.

The best option would be CI with Travis-CI or similar.

Given that this project is inherently interested in communication among multiple nodes in a resource managed system, it has also been suggested to use vagrant or similar to create recipes for test VMs.

However, setting up resource managers is not trivial, and solving that problem for a test environment may consume more time than it's worth. It might be more productive to capture representative outputs from the targeted resources managers and test against scripts that use those outputs to simulate the target environments.

oauth_client_id error

I'm using batchspawner as described by @zonca but I am having troubles making it to work.
The first issue is with the JUPYTERHUB_API_TOKEN environmental variable which is not set. I solved it in the same way @zonca did in zonca@a7449e3 (as described in jupyterhub/jupyterhub#1081)

However it is now failing with

Traceback (most recent call last):
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/traitlets/traitlets.py", line 528, in get
    value = obj._trait_values[self.name]
KeyError: 'oauth_client_id'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/glade/work/ddvento/miniconda3/bin/jupyterhub-singleuser", line 6, in <module>
    main()
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/jupyterhub/singleuser.py", line 495, in main
    return SingleUserNotebookApp.launch_instance(argv)
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/jupyter_core/application.py", line 266, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/traitlets/config/application.py", line 657, in launch_instance
    app.initialize(argv)
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/jupyterhub/singleuser.py", line 402, in initialize
    return super().initialize(argv)
  File "<decorator-gen-7>", line 2, in initialize
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/traitlets/config/application.py", line 87, in catch_config_error
    return method(app, *args, **kwargs)
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/notebook/notebookapp.py", line 1602, in initialize
    self.init_webapp()
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/jupyterhub/singleuser.py", line 433, in init_webapp
    self.init_hub_auth()
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/jupyterhub/singleuser.py", line 428, in init_hub_auth
    if not self.hub_auth.oauth_client_id:
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/traitlets/traitlets.py", line 556, in __get__
    return self.get(obj, cls)
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/traitlets/traitlets.py", line 535, in get
    value = self._validate(obj, dynamic_default())
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/traitlets/traitlets.py", line 593, in _validate
    value = self._cross_validate(obj, value)
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/traitlets/traitlets.py", line 599, in _cross_validate
    value = obj._trait_validators[self.name](obj, proposal)
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/traitlets/traitlets.py", line 907, in __call__
    return self.func(*args, **kwargs)
  File "/glade/work/ddvento/miniconda3/lib/python3.6/site-packages/jupyterhub/services/auth.py", line 484, in _ensure_not_empty
    raise ValueError("%s cannot be empty." % proposal.trait.name)
ValueError: oauth_client_id cannot be empty.

All google knows about that error is either:

  • do not run jupyterhub-singleuser run jupyterhub instead (doh!) which of course is not the case in this context
  • make sure the jupyterhub version of the two systems connecting to each other is the same (which it is in my case, both are running v0.9.0)

Any ideas on what may be wrong here?

Failure and inconsistent state when job takes too long to start

I think that batchspawner may get into an inconsistent state if the .start() method never returns. I don't have a full analysis yet, but I'm pasting logs here and will try to interpret tomorrow. I've noticed something like this from months ago, when I first set up jupyterhub, too, but didn't go so far as to determine the cause and I'm not sure it's exactly the same thing.

My first hypothesis is that .start() times out on the jupyterhub side, leaving things in an inconsistent state. That would be this:

[E 2018-05-24 00:03:44.158 JupyterHub gen:940] Exception in Future <Task finished coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /share/apps/jupyterhub/live/jupyterhub/jupyter\
hub/handlers/base.py:617> exception=TimeoutError('Timeout',)> after timeout
Traceback (most recent call last):
File "/share/apps/jupyterhub/live/miniconda/lib/python3.6/site-packages/tornado/gen.py", line 936,in error_callback
future.result()
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/handlers/base.py", line 624, in finish_user_spawn
await spawn_future
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/user.py", line 484, in spawn
raise e
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/user.py", line 404, in spawn
url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
tornado.util.TimeoutError: Timeout
[E 2018-05-24 00:03:44.172 JupyterHub gen:940] Exception in Future <Task finished coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /share/apps/jupyterhub/live/jupyterhub/jupyter\
hub/handlers/base.py:617> exception=TimeoutError('Timeout',)> after timeout
Traceback (most recent call last):
File "/share/apps/jupyterhub/live/miniconda/lib/python3.6/site-packages/tornado/gen.py", line 936, in error_callback
future.result()
File "/share/apps/jupyterhub/live/miniconda/lib/python3.6/site-packages/tornado/gen.py", line 936,in error_callback
future.result()
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/handlers/base.py", line 624, in finish_user_spawn
await spawn_future
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/user.py", line 484, in spawn
raise e
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/user.py", line 404, in spawn
url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
tornado.util.TimeoutError: Timeout
tornado.util.TimeoutError: Timeout
[E 2018-05-24 00:03:44.173 JupyterHub gen:940] Exception in Future <Task finished coro=<BaseHandler.spawn_single_user.<locals>.finish_user_spawn() done, defined at /share/apps/jupyterhub/live/jupyterhub/jupyter\
hub/handlers/base.py:617> exception=TimeoutError('Timeout',)> after timeout
Traceback (most recent call last):
File "/share/apps/jupyterhub/live/miniconda/lib/python3.6/site-packages/tornado/gen.py", line 936,in error_callback
future.result()
File "/share/apps/jupyterhub/live/miniconda/lib/python3.6/site-packages/tornado/gen.py", line 936,in error_callback
future.result()
File "/share/apps/jupyterhub/live/miniconda/lib/python3.6/site-packages/tornado/gen.py", line 936,in error_callback
future.result()
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/handlers/base.py", line 624, in finish_user_spawn
await spawn_future
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/user.py", line 484, in spawn
raise e
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/user.py", line 404, in spawn
url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
tornado.util.TimeoutError: Timeout
[W 2018-05-24 00:03:44.175 JupyterHub users:439] Stream closed while handling /hub/api/users/username/server/progress
[W 2018-05-24 00:03:44.175 JupyterHub users:439] Stream closed while handling /hub/api/users/username/server/progress
[W 2018-05-24 00:03:44.175 JupyterHub users:439] Stream closed while handling /hub/api/users/username/server/progress
[I 2018-05-24 00:03:44.177 JupyterHub log:158] 200 GET /hub/api/users/username/server/progress (username@::ffff:127.0.0.1) 159088.94ms
[I 2018-05-24 00:03:44.178 JupyterHub log:158] 200 GET /hub/api/users/username/server/progress (username@::ffff:127.0.0.1) 125638.94ms
[I 2018-05-24 00:03:44.178 JupyterHub log:158] 200 GET /hub/api/users/username/server/progress (username@::ffff:127.0.0.1) 93877.60ms
[I 2018-05-24 00:03:44.180 JupyterHub log:158] 200 GET /hub/api/users/username/server/progress (username@::ffff:127.0.0.1) 28713.99ms
[W 2018-05-24 00:03:44.411 JupyterHub batchspawner:357] Job  neither pending nor running.
[E 2018-05-24 00:03:44.412 JupyterHub gen:940] Exception in Future <Future finished exception=AssertionError()> after timeout
Traceback (most recent call last):
File "/share/apps/jupyterhub/live/miniconda/lib/python3.6/site-packages/tornado/gen.py", line 936,in error_callback
future.result()
File "/share/apps/jupyterhub/live/batchspawner/batchspawner/batchspawner.py", line 358, in start
assert self.state_ispending()
AssertionError

Then, when you try to go to hub/user/username/ in the future, you get a 500 server error, which I presume is this:

[E 2018-05-24 00:07:39.081 JupyterHub base:939] Preventing implicit spawn for username because lastspawn failed: Timeout
[E 2018-05-24 00:07:39.081 JupyterHub web:1621] Uncaught exception GET /hub/user/username/ (::ffff:1
                                                                                            27.0.0.1)
HTTPServerRequest(protocol='http', host='xxx', method='GET', uri='/hub/user/username/', version='HTTP/1.1', remote_ip='::ffff:127.0.0.1')
Traceback (most recent call last):
File "/share/apps/jupyterhub/live/miniconda/lib/python3.6/site-packages/tornado/web.py", line 1543,in _execute
result = yield result
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/handlers/base.py", line 941, in get
raise copy.copy(exc).with_traceback(exc.__traceback__)
File "/share/apps/jupyterhub/live/miniconda/lib/python3.6/site-packages/tornado/gen.py", line 936,in error_callback
future.result()
File "/share/apps/jupyterhub/live/miniconda/lib/python3.6/site-packages/tornado/gen.py", line 936,in error_callback
future.result()
File "/share/apps/jupyterhub/live/miniconda/lib/python3.6/site-packages/tornado/gen.py", line 936,in error_callback
future.result()
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/handlers/base.py", line 624, in finish_user_spawn
await spawn_future
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/user.py", line 484, in spawn
raise e
File "/share/apps/jupyterhub/live/jupyterhub/jupyterhub/user.py", line 404, in spawn
url = await gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
tornado.util.TimeoutError: Timeout
[E 2018-05-24 00:07:39.089 JupyterHub log:150] {
"X-Forwarded-Proto": "http",
"X-Forwarded-Port": "80",
"Connection": "close",
"X-Forwarded-Server": "xxx",
"X-Forwarded-Host": "xxx",
"X-Forwarded-For": "...,::ffff:127.0.0.1",
"Upgrade-Insecure-Requests": "1",
"Dnt": "1",
"Cookie": "...",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.5",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:59.0) Gecko/20100101 Firefox/59.0",
"Host": "xxx"
}
[E 2018-05-24 00:07:39.089 JupyterHub log:158] 500 GET /hub/user/username/ (username@::ffff:127.0.0.1) 16.39ms

Note that /hub/spawn does work and I can spawn a server again.

I will try to understand more tomorrow...

SlurmSpawner unable to parse job ID from text

Looking at the documentation and at the source code, to me it looks like batchspawner expects the output of the submission command being just the job ID. Unfortunately, on the system I am working it is a message such as Submitted batch job 481417 and therefore it fails with something like:

[E 2018-05-15 11:17:50.165 JupyterHub batchspawner:509] SlurmSpawner unable to parse job ID from text: Submitted batch job 481417

Am I missing something, or should I implement the awk'ing myself?

use new traits api

The traitlets api has changed and we are using deprecated behavior. Need to update:

  • use .tag(config=True) method in place of the config=True kwarg
  • use @default() decorator in mark trait default value functions

LSF integration issue

Hi,

We have an LSF Cluster that has been successfully tested with a python bsub submission. Jupyterhub / anaconda installed on the same serverworks wonderfully with the out of the box configuration.

I switched to batchapawner to test LSF which comes part of a SAS grid installation.

To keep things simple, jupyterhub is running under my own account to avoid any spawning issues. I have replaced my username with myusername
[I 2018-09-05 11:42:27.775 JupyterHub batchspawner:189] Spawner submitted script:
#!/bin/sh
#BSUB -R "select[type==any]" # Allow spawning on non-uniform hardware
#BSUB -R "span[hosts=1]" # Only spawn job on one server
#BSUB -q
#BSUB -J spawner-jupyterhub
#BSUB -o /home/myusername/.jupyterhub.lsf.out
#BSUB -e /home/myusername/.jupyterhub.lsf.err

    jupyterhub-singleuser --ip="0.0.0.0" --port=33546

[D 2018-09-05 11:42:27.780 JupyterHub base:427] 0/100 concurrent spawns
[D 2018-09-05 11:42:27.781 JupyterHub base:430] 0 active servers

...

[E 2018-09-05 11:27:08.847 JupyterHub user:427] Unhandled error starting myusername's server: /opt/sas94/thirdparty/platform/lsf/9.1/linux2.6-glibc2.3-x86_64/etc/eauth: read conf error!
Failed in an LSF library call: External authentication failed. Job not submitted.

Number of Cpu's does not work on slurm

Trying to use this in a slurm cluster, I always get 1 processor allocated no matter what i put in the req_nprocs configuration setting.

It seems that the corresponding option is not included in the slurm script template. I tried to edit it, to no avail.

Time for a new Batchspawner release

I think we have enough changes queued up to merit a new release in the near future.

Consider this space a placeholder - we will fill it in with a list of issues we want to resolve/PRs we want to pull for this release.

I will go through the list of these items and choose some, but in the meantime if folks want to nominate items for the release please leave them in the comments on this issue. Thanks!

@rkdarst @cmd-ntrf @willingc

Define resources/scheduler options using ipywidgets

If resources and scheduler options were defined using / linked to ipywidgets objects then this set of widgets could possibly be used to automate the generation of a Spawner.options_form.

This would allow the user more flexibility when making resource requests than when using an enumerated list of sets of options presented by wrapspawner.ProfileSpawner.

However, if the user has the ability to select values per resource independently then there is a possibility that he/she will make an unsatisfiable request.

Allowing resources to be individually selected using widgets would therefore benefit from being used with a dry-run job validation mechanism (suggested in Issue #36)

This is something I'd like to work on; suggestions on how best to implement this would be appreciated.

Issue timeout warning in UI

Maybe this doesn't belong into batchspawner, but I'll ask here anyways:

We're using batchspawner for having JupyterHub start notebook instances as SLURM jobs. Naturally, these have a limited runtime, and SLURM will just terminate the process after the requested runtime is over. This can be confusing (and frustrating) to users.

Obviously, users need to be properly educated.

But, maybe there could be a mechanism to warn users (via a browser notification) about an approaching batch job timeout? How should this be done?

One could implement it as an extension to Jupyter Notebook / JupyterLab, but then how would the extension know about the batch job?

Any help is greatly appreciated :)

Command 'None' returned non-zero exit status 1

Hello everyone,

I'm having trouble figuring out what could be causing this issue, but hopefully this is something you have seen before or can give me towards some rabbit holes I can go jump in.

Right before the command is submitted I can see in the output Spawner submitting job using sudo -E -u igomez bsub, which means cmd is set to sudo -E -u igomez bsub. However, for some reason I'm getting an error subprocess.CalledProcessError: Command 'None' returned non-zero exit status 1

What makes this particularly interesting is that this was working perfectly fine, and if I kill jupyterhub and launch it again it works. But it looks like eventually it gets stuck in this state and I can no longer spawn single-user notebooks.

My setup is:

  • OS: RHEL 7
  • jupyterhub --version 0.6.1
  • batchspawner forked at git ref: bec4cdf installed with pip -e
[I 2016-10-10 09:56:55.635 JupyterHub batchspawner:160] Spawner submitting job using sudo -E -u igomez bsub
[I 2016-10-10 09:56:55.636 JupyterHub batchspawner:161] Spawner submitted script:
    #!/bin/bash
    #BSUB -J Jupyterhub-Spawner
    #BSUB -q normal
    #BSUB -e /temp_test/jupyter-hub.err
    #BSUB -o /temp_test/jupyter-hub.out

    export PATH=/gpfs/grid/anaconda/python3/bin:$PATH
    jupyterhub-singleuser --user=igomez --port=57267 --cookie-name=jupyter-hub-token-igomez --base-url=/user/igomez --hub-host= --hub-prefix=/hub/ --hub-api-url=http://10.102.10.162:8081/hub/api --ip=0.0.0.0 --notebook-dir=~

[E 2016-10-10 09:56:55.655 JupyterHub user:237] Unhandled error starting igomez's server: Command 'None' returned non-zero exit status 1
[E 2016-10-10 09:56:55.669 JupyterHub web:1524] Uncaught exception POST /hub/login?next=%2Fhub%2Fuser%2Figomez (10.102.10.162)
    HTTPServerRequest(protocol='http', host='gridnode3:8000', method='POST', uri='/hub/login?next=%2Fhub%2Fuser%2Figomez', version='HTTP/1.1', remote_ip='10.102.10.162', headers={'Accept-Language': 'en-US,en;q=0.8', 'Content-Type': 'application/x-www-form-urlencoded', 'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8', 'Referer': 'http://gridnode3:8000/hub/login?next=%2Fhub%2Fuser%2Figomez', 'Origin': 'http://gridnode3:8000', 'Upgrade-Insecure-Requests': '1', 'Accept-Encoding': 'gzip, deflate', 'Connection': 'close', 'Host': 'gridnode3:8000', 'Content-Length': '36', 'Cache-Control': 'max-age=0'})
    Traceback (most recent call last):
      File "/gpfs/grid/anaconda/python3/lib/python3.5/site-packages/tornado/web.py", line 1445, in _execute
        result = yield result
      File "/gpfs/grid/anaconda/python3/lib/python3.5/site-packages/jupyterhub/handlers/login.py", line 79, in post
        yield self.spawn_single_user(user)
      File "/gpfs/grid/anaconda/python3/lib/python3.5/site-packages/jupyterhub/handlers/base.py", line 312, in spawn_single_user
        yield gen.with_timeout(timedelta(seconds=self.slow_spawn_timeout), f)
      File "/gpfs/grid/anaconda/python3/lib/python3.5/site-packages/jupyterhub/user.py", line 247, in spawn
        raise e
      File "/gpfs/grid/anaconda/python3/lib/python3.5/site-packages/jupyterhub/user.py", line 228, in spawn
        yield gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
      File "/gpfs/grid/anaconda/python3/batchspawner/batchspawner/batchspawner.py", line 271, in start
        job = yield self.submit_batch_script()
      File "/gpfs/grid/anaconda/python3/batchspawner/batchspawner/batchspawner.py", line 162, in submit_batch_script
        out = yield run_command(cmd, input=script, env=self.get_env())
      File "/gpfs/grid/anaconda/python3/batchspawner/batchspawner/batchspawner.py", line 52, in run_command
        err = yield proc.wait_for_exit()
    subprocess.CalledProcessError: Command 'None' returned non-zero exit status 1

Make it easier to fall back to scheduler defaults for resources

If the batch_template trait contains e.g.

#$ -l rmem={memory}

and we define a value for the req_memory trait then our resource request is valid. However, what if we want to just use the default value decided by the scheduler without us having to explictly probe the scheduler config to figure out what that value is (then use it to set our BatchSpawner state)? If we just leave req_memory as the default of the empty string then our scheduler will/may barf when trying to parse

#$ -l rmem=

What may be useful here is a templating mechanism where we can omit a line like the above if a condition is met e.g. using f-strings (py 3.6 only :() or jinja2 templates. NB jinja2 is guaranteed to be installed as it's needed by JupyterHub itself. Sound reasonable?

Question: How to keep Bash Function in env?

Hello, I am using c.Spawner.env_keep option to preserve a user environment.
It works well, however, I am not sure how to preserve bash function in env.

We have a bash function for module command.
BASH_FUNC_module()=() { eval $($LMOD_CMD bash "$@"); [ $? = 0 ] && eval $(${LMOD_SETTARG_CMD:-:} -s sh) }
BASH_FUNC_ml()=() { eval $($LMOD_DIR/ml_cmd "$@") }

Without this bash function, I am getting command "not found" error.
I would appreciate if you could guide me to resolve this issue.

Thank you for your works and supports.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.