pjbriggs / nebulizer Goto Github PK
View Code? Open in Web Editor NEWCommand line utilities to help manage users, tools and data libraries in a Galaxy instance via the API
License: Other
Command line utilities to help manage users, tools and data libraries in a Galaxy instance via the API
License: Other
It would be nice to be able to create and update Galaxy instance references and API keys in ~/.nebulizer
more automatically, for example could I do:
nebulizer add main http://usegalaxy.org/ [email protected]
and then have it prompt me for my password, retrieve my API key and then add this to the .nebulizer
file?
Attempting to update iuc/trinity_analyze_diff_expr
using Nebulizer 0.6.0 gets an uncaught JSONDecodeError
exception which crashes Nebulizer (the operation still completes in Galaxy).
The problem occurs when the connection with the initial tool install attempt is broken, presumably all that is required is an extension to the exception handling in the install_tool
function:
...
trinity_analyze_diff_expr: requesting installation
WARNING:nebulizer.tools:Got error from Galaxy API on attempted install (ignored)
WARNING:nebulizer.tools:Status code: 504
Traceback (most recent call last):
File "/home/pjb/virtual-envs/nebulizer/bin/nebulizer", line 8, in <module>
sys.exit(nebulizer())
File "/home/pjb/virtual-envs/nebulizer/lib/python3.6/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/home/pjb/virtual-envs/nebulizer/lib/python3.6/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/home/pjb/virtual-envs/nebulizer/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/pjb/virtual-envs/nebulizer/lib/python3.6/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/pjb/virtual-envs/nebulizer/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/home/pjb/virtual-envs/nebulizer/lib/python3.6/site-packages/click/decorators.py", line 64, in new_func
return ctx.invoke(f, obj, *args[1:], **kwargs)
File "/home/pjb/virtual-envs/nebulizer/lib/python3.6/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/home/pjb/virtual-envs/nebulizer/lib/python3. 6/site-packages/nebulizer/cli.py", line 887, in update_tool
(install_resolver_dependencies== 'yes')))
File "/home/pjb/virtual-envs/nebulizer/lib/python3.6/site-packages/nebulizer/tools.py", line 1322, in update_tool
no_wait=no_wait)
File "/home/pjb/virtual-envs/nebulizer/lib/python3.6/site-packages/nebulizer/tools.py", line 1189, in install_tool
json.loads(connection_error.body)["err_msg"])
File "/usr/lib/python3.6/json/__init__.py", line 354, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.6/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.6/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Currently there are four utilities: nebulizer
, manage_users
, manage_tools
and manage_libraries
.
It would be more consistent with other packages such as planemo
to provide a single utility nebulizer
which offered all the functionality from the other utilities as subcommands - for example
manage_users list
-> nebulizer list_users
manage_users create
-> nebulizer create_user
and so on.
The Bioblend galaxy.users.UserClient(gi).create_user_apikey(user_id)
function can only be invoked by an admin user, so it won't work for fetching a new API key on Galaxy instances where the user is not an admin.
As an alternative it looks like the galaxyclient.GalaxyClient
class (which GalaxyInstance
subclasses) implements a key
property which returns the API key for the current user.
Tool installation requires the full URL for the target toolshed, including the leading protocol (e.g. https
).
For tool repositories with large numbers of dependencies, it appears that the proxy server can timeout before Galaxy has finished processing the dependency tree and returned a response.
In these cases Nebulizer receives a 502 error from Bioblend, which is dumped to stdout - however it would be nicer if this could be gracefully handled with a suitable message instead.
Currently manage_tools install...
always handles repository and tool dependencies, by setting the install_tool_dependencies
and install_repository_dependencies
arguments of install_repository_revision
to True
(see http://bioblend.readthedocs.io/en/latest/api_docs/galaxy/all.html#bioblend.galaxy.toolshed.ToolShedClient.install_repository_revision).
However these settings could be exposed via the CLI, to allow the user to control how these are handled.
Proposal to drop support for Python 2.7, and only support Python 3.
Proposal to add an option --export
to the list_users
command which will output the user list in a tab-delimited format suitable for input into the create_users_from_file
command.
(The main issue is that passwords can't be dumped from Galaxy?)
When using e.g. install_tool
on Galaxy 17.05, tools which don't define any explicit toolshed dependencies don't have any associated conda dependencies installed.
For example:
nebulizer install_tool local toolshed.g2.bx.psu.edu devteam bowtie2 8ccbdbe9a695
If the tool is installed via Galaxy's admin interface then the conda dependencies are installed okay.
(Note btw that it's not obvious how to force installation of the missing dependencies afterwards.)
Request to add a ping
command which could be used to test whether a Galaxy server is alive and responding to requests, e.g.
$ nebulizer ping main
PING main (https://usegalaxy.org)
64 bytes from main (https://usegalaxy.org): time = 0.049 ms
...
^C
Looking at the UNIX ping
program the following options might be useful to implement:
-c COUNT
: sends COUNT
requests-i INTERVAL
: frequency to send requests-W TIMEOUT
: how long to wait for a response from the serverAdd support for accessing Galaxy using the username
and password
rather than via the API key; this should be possible when creating the bioblend
GalaxyInstance
, see:
http://bioblend.readthedocs.org/en/latest/api_docs/galaxy/all.html#galaxyinstance
It might be useful to implement a whoami
command to recover the user name from the API key.
The functionality might also be useful to check if the connection can be used for operations which require an associated user (e.g. data library uploads, see issue #25).
Proposal to extend the update_tool
command to allow it to operate on multiple repositories in a single invocation.
This could be in one or more of the following forms
nebulizer update_tool galaxy pjbriggs/trimmomatic devteam/fastqc
nebulizer update_tool galaxy iuc/trinity_*
nebulizer update_tool galaxy --all
or nebulizer update_tool galaxy */*
The ping
command will complain if the stored API key for the specified Galaxy instance isn't valid; however this is a bug - an API key is not required for ping
.
The --name
option fails to return any matches when using a short "single character" wildcard pattern (e.g. p*
or even just *
).
Using two leading non-wildcard characters (e.g. pe*
) seems to work as expected.
The delete_user
should include an option to remove the user email and name to be permanently removed or obfuscated, to enable compliance with data protection regulations.
The manage_tools installed ... --list-tools
doesn't correctly associate the tools with their parent repos, for example:
* weeder2 toolshed.g2.bx.psu.edu pjbriggs 2:3c5f10f7dd40 Installed
- Weeder2 2.0.0 Motif discovery in sequences from coregulated genes of a single species
- Weeder2 2.0.1 Motif discovery in sequences from coregulated genes of a single species
^ weeder2 toolshed.g2.bx.psu.edu pjbriggs 1:571cb77ab9e7 Installed
- Weeder2 2.0.0 Motif discovery in sequences from coregulated genes of a single species
- Weeder2 2.0.1 Motif discovery in sequences from coregulated genes of a single species
total 2
Here the 2.0.1
version is actually from the first repo/version, and the 2.0.0
version is from the second - the tools should only be listed once, and with the correct repo/version.
It's not clear that the behaviour of install_tool
is entirely conistent, so some standardisation would be useful:
--file
option, however could be extended to allow install_tool
to install multiple tools in a single invocation).In the tools.install_tool
function, tools with status New
are not counted as currently installing, but tools can often wait in this state for some time while their dependencies are being installed.
To handle this, the function should be updated to also count New
as an installing status.
Implement some form of test framework that can be used with unittest
, to allow testing of nebulizer components. Then use it to implement some tests.
What I imagine is something that will:
galaxy.ini
/tool_shed.ini
files suitable for testing withBonus points for:
Bioblend's ConnectionError
class has two attributes:
status_code
: the status code returned from the sever (e.g. 502)body
: the data returned by the server (e.g. nginx "service not available")(See http://bioblend.readthedocs.io/en/latest/_modules/bioblend.html#ConnectionError)
It would be useful in most cases to handle these errors more gracefully by checking the status code and taking action accordingly.
List of HTTP status codes (with explanations) can be found at e.g. https://en.wikipedia.org/wiki/List_of_HTTP_status_codes
The ordering of tools printed by the manage_tools list
command could be improved to group together multiple versions of the same tool/shed/owner combination.
Here's an example of what can happen at the moment:
pal_finder 0.02.04.1 Microsatellite Analysis toolshed.g2.bx.psu.edu/pjbriggs/pal_finder
pal_finder 0.02.04.3 Developmental Tools XXXXXXXXX/toolshed/pjbriggs/pal_finder
pal_finder 0.02.04.2 Microsatellite Analysis toolshed.g2.bx.psu.edu/pjbriggs/pal_finder
bioblend
has an uninstall_repository_revision
method (on the ToolShedClient
class) which could be used to implement tool repository deletion functionality:
Proposal to enable update_tool
to also remove the previously installed version of the updated repository, if user specifies an appropriate option.
Bioblend 0.16.0 is now available:
https://github.com/galaxyproject/bioblend/releases/tag/v0.16.0
Currently nebulizer is using the root logger for all its logging calls, but this should be switched to module-level loggers - see e.g. https://docs.python.org/2/howto/logging.html#logging-advanced-tutorial
This should be fairly straightforward to do - for each module add
logger = logging.getLogger(__name__)
then update method calls on logging
to use logger
instead, e.g.
logger.debug("Debug info")
It would be nice to give users utilities that allowed them to pull a dataset or set of datasets down from Galaxy in order to work on it using the command line/cluster/whatever, and also give them an easy way to upload files without using the web interface or FTP server.
The API has options for this which are exposed in bioblend:
Proposal to add a prompt in the install_tool
command for the user to confirm that they want to proceed with the installation (similar to the prompt already implemented for the update_tool
and delete_tool
commands), with a -y
option to override this on the command line.
The tool/repository management commands are a bit of a mixed bag at the moment, there are several different commands for slightly different situations:
list_tools
: lists tools that appear in the tool panel (both native and toolshed-installed)list_installed_tools
: lists toolshed-installed repositories in user-friendly stylelist_repositories
: lists toolshed-installed repositories in a format that can be fed into install_repositories
install_tool
: installs a single repository from a toolshedinstall_repositories
: installs all the repositories from a file produced by list_repositories
There is also update_tool
which updates a single toolshed-installed repository to the latest version.
It is suggested that these could be consolidated into a smaller set of commands (inspired by conda
's install
and list
commands), e.g.:
install_tool
-> install_tool
install_repositories
-> install_tool --file FILE
list_installed_tools
-> list_tools
list_repositories
-> list_tools --export
Possibly also drop the functionality of the current list_tools
command?
(The names of the new commands might have to be diffferent, to best reflect the terminology used within Galaxy itself.)
Currently if a non-existent path (library and folder combination) is supplied to the manage_libraries list
command - for example:
manage_libraries list http://127.0.0.1:8080/ -k XXXX Data/missing-folder
then it reports Total 0
, instead of indicating that the path doesn't exist.
Would be nice to have an interactive "shell" mode, e.g.
$ nebulizer localhost
list_users
...
which would preserve context when performing a number of related actions.
bioblend
includes a set of methods for manipulating quotas in a Galaxy instance, see
https://bioblend.readthedocs.io/en/latest/api_docs/galaxy/all.html#module-bioblend.galaxy.quotas
Specifically it is possible to create, update and remove quotas.
It might be useful to add an option to the uninstall_tool
command to remove older versions of tools, i.e. "pruning" rather than completely uninstalling all revisions. It should be possible to specify the maximum number of older revisions to keep (e.g. last three).
For example something like:
nebulizer uninstall_tool GALAXY pjbriggs/trimmomatic --prune --keep=3
Update nebulizer
to work with Bioblend 0.15.0, see https://github.com/galaxyproject/bioblend/releases
Proposal to update the list_users
command to flag user accounts where the disk usage is negative, e.g.:
[email protected] a-user -1.1 GB 50.0 GB -2% active
Proposals for the list_users
command:
15 GB
)When more extensive user information, it appears that the is_admin
and percent_quota
values that are displayed are all the same.
Proposal to add functionality which would estimate the total disk usage for the Galaxy instance, along with the "usage commitment" (i.e. the total disk usage if all users were using all their allocated quotas).
This issue covers two situations when attempting to install or update a tool:
nebulizer install_tool
will do nothing)nebulizer install_tool
will still attempt to install it, and nebulizer update_tool
will attempt to update to it.In the first case, it would be useful to be able to tell Nebulizer to install the most recent installable version of a tool instead in the event that the requested version is not available.
In the second case, it would make sense to treat deprecated tools as uninstallable and follow the fallback procedure above (i.e. go to the most recent installable revision), or else enable an option to allow installation of the deprecated tools (i.e. a switch to say that deprecated tools are installable).
In the list_installed_tools
command, the --updateable
option currently includes repositories without any status (e.g. because no information could be fetched from the source toolshed for the tool).
For example you might see:
U tabular_to_fastq toolshed.g2.bx.psu.edu devteam 1:92034dcbb40a Installed
U varscan_version_2 toolshed.g2.bx.psu.edu devteam 1:44d514f3df8f Installed
weeder2 xxxxxxxxx.itservices.manchester.ac.uk/toolshed pjbriggs 1:51d666bf37b1 Installed
In this case weeder2
shouldn't be listed in the results.
bioblend
provides methods to update and delete users:
https://bioblend.readthedocs.io/en/latest/api_docs/galaxy/all.html#module-bioblend.galaxy.users
It would also be useful to be able to remove user data via nebulizer
(i.e. datasets and histories), there are methods provided for handling histories which might be used here:
https://bioblend.readthedocs.io/en/latest/api_docs/galaxy/all.html#module-bioblend.galaxy.histories
Previously it seems that attempting to upload datasets to a data library using the API will fail if the API key being used isn't associated with a real user account (the case when the master API key is being used to access Galaxy).
In these cases Nebulizer should refuse to attempt the upload operation.
NB it's not clear if this limitation extends to the creation of libraries or folders themselves.
When cancelling a tool installation request (e.g. by doing ^C
on nebulizer install_tool...
), it is possible that Galaxy has already begun actioning the request. In this case although nebulizer
has stopped, the tool installation will still be proceeding on the server - in which case there is no way to stop this through either the API or the Galaxy interface.
So: it would be useful if nebulizer
could check and report the tool installation status on abort, so that the user at least is aware if the installation process is still running.
Changes in the latest version of the click
package (7.x) means that nebulizer
subcommands which had underscores now automatically have these converted to dashes.
From https://click.palletsprojects.com/en/7.x/changelog/
Subcommands that are named by the function now automatically have the underscore replaced with
a dash. If you register a function named my_command it becomes my-command in the command
line interface.
Essentially this breaks nebulizer
's CLI compared to the documentation.
Bioblend exposes a number of API functions for manipulating groups in a Galaxy instance, which would be useful to add in nebulizer, see http://bioblend.readthedocs.io/en/latest/api_docs/galaxy/all.html#module-bioblend.galaxy.groups
Probably most useful for me would be:
(There are also functions for manipulating roles but I don't use these so much however they could be useful too.)
Update: additionally it would be good to have functionality for flagging up "empty" groups etc.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.