globus / globus-cli Goto Github PK
View Code? Open in Web Editor NEWA command line interface to Globus
Home Page: https://docs.globus.org/cli
License: Apache License 2.0
A command line interface to Globus
Home Page: https://docs.globus.org/cli
License: Apache License 2.0
--help
documentation for async-transfer has already started to run up against the limits of simple inline help.
The helpdoc is really long, and it's hard to document the --batch
behavior in there.
I didn't expect to come up against this so soon, but I think we already need man pages.
Likely this will be sphinx autodoc fed back into the CLI as a set of commands.
My personal tendency at present is to add --man
as a global option that behaves much like the aws help
subcommands.
This is a discussion that we've had on and off pretty much since day one.
When we originally started work on this project, I imitated the awscli's decision to do required options without giving much consideration to whether or not that's correct.
awscli
is a good model for us to follow because (1) Amazon has many services, we are looking at supporting several services of our own, (2) Amazon employs at least as many Smart People as we do, probably more, and (3) we have experience using this CLI in administrative contexts and can vouch for the cohesion in its design.
However, this particular design decision may not sit well with our CLI, so we should evaluate it within our own context. I have my own bias, but will try to enumerate all of the arguments that I'm aware of.
Favoring Positional Arguments:
SourceID SourcePath DestID DestPath
) maintains readabilitygit
, traditional UNIX toolchain (no other known modern tools)I don't mean to denigrate that first point. It's a matter of good interactive "ergonomics", as we're fond of calling it, and it is important.
Favoring Required Options:
aws --region ...
and the region
config optawscli
, knife-ec2
, no other known toolsThere is no CLI command for creating a shared endpoint.
I don't even quite know where this should go in the CLI command structure.
globus transfer endpoint share
?
globus transfer endpoint create --shared
?
If it's a behavior of endpoint create
, that makes argument handling there much more complex, but I think it might have better feeling semantics once you figure out what command you want.
If there's a reasonable way to do it, we could pull the latest published version from pypi, otherwise we could go by GitHub releases.
Example:
jasons-mbp:src jtw$ vagrant version
Installed Version: 1.8.6
Latest Version: 1.8.6
You're running an up-to-date version of Vagrant!
The textual output from the task submission commands doesn't include the Task ID -- just the "message" field back from the API.
This is a usability problem, as immediately after submission, starting something like a globus transfer task wait
requires doing a task listing.
The JSON output includes all of the fields, so this isn't a problem for scripting, but rather for interactive use.
The text output format should probably be something like this:
$ globus transfer async-transfer ...
Message: The transfer has been accepted and a task has been created and queued for execution
Task ID: aaaaa-bbbbb-ccccc-ddddd
That's just for consistency with the various other parts of the CLI that output multi-field text info in a non-columnar format.
The alternative is to ditch that format and go with
$ globus transfer async-transfer ...
The transfer has been accepted and a task has been created and queued for execution
Task ID: aaaaa-bbbbb-ccccc-ddddd
To me, those two are indistinguishable for a user, so we should just go with the one that matches other existing output.
--heartbeat
behavior is what is expected...
if we are still directory scanning. Not sure if the rest-api exposes that info (nr_directory_expansions_pending)1
, existing CLI returns 0
)ACLs are secondary resources on endpoints (S3 and Shared), so they should be namespaced inside of globus transfer endpoint acl
.
Karl suggested this, and I pushed back slightly, thinking they should be in globus transfer share acl
.
When he informed me that S3 eps can have ACLs, that convinced me.
No one else seems to have a strong opinion.
details -t
to show a task's subtasks.I think this might just be something that we hook into GLOBUS_CLI_DEBUGMODE
.
We should suppress warnings during normal operations -- they're chatty and confusing for end users.
However, being able to see warnings is a useful feature for developers working on the project, so being able to expose them seems good.
The globus login
command writes values to the config file, but doesn't set a umask. On a lot of systems, we'll be protected by ordinary home-directory permissions, but the file will be world-readable.
Because the file contains refresh tokens, it should start with perms of 0600.
There are a number of commands which I took shortcuts on before I had complete and working tooling for table output.
To get things working without fiddling with output too much, I used the JSON output for both output formats.
In some cases, this may ultimately be the most sensible approach (probably not).
Regardless, we should list all of these as items to address before considering the CLI "done".
server add
and server delete
are two management commands for physical endpoints supported by the current CLI, but missing here.
Additionally, we should have server show
, for completeness, even though the old CLI doesn't include it.
Additionally, promote server-list
from a command into a command group server
with a subcommand list
, a sibling of these other new commands.
Provide a way for a user to see what identity-context they're operating under. Something like globus whoami
that shows the primary/effective identity that the current set of tokens was issued to.
See also #38 for more "CLI session lifecycle" material.
Right now, all of execution is wrapped in a handler for broken pipes designed to handle the possibility that python IO is being truncated by a consumer like head
.
This is somewhat dangerous, and it would be better to wrap the IO sites with this handling.
This can be a context manager, but a special purpose globus_cli.safeio.write(..., file=sys.stdout)
might be desirable.
Needs a bit more thought, and some investigation of click.echo
-- maybe that does what we want, and just needs to be wrapped in our own package so that we can slot something there later if we replace Click.
Current behavior shows all files by default. Desired behavior is to not show dotfiles by default but provide a flag (e.g. -a
).
Allowing sub-second polling may be desirable -- a strong case can be made for writing a script which does
# start transfer
transfer_out="$(globus transfer async-transfer --format JSON ...)"
# parse & extract task ID
task_id="$(echo "$transfer_out" | jq ...)"
# wait for task to complete
echo "waiting on task $task_id"
globus transfer task wait --task-id "$task_id" --polling-interval 0.1 --heartbeat
# do more stuff...
...
I don't think that forcing someone running this kind of script to wait up to 1 second for a task to complete is entirely fair. If they're running many small tasks in succession, that cost will escalate to significant delays.
However, there is really no case to justify
globus transfer task wait --task-id "$task_id" --polling-interval 0.001 --heartbeat
If you make a call like this, you've grossly misunderstood the performance characteristics of the CLI and are unnecessarily hammering the API.
I think that making 0.1
the minimum allowed would be reasonable -- it means that going into hundredths-digit precision is a clear-cut sign that you're probably doing it wrong without being too limiting.
The driving use case is this:
A user with a large number of shares wants to change the local user associated with those shares.
In order to do so, the user needs to create new shares under the new local user and then migrate a large number of ACLs from the old shares to the new ones.
This changes the share IDs, but their attributes can be replicated with a little bit of scripting, so search-based lookups will continue to work for users of the shares.
The key to making this use case work smoothly is allowing globus transfer endpoint acl add-rule
(or a new, similar command) capable of consuming the JSON output of globus transfer endpoint acl list
.
I think we can change the behavior of add-rule
based on a flag, so we get this:
$ oldshare="aabb"
$ newshare="xxyy" # not 48, XXYY Syndrome, just a fake ID
$ globus transfer endpoint acl list --endpoint-id "$oldshare" --format JSON | \
globus transfer endpoint acl add-rule --from-list --endpoint-id "$newshare"
This is highly desirable for a very small number of users.
Potentially offering a solution purely through the SDK is more desirable, but this seems like a clear-cut case for "added value" on top of core SDK features. We shouldn't pollute it with rarefied usage like this.
The case we're concerned about is the pathological case of hundreds of nested directories expanding to huge numbers of files (m/billions).
Per notes from Karl, we'd need to switch traversal to be breadth-first and unbuffered output to handle that case at all -- makes perfect sense, though unbuffered JSON output will require careful handling and some testing to ensure that the results are valid JSON.
To resolve, we have three paths forward (there may be more?):
ls
from the CLIls
to some "small" but reasonable number -- e.x. 50This is only worth discussing as an SDK addition (and not purely as a CLI feature) because we've made it very easy to build your own version of recursive ls
.
Unless we make it easier to use the "correct" version than to build your own, we haven't actually solved the problem.
Removing it from the CLI is viable but may be worse than adding a recursive=True
option to the SDK ls
-- I just don't trust that people won't go building this themselves. It therefore seems a poor way of protecting the service.
Given the choice between a hard limit and a sleep, the sleep has much better semantics.
It doesn't mess with error modes at all -- an SDKLimitExceeded error will be confusing to people, and is likely to break scripts unexpectedly as their inputs scale up.
The sleep behavior also sets the tone correctly for a future universe in which that kind of rate limiting is enforced service side in addition to being done client-side.
There's already been some discussion of these, but I'm moving it into issues to better track status and decisions.
I'd appreciate input from @karlito-go and @ranantha on this one.
This was a feature request I got from Steve -- he seemed to feel pretty strongly about it being a necessary intro point.
My instinct is that semantics should be somewhere between what knife configure
and git config
offer.
So, globus config init
(knife-like) should be prompt-oriented and allow users to enter info line by line.
For example,
$ globus config init
Please enter your auth token: XXXXX
Please enter your transfer token: XXXXX
...
$ globus config init --auth-token XXXXX
Please enter your transfer token: XXXXX
However, it might also be useful to have git-like setter/getter functionality. Maybe something more like this:
$ globus config show general.auth_token # echoes the value
...
$ globus config add general.auth_token XXXXX
of course, anything without a dot should be treated as part of the general config section, so these are equivalent to
$ globus config show auth_token
...
$ globus config add auth_token XXXXX
and of course we'd want globus config remove
Questions I still have:
--system
flag, like git-config
, to operate on /etc/globus.cfg
?globus config edit
which does $EDITOR ~/.globus.cfg
?config
command know that certain values (like tokens) are secret, and show redacted versions of them?--overwrite
flag?@bd4 I'm concerned that this needs direct access to the SDK's GlobusConfigParser
. It's a very unusual internal use-case, and I don't think we want to support it for external groups.
I'm happy to from globus_sdk.config import GlobusConfigParser
, but I'll need a new function attached to flush its state to a file (the internal SafeConfigParser
should support this for us), and I'll need to update the GlobusConfigParser
to support reading only one file. Otherwise, it would load system and default config, and flush all of that back out when it's done, which would be bad.
There are some alternatives:
0.3.0
of the SDK in which the config parser insists that there is a [general]
section heading, like a typical ConfigParser
? Then this would not need the specialized GlobusConfigParser
at all. That breaks compatibility with existing config, which I don't like. We'd need clear messaging for anyone who has started using it.It can be based loosely on the SDK guide: https://github.com/globus/globus-sdk-python/blob/master/CONTRIBUTING.md
However, it should have a different emphasis, particularly for reporting bugs that may look arcane to non-pythonistas.
globus transfer async-transfer
and globus transfer async-delete
currently hide away the submission ID generation. That means that the CLI doesn't support reliable, safe-to-retry task submission.
To resolve, we should expose this behavior with a new command and a couple of flags.
Namely, we want globus transfer task generate-submission-id
to produce the submission ID in raw text, and globus transfer async-transfer --submission-id
to consume it.
The same option should exist for async-delete
, of course.
Per globus/globus-sdk-python#51 , we're trying to get off of the dependency on six
because OSX makes things difficult.
This is largely matter of supporting the naive developers who don't realize that there's a strong reason to use virtualenv.
At present, we're using
six.string_types
(easy to sub with some version dispatch)six.reraise
The reraise
functionality is hard to imitate, so we'll need to take another look at where and how that's happening. I don't think we can do the current exception handling logic without a reraise
, but maybe we don't need to preserve context (I think that's just paranoia about how click
might behave).
Right now, the error output is not prescriptive, and just notes that an endpoint needs to be activated.
Example from Steve:
$ globus transfer ls --endpoint-id 9d6d994a-6d04-11e5-ba46-22000b92c6ec
Globus CLI Error: A Transfer API Error Occurred.
HTTP status: 400
request_id: 9sHmaySoX
code: ClientError.ActivationRequired
message: The endpoint '9d6d994a-6d04-11e5-ba46-22000b92c6ec' is not activated
This is actually the LS call returning the ActivationRequired
error, not a processed activation requirements doc.
What we really need to do is this: autoactivate any endpoint that we see, and capture any activation requirements document that we get back.
If the response has a success code, proceed as usual.
However, if we get NotActivated
, we need to parse and present the ActivationRequirements
.
I think in that case, we should exit with status 1 -- the command failed, after all.
As for the error presentation, it depends a bit on the required activation type.
For web activation, we just need to show the activation link, that's pretty easy.
For CLI activation, we need some text that clarifies that we mean the hosted CLI.
Possibly something like this:
$ globus transfer ls --endpoint-id 9d6d994a-6d04-11e5-ba46-22000b92c6ec
Globus CLI Error: A Globus Endpoint Requires Activation
code: ClientError.ActivationRequired
message: The endpoint '9d6d994a-6d04-11e5-ba46-22000b92c6ec' is not activated
To activate 9d6d994a-6d04-11e5-ba46-22000b92c6ec , you must use the Hosted Globus CLI.
Follow this guide to connect <link>
Then run
endpoint-activate 9d6d994a-6d04-11e5-ba46-22000b92c6ec
Note that I omit some of the HTTP detail (it's a success there, after all), but still want the ClientError.ActivationRequired
code noted.
This may have some unforeseen impact on #21, so we need to think about that.
Ideally, the SDK would handle parsing the activation requirements document a bit, and we could inspect an activation_type
property on it to see CLI vs. Web activation.
For some reason, an early version of the CLI used sys.excepthook
to do global error handling.
The goal was to hide any exceptions which occur from the end user, and present a clean "minimal" error message that wouldn't be too scary.
This is all well and good, but it should just be a try-except block in the main command method.
Any exceptions thrown and handled by parsing are presented nicely already anyway.
This also has implictations for #17.
When I started using GLOBUS_CLI_DEBUGMODE, I didn't have the HiddenOption
class written yet, and there was no CommandState
location to store that value anyway.
This should be fairly easy to add to the common_options
option set.
globus transfer async-delete
should have a --batch
mode, just like globus transfer async-transfer
An option on ls mimicking ls -R behavior would be helpful. Maximum depth options and global policy are probably prudent. In JSON output mode, behavior such that the path (at least relative to the root path given on the command line) is displayed would be needed.
subscription_id
network_use
, location
, and disable-verify
Karl suggested this a while back, and I've had it in my queue for a while.
I definitely want this before we consider the CLI done, but I'm not sure when we'll get to it.
Several commands have a set of non-200 HTTP responses that might still be considered part of normal operations. For example, a command like globus transfer endpoint delete ...
could be considered successful on a 200 or a 404.
It may be desirable for us to hand that information -- the 404 response, vs. a 403 or other error, to the caller, but doing so with some fixed mapping is burdensome for scripting (you need to know our statuses table), and not very extensible if we omit codes that we start using later.
The generic solution is to just let the invoking script specify what it would like us to translate various HTTP statuses into.
That lets them get meaningful status codes, avoids a magical lookup table, and makes those scripts more readable because they specify what the various error codes that they're dispatching on mean / come from.
So, the basic form would be something like this:
globus transfer endpoint delete --map-http-status "404=54" ...
and we can even allow crazy things like this check for nonexistence:
globus transfer endpoint show --map-http-status "404=0" --map-http-status "200=1"
In the original version of this proposal the idea was to restrict the caller to using the range from 50 to 99 (reserving other exit codes for our own purposes into the future).
However, I think there's a case for allowing clients to say "allow 404s" by mapping to 0.
Whether or not we allow mapping to 1 is a little bit more tenuous, but I think still acceptable, since it just means "generic error".
There's also a question of whether or not it should be
globus transfer endpoint delete --map-http-status "404=54" --map-http-status "403=53"
or
globus transfer endpoint delete --map-http-status "404=54,403=53"
That second is a little bit nicer, but also a little harder to implement.
The partial commands:
globus transfer task
globus transfer
globus
all give:
AttributeError: 'Namespace' object has no attribute 'fun'
Instead of having the globus transfer ls
command default to /~/
, don't send any path
to the Transfer service and let the Transfer service figure out what the default path is supposed to be.
Fixes S3 endpoints (which don't support /~/
), shared endpoints (which should usually default to /
), and any kind of endpoint that has had a default_directory
configured.
Dan mentioned this to me.
Specifically, trying to autoactivate an activated endpoint which does not support autoactivate is confusing:
$ globus transfer endpoint autoactivate --endpoint-id danielpowers#prod01
{
"code": "AutoActivationFailed",
"resource": "/endpoint/danielpowers%23prod01/autoactivate",
...
"message": "The endpoint could not be auto activated, fill in the returned activation_requirements and POST them back to /activate to perform manual activation.",
...
"oauth_server": null,
"subject": null
}
Really, the autoactivate call in this case is unhelpful / unimportant, as the activation requirements document only applies if the expires_in
field you get back is unacceptable.
autoactivate calls on autoactivate-capable endpoints works fine and has acceptable (but noisy) output.
Additionally, this call is generally unnecessary because we automatically attempt to autoactivate any endpoint which we see, so trying an ls
or similar operation will result in an implicit and less scary looking autoactivation.
If we keep autoactivate as an explicit operation, we need to figure out better (textmode) output for it.
If we make it only magically autoactivate, this will feed into #30 as well.
--all
option that will cancel all ACTIVE
and INACTIVE
tasks.@bd4, @corpulentcoffee: I'm particularly interested in your opinions on this, as we've had a number of discussions about the CLI design/layout.
Thoughts and opinions from everyone and anyone welcome, of course.
There's a potential feature that I'd like to think about and ask questions about.
For now, I'm calling it --grep
because I think it's the most intuitive name for it, but I'm not thinking of doing full regex matching necessarily.
--grep
is probably a bad name for this if I add it, since it promises too much, but I don't have a good name yet. Maybe --cli-filter
Right now, a lot of commands have grep-friendly text output.
I'm going to use bookmark list for a bunch of examples because it has very simple output.
$ globus transfer bookmark list -Ftext
Name | Endpoint ID | Bookmark ID | Path
-------------------------------- | ------------------------------------ | ------------------------------------ | ----
Crummy Bookmark | d1763b75-6d04-11e5-ba46-22000b92c6ec | 6c0a7d14-f796-11e5-a6f9-22000bf2d559 | /abc/123/
EP1 GoData | ddb59aef-6d04-11e5-ba46-22000b92c6ec | ec207bf4-eb91-11e5-9829-22000b9da45e | /share/godata/
Test Bookmark Creation | d1763b75-6d04-11e5-ba46-22000b92c6ec | 46e4d1b8-f78e-11e5-a6f9-22000bf2d559 | /abc/123/
In the case of commands like this, it's natural to want to be able to filter on columns in some way. My beloved awk
can do great things here with |
as its delimiter, and plain grep
without column awareness is pretty useful too.
$ globus transfer bookmark list -Ftext | grep 'ec207bf4-eb91-11e5-9829-22000b9da45e'
EP1 GoData | ddb59aef-6d04-11e5-ba46-22000b92c6ec | ec207bf4-eb91-11e5-9829-22000b9da45e | /share/godata/
is a decent example of what we'll see people doing.
This really doesn't work well on the JSON output, however, for obvious reasons:
$ globus transfer bookmark list -Fjson | grep 'ec207bf4-eb91-11e5-9829-22000b9da45e'
"id": "ec207bf4-eb91-11e5-9829-22000b9da45e"
Now, what if that filtering were done inside of the CLI code itself, so instead of globus ... | grep <expr>
, we had globus ... --grep <expr>
?
We could apply that expression to text and JSON output uniformly -- quick and easy filtering for your results.
$ globus transfer bookmark list -Ftext --grep 'EP1'
EP1 GoData | ddb59aef-6d04-11e5-ba46-22000b92c6ec | ec207bf4-eb91-11e5-9829-22000b9da45e | /share/godata/
and importantly
$ globus transfer bookmark list -Fjson --grep 'EP1'
{
"DATA": [
{
"name": "EP1 GoData",
"path": "/share/godata/",
"endpoint_id": "ddb59aef-6d04-11e5-ba46-22000b92c6ec",
"DATA_TYPE": "bookmark",
"id": "ec207bf4-eb91-11e5-9829-22000b9da45e"
}
]
}
Bookmark list is not a very compelling case for this because it's small.
Endpoint Search obviously has searching and filtering options available in the API.
The three really strong cases for this feature, I think, are Task List, Event List, and ACL List.
Lots of people will want to do something like this:
$ globus transfer task list | grep 'Globus Tutorial Endpoint 1'
c44056b6-f075-11e5-9833-22000b9da45e | SUCCEEDED | DELETE | Globus Tutorial Endpoint 1 | None | globus-cli delete
b8a53d98-ed29-11e5-982b-22000b9da45e | SUCCEEDED | TRANSFER | Globus Tutorial Endpoint 1 | Globus Tutorial Endpoint 1 | None
or this
$ globus transfer acl list --endpoint-id d1763b75-6d04-11e5-ba46-22000b92c6ec | grep 'identity'
02ed6f8c-d2aa-11e5-9759-22000b9da45e | identity | c8aad43e-d274-11e5-bf98-8b02896cf782 | rw | /otherusers/globusteam/
02ed6f85-d2aa-11e5-9759-22000b9da45e | identity | c8aad43e-d274-11e5-bf98-8b02896cf782 | r | /
01bbeadd-d2aa-11e5-9759-22000b9da45e | identity | ae2f7f60-d274-11e5-b879-afc598dd59d4 | rw | /otherusers/ballen/
None | identity | c8aad43e-d274-11e5-bf98-8b02896cf782 | rw | /
The easy way to start on this is a raw substring match on the SDK's str(GlobusResponse)
Then we can maybe do regex, and maybe only match on fields inside of a JSON structure, and maybe only apply on visible columns in text output.
I know that there's a case for saying that "Oh, no! We don't want people using Display Name in scripts!", but I don't think that's the only consideration here.
I recognize that there's a danger in allowing someone to do globus transfer task list --grep <display_name> --format JSON > somefile.json
, but I don't think that's particularly hard to do on your own regardless.
The question, in my mind, is whether or not some kind of basic filtering -- insufficient for all needs but useful in small doses -- would be a useful thing to have in the CLI.
If we just do a substring match on str(GlobusResponse)
, the implementation and maintenance cost is very cheap.
Is substring match not good enough? Should we do full regexes?
Is this entire idea an unnecessary feature?
We can always tell people to use grep
or jq
.
They'll certainly have the former, but not everyone has a command-line JSON filtering tool at the ready.
Although this might be useful for text output, I'm more interested in supporting simple filtering on JSON out. OTOH, maybe just instructing people in the use of jq
(my personal favorite) is good enough?
@bd4 I want us to discuss this a little bit.
I think it makes a lot of sense that globus transfer ls --format json --endpoint-id 'badepid'
would produce a JSON error message.
However, there are a number of conditions under which we do not receive JSON from the other end of the connection.
For example, the web server may respond with a raw body of 500 Internal Server Error
.
In those cases, should we wrap this in a standard JSON container, like {"message": "500 Internal Server Error"}
?
To me, the target use case is stuff like nonexistent endpoints, which the caller may want to parse and inspect.
They should be able to, relatively safely, hand stderr to something that consumes JSON.
I think that would be totally acceptable, but I'm not sure I've thought through all of the options.
There are errors that we can't handle so easily in this way, namely Python stacktraces.
I don't want to be JSONifying every exception ever thrown, just to slavishly obey the request for --format JSON
Where do we draw the line? Do we say "We will convert any error responses from the APIs to JSON." ?
If we can't crystalize this into a simple statement, we probably don't have clear enough thinking about it yet.
It would be nice to have something similar to the old CLI wait
command.
The idea is to be a blocking CLI command which returns with status 0 once the Task terminates.
Importantly, it should exit with status 0 even if the Task itself fails.
The exit code is necessary to distinguish network failure (likely on a long-running command) from successful termination.
Should let you write something like this:
#!/bin/sh
...
rc=1
while [ $rc -ne 0 ];
do
# wait 30m at a time
globus transfer task wait --task-id '...' --timeout 1800
echo 'still alive and polling' >2
rc=$?
done
Some thoughts on what we might want for this, interface-wise:
globus transfer task wait
.
It's a task-oriented feature, so it makes sense as a subcommand of globus transfer task
.
--timeout
This is not strictly necessary, but a waiting process probably wants to heartbeat periodically to show that it's still alive. After N seconds, exit with status 0 if the task is still pending.
--polling-interval
Check the status every N seconds; defaults to 1.
If you plan to poll many tasks simultaneously, you want to be able to reduce network traffic by lengthening the interval between queries.
If you want to watch a task at very high resolution, you'll still be limited by RTT for a given request.
We could do higher resolution polling by running multiple parallel requests, but that seems excessive.
--fail-on-fault-types
This is the most tenuous potential flag -- seems somewhat fragile.
If there are fault events of a particular type in the Task events, it may be desirable to consider it failed. Particularly, I'm thinking of Permission Denied or nonexistent dest dirs.
This would be an enum of string options, passed as a comma delimited list, defaulting to the empty list.
The awscli has fairly comprehensive completion support, which may serve as a guide:
http://docs.aws.amazon.com/cli/latest/userguide/cli-command-completion.html
It should be possible, given the structure of click commands, to do a completer as a python function or a hidden option to the CLI.
If we want completers to be pure shell functions, we can add a hidden option to generate a shell script as output, and then ship that result as part of the package data.
This was something found in #32 and fixed there.
However, in case that work is cancelled, delayed, or significantly altered, we need to be sure to fix this issue.
When the custom excepthook was replaced with a high-level dispatch on exception types, I did not correctly add explicit exit(1)
calls as I should have.
This is a bit of a design question with fairly broad implications for a bunch of commands.
There's some clunkiness regarding shared endpoint management because they're sorta endpoints and they're sorta their own thing.
That they are valid logical endpoints, good as Task and FS op targets, is good and useful, but only if we strip back the meaning of "endpoint" to be exactly that.
Shared Endpoints don't/can't support the full set of Endpoint operations, which makes managing them as a type of endpoint very weird.
That leads to real weirdness in the API as well -- like all endpoints having a nullable "host_path" field.
At the very least, it would be nice to have endpoint subtypes added, so that we're not deducing the endpoint type from which fields are null.
Ideally, in my mind, non-shared endpoints would be "demoted" to be another specialized type of endpoint, something like "physical endpoints" or "gridftp server endpoints" (which distinguishes them from S3), but that's a bigger issue.
For now, focusing on the CLI.
I started looking at #3 again, and realized that there's actually a bit of a model problem posed here.
Regarding shared endpoint creation in particular, I think we have three options.
globus transfer endpoint create --shared --host-endpoint-id ...
globus transfer endpoint share --host-endpoint-id ...
globus transfer share create --host-endpoint-id ...
If I had to distill down this issue to a TL;DR format, it would probably be a choice between those three options, phrased as
Conceptually, which of the following is a Share?
How we think of Shared Endpoints / Shares and "what they are" will ultimately drive how we present them and manage them in the CLI and potentially other interfaces.
Right now, the API and Web App share the opinion that a Shared Endpoint is a special kind of endpoint. That works really well when you're looking at endpoints that are shared with you, but very badly when you're trying to manage your own shared endpoints.
Should this be divided into two categories?
I would be most inclined towards:
That duality could be confusing if poorly presented, but I think the other split-brained view of the world -- in which managing Shared Endpoints is a confusing special case of managing an endpoint definition -- is more confusing.
Perhaps the Web UI will continue with its current layout and format, but the CLI, SDK, and other developer tools will present things in a different form? I worry that inconsistency will confuse developers by not mapping nicely to the UI, but right now we'll be confusing developers anyway, so it's not exactly a net loss.
I can take whatever kind of decision I want for now, but this needs to be resolved properly before the first "production" version.
We're currently using the Click default behavior when parsing options with multiple=False
.
That parses all instances of an option and returns the last one given.
So globus transfer ls --endpoint-id 'abc123' --endpoint-id 'def456'
is entirely valid, and results in endpoint_id="def456"
.
This is fine/safe for options where allowing an override doesn't change the service-side semantics of commands, like --format
, but for IDs and paths, it could be the result of user confusion and not result in the desired action.
For example, a globus transfer async-transfer
with two sources, two destinations, and two paths is inherently ambiguous, and likely means that the user has wholly misunderstood usage of that command.
To correctly implement a change in this behavior, I think we need to add a new class of option.
Such opts can internally be specified with multiple=True
and then assert that len(value) <= 1
where value
is the resulting tuple.
I'd suggest that this be a custom callback, but I worry about future opts that need or want a custom callback and this behavior.
The custom option class doesn't add much more complexity, and properly handles that case.
The inverse of globus login
.
This can't touch the consent in Globus Auth, but it should invalidate all of your tokens by revoking them, and scrub them from the config file.
The login command could be improved with some additional explanation for what happened after the Authorization code is entered. Below is a sample suggestion.
globus login
Please login to Globus here:
https://auth.globus.org/v2/oauth2/authorize?code_challenge=<snip>
Enter the resulting Authorization Code here: <snip>
A credential (use appropriate term) for using the CLI was saved for you in (path). It is valid until <date/time>
> (back at command prompt)
Right now, the way that we're using click
is dangerous to the long-term health of this project.
It provides a lot of value with the way that it takes care of parsing and dispatch, but unless we wrap it carefully, it will be really difficult to move off of it if we ever need to.
We've already gotten into some interesting customization directions -- like the HiddenOption
type.
And there are inevitably going to be some places where we find the parsing provided by click to be confusing or even downright wrong (at least there's pallets/click#619 ).
I don't think we need to obscure the basic usage behind globus_cli_command
and globus_cli_option
wrappers everywhere -- that's unnecessary and doesn't buy us anything -- but collecting all of the interesting customizations in one place will likely make it much easier to transition if we ever need to.
There will be general-purpose support for public clients in Globus Auth in the near future.
At that time, we want globus login
to use a public client registration to start a 3-legged flow with the following basic usage pattern:
globus login
displays a link to copy-paste into a browser windowglobus login
(still open) consumes that authorization code as an input and exchanges it for refresh tokens, which are savedThis requires that the SDK changes its token handling to be ready for refresh tokens, and that the support in Globus Auth is completed.
This is a bit of an open question.
Right now, if you attempt an authenticated action like globus transfer ls ...
, you get a 401 error back.
Ideally, for any call that cannot be made unauthenticated, the relevant commands would give you a "no auth" error of some kind and direct you to globus login
.
The big question here is where and how this gets enforced.
Some calls are available unauthenticated, but I'm not sure that it's worth extra burden to support that mode of usage.
If all calls which talk to the service are forced to be made with credentials, then the problem is solved neatly and easily.
However, if we do want to support this case, then we need to figure out where credentials are and are not required.
When HTTPS endpoints enter the ecosystem with possible "public read" ACLs, the desire for unauthenticated calls to be supported may (not necessarily will) increase.
Do we want to hide certain fields from the output? Example: event_link
and subtask_link
?
I assumed for the recent rewrite that it would be okay to assume command names were under 16 chars. That's pretty good, but we have a violator:
=== globus transfer endpoint ===
search Search for Globus Endpoints
show Display a detailed Endpoint definition
deactivate Deactivate an Endpoint
create Create a new Endpoint
update Update attributes of an Endpoint
my-shared-endpoint-listList all Shared Endpoints on an Endpoint by...
server-list List all servers belonging to an Endpoint
autoactivate Activate an Endpoint via autoactivation
delete Delete a given Endpoint
my-shared-endpoint-list
is a bit of a long name, and probably should be fixed, but that doesn't mean it should break list-commands
so easily.
Probable solution is to do a line break and indent the short-help, like so:
=== globus transfer endpoint ===
search Search for Globus Endpoints
show Display a detailed Endpoint definition
deactivate Deactivate an Endpoint
create Create a new Endpoint
update Update attributes of an Endpoint
my-shared-endpoint-list
List all Shared Endpoints on an Endpoint by...
server-list List all servers belonging to an Endpoint
autoactivate Activate an Endpoint via autoactivation
delete Delete a given Endpoint
We can do that for any command that's going to come too close to the right-hand column (within 2 chars?).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.