The cap-client from cernanalysispreservation

Parametrize data in the creation of deposit

cli: get repositories method [2h]

We would like to have a cli method, that returns user a list of repositories attached to his analysis.

Example command:
cap-client repositories get --pid <my-pid> --with-snapshots

By default, we don't want to show all the snapshots for webhooks, just repositories details. User can ask for snapshots using --with-snapshots flag.

We need to make a request (in CapAPI class) like:

self._make_request(
            urljoin('deposits/', pid),
            method='get',
            expected_status_code=200,
            headers={'Accept': 'application/repositories+json'},
        )

Then we have to filter results, depending if user asked for full details (with snapshots) or not.
If the user wanted to see all the details you just return a response directly, if without snapshots you filter data like that:
for x in response['webhooks']: x.pop('snapshots')

You need to register your new get command under repositories click group, so you can call it from cap-client repositories ..

Example how it's done for permissions:

@permissions.command()
@click.option(
    '--pid',
    '-p',
    help='Get permissions of the deposit with given pid',
    default=None,
    required=True,
)
@click.pass_context
def get(ctx, pid):
    """Retrieve analysis user permissions."""
    try:
        response = ctx.obj.cap_api.get_permissions(pid=pid)
        click.echo(json.dumps(response, indent=4))
    except Exception as e:
        logging.error('Unexpected error.')
        logging.debug(str(e))

You need to write something very very similar, just add --with-snapshots flag. Filtering results should be inside your cap_api.get_repositories method.

Write test to check your command in both cases (with snapshots and not) and with error handling.

cli: use `by_me` filter in queries instead of exposing user id [1h]

Client is using a user id to filter user deposits. Server already has an option to do that, using by_me param in request query.
Needs to be changed here

Implement delete method

Possibility to delete a deposit through the client.

tests: for file management upload / download / remove / list

Add tests for basic client functionality

cli: add all parameter for `get-shared` command [1h]

cap-client get by default returns only drafts created by a user, and by flag all you can see all the drafts that user has access to
Let's make a similar behaviour for cap-client get-shared.

You need to pass all param here, and then:

if all is False - query /records/?q=&by_me=True
if all is True - query /records

NOTE this method is used also for fetching a specific record (with given PID), so make sure your changes don't break this functionality.

UPDATE Sorry, didn't mention, need to add/update tests to check behaviour with your added parameter.

add command: files get [FILENAME] [PID]

add command to download the file associated with analysis

Add cap-client ping

Add CLI skeleton with first command: check status of connection with analysis-preservation server

cli: use serialized deposits in get all command

cap-client get returns non serialized results
once implemented on server side, we need to use in client

Implement update method on client

We need an update method in case someone wants to update an already created deposit.

cli: update get-schema method [2h]

    def get_schema(self, ana_type=None, version='0.0.1'):
        """Retrieve schema according to type of analysis."""
        types = self._get_available_types()

        if ana_type not in types:
            raise UnknownAnalysisType(types)

        response = self._make_request(
            url='schemas/deposits/records/{}-v{}.json'.format(
                ana_type, version))

        schema = {
            k: v
            for k, v in response.get('properties', {}).items()
            if not k.startswith('_')
        }

        return schema

remove default version parameter from client-side (server by default will return the latest version of the given schema, if version not passed)
remove types check (new endpoint does the check itself)
hit an https://analysispreservation.cern.ch/api/jsonschemas/{schema_type}/{version}?resolve=True ** NOTE ** resolve is an important flag, that will resolve all the $ref in the schema, check how schema like cms-analysis look like with resolve=False and resolve=True
add flag to return deposit/record schema
the result is a dictionary, with various fields, depending on the flag above you have to return either deposit_schema or record_schema to the user
write/update tests for your command to check all the cases

Improve --help documentation to match full documentation

In the last user test iteration, we commonly observed analysts attempting to use cap-client --help to find a command to perform a desired task. They drew back on --help, even though the full documentation was already loaded in the browser. While the full documentation proved to be helpful for all tasks, this was not the case for the command overview resulting from --help. For example, a user who wanted to update an analysis field, did not manage doing so with the command line --help. It only lists metdata with the description Metadata managing commands. As we see that the full documentation is effective, improving the --help overview by drawing back on already existing descriptions should improve the usability.

cli: make possible to pass json from command line

For now for commands like create, update, we can pass json from file, using --file option. Two things should be done:

rename this option to --json or --json-file, as can be confused with --file used for commands like update (when you can pass every kind of file)
make possible to pass json directly through command line, e.g.
cap-client create {} --type lhcb

initiate git repository upload via client

Deploy client in PyPi

Introduce deploy configuration on travis

tests: test all the commands [5h]

Install cap-client locally with current master and try all the commands.
Prepare a note with every command you run and the output you've got.

Example:
cap-client --help to see all the commands

Usage: cap-client [OPTIONS] COMMAND [ARGS]...

  CAP Client for interacting with CAP Server.

Options:
  -v, --verbose                   Verbose output
  -l, --loglevel [error|debug|info]
                                  Sets log level
  -t, --access_token TEXT         Sets users access token
  --help                          Show this message and exit.

Commands:
  clone         Clone analysis with given pid.
  create        Create an analysis.
  delete        Delete analysis with given pid.
  files         Files managing commands.
  get           Retrieve one or all analyses from a user.
  get-schema    Retrieve analysis schema.
  get-shared    Retrieve one or all shared analyses from a user.
  me            Retrieve user info.
  metadata      Metadata managing commands.
  permissions   Permissions managing commands.
  publish       Publish analysis with given pid.
  repositories  Repositories managing commands.
  types         Retrieve all types of analyses.

cap-client metadata --help pick one to test, use help to see all the commands in metadata group

Usage: cap-client metadata [OPTIONS] COMMAND [ARGS]...

  Metadata managing commands.

Options:
  --help  Show this message and exit.

Commands:
  append  Edit analysis field adding a new value to an array.
  get     Retrieve one or more fields in analysis metadata.
  remove  Remove analysis field.
  set     Edit analysis field value.

cap-client metadata get --help see what can be parameters for metadata get method

Usage: cap-client metadata get [OPTIONS] [FIELD]

  Retrieve one or more fields in analysis metadata.

Options:
  -p, --pid TEXT  Get metadata of the deposit with given pid  [required]
  --help          Show this message and exit.

So you see it needs a pid argument, which is required, so it means, you have to test three cases:

when existing PID passed cap-client metadata get --pid non-existing-pid
when non-existing pid passed cap-client metadata get --pid existing-pid
when no pid passed cap-client metadata get

Write down all the commands you run and output/error you've got.

cli: clean output

By default client shouldn't display all the debug information, with [INFO] or [DEBUG] at the beginning, we want to just see the output.
Make possible to see more verbose output with --verbose flag.

deployment: make cap-client available on lxplus by a source script

Most physics users will expect that the cap client is somehow usable on LXPLUS so if the cap-client could somehow be deployed there (via RPM or some virtualenv that can be sources) that would be nice

cli: dir upload to delete leftover tar file

the tar file stays in the system (see fig.), while it should be deleted after the upload

Description:

while uploading a directory for example: cap-client -v files upload newdir -p bf6b8501822c4d2ba46028611354df7e

the client creates a temporary tar file which remains in the system after the upload is over.

Retrieve metadata by index of array

See fig.

Retrieve user's analysis with given PID

REQUEST: cap-client --access-token=USERS_TOKEN get <PID>
RESPONSE: 200 OK analysis

Add option to get only analysis created by user

Now get returns all analysis that user has access to.
Add option to retrieve only those created by user.

Use access tokens

Make it possible to authorize users, by passing the access token generated within CERN Analysis Preservation app.
Access token passed as a parameter of CLI or set as env variable

publish analysis documentation

cli: pass oauth token in headers [1h]

Currently we pass auth token in URL.

Update CapAPI._make_request method, so instead of passing access_token in params, adds this one to headers:
"Authorization: OAuth2 <access_token>

cli: add get field method

Example:

$ cap-cli metadata get field_name --pid ana_pid
field_value

Nested fields should be access with dots, e.g. basic_info.ana_title

Upload multiple files

Add docs for basic client functionality

add command: files remove [FILENAME] [PID]

add command to remove the file associated with analysis

Option to specify field when uploading file

Proof check all cap-client commands

use new serializers in client requests

we need to use Accept: application/basic+json in request headers

Add ssl certifications for securing connection

Add validators output in response

For now, when using API, we validate passed schema, but don't return any information, which fields were incorrect.
We want to returns response from JSON validators on POST, PUT requests.

Python 3 compatibility

I tried using the client using the Python that ships with my Linux distribution, v3.6.5 but this failed:

$ cap-client --help
Traceback (most recent call last):
  File "/home/apearce/tmp/venv/bin/cap-client", line 11, in <module>
    load_entry_point('cap-client==0.0.2', 'console_scripts', 'cap-client')()
  File "/home/apearce/tmp/venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 480, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/home/apearce/tmp/venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2693, in load_entry_point
    return ep.load()
  File "/home/apearce/tmp/venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2324, in load
    return self.resolve()
  File "/home/apearce/tmp/venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2330, in resolve
    module = __import__(self.module_name, fromlist=['__name__'], level=0)
  File "/home/apearce/tmp/venv/lib/python3.6/site-packages/cap_client/cli/__init__.py", line 33, in <module>
    from cap_client.cap_api import CapAPI
  File "/home/apearce/tmp/venv/lib/python3.6/site-packages/cap_client/cap_api.py", line 30, in <module>
    from urlparse import urljoin
ModuleNotFoundError: No module named 'urlparse'

I suspect there are probably other issues with Python 3 compatibility, but I haven't checked.

Retrieve list of all user's analysis

REQUEST: cap-client --access-token=USERS_TOKEN get
RESPONSE: 200 OK list of analysis

cli: add upload objects method

Add method to upload objects to existing analysis, like:
$ cap-client files upload file_name --file foo.dat --pid analysis_pid

Upload file to the bucket associated with given analysis.

cli: upload repository method [2h]

In CapAPI class, add a method for uploading a repository from URL.

Example of how it looks for publishing:

def publish(self, pid):
        return self._make_request(url='deposits/{}/actions/publish'.format(pid), 
                                  expected_status_code=202,
                                  method='post',
                                  headers={'Content-Type': 'application/json',
                                           'Accept': 'application/basic+json'})

For upload repository we need to make request with:

an endpoint that should be called is deposits/{pid}/actions/upload
expected status code is 201
method post
data (needs json.dumps)
- url
- webhook (true|false)
- event_type (release|push)'
headers
{'Content-Type': 'application/json', 'Accept': 'application/basic+json'}

Currently, we don't pass repositories field in basic serializer (that's the one you're asking for in headers) - so no way to validate response JSON - for now just check that status code was 201.
We will decide in a separate thread on creating a serializer for repositories part.

Write tests to check how your method behaves in cases:

400 returned from the server (e.g. wrong URL like http://nongithubhost.com)
just repo upload
repo upload with push webhook
repo upload with release webhook
user doesn't have sufficient permission

Connect method to CLI, create a file like cap_client/cli/files_cli.py. It needs registering click group, like:

@click.group()
def repositories():
    """Repositories managing commands."""

And then upload command registered under this group so you can call it like:
cap-client repositories upload my_url --webhook push
We can have a separate command for upload and creating webhook, or as a parameter like above (to be decided)

Remember to add a line at the end of cap_client/cli/__init__.py file:
cli.add_command(repositories)
Without this one, you won't see your repositories command when calling cap-client

UPDATE let's do this one after cernanalysispreservation/analysispreservation.cern.ch#1547
then, you want to make a request in your method with the header ('Accept', 'application/repositories+json'), like:

    def upload_repository(self, pid, url, event_type=None):
        """Your method."""
        return self._make_request(
            url=f'deposits/{pid}/actions/upload',
            data=json.dumps(
                dict(url=url,
                     event_type=event_type if event_type else None,
                     webhook=True if event_type else False)),
            method='post',
            headers={
                'Content-Type': 'application/json',
                'Accept': 'application/repositories+json'
            })

write yours tests according to repositories serializer format

cli: add set field method

Now we have patch method, that allows users to patch json passing operations in JSON Patch format, so to set field we need to use:

[{ "op": "replace", "path": "/field_name", "value": "field_value" }]

Would be simpler to let users run commands like:

cap-cli metadata set field_name field_value --pid ana_pid

    def create(self, json_='', ana_type=None, version='0.0.1'):
        """Create an analysis."""
        types = self._get_available_types()

        if ana_type not in types:
            raise UnknownAnalysisType(types)

        if not json_:
            raise MissingJsonFile()

        try:
            data = json.loads(json_)
        except ValueError:
            with open(json_) as fp:
                data = json.load(fp)

        data['$ana_type'] = ana_type
        json_data = json.dumps(data)

        response = self._make_request(
            url='deposits/',
            method='post',
            data=json_data,
            expected_status_code=201,
            headers={'Content-Type': 'application/json'})

        return self._make_request(url='deposits/{}'.format(
            response.get('metadata', {}).get('_deposit', {}).get('id', '')),
                                  method='put',
                                  data=json.dumps(response.get('metadata',
                                                               {})),
                                  expected_status_code=200,
                                  headers={
                                      'Content-Type': 'application/json',
                                      'Accept': 'application/basic+json'
                                  })

json-file should be a required field (also rename to json, as can be both - file and command line)
if ana_type is passed, $schema shouldnt be in a json (raise an error)
if $schema in json, ana_type shouldnt be there (raise an error)
version parameter is not supported by backened so should be removed
dont make call to available types, user can call it from different command
fix an issue when create called without valid access token, example:
make put request to url='deposits/{}'.format(response['id'])
write/update tests to check all the cases for your updated method

Add option to get user permissions
Add option for updating permissions

cernanalysispreservation / cap-client Goto Github PK

cap-client's People

Contributors

Stargazers

Watchers

Forkers

cap-client's Issues

Recommend Projects

Recommend Topics

Recommend Org