Giter Site home page Giter Site logo

core's People

Contributors

ambrussimon avatar andynemzek avatar cgc avatar coltonlw avatar davidfarkas avatar davidfarkas93 avatar ehlertjd avatar gsfr avatar hkethi002 avatar josschne avatar kevlarkevin avatar kofalt avatar larsoner avatar lmperry avatar mrdarcymurphy avatar nagem avatar rentzso avatar ryansanford avatar tomck avatar wandell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

core's Issues

Dev infrastructure

We are looking to create a simple, yet effective way to launch the scitran API for developers. Besides the API itself, the only other launch requirement is mongod. A single entry point, such as ./scitran.sh run and a single config file are desired. Ideally, no further dependencies are introduced. Implementation candidates are bash and python. Options for app launching and live reloading are Reflex and possibly Paste Script.

Below is a starting point for discussion, intended to be refined in place before commencing work.

Requirements:

  • works on
    • Mac
    • Linux
    • Linux in Vagrant on Mac
  • live reloading of the API
  • API's stdout and stderr are printed
  • bootstrapping of users and data from arbitrary locations

Targets:

  • prepare - recent bash may be all we need
  • run - setup + bootstrap + launch
  • setup - install reflex, mongo, venv, etc.
  • launch - launch API and mongo
  • update - update this repo
  • cmd - run arbitrary commands in the application context

Convenience targets (likely wrappers around cmd):

  • bootstrap - bootstrap users or data
  • reset - reset database (via boostrap.py clean)
  • mongo-log - tail the mongod log
  • python - ipython shell with db variable set
  • lint - run a linter

Directory structure:

  • scitran.sh or scitran.py - entry point
  • api - API source code
  • bin - api.wsgi and bootstrap.py
  • test - unit and integration tests
  • runtime - disposable environment for python dependencies and mongo binaries
  • templates - bootstrap.json, config.toml
  • persistent - data, db

Attaching a file to a "Project" causes rules.create_jobs exception when called with that file

Branch = master

Impact

Was blocking @mrDarcyMurphy, but I disabled the uwsgi cron job as a workaround.

Steps to reproduce:

Attach a file to any Project, then have rules.create_jobs called via uwsgi cron job or similar.

Error

Tue Jan 5 17:04:52 2016 - error managing signal 0 on worker 1
Traceback (most recent call last):
File "bin/api.wsgi", line 88, in job_creation_timer
job_creation(signum)
File "bin/api.wsgi", line 43, in job_creation
rules.create_jobs(config.db, c, c_type, f)
File "./api/rules.py", line 139, in create_jobs
project = get_project_for_container(db, container)
File "./api/rules.py", line 166, in get_project_for_container
raise Exception('Hierarchy walking not implemented for container ' + str(con
tainer['_id']))
Exception: Hierarchy walking not implemented for container 568be6a7a104c7a4b4a89
567
Tue Jan 5 17:05:22 2016 - error managing signal 0 on worker 1

single file downloads

Add single file download capability to GET /nimsapi, specifying level, id, and a unique file description in a JSON request body.

API throws error from uwsgi cron job when encountering stale "running" job.

Branch = master

Error

uwsgi_1 | [pid: 71|app: 0|req: 1/1] 192.168.99.1 () {34 vars in 399 bytes} [Wed Dec 30 19:37:47 2015] GET /api/jobs/next => generated 864 bytes in 35 msecs (HTTP/1.1 200) 3 headers in 112 bytes (2 switches on core 0)
uwsgi_1 | Traceback (most recent call last):
uwsgi_1 | File "bin/api.wsgi", line 87, in job_creation_timer
uwsgi_1 | job_creation(signum)
uwsgi_1 | File "bin/api.wsgi", line 76, in job_creation
uwsgi_1 | jobs.retry_job(config.db, j)
uwsgi_1 | NameError: global name 'jobs' is not defined
uwsgi_1 | Wed Dec 30 19:39:47 2015 - error managing signal 0 on worker 4

How to reproduce

  1. bootstrap new instance of API
  2. Wait for API to create jobs from bootstrapped data.
  3. Set job to running state
baseUrl="https://localhost:8443/api/jobs"
alias sCurl="curl -sk -H 'User-Agent: SciTran Drone Engine' -H 'X-SciTran-Auth: change-me'"
sCurl $baseUrl/next

  1. Wait 100 seconds.

If this error happens on uwsgi startup rather than cron execution, all subsequent api requests will error with http 500 response codes until uwsgi is restarted enough times to reset all stale "running" jobs.

Odd response to unchanged data

TL;DR: an update request is returning 404 when it maybe should be 304.

When changing the permissions of a user on a project things work fine the first time. If I send the same request to change the permissions a second time, ie. trying to change {access:'rw'} to {access:'rw'}, then the response is 404 Element not updated....

Request:

  • url: http://localhost:9000/api/projects/abc123/permissions/local/[email protected]
  • body: {"access":"rw"}

1st Response:

  • 200
  • {"modified": 1}

2nd Response:

  • 404
  • {"code":404,"uid":"[email protected]","detail":"Element not updated in list permissions of container projects abc123"}

Q: module organization

I'm looking through the module, and it looks like the organization has some non-standard conventions, which is making it a little more challenging than usual for me to understand:

  • imports are absolute (import apps) instead of relative (from . import apps) so it's tough to tell what is from the local package vs built-in or external imports
  • scripts like bootstrap.py are in the same location as source modules -- typically these live in a bin directory or so, to make it clear what is part of the module, and what is a callable script

Do these need to be that way? I'm happy to work on reorganizing these aspects of package if it makes sense to do so at some point. I think it would make it easier to maintain and for other devs to understand the layout, and in the meantime would teach me a bit about the module. But if this seems overkill, it doesn't need to be done.

localpaths

Add localpaths option to project, session, and acquisition details.

Mixed plural case in container_type

Recently, my understanding is that the container_type of a file object in the API did not have an s prefix, at least in the case of an acquisition.

Thus, to generate a download URI (as I do in jobs.py) you add the s suffix:

'/' + i['container_type'] + 's/' + i['container_id'] + '/files/' + i['filename']

Surprisingly, this was not the case, as shown by these download URLs:

/acquisitionss/56a0159b6d7638bbbf9160d3/files/8892_nifti.nii.gz
/acquisitions/56a0159e6d7638bbbf9160f6/files/149_1_1_localizer.zip

It looks like spawning jobs from listhandler post will call rules.create_jobs with acquisitions.

I added code to jobs that throws an exception when called with plurals as a stopgap to limit the problem; here's a trace that hopefully illuminates the issue:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1535, in __call__
    rv = self.handle_exception(request, response, e)
  File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1529, in __call__
    rv = self.router.dispatch(request, response)
  File "./api/api.py", line 120, in dispatcher
    rv = router.default_dispatcher(request, response)
  File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1278, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1102, in __call__
    return handler.dispatch()
  File "./api/base.py", line 128, in dispatch
    return super(RequestHandler, self).dispatch()
  File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 572, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)
  File "./api/handlers/listhandler.py", line 401, in post
    rules.create_jobs(config.db, container, cont_name, file_properties)
  File "./api/rules.py", line 140, in create_jobs
    jobs.queue_job(db, alg_name, input)
  File "./api/jobs.py", line 108, in queue_job
    raise Exception('Container type cannot be plural :|')

File system cleanup endpoint

Also we need to create a collection with delete candidates with this schema:

{
      hash: "...",
      timestamp: "...",
      _id: "..."
}```

The API will check that the files have been deleted more than 24 hours ago and that the files are not used anywhere in the database.

Reduce HTTP awareness in non handler layers

As a general rule, classes should fall into two distinct categories:

  • Classes that have functions named after HTTP verbs, and (basically) nothing else
  • Classes that don't know anything about HTTP verbs or protocol details

Rather than self.abort, raise a variety of typed exceptions. This would be a better, drop-in replacement for what self.abort does anyway. At the very top, and/or whichever layers are appropriate, catch typed exceptions and handle a JSON error return there.

Error when attempting sessions download

Some of the recent changes appear to have effected file downloads. This was previously working. The issue occurs when attempting to retrieve the ticket for the download.

Endpoint: /api/download
Method: POST
Payload:

{"optional":true,"nodes":[{"level":"session","_id":"5660680b9737e71a69dc8c4b"},{"level":"session","_id":"5660680c9737e71a69dc8c4f"}]}

Response Code: 500
Response:

<html>
 <head>
  <title>500 Internal Server Error</title>
 </head>
 <body>
  <h1>500 Internal Server Error</h1>
  The server has either erred or is incapable of performing the requested operation.<br /><br />



 </body>
</html>

Log entry:


2015-12-08 17:40:49             root               webapp2.py  1552:ERRO 'NoneType' object has no attribute '__getitem__'
Traceback (most recent call last):
  File "/scitran/persistent/venv/local/lib/python2.7/site-packages/webapp2.py", line 1535, in __call__
    rv = self.handle_exception(request, response, e)
  File "/scitran/persistent/venv/local/lib/python2.7/site-packages/webapp2.py", line 1529, in __call__
    rv = self.router.dispatch(request, response)
  File "./api/api.py", line 126, in dispatcher
    rv = router.default_dispatcher(request, response)
  File "/scitran/persistent/venv/local/lib/python2.7/site-packages/webapp2.py", line 1278, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/scitran/persistent/venv/local/lib/python2.7/site-packages/webapp2.py", line 1102, in __call__
    return handler.dispatch()
  File "./api/base.py", line 125, in dispatch
    return super(RequestHandler, self).dispatch()
  File "/scitran/persistent/venv/local/lib/python2.7/site-packages/webapp2.py", line 572, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/scitran/persistent/venv/local/lib/python2.7/site-packages/webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)
  File "./api/core.py", line 249, in download
    return self._preflight_archivestream(req_spec)
  File "./api/core.py", line 184, in _preflight_archivestream
    prefix = project['group'] + '/' + project['label'] + '/' + session.get('label', 'untitled')
TypeError: 'NoneType' object has no attribute '__getitem__'
[pid: 14428|app: 0|req: 60/137] 10.0.2.2 () {52 vars in 918 bytes} [Tue Dec  8 17:40:49 2015] POST /api/download => generated 228 bytes in 31 msecs (HTTP/1.1 500) 2 headers in 99 bytes (1 switches on core 0)

Better sanitization of file I/O

#86 closed the loop on security vulns, but untrusted input could still perform a variety of annoying tasks:

  • 300,000 character whitespace-not-whitespace unicode filenames
  • Behaviour of system when code points are passed that aren't valid for the filesystem?
  • Different filesystems have different opinions (OSX ain't case sensitive, etc), what's our minimal subset?
  • Let's assume there's an escape for os.path.basename since I didn't validate

Closing this requires at minimum:

  1. Thoughtful sanitization of untrusted input
  2. Handling of post-sanitization collisions
  3. Test cases that confirm failure to escape / cause trouble

Don't silently skip missing files on download

We are currently silently skipping missing or unreadable files on download. As @kofalt points out, this is fundamentally opposed to our data management mission and must be corrected asap.

However, considerable edge cases exist. For example, how do we handle files being legitimately deleted, in parallel to a long-running download operation?

Also, we don't want to inform the user of a (potentially irrelevant) problem 3 hours into a large download and proceed to abort and purge all downloaded data.

We used to check for files during the download preflight, but that was deemed to slow. Let's again explore our options here. This pattern would allow us to mark files as being part of a download and delay physical removal until the download completes.

@kofalt, @ryansanford: Happy to hear your thoughts.

@rentzso: Let's take this on together next week.

begin transition to oauth2

Begin to add support for oauth2.

  • part 1 nimsapi will accept 'access_token' in request's Authorization header or 'user' as url encoded parameter. If 'access_token' and 'user' are both supplied, 'access_token' will be used.
  • part 2 - nimsapi p2p requests will add authenticated user id to header before dispatching to remote site.

Filesystem error durring upload of file to session results in file record remaining in session collection.

@gsfr @rentzso Let me know if I can provide any additional detail.

Branch = master

How to reproduce

  1. chmod -R -w "${SCITRAN_PERSISTENT_DATA_PATH}"
  2. upload file
# Headers & settings
baseUrl="https://inv-ryan-03.invenshure.com:8443/api/sessions"
alias sCurl="curl -sk -H 'User-Agent: SciTran Drone Engine' -H 'X-SciTran-Auth: change-me'"

# Get a session as an upload target
SessionId=$(sCurl $baseUrl | jq -r '.[0]._id')

# Generate a useless file
uname -a > example.bin

# Succeeds
sCurl $baseUrl/$SessionId/files -F "[email protected]"

Error

2016-01-04 16:03:04      scitran.api                  base.py   127:DEBU from None None POST /api/sessions/568a8893e460bcea7c3fdd2a/files {}
2016-01-04 16:03:04      scitran.api           liststorage.py    41:DEBU query {'_id': ObjectId('568a8893e460bcea7c3fdd2a')}
2016-01-04 16:03:04      scitran.api            validators.py   125:DEBU {u'key_fields': [u'name'], u'required': [u'name', u'created', u'modified', u'size', u'hash', u'unprocessed'], u'additionalProperties': False, u'$schema': u'http://json-schema.org/draft-04/schema#', u'type': u'object', u'properties': {u'hash': {u'type': u'string'}, u'name': {u'type': u'string'}, u'created': {}, u'unprocessed': {u'type': u'boolean'}, u'measurements': {u'uniqueItems': True, u'items': {u'type': u'string'}, u'type': u'array'}, u'modified': {}, u'instrument': {u'type': u'string'}, u'metadata': {u'type': u'object'}, u'type': {u'type': u'string'}, u'tags': {u'uniqueItems': True, u'items': {u'type': u'string'}, u'type': u'array'}, u'size': {u'type': u'integer'}}}
2016-01-04 16:03:04      scitran.api            validators.py    75:DEBU {'hash': '128a2aa7f6167cb09592541c97073b2bdc6e2a89b4a39a8cfad81f476e43eaa10b844f34811154632000050bd8b782de', 'name': 'example.bin', 'created': datetime.datetime(2016, 1, 4, 16, 3, 4, 667687), 'unprocessed': True, 'modified': datetime.datetime(2016, 1, 4, 16, 3, 4, 667687), 'size': 136}
2016-01-04 16:03:04      scitran.api           liststorage.py    65:DEBU payload {'hash': '128a2aa7f6167cb09592541c97073b2bdc6e2a89b4a39a8cfad81f476e43eaa10b844f34811154632000050bd8b782de', 'name': 'example.bin', 'created': datetime.datetime(2016, 1, 4, 16, 3, 4, 667687), 'unprocessed': True, 'modified': datetime.datetime(2016, 1, 4, 16, 3, 4, 667687), 'size': 136}
2016-01-04 16:03:04      scitran.api           liststorage.py    70:DEBU query {'files': {'$not': {'$elemMatch': {u'name': 'example.bin'}}}, '_id': ObjectId('568a8893e460bcea7c3fdd2a')}
2016-01-04 16:03:04      scitran.api           liststorage.py    71:DEBU update {'$push': {'files': {'hash': '128a2aa7f6167cb09592541c97073b2bdc6e2a89b4a39a8cfad81f476e43eaa10b844f34811154632000050bd8b782de', 'name': 'example.bin', 'created': datetime.datetime(2016, 1, 4, 16, 3, 4, 667687), 'unprocessed': True, 'modified': datetime.datetime(2016, 1, 4, 16, 3, 4, 667687), 'size': 136}}}
2016-01-04 16:03:04             root               webapp2.py  1552:ERRO [Errno 13] Permission denied: '/var/scitran/data/1/2'
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1535, in __call__
    rv = self.handle_exception(request, response, e)
  File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1529, in __call__
    rv = self.router.dispatch(request, response)
  File "./api/api.py", line 118, in dispatcher
    rv = router.default_dispatcher(request, response)
  File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1278, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1102, in __call__
    return handler.dispatch()
  File "./api/base.py", line 128, in dispatch
    return super(RequestHandler, self).dispatch()
  File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 572, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)
  File "./api/handlers/listhandler.py", line 400, in post
    file_store.move_file(dest_path)
  File "./api/files.py", line 112, in move_file
    os.makedirs(target_dir)
  File "/usr/lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/usr/lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/usr/lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/usr/lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/usr/lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/usr/lib/python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "/usr/lib/python2.7/os.py", line 157, in makedirs
    mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/var/scitran/data/1/2'
[pid: 15|app: 0|req: 29/52] 192.168.3.177 () {40 vars in 680 bytes} [Mon Jan  4 16:03:04 2016] POST /api/sessions/568a8893e460bcea7c3fdd2a/files => generated 228 bytes in 88 msecs (HTTP/1.1 500) 2 headers in 99 bytes (2 switches on core 1)
2016-01-04 16:03:09 scitran.api.jobs                 rules.py    59:WARN file example.bin in container 568a8893e460bcea7c3fdd2a has no type key

add comments/documentation for all routes

Acceptance Criteria:
Close when all routes from api.py are fully documented in api.raml

EDIT by Colton: Compare api.raml and routes in api.py to find api resources which have not yet been added to the raml spec

Upgrade mongo usage

As mentioned in #25, we're using some pymongo functions which are deprecated; see diff there for more details. Eventually it might be wise to upgrade those.

Uploading data doesn't result in jobs getting created in master branch

Looks like this was caused by renaming some elements of the file document inside the acquisitions collection within mongodb.

dirty->unprocessed
filename->name
filehash->hash
filetype->type

One example of the mismatch
api.wsgi Line 136
for c in application.db[c_type].find({'files.dirty': True}, ['files']):

Add 'now' flag to job entries

Proposed by @ryansanford as a preemptive production support feature.

Add a boolean now flag to job objects.
Toggling this flag will cause the job to run as soon as possible.

Change the job modification (post/put) route to only allow superusers to manipulate the field.

Change the next route to make two queries - first, check for any jobs with a now flag, second, the normal query that exists today. Critically, the now query should sort by the modified timestamp in the opposite order the normal query uses - that is, return a job that has most recently been modified and has a now flag. This contrasts with the normal query which is FIFO.

Should be set on endpoint tracked by #319.

Eliminate or harmonize upload digest

SSL already provides integrity; all modern clients will most likely use HMAC-SHA256 at the transport level.

In order of desire:

  1. Eliminate obsolete digest options (MD5)
  2. If we're going to have an upload digest, harmonize with CAS storage
  3. Make upload digest optional (may already be the case), or eliminate entirely

Tagging @ryansanford he probably has the most experience in this area.

Force flag not respected on file uploads

Uploading redundant files are rejected with 409, even if the force flag is passed.
Expected behaviour is to overwrite the existing file and return 200.

To reproduce:

# Headers & settings
baseUrl="https://localhost:9000/api/projects"
alias sCurl="curl -sk -H 'User-Agent: SciTran Drone Engine' -H 'X-SciTran-Auth: change-me'"

# Get a project as an upload target
ProjectId=$(sCurl https://localhost:9000/api/projects | jq -r '.[0]._id')

# Generate a useless file
uname -a > example.bin

# Succeeds
sCurl $baseUrl/$ProjectId/files?force=true -F "[email protected]"

# Fails
sCurl $baseUrl/$ProjectId/files?force=true -F "[email protected]"

Example uses jq, an excellent CLI tool for JSON manipulation.
Obviously, your instance will need at least one project for this to work.

access control for file and attachment uploads

No permissions checking is currently done for attachment uploads. Use Container._get() to do it:

self._get(_id, ‘rw’)

Also, file uploads currently require admin permission, but that should probably be changed to read-write as well.

Bring back cron-like functionality

At the moment, job creation only runs via uwsgi cron. Without uwsgi, these tasks will not run.
#93 will reduce but not eliminate our need for cron-like functionality. The current best proposal is to use APScheduler, which has several advantages over our old approach and in general looks pretty great.

Improve batch download structure

In a batch download, if the label is missing in a session or an acquisition, in the structure of the extracted tar, we are naming the corresponding folder with the default "untitled".
Instead, we should use the timestamp or the uid of the session/acquisition.

API hosted by uwsgi sometimes throws errors "No replica set members found yet"

Branch

master

Description

Error is not consistent. About half the time I start uwsgi. Believe it's related to https://jira.mongodb.org/browse/PYTHON-986, as adding connect=False to pymongo.MongoClient() in config.py I'm unable to reproduce.

I'll create a pull request for this change.

Environment

Mongodb is in a standalone replicaset config

Uwsgi config

[uwsgi]
wsgi-file = bin/api.wsgi
chdir=code/api
pythonpath=code/data
master = True
die-on-term = True
socket = [::]:9000
processes = 4
threads = 2

Actual error

2016-01-12T18:38:53.641920306Z [pid: 15|app: 0|req: 4/27] 192.168.3.177 () {44 vars in 809 bytes} [Tue Jan 12 18:38:53 2016] GET /api/sessions => generated 6120 bytes in 47 msecs (HTTP/1.1 200) 3 headers in 113 bytes (2 switches on core 0)
2016-01-12T18:38:53.651092720Z 2016-01-12 18:38:53      scitran.api                config.py    86:INFO Initializing database
2016-01-12T18:38:54.354535093Z 2016-01-12 18:38:54             root               webapp2.py  1552:ERRO No replica set members found yet
2016-01-12T18:38:54.354623330Z Traceback (most recent call last):
2016-01-12T18:38:54.354646766Z   File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1535, in __call__
2016-01-12T18:38:54.354665146Z     rv = self.handle_exception(request, response, e)
2016-01-12T18:38:54.354681000Z   File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1529, in __call__
2016-01-12T18:38:54.354697060Z     rv = self.router.dispatch(request, response)
2016-01-12T18:38:54.354712743Z   File "./api/api.py", line 120, in dispatcher
2016-01-12T18:38:54.354737566Z     rv = router.default_dispatcher(request, response)
2016-01-12T18:38:54.354793196Z   File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1278, in default_dispatcher
2016-01-12T18:38:54.354811500Z     return route.handler_adapter(request, response)
2016-01-12T18:38:54.354826626Z   File "/usr/local/lib/python2.7/dist-packages/webapp2.py", line 1101, in __call__
2016-01-12T18:38:54.354842113Z     handler = self.handler(request, response)
2016-01-12T18:38:54.354860460Z   File "./api/handlers/userhandler.py", line 17, in __init__
2016-01-12T18:38:54.354876060Z     super(UserHandler, self).__init__(request, response)
2016-01-12T18:38:54.354890716Z   File "./api/base.py", line 21, in __init__
2016-01-12T18:38:54.354905476Z     self.debug = config.get_item('core', 'insecure')
2016-01-12T18:38:54.354919783Z   File "./api/config.py", line 133, in get_item
2016-01-12T18:38:54.354934540Z     return get_config()[outer][inner]
2016-01-12T18:38:54.354949000Z   File "./api/config.py", line 110, in get_config
2016-01-12T18:38:54.354963780Z     initialize_db()
2016-01-12T18:38:54.354978013Z   File "./api/config.py", line 87, in initialize_db
2016-01-12T18:38:54.354992570Z     if not db.system.indexes.find_one():
2016-01-12T18:38:54.355007233Z   File "/usr/local/lib/python2.7/dist-packages/pymongo/collection.py", line 798, in find_one
2016-01-12T18:38:54.355038073Z     for result in cursor.limit(-1):
2016-01-12T18:38:54.355052866Z   File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 977, in next
2016-01-12T18:38:54.355068156Z     if len(self.__data) or self._refresh():
2016-01-12T18:38:54.355082543Z   File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 902, in _refresh
2016-01-12T18:38:54.355097533Z     self.__read_preference))
2016-01-12T18:38:54.355111673Z   File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 813, in __send_message
2016-01-12T18:38:54.355126883Z     **kwargs)
2016-01-12T18:38:54.355141630Z   File "/usr/local/lib/python2.7/dist-packages/pymongo/mongo_client.py", line 728, in _send_message_with_response
2016-01-12T18:38:54.355156873Z     server = topology.select_server(selector)
2016-01-12T18:38:54.355171403Z   File "/usr/local/lib/python2.7/dist-packages/pymongo/topology.py", line 121, in select_server
2016-01-12T18:38:54.355186350Z     address))
2016-01-12T18:38:54.355200403Z   File "/usr/local/lib/python2.7/dist-packages/pymongo/topology.py", line 97, in select_servers
2016-01-12T18:38:54.355215306Z     self._error_message(selector))
2016-01-12T18:38:54.355229466Z ServerSelectionTimeoutError: No replica set members found yet

Coalesce oAuth token requests

Checking a new oauth token with the provider is currently unconditional. This results in more upstream provider calls than are necessary if more than one API request occurs in a short interval with the new token.

Example: when our web app receives a token error, we will contact Google for a new token, then re-make any failed requests to the scitran API. In practice, this is perhaps ~5 requests made simultaneously with the new token.

The API could coalesce oAuth token provider checks by sharing context between workers such that only one in-flight request for a given token occurs at a time. Expected result is making the correct & minimal request count on the oAuth provider.

targeted uploads

Add handling for targeted uploads (user or processor-specified container level and ID) to PUT /nimsdata. Such uploads will come in as multipart/form-data and need to be differentiated from sortable uploads (such as those from the reaper), which carry no additional information.

Script organization

Determine new location/home for /run.sh and /bin/* files. /scripts could be a place to do this.

Error thrown when uploading .txt files

Uploading a text file throws this error:

Traceback (most recent call last):
  File "/scitran/persistent/venv/local/lib/python2.7/site-packages/webapp2.py", line 1535, in __call__
    rv = self.handle_exception(request, response, e)
  File "/scitran/persistent/venv/local/lib/python2.7/site-packages/webapp2.py", line 1529, in __call__
    rv = self.router.dispatch(request, response)
  File "./api/api.py", line 126, in dispatcher
    rv = router.default_dispatcher(request, response)
  File "/scitran/persistent/venv/local/lib/python2.7/site-packages/webapp2.py", line 1278, in default_dispatcher
    return route.handler_adapter(request, response)
  File "/scitran/persistent/venv/local/lib/python2.7/site-packages/webapp2.py", line 1102, in __call__
    return handler.dispatch()
  File "./api/base.py", line 125, in dispatch
    return super(RequestHandler, self).dispatch()
  File "/scitran/persistent/venv/local/lib/python2.7/site-packages/webapp2.py", line 572, in dispatch
    return self.handle_exception(e, self.app.debug)
  File "/scitran/persistent/venv/local/lib/python2.7/site-packages/webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)
  File "./api/handlers/listhandler.py", line 367, in post
    file_store = files.FileStore(self.request, tempdir_path, filename=kwargs.get('name'))
  File "./api/files.py", line 77, in __init__
    self.hash = self.received_file.get_hash()
AttributeError: 'cStringIO.StringO' object has no attribute 'get_hash'

Image (.png) files work just fine.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.