neurodatawithoutborders / api-python Goto Github PK

View Code? Open in Web Editor NEW

29.0 29.0 14.0 1.2 MB

[DEPRECATED] Python API for NWB < 2.0

License: Other

Python 83.87% Shell 1.09% MATLAB 15.04%

api-python's People

Contributors

Stargazers

Watchers

Forkers

neuromusic nippoo sundeepteki potionsell chenchals ajtritt jakirkham t-b pgleeson silverlabucl choldgraf andrewmicallef davidtingley bendichter

api-python's Issues

Quickstart example

There are many examples in the examples folder and in the readme, but I think it'd be most useful for new users to have a minimal quickstart example that shows off how to interact with the API on real data. It would be useful to have a jupyter notebook explaining how to go from raw numpy objects to an NWB format in the least number of steps possible. And then start building more complexity on top of that.

In addition, I think it'd be useful to have an example that pulls a public dataset from the web (e.g. an EEG dataset on a public site), loads its contents into memory, and then turns it into an NWB format.

Calling nwb_file.create() on the same file twice generates a mysterious error message

When running 'create()' on an existing file, generates the following error message:

526 
527         try:

--> 528 line = py3compat.cast_unicode_py2(self.raw_input_original(prompt))
529 except ValueError:
530 warn("\n********\nYou or a %run:ed script called sys.stdin.close()"

Ideally this should show 'File already exits' or something similar

Renaming repo ?

Could this repo be renamed to something that contains NWB ?
By default cloning this repo at the moment will create a directory 'api-python', which is not very descriptive.

Calling py.nwb.display_versions.matlab() results in "Python Error: ImportError: cannot import name _errors"

I am trying to use matlab bridge to construct nwb file for our electrophysiology recording. I got stuck with below error. Any help would be greatly appreciated !

>> py.nwb.display_versions.matlab()

** Versions:
Python: 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:24:40) [MSC v.1500 64 bit (AMD64)]
Python executable: D:\Program Files\R2016a\bin\win64\MATLAB.exe
unable to import hd5f: <type 'exceptions.ImportError'>

Bug in validating files opened for reading with extensions

The validator doesn't seem to recognise that the extensions have been defined.

For instance, running

cd examples/create_scripts
python analysis_e.py

creates the file and says it's valid when closing it. If you then run

python -m nwb.validate ../created_nwb_files/analysis_e.nwb

you get:

Reading 12 groups and 12 datasets

******
Validation messages follow.
** one error.
No Miscellaneous errors. -- Good
No groups missing. -- Good
No datasets missing. -- Good
No attributes missing. -- Good
No Incorrect attribute values. -- Good
1 groups custom missing attribute neurodata_type=Custom:
  1. '/analysis/aibs_spike_times'
No datasets custom missing attribute neurodata_type=Custom. -- Good
No groups defined in extension, but missing attribute schema_id. -- Good
No datasets defined in extension, but missing attribute schema_id. -- Good
** 22 warnings.
No Miscellaneous warnings. -- Good
No groups custom inside custom missing attribute neurodata_type=Custom. -- Good
5 datasets custom inside custom missing attribute neurodata_type=Custom (1 combined):
  1. '/analysis/aibs_spike_times/sweep_# (#=1-5)'
No recommended groups missing. -- Good
5 recommended datasets missing:
  1. '/general/experiment_description'
  2. '/general/experimenter'
  3. '/general/institution'
  4. '/general/lab'
  5. '/general/session_id'
No recommended attributes missing. -- Good
No recommended attributes empty. -- Good
No required_attributes_empty. -- Good
12 added attributes not described by extension (4 combined):
  1. /analysis/aibs_spike_times: (group) comments
  2. /analysis/aibs_spike_times: (group) schema_id
  3. /analysis/aibs_spike_times/sweep_#: (dataset) comments (#=1-5)
  4. /analysis/aibs_spike_times/sweep_#: (dataset) schema_id (#=1-5)
** No additions.
No groups custom and identified by attribute neurodata_type=Custom.
No datasets custom and identified by attribute neurodata_type=Custom.
No groups defined by extension and identified by attribute schema_id.
No datasets defined by extension and identified by attribute schema_id.
No added attributes described by extension.
** Summary
1 errors, 22 warnings, 0 additions
failed validation check (at least one error)

Installation instructions don't mention h5py dependency

They are also not explicit about whether Python 2 or 3 should be used, though I realise that's a work in progress. I'm happy to submit a PR for the docs, or indeed update setup.py to require h5py unless there is a reason this wasn't done?

Modularize folders the repository

Right now the repository consists of a fair number of modules that seem to do one or two things each. I bet there are opportunities for these to be combined into one script and / or organized into sub-modules to make things easier to follow. Has there been thought put into doing this?

Issue with extension links to extension datasets

I have tried adding a pixel_time_offsets dataset in TwoPhotonSeries linking to another extension dataset, and get the following validation error when I finish creating the NWB file:

78 datasets defined in extension, but missing attribute schema_id (2 combined):
  1. '/acquisition/timeseries/ROI_#_Green/pixel_time_offsets (#=1-39)'
  2. '/acquisition/timeseries/ROI_#_Red/pixel_time_offsets (#=1-39)'

The relevant portion of the extension definition is:

"<TwoPhotonSeries>/": {
        "description": "Extension to add a pixel_time_offsets dataset.",
        "pixel_time_offsets?": {
            "description": ("The offset from the frame timestamp at which each pixel was acquired."
                            " Note that the offset is not time-varying, i.e. it is the same for"
                            " each frame. These offsets are given in the same units as for the"
                            " timestamps array, i.e. seconds."),
            "link": {"target_type": "pixel_time_offsets", "allow_subclasses": False},
            "data_type": "float64!"
        }
    },
    "<roi_name>/*": {
        "pixel_time_offsets": {
            "description": ("The offset from the frame timestamp at which each pixel in this ROI"
                            " was acquired."
                            " Note that the offset is not time-varying, i.e. it is the same for"
                            " each frame. These offsets are given in the same units as for the"
                            " timestamps array, i.e. seconds."),
            "data_type": "float64!",
            "dimensions": [["y"], ["y", "x"]]
        }
    }

I can work around the problem (i.e. stop the validation error happening) by adding the second line below once I have called set_dataset, so it looks like the API is failing to fill in the h5attrs dict when it should be in this instance. The schema_id attribute does get created in the NWB file and can be seen in HdfView, even without the extra line.

ts.set_dataset('pixel_time_offsets', 'link:'+roi['pixel_time_offsets'].name)
ts.get_node('pixel_time_offsets').h5attrs['schema_id'] = 'pixeltimes:pixel_time_offsets'

Use numpydoc docstrings?

It would be more clear how this package fits into the python science ecosystem if the docstrings followed the numpydoc format (here is its explanation). This basically entails writing parameters etc with the following form:

def my_function(a, b=None):
    """Do my function.
        
        This function does blah blah.

        Parameters
        -----------
        a : int | float
            Explanation of a.
        b : string | None
            Explanation of b.

        Returns
        --------
        something : instance of XXX
            Explanation of something
    """
and so on...

I think this would make the documentation more readable, and it'd make it possible to build API docs on the web quite easily. What are thoughts on using this?

Call this repository `nwb-python` instead of `api-python`

Have you guys thought about just calling this nwb-python? It would make the repository name more clearly linked to the project, since a repo called api-python is totally non-descriptive of the content that's inside.

Query on using links within ImageSegmentation reference_images

Rather than creating fresh groups within the reference_images group, I'd like to link to existing groups in /acquisition/timeseries. However, my groups there are TwoPhotonSeries groups, and when I do

ref_imgs.make_group('<image_name>', name='Zstack_image',
                                    link='link:/acquisition/timeseries/Zstack_Red_0033')

I get a warning from the validator:

6 Miscellaneous warnings (1 combined):
  1. /processing/Acquired_ROIs/ImageSegmentation/Zstack#/reference_images/Zstack_image
     - type (core:<image_name>/) is linked to a different type (core:<TwoPhotonSeries>/)
     at: /acquisition/timeseries/Zstack_Red_# (#=33,36-37,47,49,52)

I'm guessing this is because the spec defines reference_images using merge rather than link

            "reference_images/": {
                "description": "Stores image stacks segmentation mask apply to.",
                "<image_name>/+": {
                    "description": ("One or more image stacks that the masks apply "
                        "to (can be one-element stack)"),
                    "merge+": ["<ImageSeries>/",] }}

So my question is, am I doing something wrong here, or should the spec be adapted to allow this use case?

Spinning off docs into a website

Currently the docs are difficult to follow since they're embedded into one gigantic readme file. Has there been any efforts to create a documentation folder that could be used to generate a website using, e.g., sphinx? This would make it much easier for users to discover how to use these tools.

Using API via matlab_bridge

Hello,

Despite several trials, my installation of matlab-bridge-api wasn't successful. Main points:

Python API seems to be running smoothly ("./test_all.sh" and for example f = nwb_file.open(**settings) run)
The issue appears to be in h5py installation. I have tried both, installing h5py beforehand (via conda from conda-forge and anaconda and via pip, I have also tried multiple versions) and let the nwb setup.py installer do the job. The later fails, whereas the the former passes successfully but results in failure when calling the module from matlab. (I have done setenv('HDF5_DISABLE_VERSION_CHECK', '1'))

Some reproducible error messages:

>> nwbModule = py.importlib.import_module('nwb.h5gate')
Error using _objects>init h5py._objects
(D:\Build\h5py\h5py-2.7.0\h5py_objects.c:7748) (line 1)
Python Error: ImportError: DLL load failed: The specified procedure
could not be found.
Error in _objects>init h5py.h5r
(D:\Build\h5py\h5py-2.7.0\h5py\h5r.c:3222) (line 12)
Error in h5r>init h5py._conv
(D:\Build\h5py\h5py-2.7.0\h5py_conv.c:7536) (line 21)
Error in init> (line 34)
from ._conv import register_converters as _register_converters
Error in find_links> (line 6)
Error in h5gate> (line 17)
Error in init>import_module (line 37)
import(name)
>> run_all_tests
pyversion is;
version: '2.7'
executable: 'C:\Users\Martin\Anaconda3\envs\nwb-edit\python.exe'
library: 'C:\Users\Martin\Anaconda3\envs\nwb-edit\python27.dll'
home: 'C:\Users\Martin\Anaconda3\envs\nwb-edit'
isloaded: 1
Undefined variable "py" or class "py.nwb.h5gate.File".
Error in h5g8.file (line 23)
obj.fg_file = py.nwb.h5gate.File(file_name, kwargs);
Error in nwb_file (line 52)
f = h5g8.file(file_name, spec_files, arg_vals.default_ns, options);
Error in t_annotation/create_annotation_series (line 36)
f = nwb_file(settings{:});
Error in t_annotation/test_annotation_series (line 19)
create_annotation_series(fname, name, 'acquisition/timeseries');
Error in t_annotation (line 78)
test_annotation_series()
Error in run_all_tests (line 37)
feval(name, verbosity);
Previous two cases were with python 2.7, whereas on python 3.6 the matlab crashes with system error:
Abnormal termination:
Access violation
Register State (from fault):
[...]
Stack Trace (from fault):
[ 0] 0x0000000000000000 +00000000
[ 1] 0x00007ffd51cd52a9 C:\Users\Martin\Anaconda3\lib\site-packages\h5py_conv.cp36-win_amd64.pyd+00021161 PyInit__conv+00006969
[ 2] 0x00007ffd51cd5219 C:\Users\Martin\Anaconda3\lib\site-packages\h5py_conv.cp36-win_amd64.pyd+00021017 PyInit__conv+00006825
[ 3-126] .....
[127] 0x000000000844a4cc C:\Program Files\MATLAB\R2016b\bin\win64\m_parser.dll+00042188 mps_count_formal_outputs+00002196
And when trying to install fresh h5py with python setup.py install from the nwb_api-python folder I get:
c:\users\martin\appdata\local\temp\easy_install-g72ay1\h5py-2.7.0\h5py\api_compa
t.h(27) : fatal error C1083: Cannot open include file: 'hdf5.h': No such file or directory
error: Setup script exited with error: command 'C:\Users\Martin\AppData\Loca
l\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Bin\amd64\cl.e
xe' failed with exit status 2

I will be grateful for tips how to solve the problem. Thank you!

Why does RoiResponseSeries only allow a single value per ROI per time?

I'm trying to represent data from an AOL microscope in NWB. It supports acquiring data directly from defined ROIs (which may be points, line segments, planes or volumes). I was thus thinking that using RoiResponseSeries within a Fluorescence interface would be most appropriate for storing the recordings, having defined the ROIs within ImageSegmentation with reference to an initial Z stack acquisition. However, the RoiResponseSeries only allows a single value per ROI per time point, so cannot store the full data from a 1d, 2d or 3d ROI, only a summary such as mean.

Is this deliberate, or was this kind of imaging just not anticipated?

Would I be better storing the data within /acquisition/timeseries, using a separate TwoPhotonSeries group for each ROI and channel?

Query on interpretation of unit conversion factors

The spec says things like "Scalar to multiply each element in data to convert it to the specified
unit" in Timeseries and its subclasses. This suggests that values stored in say millimetres should have a conversion attribute of 1000 and specify the unit as 'metre'. In /general/optophysiology the spec describes conversion as "Multiplier to get from stored values to specified unit (e.g., 1e-3 for millimeters)", where the example gives the opposite sense to the text.

Which is correct?

Why the `nwb_`?

why are the core parts of file, utils, etc named as nwb_XXX? It seems unnecessary given that the main package is already called nwb, and in the examples when they're imported, they are already renamed w/ import nwb.nwb_utils as utils. Maybe this is a minor point but it'd be good to streamline the API to follow conventions of other major python projects in the numpy ecosystem

Python 3 compatibility

I see that in the readme it says python 3 compatibility is being worked on, but what is the status of this currently? Most folks (me included) at this point have switched to python 3, and new versions of ipython (and probably other tools as well) are going to stop supporting python 2