naplab / naplib-python Goto Github PK
View Code? Open in Web Editor NEWTools and functions for neural data processing and analysis in python
Home Page: https://naplib-python.readthedocs.io/en/latest/index.html
License: MIT License
Tools and functions for neural data processing and analysis in python
Home Page: https://naplib-python.readthedocs.io/en/latest/index.html
License: MIT License
It would be great if cuML models were directly compatible with naplib's TRF function.
Probably need to either allow input data to be of type cupy.ndarray and check to see if this is the type, or just assume it is one case or the other.
Currently, we log warnings, errors, info, and debug info using different methods in different parts of the code. Mainly using print
functions and the logging
library. We should unify all output so that it can be controlled from the same place. Ideally, the logging
library.
I think this line in process_ieeg
https://github.com/naplab/naplib-python/blob/main/naplib/naplab/process_ieeg.py#L260
and the next line
https://github.com/naplab/naplib-python/blob/main/naplib/naplab/process_ieeg.py#L261
essentially hard-coded a befaft period of 1 second on either side. The 1 should be swapped for the befaft[0] and befaft[1]
The first few samples and the last few samples around phase/amplitude extraction get very distorted, so it would be great to add a buffer to befaft
which is then removed after doing phase and amplitude extraction
Tell us what happens instead
Paste the template code (ideally a minimal example) that causes the issue
Paste the full traceback in case there is an exception
If don't provide labels to electrode_lags_fratio(), then default=None but no error is thrown.
lags = nl.segmentation.electrode_lags_fratio(data, field='resp')
Currently, downsampling to the intermediate sampling rate happens immediately after loading the raw data files. If we can move this resampling to within (some of) the load functions, it will improve performance substantially. However, this is only useful when loading data one electrode at a time, because resampling has to be done on the full signal and not chunks of it, to maintain same precision as we have now. Currently, this does not seem to be the case for any of the load functions.
process_ieeg computes spectrograms of more than just what are in stimorder. It currently computes spectrogram for each audio file, and then just selects the ones that are in StimOrder. This is inefficient if you have a task which only uses some of the files.
Tell us what happens instead
Paste the template code (ideally a minimal example) that causes the issue
Paste the full traceback in case there is an exception
Having the ability to read in data stored in BIDS format would be helpful for adoption by new users. BIDS is pretty well established across modalities.
add zeros to start and end of stimulus in process_ieeg
Tell us what should happen
Fails if 'out' variable is a struct, rather than a struct array.
Paste the template code (ideally a minimal example) that causes the issue
Paste the full traceback in case there is an exception
Since the TRF by default fits a separate model for each output channel, it would be easy to parallelize this loop.
Single function/class for entire preprocessing pipeline given a config
of parameters.
In shadederrorplot(), the line and corresponding shaded region should be the same color without having to manually specify.
It would be nice if concat_apply took **kwargs or a kwargs_dict parameter which it passed through to the function being applied
General function which ideally should take in a graph (in matrix form), defining what electrodes to use when computing reference for that electrode. It should also take a method, such as 'avg', 'med', or 'pca'
For example, if 'resp' has 4 electrodes,
rereferenced_data = naplib.preprocessing.rereference(data, field='resp', arr=arr, method='avg')
where arr
is a 4x4 matrix:
to define blocks for local rereferencing:
arr = [[1,1,0,0],
[1,1,0,0],
[0,0,1,1],
[0,0,1,1]]
to define blocks for global rereferencing:
arr = [[1,1,1,1],
[1,1,1,1],
[1,1,1,1],
[1,1,1,1]]
to define blocks for a weighted rereferencing:
arr = [[1,.5,.25,0],
[.5,1,.5,.25],
[.25,.5,.5,.5],
[0,.25,.5,1]]
For example
arr = naplib.utils.create_block_rereference_arr([1,1,2,2])
# arr = [[1,1,0,0],
# [1,1,0,0],
# [0,0,1,1],
# [0,0,1,1]]
arr = naplib.utils.create_block_rereference_arr([1,1,1,1])
# arr = [[1,1,1,1],
# [1,1,1,1],
# [1,1,1,1],
# [1,1,1,1]]
or to get it from a list/array of channel names:
arr = naplib.utils.create_rereference_arr_from_channelnames(['RTx1', 'RTx2', 'RTs1', 'RTs2'], method='amplifier')
# arr = [[1,1,0,0],
# [1,1,0,0],
# [0,0,1,1],
# [0,0,1,1]]
https://naplib-python.readthedocs.io/en/latest/references/visualization.html#shaded-error-plot
Documentation for hierarchical cluster plot is missing description of "data" arg.
imSTRF description should say "colorbar is centered at zero", not data is centered at zero
Would be great to have a page on the docs which details how the OutStruct works and how to get started with some simple stuff.
Kfold.split() when setting shuffle and random state should work.
KFold.split() raises NameError when calling split after setting random state.
kfold = nl.model_selection.KFold(6, shuffle=True, random_state=1)
for t1, t2 in kfold.split([i for i in range(29)]):
print((t1))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
/tmp/ipykernel_638517/928584509.py in <module>
1 kfold = nl.model_selection.KFold(6, shuffle=True, random_state=1)
----> 2 for t1, t2 in kfold.split([i for i in range(29)]):
3 print((t1))
~/naplab/GPT-Encoding/AnalysisCode/naplib/model_selection/model_selection.py in split(self, *args)
82 )
83
---> 84 for train, test in super().split(data[0]):
85 tmp = list(
86 chain.from_iterable(
~/anaconda3/envs/gpt_encoding/lib/python3.7/site-packages/sklearn/model_selection/_split.py in split(self, X, y, groups)
338 )
339
--> 340 for train, test in super().split(X, y, groups):
341 yield train, test
342
~/anaconda3/envs/gpt_encoding/lib/python3.7/site-packages/sklearn/model_selection/_split.py in split(self, X, y, groups)
84 X, y, groups = indexable(X, y, groups)
85 indices = np.arange(_num_samples(X))
---> 86 for test_index in self._iter_test_masks(X, y, groups):
87 train_index = indices[np.logical_not(test_index)]
88 test_index = indices[test_index]
~/anaconda3/envs/gpt_encoding/lib/python3.7/site-packages/sklearn/model_selection/_split.py in _iter_test_masks(self, X, y, groups)
96 By default, delegates to _iter_test_indices(X, y, groups)
97 """
---> 98 for test_index in self._iter_test_indices(X, y, groups):
99 test_mask = np.zeros(_num_samples(X), dtype=bool)
100 test_mask[test_index] = True
~/naplab/GPT-Encoding/AnalysisCode/naplib/model_selection/model_selection.py in _iter_test_indices(self, X, y, groups)
46 indices = np.arange(n_samples)
47 if self.shuffle:
---> 48 check_random_state(self.random_state).shuffle(indices)
49
50 n_splits = self.n_splits
NameError: name 'check_random_state' is not defined
logging throws error for python<3.9 in process_ieeg
dj-jayu/Img_link_to_local_markdown#2
Tell us what happens instead
Paste the template code (ideally a minimal example) that causes the issue
Paste the full traceback in case there is an exception
Load raw data from edf files
Instead of a file, allow a list of strings
Given a directory containing all the elec_recon files, minimally extract the following:
If possible, also extract region labels, but there might not be a common format for this
Would be great if you could manually set weights in TRF for easy simulation of STRFs
Load raw data from NWB format
Tell us what should happen
Tell us what happens instead
Paste the template code (ideally a minimal example) that causes the issue
Paste the full traceback in case there is an exception
Tell us what should happen
Tell us what happens instead
Paste the template code (ideally a minimal example) that causes the issue
Paste the full traceback in case there is an exception
naplib-python/naplib/visualization/plots.py
Line 404 in 1cf67c0
Shaded error plot docstring has incorrect formatting resulting in Raises
section being improperly displayed.
https://naplib-python.readthedocs.io/en/latest/references/visualization.html#shaded-error-plot
Should minimally have the ability to perform the following:
naplib.visualization.hierarchical_cluster_plot
should have an axis
argument which would be one of ['x', 'y', 'xy']
that specifies which axis of the data matrix to perform hierarchical clustering on. If x
, the dendrogram will be on the top of the data figure (the current default). If y
, the dendrogram would be to the left of the data figure. If xy
, there would be two dendrograms, one above and one to the left of data.
would be nice to automatically add legend strings to each line so if user calls plt.legend() it works
publish on conda-forge
Create a function which performs alignment based on a set of sounds/triggers (like a list of numpy arrays which are wav-data) and a single recording of the acoustics from the experiment. Might additionally require Stimulus Order information.
After parsing args, there should be better and more descriptive errors thrown if data is not correct.
For example, if you try to pass a list of length 2 (when it should be an array of length 2), to responsive_ttest, the error comes up later, but it should really happen right after the inputs are parsed
https://github.com/naplab/naplib-python/blob/main/naplib/stats/responsive_elecs.py#L112
Y = [np.random.rand(400,3) for _ in range(5)]
_, stats = responsive_ttest(resp=Y, befaft=[1,1], sfreq=100, alpha=0.01, random_state=1)
whereas no error happens if you correctly pass befaft as an array:
_, stats = responsive_ttest(resp=Y, befaft=np.array([1,1]), sfreq=100, alpha=0.01, random_state=1)
Tell us what should happen
Tell us what happens instead
Paste the template code (ideally a minimal example) that causes the issue
Paste the full traceback in case there is an exception
Would be nice to have a parallelized "apply" method for the naplib.Data class which would allow you to easily parallelize a function over trials of the Data object, since you typically have to loop through trials to apply a function to each.
Docs for load_edf function correspond to the load_tdt instead.
https://naplib-python.readthedocs.io/en/latest/references/io.html#load-edf
naplib-python/naplib/out_struct.py
Line 256 in 29201c7
At the moment the load functions (load_edf
, load_tdt
, ...) support specifying a time range to read only a part of the data. But there is no way to leverage this when calling process_ieeg
. This is most useful when the block of interest is a small part of the total recording file and we don't want to read the full file every time we run the pipeline. For example, when multiple experiments are contained in a single recording file.
def process_ieeg(..., time_range: Union[int, Tuple[int, int]]=0, ...):
...
It would be great if there were tutorials on using mne's visualization tools along with naplib-python
load raw data from TDT machines, including ideally different versions of TDT
Would be great to have a parameter in naplab.stats.responsive_ttest to optionally control whether each segment (e.g. 1 second clip before and after onset) is averaged before being concatenated. Averaging may produce more consistent results, though not as good if you only have a few trials.
Internally, TRF performs cross-validation during fitting, but it uses a custom cross-validation splitter, so just convert this to naplib.model_selection.KFold to improve code re-use.
Add an apply method without concatenating for the array_ops module, because sometimes we don't want across-trial edge effects.
The labels
parameters within naplib.segmentation.electrode_lags_fratio does not have proper documentation.
Load raw stimuli and return them, as well as meta-data like filenames
Would be nice to be able to get the manner_dict from this function:
https://github.com/naplab/naplib-python/blob/main/naplib/features/alignment_extras.py#L41
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.