sinzlab / neuralpredictors Goto Github PK

View Code? Open in Web Editor NEW

24.0 8.0 43.0 1.78 MB

Machine Learning Utils of Sinzlab

License: MIT License

Dockerfile 0.02% Python 31.30% Shell 0.01% Makefile 0.06% Jupyter Notebook 68.62%

neuralpredictors's Introduction

Neuralpredictors

Sinz Lab Neural System Identification Utilities for PyTorch.

How to run the tests 🧪

Clone this repository and run the following command from within the cloned repository to run all tests:

docker-compose run pytest

How to contribute 🔥

Pull requests (and issues) are always welcome. This section describes some preconditions that pull requests need to fulfill.

Tests

Please make sure your changes pass the tests. Take a look at the test running section for instructions on how to run them. Adding tests for new code is not mandatory but encouraged.

Code Style

black

This project uses the black code formatter. You can check whether your changes comply with its style by running the following command:

docker-compose run black

Furthermore you can pass a path to the service to have black fix any errors in the Python modules it finds in the given path.

isort

isort is used to sort Python imports. You can check the order of imports by running the following command:

docker-compose run isort

The imports can be sorted by passing a path to the service.

Type Hints

We use mypy as a static type checker. Running the following command will check the code for any type errors:

docker-compose run mypy

It is not necessary (but encouraged) to add type hints to new code but please make sure your changes do not produce any mypy errors.

Note that only modules specified in the mypy-files.txt file are checked by mypy. This is done to be able to add type hints gradually without drowning in errors. If you want to add type annotations to a previously unchecked module you have to add its path to mypy-files.txt.

neuralpredictors's People

Contributors

Stargazers

Watchers

Forkers

m00mo eywalker sacadena konstantinwilleke zhiweid fabiansinz christoph-blessing eulerlab arnenx mohammadbashiri kklurz maxfburg shanqma lucabaroni shahdsaf mpicek robertpetrovic kellirestivo kaihcohrs elena-off xup5 ivust pollytur arjunsinghrathore walkerlab mvystrcilova vdobrev1 yongrong-qiu nwcimaszewski bryanlimy suhasshrinivasan wurining lavanya-m-k moslem-tg hkim42 darioliscai ecker-lab oliveira-caio neuronmorph aecker fededagos

neuralpredictors's Issues

Duplicate code in conv3d

in conv3d there is some code that is duplicate with conv2d, e.g. https://github.com/ecker-lab/neuralpredictors/blob/29206ece4ed20af57f7c6e5ee614e49da50dd8d5/neuralpredictors/layers/cores/conv3d.py#L228 . Ideally, these functions would not be duplicate, as for example with PR #217 batch norm handling evolved in conv2d, and conv3d is missing the new features but could have had it.

Apply naming conventions throughout for inputs and outputs

Our internal naming conventions are as follows:

inputs
targets
data_key
behavior
pupil_center

These convention should be followed in all datasets, transforms, etc.
This applies in particular to this module: https://github.com/sinzlab/neuralpredictors/tree/master/neuralpredictors/data

Make StackedCore2d accept a list of kernel sizes

This feature would enable the user to choose what kernel sizes can be used for the hidden layers. At the moment they all have the same size.

Add proper logger and remove `print`s

What's the best library currently for this? Does anyone have any extended experience?

suggestion

        for k, loader, _ in zip(cycle(self.loaders.keys()), cycle(cycles), range(len(self.loaders) * self.max_batches)):

Originally posted by @eywalker in https://github.com/_render_node/MDIzOlB1bGxSZXF1ZXN0UmV2aWV3VGhyZWFkMjIxMzIyNjk1OnYy/pull_request_review_threads/discussion

Add utility functions for nnfabrik dataloaders.

For the sinzlab-specific usecase of nnfabrik, we allow the dataloader to return one of the following:

namedtuple
tuple
dictionary
pytorch default dataloaders

It is the job of the nnfabrik trainer to pass the data from the dataloader to the model. An interface is necessary that unpacks the dataloader and passes the correct arguments to their specified destination.
This interface should be parts of mlutils.

Bug report - Data Transformation Normalise

This actually would not work for any other folder name for the TreeDataset. For instance, if the folder is named 'videos'

Here is how this is solved inside neuropredictors for now. I would propose such a modification also to keep it back compatible.

`def init(self, data, stats_source="all", exclude=None, inputs_mean=None, inputs_std=None,
in_name=None, out_name=None, eye_name=None):

    self.exclude = exclude or []
    if in_name is None:
        in_name = "images" if "images" in data.statistics.keys() else "inputs"
    if out_name is None:    
        out_name = "responses" if "responses" in data.statistics.keys() else "targets"
    if eye_name is None:
        eye_name = "pupil_center" if "pupil_center" in data.data_keys else "eye_position"`

Questions on shifters

Hello,

I have 2 questions regarding shifters and 1 regarding modulators .

Why here for MLP regularizer is not really implemented, while for StaticAffine2d it is defined ? Is it a bug or is there an explanation for this?
What is the conceptual difference between MLP and StaticAffine2d shifters? From the descriptions it looks like MLP is just a more general version of StaticAffine2d, where it can actually be hidden layers but if shift_layers=1 for MLPthen it is equivalent with StaticAffine2d. Is it correct or do I miss something?

Here are intros for MLP and StaticAffine2d:

MLP 
"""
        Multi-layer perceptron shifter
        Args:
            input_features (int): number of input features, defaults to 2.
            hidden_channels (int): number of hidden units.
            shift_layers(int): number of shifter layers (n=1 will correspond to a network without a hidden layer).
            **kwargs:
 """

and

StaticAffine2d
"""
        A simple FC network with bias between input and output channels without a hidden layer.
        Args:
            input_channels (int): number of input channels.
            output_channels (int): number of output channels.
            bias (bool): Adds a bias parameter if True.
 """

Why there us nothing in the modulators block now? And I cannot find where the code was moved (if it was)

Thanks in advance!

[documentation] Link readouts to the papers where they have been used

in the gaussian readouts there are things like UltraSparse committed before 2022 - it would be nice to link papers for it and (to know why they are less used now, etc)

Inconsistent regularizers

The Laplace regularizers are currently inconsistent between 2d/3d in whether they allow padding or not, what the default option is (padding for 2d, no padding for LaplaceL23d, padding without an option to change it for FlatLaplaceL23d) and which versions are implemented (LaplaceL2norm only for 2d). Would it be possible to handle this in a consistent way? I'd also be interested in adding some variants to make it more complete (i.e. a LaplaceL23dnorm or a 1d variant of FlatLaplaceL23d so that one can deal with space-time separable 3d kernels out of the box).

Move early stopping to pytorch ignite engine

I feel the early stopping gets overloaded with functionality:

keep track of best model
log scores
early stop
learning rate schedule

Since that doesn't even cover all our use cases, we should move to something more flexible. I feel that PyTorch Ignite might do that (https://pytorch.org/ignite/). Their central concept is that of an Engine that loops through data and does things. What it does is controlled by user-provided functions. In addition, steps like learning rate decrease or early stopping can be handled by provided or custom events triggered at the start of an epoch, at every batch, when the training is completed, and so on.

The above cases could be handled as follows:

Best model: After every epoch, the validation score is tested and a best model is stored if it improved compared to before
Log scores: Before every epoch, a number of scorers are evaluated.
Early stop: Early Stopping already has a handler in Ignite.
Learning rate scheduler: Can be handled with batch or epoch events
Adapting model components such as fixing the sampling of the Gaussian readout can be handled by custom events such as FIRST_LR_DECREASE or so.

Opinions? Does anyone want to program a toy example?

Check initialization of RemappedGaussian2d

PR issues

@cblessing24
Dear Christoph,
I am not sure what happened during PR testing. From what I get into logs, no tests failed but there are also no tests for this lines and the checks want me to do them. Is it correct / necessary or am I missing something?

PR link

Thanks in advance!

[bug, backpcompatibility] Typing for python 3.8 and below

For python 3.8 and below we need to use
from typing import List and List instead of list like here and below

Conv2d cores `independent_bn_bias` parameter not longer required

The parameter independent_bn_bias (

neuralpredictors/neuralpredictors/layers/cores/conv2d.py

Line 60 in 29206ec

independent_bn_bias=True,

) in the conv2d cores is not longer required, as now batchnorm scale and bias parameters can be precisely controlled by passing a list of booleans to batch_norm, batch_norm_scale, and bias.

independent_bn_bias was so far kept for backwards compatibility. Now its obsolete and could be removed in the future for the next version.

Add a template model forward that follows new data conventions

The forward function of our models need to have similar behavior in order to accept as many dataloaders as possible.

We settled on the following convention for the forward function of a model.

from mei.legacy.utils import varargin
@varargin
forward(Inputs, data_key=None):

There will be only a single positional argument, along with data_key as additional keyword argument which our dataloaders contain within the dict of dicts. All additional keyword arguments that the model will need will be caught by the varargin decorator. Its job is to unpack the kwargs and print out the kwargs which are not used as a warning to the user.
Right now, this decorator resided in the MEI package, but it needs to find a place in neuralpredictors as well.

Bug in SE2dCore

neuralpredictors/neuralpredictors/layers/cores/conv2d.py

Line 628 in a761c1e

    
           in_channels=self.hidden_channels if not self.skip > 1 else min(self.skip, l) * self.hidden_channels,

This line needs to respect that hidden channels is turned into a list in the sueper.__init__

Alternatively, you can just keep all the outputs into a list first, and then simply index into it with `self.stack`.

Alternatively, you can just keep all the outputs into a list first, and then simply index into it with self.stack.

Originally posted by @eywalker in https://github.com/_render_node/MDIzOlB1bGxSZXF1ZXN0UmV2aWV3VGhyZWFkMjIxMzIyMTQwOnYy/pull_request_review_threads/discussion

A couple of issues

EDIT: Wrong Repo

[Bug] - 3D / 3Dfactorised core with padding

@Mvystrcilova
for 3D cores get_output_shape is not working if padding is True.
This and this would fail if padding is True. Why not to use the default neuropredictors function for this?

Thanks in advance:)

as a suggestion for the model creation functions - for me something like this works fine

from operator import itemgetter
from neuralpredictors.utils import get_module_output
from nnfabrik.utility.nn_helpers import get_dims_for_loader_dict


session_shape_dict = get_dims_for_loader_dict(dataloaders['train'])
subselect = itemgetter(0, 2, 3)
in_shapes_dict = {
    k: subselect(tuple(get_module_output(core, v[in_name])[1:]))
    for k, v in session_shape_dict.items()
}

Cleaning Up before going public

rewrtite datasets
Cleanup Core, Core2d, and Stacked2dCore. make Core a nn.module, and get rid of Core2D
find a new name for ml-utils: current runner up: nnkortex

Redesign the device_state to work with individual devices

As of now, the device state

neuralpredictors/neuralpredictors/training.py

Line 87 in 5d06b38

def device_state(model, device):

only works with cuda or cpu as an input. For tmulti GPU training, it should be able to handle the actual devices on top of these inputs.

Gaussian MultiReadout assumes shared parameters

In the current neuralpredictors main, the MultipleFullGaussian2d is inheriting from MultiReadoutSharedParametersBase.
This might be a problem in some cases. For example, if I just want a "normal" MultiGauss2d, without sharing anything, I could just invoke it as it is and it works.
However, it does not work with grid_mean_predictor.
Because, for example, whenever a grid_mean_predictor is passed, it expects that there is a source grid.

The easiest solution, and my proposal, is to make it inherit from MultiReadoutBase then, so that the init would look like:

class MultipleFullGaussian2d(MultiReadoutBase):
    _base_readout = FullGaussian2d

So either MultiReadoutSharedParametersBase is expanded, so it does work with and without sharing, or we have two independent instantiations from the respective base classes. I'm leaning towards the latter.

Make Multireadouts consistent.

In layers.readouts, there's a new class, MultiReadouts, which all multireadouts should inherit from. So far this is only implemented for the Gaussian2d, the others have follow.

Laplace regularizer does not zero-pass

... but it should to avoid high-frequency edge artifacts.

[Bug] - 3D / 3Dfactorised parameter final_nonlin has no effect for cores with more than 1 layer

@Mvystrcilova, @pollytur
For the Basic3dCore and Factorized3dCore in 3dcores.py the parameter final_nonlin has no effect if the core has more than 1 layer the reason for this is a if statement that is aways true in the loop that defines all layers after the first one.

In the for loop defined here the variable l takes values from 0 to number of layers - 2 and the condition a few lines after (line 183) compares l to self.layers. The result of this comparison is aways True by definition for l. The same holds for the for loop defined here and if condition here (line 430)

[Bug report] - regularizers

Got an error RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Here is the traceback:

File .../neuralpredictors/neuralpredictors/layers/cores/conv2d.py:461, in RotationEquivariant2dCore.laplace(self)
    460 def laplace(self):
--> 461     return self._input_weights_regularizer(
    462         self.features[0].hermite_conv.weights_all_rotations, avg=self.use_avg_reg
    463     )

File ~/anaconda3/envs/sensorium/lib/python3.9/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

File .../neuralpredictors/neuralpredictors/regularizers.py:176, in LaplaceL2norm.forward(self, x, avg)
    173 agg_fn = torch.mean if avg else torch.sum
    175 oc, ic, k1, k2 = x.size()
--> 176 return agg_fn(self.laplace(x.view(oc * ic, 1, k1, k2)).pow(2)) / agg_fn(x.view(oc * ic, 1, k1, k2).pow(2))

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Suggestion: to replace view with reshape (worked locally for me). Torch reshape may view or copy the tensor, but its able to work with contiguous tensors and I guess it was the issue. More details here

Make Cortex Readout Single trial dependent

The cortex readout could learn on a trial by trial bassis, to also make use of the eye position as used by the shifter network.

Question on strides

I am not sure if its a typo or it actually has some sense.

For Stacked2dCore in the function add_first_layer the stride in not defined (the strike value is not passed for the pytorch Conv2d layer), while it is passed for all other layers in add_subsequent_layers and it is passed in add_first_layer for RotationEquivariant2dCore. So, should the stride be passed for the layer0 as well? (I assume its a typo)

Best, Polly

Add a regularizer to the Gaussian3D readout

All transforms need to work on batches as well as the whole dataset

The Transforms for the dataset classes in mlutils.data.datasets are usually applied for a single entity. Since the dataset can be called either as a batch, or accessing the whole dataset (data.dataset[:].images for example), the transforms should work in both cases.

This behavior is true for the Normalizer, Sampler, ToTensor, AddBehaviorAsChannels. The other Transforms have to be tested, and all new transforms also have to adhere to this convention.

Bug in conv2d Core

@MaxFBurg I think that this ValueError throws errors even for good configs:

neuralpredictors/neuralpredictors/layers/cores/conv2d.py

Line 124 in a761c1e

raise ValueError(

Just setting bias=False in the config will lead to a ValueError.

`group_sparsity` regularization results in an error with different conv types

The group_sparsity regularizer of the Stacked2dCore the convolution layers are referred to as conv, but that does not generalize to other conv layer types like ds_conv for instance.

neuralpredictors/neuralpredictors/layers/cores/conv2d.py

Lines 247 to 257 in 7f1b303

    
               def group_sparsity(self): 
        
                   """ 
        
                   Sparsity regularization on the filters of all the conv2d layers except the first one. 
        
                   """ 
        
                   ret = 0 
        
                   for feature in self.features[1:]: 
        
                       ret = ret + feature.conv.weight.pow(2).sum(3, keepdim=True).sum(2, keepdim=True).sqrt().mean() 
        
                   return ret / ((self.num_layers - 1) if self.num_layers > 1 else 1) 
        
               def regularizer(self): 
        
                   return self.group_sparsity() * self.gamma_hidden + self.gamma_input * self.laplace()

Implement shifter for factorised readouts

Currently not implemented (link), however, logic from Sinz 2019 could be borrowed to do it (here and here)

Wrong order of dimensions for readout in_shape

When we pass the shape of the core output tensor to the readout, we consistently have the wrong order of height and width.

c, w, h = in_shape

(from

neuralpredictors/neuralpredictors/layers/readouts/gaussian.py

Line 64 in 711ffee

self.in_shape = in_shape

)

it's not a bug per se, because it is consistent internally. But would be clean to change it for all readouts.

This conflicts with skip logic

https://github.com/sinzlab/ml-utils/blob/59949af3673bada3ba0c34d18327c5b028dae207/mlutils/layers/cores.py#L150-L151

[example request] Train loop example for the warm-up scheduling

To create an example for the train loop with the warm-up scheduler combined with the early stopping scheduler

Minor Improvements of FiringRateEncoder

After using the new Firingrate Encoder for the neuralprediction challenge, I noticed that it could use a few minor improvements.

This relates back to the question how the dataloader args/kwargs should be passed to the encoder.
We agreed that the input always has to be an arg. The rest can be both args/kwags.

Here's an example usecase:

batch = get_batch(dataloader)
model(*batch, **batch._asdict())

This would be the most general way of passing everything to the model, and I think that this is the cleanest way.

In that case, our FiringRateEncoder fails, because it gets the inputs twice: as arg and as kwarg.
#152 here's an example PR.

So we could either hand the responsibility to the user to sort the args/kwargs himself. Or we make our Encoder more forgiving. I'd suggest the latter.

[bug] Batch norm layers for 3d core

After hackathon adjustments, add_bn_layer is inherited from ConvCore but the function there does not take hidden_channels while there are needed for the 3d core (here)

Also, would be great, instead of throwing errors to make things lists here and add a warning. Otherwise, most of the public repos are not back compatible and its easy to make them

Mistake in the FullGaussian2d readout

The way we are sampling readout positions in FullGaussian2d readout is different from what is claimed in the paper and also does not guarantee a proper Gaussian distribution (i.e. the learned covariance matrix is neither guaranteed to be symmetric nor have a positive determinant). More details below.

The mistake is two-fold:

1. The learned sigma is not restricted to be a proper covariance matrix.

Firstly, at initialization, we are sampling the entries of the covariance matrix from a uniform distribution. which means that it is possible to have covariances bigger than the variances, and there is no guarantee that the matrix is symmetric. Also, even if the initialization works out by chance (which is very unlikely), there is no restriction during learning.

As a result, we can end up with covariance matrices with negative determinants, which should not happen.

2. the way we are using the sigma in the re-parametrization trick is not correct.

Our goal is to sample from a normal distribution with mean \mu and covariance \sigma. Instead, we are utilizing the reparametrization trick, but the way we are doing it is not correct. What we do is to transform the samples from a standard normal distribution with the covariance matrix \sigma @ sample_{std_normal} + \mu, but this is NOT equivalent to sampling from the normal distribution with mean \mu and covariance matrix \sigma.

@fabiansinz would you agree with these points? or am I missing something?

If my understanding is correct, my suggestion for a fix is as follows:

learn a rotation angle \theta and two eigenvalues (variances) s per neuron
then \sigma = rot_mat(\theta) @ diag(s) @ rot_mat(\theta).T
sample via rot_mat(\theta) @ diag(sqrt(s)) @ sample_{std_normal}

This way we would have a proper covariance matrix, the sampling is correct, and instead of 4 parameters per neuron, we would have 3 parameters to learn.

Missing shifter option for Encoder3d

The base encoder for video data is missing a shifter option (and documentation)

if `stack` is an int, what's the point in doing `range(self.layers)[stack]`? shouldn't this just be the same as `[stack]`?

if stack is an int, what's the point in doing range(self.layers)[stack]? shouldn't this just be the same as [stack]?

Originally posted by @eywalker in https://github.com/_render_node/MDIzOlB1bGxSZXF1ZXN0UmV2aWV3VGhyZWFkMjIxMzIwOTYyOnYy/pull_request_review_threads/discussion

Bug report - stride does not change for RotationEquivariant2dCore

I tried to change the stride for RotationEquivariant2dCore, while it does not updates (checked with model.stride).
To reproduce:

model = RotationEquivariant2dCore(..., stride=2)
print(model.stride) # should print 1

The issue I guess is that stride gets reinitialised by the default value during calling the super.__init__ as stride is not passed there (source code link).

The dirty fix (for someone who need it urgently) could be

self.num_rotations = num_rotations
self.upsampling = upsampling
self.rot_eq_batch_norm = rot_eq_batch_norm
self.init_std = init_std
super().__init__(*args, **kwargs, input_regularizer=input_regularizer, stride=stride)

or stride could also passes as a part of kwargs in further versions of the lib

[Bug fix?] Shift replaced if shifter defined

Here the original shift would be replaced is the shifter is defined. This is probably not the expected behaviour

[Bug report] - Readouts

If a user would define both feature_reg_weight and gamma_readout (like here) then the feature_reg_weight would be ignored in the resolve function, which is probably not correct as gamma_readout is deprecated, hence, feature_reg_weight should have the priority.
Resolve function

def resolve_deprecated_gamma_readout(self, feature_reg_weight: float, gamma_readout: Optional[float]) -> float:
        if gamma_readout is not None:
            warnings.warn(
                "Use of 'gamma_readout' is deprecated. Please consider using the readout's feature-regularization parameter instead"
            )
            feature_reg_weight = gamma_readout
        return feature_reg_weight

hidden_padding argument in Stacked2dCore overwritten

Stacked2dCore has an argument hidden_padding, but internally hidden_padding is overwritten/redefined (only based on the hidden_dilation argument) so that same type padding is always applied.

Add a new argument to StackedCore2d set hidden-padding (e.g., to 0)

At the moment this is computed in the __init__ function, to keep the output and input width and height the same.
https://github.com/sinzlab/ml-utils/blob/59949af3673bada3ba0c34d18327c5b028dae207/mlutils/layers/cores.py#L126

Final non-linearity argument to stacked core 2D is ignored in most cases

Take a look at the following line:

neuralpredictors/neuralpredictors/layers/cores/conv2d.py

Line 185 in b654145

if (self.num_layers > 1 or self.final_nonlinearity) and not self.linear:

The value of the self.final_nonlinearity variable is completely irrelevant if self.num_layers is greater than one.

Further questions:

Why is no non-linearity added if there is only one layer by default?
Why is there no warning if self.linear and self.final_nonlinearity is True?
Is anyone using final_nonlinearity == False?

Potential fix:

if self.linear:
    return
if self.num_layers == 1 and not self.final_nonlinearity:
    return
if self.i_layer == self._num_layers - 1 and not self.final_nonlinearity:
    return
# add non-linearity

Models need dataloaders to be loaded

The readout requires the dataloaders to infer the shape of the data.
It should be possible to just provide the shape of the data.

[improvement] Factorised readouts - positivity constraint on masks only

This is important to make the code for older papers reproducible in terms of qualitative findings

The masks should have a restriction to be positive independently of the feature readouts like here it is independent of feature weights positivity constraint (which is here), while currently in neuropredictors its either everything is positive or nothing (here - it restricts both weights and masks being positive)

	def group_sparsity(self):
	"""
	Sparsity regularization on the filters of all the conv2d layers except the first one.
	"""
	ret = 0
	for feature in self.features[1:]:
	ret = ret + feature.conv.weight.pow(2).sum(3, keepdim=True).sum(2, keepdim=True).sqrt().mean()
	return ret / ((self.num_layers - 1) if self.num_layers > 1 else 1)

	def regularizer(self):
	return self.group_sparsity() * self.gamma_hidden + self.gamma_input * self.laplace()

sinzlab / neuralpredictors Goto Github PK

neuralpredictors's Introduction

Neuralpredictors

How to run the tests 🧪

How to contribute 🔥

Tests

Code Style

black

isort

Type Hints

neuralpredictors's People

Contributors

Stargazers

Watchers

Forkers

neuralpredictors's Issues

1. The learned sigma is not restricted to be a proper covariance matrix.

2. the way we are using the sigma in the re-parametrization trick is not correct.

Recommend Projects

Recommend Topics

Recommend Org