Giter Site home page Giter Site logo

neuralpredictors's Introduction

Neuralpredictors

Test codecov Black Mypy Isort PyPI version DOI

Sinz Lab Neural System Identification Utilities for PyTorch.

How to run the tests ๐Ÿงช

Clone this repository and run the following command from within the cloned repository to run all tests:

docker-compose run pytest

How to contribute ๐Ÿ”ฅ

Pull requests (and issues) are always welcome. This section describes some preconditions that pull requests need to fulfill.

Tests

Please make sure your changes pass the tests. Take a look at the test running section for instructions on how to run them. Adding tests for new code is not mandatory but encouraged.

Code Style

black

This project uses the black code formatter. You can check whether your changes comply with its style by running the following command:

docker-compose run black

Furthermore you can pass a path to the service to have black fix any errors in the Python modules it finds in the given path.

isort

isort is used to sort Python imports. You can check the order of imports by running the following command:

docker-compose run isort

The imports can be sorted by passing a path to the service.

Type Hints

We use mypy as a static type checker. Running the following command will check the code for any type errors:

docker-compose run mypy

It is not necessary (but encouraged) to add type hints to new code but please make sure your changes do not produce any mypy errors.

Note that only modules specified in the mypy-files.txt file are checked by mypy. This is done to be able to add type hints gradually without drowning in errors. If you want to add type annotations to a previously unchecked module you have to add its path to mypy-files.txt.

neuralpredictors's People

Contributors

akjagadish avatar arnenx avatar bryanlimy avatar christoph-blessing avatar claudiusgruner avatar eywalker avatar fabiansinz avatar fededagos avatar ivust avatar kklurz avatar konstantinwilleke avatar m00mo avatar maxfburg avatar mohammadbashiri avatar mvystrcilova avatar neuronmorph avatar oliveira-caio avatar pollytur avatar ppierzc avatar sacadena avatar suhasshrinivasan avatar synicix avatar wurining avatar yongrong-qiu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neuralpredictors's Issues

Add utility functions for nnfabrik dataloaders.

For the sinzlab-specific usecase of nnfabrik, we allow the dataloader to return one of the following:

  • namedtuple
  • tuple
  • dictionary
  • pytorch default dataloaders

It is the job of the nnfabrik trainer to pass the data from the dataloader to the model. An interface is necessary that unpacks the dataloader and passes the correct arguments to their specified destination.
This interface should be parts of mlutils.

Bug report - Data Transformation Normalise

This actually would not work for any other folder name for the TreeDataset. For instance, if the folder is named 'videos'

Here is how this is solved inside neuropredictors for now. I would propose such a modification also to keep it back compatible.

`def init(self, data, stats_source="all", exclude=None, inputs_mean=None, inputs_std=None,
in_name=None, out_name=None, eye_name=None):

    self.exclude = exclude or []
    if in_name is None:
        in_name = "images" if "images" in data.statistics.keys() else "inputs"
    if out_name is None:    
        out_name = "responses" if "responses" in data.statistics.keys() else "targets"
    if eye_name is None:
        eye_name = "pupil_center" if "pupil_center" in data.data_keys else "eye_position"`

Questions on shifters

Hello,

I have 2 questions regarding shifters and 1 regarding modulators .

  1. Why here for MLP regularizer is not really implemented, while for StaticAffine2d it is defined ? Is it a bug or is there an explanation for this?
  2. What is the conceptual difference between MLP and StaticAffine2d shifters? From the descriptions it looks like MLP is just a more general version of StaticAffine2d, where it can actually be hidden layers but if shift_layers=1 for MLPthen it is equivalent with StaticAffine2d. Is it correct or do I miss something?

Here are intros for MLP and StaticAffine2d:

MLP 
"""
        Multi-layer perceptron shifter
        Args:
            input_features (int): number of input features, defaults to 2.
            hidden_channels (int): number of hidden units.
            shift_layers(int): number of shifter layers (n=1 will correspond to a network without a hidden layer).
            **kwargs:
 """

and

StaticAffine2d
"""
        A simple FC network with bias between input and output channels without a hidden layer.
        Args:
            input_channels (int): number of input channels.
            output_channels (int): number of output channels.
            bias (bool): Adds a bias parameter if True.
 """
  1. Why there us nothing in the modulators block now? And I cannot find where the code was moved (if it was)

Thanks in advance!

Inconsistent regularizers

The Laplace regularizers are currently inconsistent between 2d/3d in whether they allow padding or not, what the default option is (padding for 2d, no padding for LaplaceL23d, padding without an option to change it for FlatLaplaceL23d) and which versions are implemented (LaplaceL2norm only for 2d). Would it be possible to handle this in a consistent way? I'd also be interested in adding some variants to make it more complete (i.e. a LaplaceL23dnorm or a 1d variant of FlatLaplaceL23d so that one can deal with space-time separable 3d kernels out of the box).

Move early stopping to pytorch ignite engine

I feel the early stopping gets overloaded with functionality:

  • keep track of best model
  • log scores
  • early stop
  • learning rate schedule

Since that doesn't even cover all our use cases, we should move to something more flexible. I feel that PyTorch Ignite might do that (https://pytorch.org/ignite/). Their central concept is that of an Engine that loops through data and does things. What it does is controlled by user-provided functions. In addition, steps like learning rate decrease or early stopping can be handled by provided or custom events triggered at the start of an epoch, at every batch, when the training is completed, and so on.

The above cases could be handled as follows:

  • Best model: After every epoch, the validation score is tested and a best model is stored if it improved compared to before
  • Log scores: Before every epoch, a number of scorers are evaluated.
  • Early stop: Early Stopping already has a handler in Ignite.
  • Learning rate scheduler: Can be handled with batch or epoch events
  • Adapting model components such as fixing the sampling of the Gaussian readout can be handled by custom events such as FIRST_LR_DECREASE or so.

Opinions? Does anyone want to program a toy example?

PR issues

@cblessing24
Dear Christoph,
I am not sure what happened during PR testing. From what I get into logs, no tests failed but there are also no tests for this lines and the checks want me to do them. Is it correct / necessary or am I missing something?

PR link

Thanks in advance!

image

Conv2d cores `independent_bn_bias` parameter not longer required

The parameter independent_bn_bias (


) in the conv2d cores is not longer required, as now batchnorm scale and bias parameters can be precisely controlled by passing a list of booleans to batch_norm, batch_norm_scale, and bias.

independent_bn_bias was so far kept for backwards compatibility. Now its obsolete and could be removed in the future for the next version.

Add a template model forward that follows new data conventions

The forward function of our models need to have similar behavior in order to accept as many dataloaders as possible.

We settled on the following convention for the forward function of a model.

from mei.legacy.utils import varargin
@varargin
forward(Inputs, data_key=None):

There will be only a single positional argument, along with data_key as additional keyword argument which our dataloaders contain within the dict of dicts. All additional keyword arguments that the model will need will be caught by the varargin decorator. Its job is to unpack the kwargs and print out the kwargs which are not used as a warning to the user.
Right now, this decorator resided in the MEI package, but it needs to find a place in neuralpredictors as well.

[Bug] - 3D / 3Dfactorised core with padding

@Mvystrcilova
for 3D cores get_output_shape is not working if padding is True.
This and this would fail if padding is True. Why not to use the default neuropredictors function for this?

Thanks in advance:)

as a suggestion for the model creation functions - for me something like this works fine

from operator import itemgetter
from neuralpredictors.utils import get_module_output
from nnfabrik.utility.nn_helpers import get_dims_for_loader_dict


session_shape_dict = get_dims_for_loader_dict(dataloaders['train'])
subselect = itemgetter(0, 2, 3)
in_shapes_dict = {
    k: subselect(tuple(get_module_output(core, v[in_name])[1:]))
    for k, v in session_shape_dict.items()
}

Cleaning Up before going public

  • rewrtite datasets
  • Cleanup Core, Core2d, and Stacked2dCore. make Core a nn.module, and get rid of Core2D
  • find a new name for ml-utils: current runner up: nnkortex

Gaussian MultiReadout assumes shared parameters

In the current neuralpredictors main, the MultipleFullGaussian2d is inheriting from MultiReadoutSharedParametersBase.
This might be a problem in some cases. For example, if I just want a "normal" MultiGauss2d, without sharing anything, I could just invoke it as it is and it works.
However, it does not work with grid_mean_predictor.
Because, for example, whenever a grid_mean_predictor is passed, it expects that there is a source grid.

The easiest solution, and my proposal, is to make it inherit from MultiReadoutBase then, so that the init would look like:

class MultipleFullGaussian2d(MultiReadoutBase):
    _base_readout = FullGaussian2d

So either MultiReadoutSharedParametersBase is expanded, so it does work with and without sharing, or we have two independent instantiations from the respective base classes. I'm leaning towards the latter.

Make Multireadouts consistent.

In layers.readouts, there's a new class, MultiReadouts, which all multireadouts should inherit from. So far this is only implemented for the Gaussian2d, the others have follow.

[Bug] - 3D / 3Dfactorised parameter final_nonlin has no effect for cores with more than 1 layer

@Mvystrcilova, @pollytur
For the Basic3dCore and Factorized3dCore in 3dcores.py the parameter final_nonlin has no effect if the core has more than 1 layer the reason for this is a if statement that is aways true in the loop that defines all layers after the first one.

In the for loop defined here the variable l takes values from 0 to number of layers - 2 and the condition a few lines after (line 183) compares l to self.layers. The result of this comparison is aways True by definition for l. The same holds for the for loop defined here and if condition here (line 430)

[Bug report] - regularizers

Got an error RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Here is the traceback:

File .../neuralpredictors/neuralpredictors/layers/cores/conv2d.py:461, in RotationEquivariant2dCore.laplace(self)
    460 def laplace(self):
--> 461     return self._input_weights_regularizer(
    462         self.features[0].hermite_conv.weights_all_rotations, avg=self.use_avg_reg
    463     )

File ~/anaconda3/envs/sensorium/lib/python3.9/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

File .../neuralpredictors/neuralpredictors/regularizers.py:176, in LaplaceL2norm.forward(self, x, avg)
    173 agg_fn = torch.mean if avg else torch.sum
    175 oc, ic, k1, k2 = x.size()
--> 176 return agg_fn(self.laplace(x.view(oc * ic, 1, k1, k2)).pow(2)) / agg_fn(x.view(oc * ic, 1, k1, k2).pow(2))

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Suggestion: to replace view with reshape (worked locally for me). Torch reshape may view or copy the tensor, but its able to work with contiguous tensors and I guess it was the issue. More details here

Question on strides

I am not sure if its a typo or it actually has some sense.

For Stacked2dCore in the function add_first_layer the stride in not defined (the strike value is not passed for the pytorch Conv2d layer), while it is passed for all other layers in add_subsequent_layers and it is passed in add_first_layer for RotationEquivariant2dCore. So, should the stride be passed for the layer0 as well? (I assume its a typo)

Best, Polly

All transforms need to work on batches as well as the whole dataset

The Transforms for the dataset classes in mlutils.data.datasets are usually applied for a single entity. Since the dataset can be called either as a batch, or accessing the whole dataset (data.dataset[:].images for example), the transforms should work in both cases.

This behavior is true for the Normalizer, Sampler, ToTensor, AddBehaviorAsChannels. The other Transforms have to be tested, and all new transforms also have to adhere to this convention.

`group_sparsity` regularization results in an error with different conv types

The group_sparsity regularizer of the Stacked2dCore the convolution layers are referred to as conv, but that does not generalize to other conv layer types like ds_conv for instance.

def group_sparsity(self):
"""
Sparsity regularization on the filters of all the conv2d layers except the first one.
"""
ret = 0
for feature in self.features[1:]:
ret = ret + feature.conv.weight.pow(2).sum(3, keepdim=True).sum(2, keepdim=True).sqrt().mean()
return ret / ((self.num_layers - 1) if self.num_layers > 1 else 1)
def regularizer(self):
return self.group_sparsity() * self.gamma_hidden + self.gamma_input * self.laplace()

Minor Improvements of FiringRateEncoder

After using the new Firingrate Encoder for the neuralprediction challenge, I noticed that it could use a few minor improvements.

This relates back to the question how the dataloader args/kwargs should be passed to the encoder.
We agreed that the input always has to be an arg. The rest can be both args/kwags.

Here's an example usecase:

batch = get_batch(dataloader)
model(*batch, **batch._asdict())

This would be the most general way of passing everything to the model, and I think that this is the cleanest way.

In that case, our FiringRateEncoder fails, because it gets the inputs twice: as arg and as kwarg.
#152 here's an example PR.

So we could either hand the responsibility to the user to sort the args/kwargs himself. Or we make our Encoder more forgiving. I'd suggest the latter.

[bug] Batch norm layers for 3d core

After hackathon adjustments, add_bn_layer is inherited from ConvCore but the function there does not take hidden_channels while there are needed for the 3d core (here)

Also, would be great, instead of throwing errors to make things lists here and add a warning. Otherwise, most of the public repos are not back compatible and its easy to make them

Mistake in the FullGaussian2d readout

The way we are sampling readout positions in FullGaussian2d readout is different from what is claimed in the paper and also does not guarantee a proper Gaussian distribution (i.e. the learned covariance matrix is neither guaranteed to be symmetric nor have a positive determinant). More details below.

The mistake is two-fold:

1. The learned sigma is not restricted to be a proper covariance matrix.

Firstly, at initialization, we are sampling the entries of the covariance matrix from a uniform distribution. which means that it is possible to have covariances bigger than the variances, and there is no guarantee that the matrix is symmetric. Also, even if the initialization works out by chance (which is very unlikely), there is no restriction during learning.

As a result, we can end up with covariance matrices with negative determinants, which should not happen.

2. the way we are using the sigma in the re-parametrization trick is not correct.

Our goal is to sample from a normal distribution with mean \mu and covariance \sigma. Instead, we are utilizing the reparametrization trick, but the way we are doing it is not correct. What we do is to transform the samples from a standard normal distribution with the covariance matrix \sigma @ sample_{std_normal} + \mu, but this is NOT equivalent to sampling from the normal distribution with mean \mu and covariance matrix \sigma.

@fabiansinz would you agree with these points? or am I missing something?

If my understanding is correct, my suggestion for a fix is as follows:

  • learn a rotation angle \theta and two eigenvalues (variances) s per neuron
  • then \sigma = rot_mat(\theta) @ diag(s) @ rot_mat(\theta).T
  • sample via rot_mat(\theta) @ diag(sqrt(s)) @ sample_{std_normal}

This way we would have a proper covariance matrix, the sampling is correct, and instead of 4 parameters per neuron, we would have 3 parameters to learn.

Bug report - stride does not change for RotationEquivariant2dCore

I tried to change the stride for RotationEquivariant2dCore, while it does not updates (checked with model.stride).
To reproduce:

model = RotationEquivariant2dCore(..., stride=2)
print(model.stride) # should print 1

The issue I guess is that stride gets reinitialised by the default value during calling the super.__init__ as stride is not passed there (source code link).

The dirty fix (for someone who need it urgently) could be

self.num_rotations = num_rotations
self.upsampling = upsampling
self.rot_eq_batch_norm = rot_eq_batch_norm
self.init_std = init_std
super().__init__(*args, **kwargs, input_regularizer=input_regularizer, stride=stride)

or stride could also passes as a part of kwargs in further versions of the lib

[Bug report] - Readouts

If a user would define both feature_reg_weight and gamma_readout (like here) then the feature_reg_weight would be ignored in the resolve function, which is probably not correct as gamma_readout is deprecated, hence, feature_reg_weight should have the priority.
Resolve function

def resolve_deprecated_gamma_readout(self, feature_reg_weight: float, gamma_readout: Optional[float]) -> float:
        if gamma_readout is not None:
            warnings.warn(
                "Use of 'gamma_readout' is deprecated. Please consider using the readout's feature-regularization parameter instead"
            )
            feature_reg_weight = gamma_readout
        return feature_reg_weight

Final non-linearity argument to stacked core 2D is ignored in most cases

Take a look at the following line:

if (self.num_layers > 1 or self.final_nonlinearity) and not self.linear:

The value of the self.final_nonlinearity variable is completely irrelevant if self.num_layers is greater than one.

Further questions:

  • Why is no non-linearity added if there is only one layer by default?
  • Why is there no warning if self.linear and self.final_nonlinearity is True?
  • Is anyone using final_nonlinearity == False?

Potential fix:

if self.linear:
    return
if self.num_layers == 1 and not self.final_nonlinearity:
    return
if self.i_layer == self._num_layers - 1 and not self.final_nonlinearity:
    return
# add non-linearity

[improvement] Factorised readouts - positivity constraint on masks only

This is important to make the code for older papers reproducible in terms of qualitative findings

The masks should have a restriction to be positive independently of the feature readouts like here it is independent of feature weights positivity constraint (which is here), while currently in neuropredictors its either everything is positive or nothing (here - it restricts both weights and masks being positive)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.