Giter Site home page Giter Site logo

amassivek / signalpropagation Goto Github PK

View Code? Open in Web Editor NEW
45.0 3.0 6.0 64 KB

Forward Pass Learning and Inference Library, for neural networks and general intelligence, Signal Propagation (sigprop)

Home Page: https://amassivek.github.io/sigprop

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
deep-learning forward-forward neural-networks forward-learning forward-pass biological-neural-networks local-learning spiking-networks spiking-neural-networks signal-propagation

signalpropagation's Introduction

Signal Propagation

The Framework for Unifying Learning and Inference in a Forward Pass

A python package for training parameterized models (e.g. neural networks) via the same forward pass used for inference. This package is for PyTorch.

Signal Propagation (sigprop) is the forward pass learning library for developing and deploying continuous, asynchronous, and parallel (lifelong) learning algorithms on CPUs, GPUs, neuromorphics, and edge devices. continuous, asynchronous, and parallel

Run the Examples and view the Quick Start for implementions details, experiments,, and more.

Documentation for the package is below.

A guide and tutorial on forward learning is available at:
https://amassivek.github.io/sigprop

The paper detailing the framework for forward learning is available at:
https://arxiv.org/abs/2204.01723

Gitter

TOC

  1. Install
  2. Quick Start
  3. Examples
  4. Documentation
  5. Development

1. Install

1.1. Production

Install

pip install https://github.com/amassivek/signalpropagation/archive/main.tar.gz

1.2. Development

  1. Clone the repo
git clone https://github.com/amassivek/signalpropagation.git
  1. cd
cd signalpropagation
  1. install in development mode
pip install -e .
  1. now you may development on signal propagation

  2. then, submit a pull request

2. Quick Start

2.1. Examples
2.2. Implement a Model

2.1. Use Examples

git clone https://github.com/amassivek/signalpropagation.git
cd signalpropagation
pip install -e .
cd examples
chmod +x examples.sh
./examples.sh

2.2 Implement a Model

This quick start uses the forward learning model from Example 2, but on a simple network:
Example 2: Input Target Max Rand

Concept overview of Signal Propagation:

  • There is a signal, placed at the front of the network.
  • Their are propagators, wrapped around each layer.
  1. Install the sigprop package
git clone https://github.com/amassivek/signalpropagation.git
cd signalpropagation
pip install -e .
cd <your_code_directory>
  1. Go to the file for network model.

Open the python file with the model you are configuring to use signal propagation.

  1. Add the import statement.
import sigprop
  1. Select the forward learning model.

Pick a signal, a propagator, and a model for using forward learning. Below are good defaults.

This forward learning model trains each layer of the network as soon as the layer receives an input and target, so training and inference are united and act together. This forward learning model is the base model, and will conveniantly take take of the inputs and outputs during learning and inference.

sigprop.models.Forward

This signal model learns a projection of the input and the target (i.e. context) to the same dimension.

sigprop.signals.ProjectionContextInput

This propagator model takes in any loss function (i.e. callable) to train a network layer. Currently, propagators for signals have different router logic than hidden layers. So, we use a different propagator class for the signal.

sigprop.propagators.signals.Loss
sigprop.propagators.Loss

We pair it with the following loss function.

sigprop.loss.v9_input_target_max_all

A side note. Alternatively, sigprop.propagators.signals.Fixed may be used when the signal is constructed instead of learned, such as a fixed projection or overlay. Here if we replace Loss with Fixed, then the above signal model will be treated as fixed projection. In this case, we would no longer use a loss function.

  1. Setup a manager (optional).

Setup a manager to configure defaults to help add signal propagation to an existing model. Managers are helper classes.

Managers are particularly helpful when adding propagators to layers of an existing model. Each network layer is wrapped in a propagator, so it may learn on its own. As we will see below, managers make wrapping layers quick.

sp_manager = sigprop.managers.Preset(
    sigprop.models.Forward,
    sigprop.propagators.Loss,
    sigprop.propagators.signals.Loss,
    build_optimizer
)

We wrote a method to build an optimizer for each layer of the network, since each layer learns independently from the other layers. The manager will call this method with a layer to get an optimizer and then give the optimizer to a propagator to train the layer.

def build_optimizer(module, lr=0.0004, weight_decay=0.0):
    p = list(module.parameters())
    if len(p) == 0:
        return None
    optimizer = optim.Adam(
        p, lr=lr,
        weight_decay=weight_decay)
    return optimizer
  1. Configure propagator ahead of time. (optional)

If we are using a manager, we do this step. Otherwise, we skip this step.

Each network layer is wrapped in a propagator, so the layer may learn on its own. Here, we configure defaults for propagators ahead of time, so we may easily wrap layers without having to specify an individual configuration for each layer. Here we set the default loss.

sp_manager.config_propagator(
    loss=sigprop.loss.v9_input_target_max_all
)
  1. Setup the signal.
  • num_classes is the number of classes.
  • hidden_dim is the size of the first hidden dim. Here it is the same as the input dim.
hidden_dim = input_dim
hidden_ch = 128
input_shape = (num_classes,)
output_shape = (hidden_ch, hidden_dim, hidden_dim)
signal_target_module = nn.Sequential(
    nn.Linear(
        int(shape_numel(input_shape)),
        int(shape_numel(output_shape)),
        bias=False
    ),
    nn.LayerNorm(shape_numel(output_shape)),
    nn.ReLU()
)
signal_input_module = nn.Sequential(
    nn.Conv2d(
        3, output_shape[0], 3, 1, 1,
        bias=False
    ),
    nn.BatchNorm2d(output_shape[0]),
    nn.ReLU()
)

sp_signal = sigprop.signals.ProjectionContextInput(
    signal_target_module, signal_input_module,
    input_shape, output_shape
)
# convert labels to a one-hot vector
sp_signal = nn.Sequential(
    sigprop.signals.LabelNumberToOnehot(
        num_classes
    ),
    sp_signal
)

There are two options for wrapping the signal with a propagator:

Option 1, if we are using a manager

sp_signal = sp_manager.set_signal(
    sp_signal,
    loss=sigprop.loss.v9_input_target_max_all
)

Option 2, if we choose to not use a manager.

sp_signal = sigprop.propagators.signals.Loss(
    sp_signal,
    optimizer=build_optimizer(sp_signal),
    loss=sigprop.loss.v9_input_target_max_all
)

Note, we by default feed in a vector as the context. For labels, this means converting to a one_hot vector of type float. We use a formatter, such as LabelNumberToOnehot and place it before the signal (refer to Add a new Signal).

  1. Wrap network layers with propagators.

The last layer is trained normally, so we use the identity operation.

Below are the network layers.

layer_1 = nn.Sequential(
    nn.Conv2d(
        hidden_ch, hidden_ch*2, 3, 2, 1,
        bias=False
    ),
    nn.BatchNorm2d(output_shape[0]),
    nn.ReLU()
)
layer_2 = nn.Sequential(
    nn.Conv2d(
        hidden_ch*2, hidden_ch*4, 3, 2, 1,
        bias=False
    ),
    nn.BatchNorm2d(output_shape[0]),
    nn.ReLU()
)
layer_output = nn.Sequential(
    nn.Linear(
        int(input_dim//2**2),
        int(num_classes),
        bias=True
    )
)

There are two options to wrap the layers with propagators.

Option 1, if we are using a manager.

layer_1 = sp_manager.add_propagator(layer_1)
layer_2 = sp_manager.add_propagator(layer_2)
layer_output = sp_manager.add_propagator(layer_output, sigprop.propagators.Identity)

Option 2, if we choose to not use a manager.

from sigprop import propagators
layer_1 = propagators.Loss(
    layer_1,
    optimizer=build_optimizer(layer_1),
    loss=sigprop.loss.v9_input_target_max_all
)
layer_2 = propagators.Loss(
    layer_2,
    optimizer=build_optimizer(layer_2),
    loss=sigprop.loss.v9_input_target_max_all
)
layer_output = propagators.Identity(
    layer_output,
    optimizer=build_optimizer(layer_output),
    loss=sigprop.loss.v9_input_target_max_all
)
  1. Create the network.

Below is the network model:

network = nn.Sequential(
    layer_1,
    layer_2,
    layer_output,
)

There are two options to wrap the network model in a sigprop model.

Option 1, if we are using a manager.

network = sp_manager.set_model(network)

Option 2, if we choose to not use a manager.

network = sigprop.models.Forward(network, sp_signal)
  1. train the network.
model.train()

for batch, (data, target) in enumerate(train_loader):

    output = model(data, target)

    loss = F.cross_entropy(output, target)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
  1. inference from network.
model.eval()

acc_sum = 0.
count = 0.

for batch, (data, target) in enumerate(test_loader):

    output = model(data, None)

    pred = output.argmax(1)
    acc_mask = pred.eq(target)
    acc_sum += acc_mask.sum()
    count += acc_mask.size(0)

acc = acc_sum / count

3. Examples

Here are a few examples. More will be added.

Example 2: Input Target Max Rand

ex2_input_target_max_rand.py

This example feeds the inputs x (e.g. images) and their respective targets t (e.g. labels) as pairs (or one after the other). Given pair x_i,t_i, this example selects the closest matching pair x_j,t_j to compare with. If there are multiple equivalent matching pairs, it randomly selects one.

Example 4: Input Target Top-K

ex4_input_target_topk.py

This example feeds the inputs x (e.g. images) and their respective targets t (e.g. labels) as pairs (or one after the other). Given pair x_i,t_i, this example selects the top k closest matching pair x_j,t_j to compare with.

This example demonstrates how to add a monitor to each loss and display metrics for each layer wrapped with a propagator.

4. Documentation

Refer to sections 2 and 3 for examples with explainations.

4.1. Signals

sigprop/signals

Signals generate the signal for learning, then forward it to the first layer. This is taken as the first layer of the network. Or, if a fixed project of the target is used, i.e. there is no learning, then this takes place before the first layer of the network.

4.2. Propagators

sigprop/propagators

Propagators train each network layer and forward the signal from one layer to the other. Each network layer is wrapped in a propagator, so the layer may learn on its own.

4.3. Models

sigprop/models

Models handle the input and output when forward learning. By default, they are not necessary, only signals and propagators are necessary (i.e. signal propagation). However, models provide conveniance functionality for common routines when using signals and propagators.

4.4. Managers

sigprop/managers

Managers allow for upfront configuration (defaults) of signals, propagators, and models. Upfront configuration is helpful in scenerios where signal propagation is wrapping an existing model. For example, we may use the manager to wrap layers in an already existing model. Refer the the examples for a demonstration.

4.5. Functional

sigprop/functional

The functional interface to signal propagation.

4.6. Monitors

sigprop/monitors

Monitor signals, propagators, and modules to record and display metrics. In Example 4: Input Target Top-K, a monitor is wraps the loss to display metrics on the loss and accuracy for each layer wrapped with a propagator; in other words, it displays layer level metrics.

5. Development

5.1. Add a New Loss

Losses are functions or a callable.

Refer to folder sigprop/loss for examples of losses.

Example new implementation:

def new_loss(sp_learn,h1,t1,h0,t0,y_onehot):
    l = #calc loss

    return l

Example new implementation:

class NewLoss(sigprop.loss.Loss):
    def __init__(self):
        super().__init__()

    def forward(sp_learn,h1,t1,h0,t0,y_onehot):
        l = #calc loss

        return l

5.2. Add a New Signal

There is the signal generator and the optional signal formatter.

Generator

Refer to file sigprop/signals/generators.py for examples of signals.

Note, the signal generators return the original input (h0) and context (t0). This provides flexibility for fixing up the input and context before it is used for learning (e.g. reshape). For example, apply a formatter.

Example new implementation:

from sigprop import signals

class MySignalGenerator(signals.Generator):
    def __init__(self, module, input_shape, output_shape):
        super().__init__()
        self.input_shape = input_shape
        self.output_shape = output_shape
        self.module = module

    def forward(self, input):
        h0, t0 = input
        t1 = self.module(t0.flatten(1)).view(t0.shape[0:1]+self.output_shape)
        h1 = h0
        return h1, t1, h0, t0

Formatter

The formatter is optional.

Refer to file sigprop/signals/formatter.py for examples of formatters.

Example new implementation:

from sigprop import signals

class MySignalFormatter(signals.Formatter):
    def forward(self, input):
        h0, t0 = input
        if t0 is not None:
            t0 = F.one_hot(torch.arange(t0.shape[1], device=t0.device),t0.shape[1]).float()
        return h0, t0

5.3. Add a New Propagator

There are two types of propagators: one that learn and ones that do not.

Learn

Refer to file sigprop/propagators/learn.py for examples of propagators that learn.

Currently, propagators for signals have different router logic than hidden layers. So, we use a different propagator class for the signal.

Example new implementation:

from sigprop import propagators

class MyLearnPropagator(propagators.Learn):
    def loss_(self,h1,t1,h0,t0,y_onehot):
        loss = # calculate a loss
        return loss

    def train_(self,h1,t1,h0,t0,y_onehot):
        loss = self.loss_(h1,t1,h0,t0,y_onehot)
        if self.optimizer is not None:
            self.optimizer.zero_grad()
            loss.backward()
            self.optimizer.step()
        return loss.item(), None, None

    def eval_(self,h1,t1,h0,t0,y_onehot):
        loss = self.loss_(h1,t1,h0,t0,y_onehot)
        return loss.item(), None, None

class MyLearnPropagatorForSignals(propagators.signals.Learn):
    def loss_(self,h1,t1,h0,t0,y_onehot):
        loss = # calculate a loss
        return loss

    def train_(self,h1,t1,h0,t0,y_onehot):
        loss = self.loss_(h1,t1,h0,t0,y_onehot)
        if self.optimizer is not None:
            self.optimizer.zero_grad()
            loss.backward()
            self.optimizer.step()
        return loss.item(), None, None

    def eval_(self,h1,t1,h0,t0,y_onehot):
        loss = self.loss_(h1,t1,h0,t0,y_onehot)
        return loss.item(), None, None

Other

Refer to file sigprop/propagators/other.py for examples of propagators that do not learn.

Example new implementation:

from sigprop import propagators

class MyPropagator(propagators.Propagator):
    def __init__(self, module):
        super().__init__()

        self.module = module

    def forward(self, input):
        h0, t0, y_onehot = input

        h1 = self.module(h0)
        t1 = t0

        return (h1, t1, y_onehot)

5.4. Add a New Model

Refer to file sigprop/models/model.py for examples of forward learning models.

Example new implementation:

from sigprop import models

class MyModel(models.Model):
    def __init__(self, model, signal):
        super().__init__(model, signal)

    def forward(self, input):
        x, y_onehot = input
        h0, t0 = self.signal((x, y_onehot, y_onehot))
        h1, t1, y_onehot = self.model((h0, t0, y_onehot))
        return h1

signalpropagation's People

Contributors

amassivek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

signalpropagation's Issues

Issues with the Accuracy output

Describe the bug
I am running the ex4 script for CUB 200 dataset but I'm getting output that seems to be incorrect. Specifically, all the accuracy Values for each layer are 1.0000. and the over all test accuracy is 0.0 .

I have tried running the script multiple times, but the output is consistently incorrect. I have also checked the input data to make sure that it is correct, and there do not appear to be any issues there. I did not change much in the code as I only wrote a dataset class for CUB-200. I am using the default configurations for the rest of the parameters.

I'm running the script on a Linux machine using Python 3.8.16.

Expected behavior
Although it might be possible that the accuracy is 0 at the start of the epoch as the dataset contains large number of classes (200), it should at least increase as it trains, which it doesn't seem to be doing.

I have also tried increasing the topk value to 20 but it didn't make any difference. Is it safe to assume the accuracy is really 0 in this case?

Screenshots
Here is a screenshot of the first epoch results:
issue

Version (please complete the following information):

  • SigProp: [commit a5eabe2]
  • Pytorch: [ 1.13.1]
  • PyTorch Vision: [ 0.14.1]
  • Device: [GPU]

having trouble reproducing CIFAR results in the paper

Hi, I am trying to reproduce CIFAR10 and CIFAR100 results listed in the paper (8.34 test error rate for CIFAR10 and 34.30 error rate for CIFAR100).

I used ex2_input_target_max_rand.py, ex4_input_target_topk.py, and examples.py to run

python ex2_input_target_max_rand.py --sigprop --model vgg8 --dataset CIFAR10 --dropout 0.2 --lr 5e-4 --nonlin leakyrelu
python ex4_input_target_topk.py --sigprop --model vgg8 --dataset CIFAR10 --dropout 0.2 --lr 5e-4 --nonlin leakyrelu

Here are some additional configuration I tested

  • --norm batch_norm or --norm instance_norm

  • I also tested "v9_input_target_max_all" and "v1_input_label_direct" loss by replacing them with input_target_max_rand in the ex2_input_target_max_rand.py file.

  • I also tested different topk values, e.g. 2, (default =6) for input_target_topk loss for CIFAR10

The default values of --lr-decay-milestones and --lr-decay-fact, coupled with MultiStepLR in the HyperParamsWrapper, should handle lr scheduling described in the paper. However, it was not possible to get test accuracy higher than 87% for CIFAR10, and 50% for CIFAR100. Can you please provide training configuration or environment to reproduce the results described in the paper?

Here are some example printout results. The accuracy did not improve after about 250~300 epochs. I ran the script on a Linux machine using Python 3.9.16 and Pytorch 1.13.1.

for CIFAR100

Epoch Start: 399
[Info][Train  Epoch 399/400][Batch 390/391]     [loss 2.0399]   [acc 0.4259]
[Sequential] Acc: 0.4750 (0.4345, 21727/50000)   Loss: 24.8152 (27.2924)
[BlockConv] Acc: 0.3375 (0.3133, 15664/50000)    Loss: 22.0432 (22.7874)
[BlockConv] Acc: 0.3125 (0.2719, 13596/50000)    Loss: 21.8403 (22.1944)
[BlockConv] Acc: 0.2625 (0.2637, 13185/50000)    Loss: 21.3491 (21.5723)
[BlockConv] Acc: 0.2125 (0.2766, 13830/50000)    Loss: 20.7694 (20.7466)
[BlockConv] Acc: 0.2750 (0.2814, 14068/50000)    Loss: 19.9495 (19.8684)
[BlockConv] Acc: 0.2750 (0.2650, 13250/50000)    Loss: 19.6585 (19.0249)
[BlockLinear] Acc: 0.3250 (0.2553, 12764/50000)          Loss: 20.2943 (19.3511)
[Info][Test   Epoch 399/400]                    [loss 1.7940]   [acc 0.4967]
[Sequential] Acc: 0.6875 (0.4439, 4439/10000)    Loss: 17.5677 (27.2676)
[BlockConv] Acc: 0.6250 (0.3671, 3671/10000)     Loss: 20.6346 (21.4852)
[BlockConv] Acc: 0.4375 (0.3414, 3414/10000)     Loss: 22.7599 (20.6480)
[BlockConv] Acc: 0.5000 (0.3426, 3426/10000)     Loss: 23.8240 (20.1075)
[BlockConv] Acc: 0.7500 (0.3659, 3659/10000)     Loss: 25.7010 (20.2997)
[BlockConv] Acc: 0.7500 (0.3792, 3792/10000)     Loss: 30.8968 (20.1886)
[BlockConv] Acc: 0.7500 (0.3566, 3566/10000)     Loss: 29.7236 (18.8794)
[BlockLinear] Acc: 0.6250 (0.3487, 3487/10000)   Loss: 21.7481 (18.2785)

for CIFAR10

Epoch Start: 399
[Info][Train  Epoch 399/400][Batch 390/391]     [loss 0.4211]   [acc 0.8557]
[Sequential] Acc: 0.5125 (0.6852, 34259/50000)   Loss: 3.3270 (3.4904)
[BlockConv] Acc: 0.5875 (0.6892, 34460/50000)    Loss: 3.3551 (3.6682)
[BlockConv] Acc: 0.6500 (0.7443, 37215/50000)    Loss: 3.2299 (3.4566)
[BlockConv] Acc: 0.6500 (0.7846, 39229/50000)    Loss: 3.0880 (3.3314)
[BlockConv] Acc: 0.6750 (0.8164, 40821/50000)    Loss: 2.9348 (3.2015)
[BlockConv] Acc: 0.7750 (0.8463, 42317/50000)    Loss: 2.8138 (3.1072)
[BlockConv] Acc: 0.8000 (0.8634, 43169/50000)    Loss: 2.6970 (3.0334)
[BlockLinear] Acc: 0.7875 (0.8558, 42789/50000)          Loss: 2.6793 (3.0491)
[Info][Test   Epoch 399/400]                    [loss 0.4259]   [acc 0.8633]
[Sequential] Acc: 0.6875 (0.7101, 7101/10000)    Loss: 1.6176 (3.4179)
[BlockConv] Acc: 0.6250 (0.7316, 7316/10000)     Loss: 1.6897 (3.3831)
[BlockConv] Acc: 0.7500 (0.7777, 7777/10000)     Loss: 1.5615 (3.2262)
[BlockConv] Acc: 0.7500 (0.8093, 8093/10000)     Loss: 1.3019 (3.1340)
[BlockConv] Acc: 0.7500 (0.8350, 8350/10000)     Loss: 1.2648 (3.0709)
[BlockConv] Acc: 0.7500 (0.8573, 8573/10000)     Loss: 1.2363 (3.0167)
[BlockConv] Acc: 0.7500 (0.8650, 8650/10000)     Loss: 1.2136 (2.9883)
[BlockLinear] Acc: 0.7500 (0.8627, 8627/10000)   Loss: 1.2342 (2.9913)

I am also having trouble finding code implementation of the equation 10 in the paper. Can you please locate where it is?

What's the difference between example2 and example4?

Hi, Amassivek, Thanks for your amazing work!

After looking through the repository, I wonder what is the difference between example 2 and example 4.

Example 2: Input Target Max Rand
This example feeds the inputs x (e.g. images) and their respective targets t (e.g. labels) as pairs (or one after the other). Given pair x_i,t_i, this example selects the closest matching pair x_j,t_j to compare with. If there are multiple equivalent matching pairs, it randomly selects one.

Example 4: Input Target Top-K
This example feeds the inputs x (e.g. images) and their respective targets t (e.g. labels) as pairs (or one after the other). Given pair x_i,t_i, this example selects the top k closest matching pair x_j,t_j to compare with.

Do they have different loss functions for every layer? And what does it mean that " selects the top k closest matching pair x_j,t_j to compare with"? From my understanding every layer updates its weights will just compare the L2 distance between h_i and t_i. When does it use multiple pairs?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.