Giter Site home page Giter Site logo

evonet's People

Contributors

dmccloskey avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

evonet's Issues

ChromatogramSimulator

Description

A class to simulate chromatograms.

Objectives

  • Ability to add in an arbitrary number of peaks
  • Ability to remove overlapping points from overlapping peaks
  • Ability to join peaks

Feature normalization

Description

Feature normalization: between net input and net output, performed across a feature set

Objectives

  • mean NodeIntegration
  • with tests
  • count NodeIntegration
  • with tests
  • pow,b and log,b NodeActivation types
  • with tests
  • addFeatureNormalization method in ModuleBuilder
  • with tests

References

  • Ioffe et al, 2015 Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv:1502.03167v3

Regularization

Regularization techniques

  • Batch normalization
  • Drop connection
  • Drop Node
  • L2 norm

NNFramework

Description

Barebones neural network framework

Objectives

  • SGS with adaptive and random learning rate
  • Cost functions (e.g., 1/2(delta)**2)
  • Mini batch with average update
  • Foward propogation using an adjacency graph
  • Backward propogation using an adjacency graph
  • Node definitions (e.g., ReLU)
  • Integration with Eigen

Objectives (Bonus)

  • GPU distribution

Test

  • Autoencoder

Notes

VAE example

Description

MNIST VAE example

Objectives

  • Model cache for input and output nodes when adding nodes to the model
  • update checkCompleteModel to use input/output node cache
  • Update modelTrainer to allow for training with multiple lossFunction/output node pairs
  • Update model calculateError function to += error_ instead of assign
  • Update modelTrainer to include getters/setters for loss_functions and output_node pairs
  • Test modelTrainer
  • Update population trainer to remove output nodes and instead use modelTrainer
  • KL divergence loss function
  • addVAESampler to ModelBuilder
  • Gaussian sampler for input to VAE
  • BCEWithLogits
  • Perceptual loss function (?)

Training/Testing

  • during training, set the sampler to sample from a Gaussian
  • during testing, set the sampler to 0

References (Vanilla VAE)

References (Perceptual loss)

Reference (BCEWithLogits)

Solver

Description

Abstract base and inherited classes for various solvers

Objectives

  • SGD
  • Gradient noise

Objectives (Bonus)

  • Adam
  • NAdam
  • AMSGrad

References

Network viewer

Description

GUI to view a network

Objectives

  • Node objects that represent nodes in the network (shapes or tooltips for different integratoin and activation)
  • Connector objects that represent links in the network
  • Weight object that represents the weight associated with a particular link
  • Graph generator using the Model class
  • Graphical depiction of flow through the network

Examples

Node/link/weight modules

Description

A method to define combinations of nodes/links/and weights as a "module" that can be copied, added, and deleted during model evolution

Objective

  • addition Node/Link/Weight attribute module_name
  • ModelReplicator methods for copyModule, addModule, and deleteModule
  • test for Node/Link/Weight attribute
  • tests for ModelReplicator methods

PeakPickerLearn

Description

Peak picking and integration using deep learning

Objectives

  • identification of all peaks
  • correct baseline integration of all peaks
  • smoothing of spurious points, interpolation of saturated peaks, interpolation of cutoff peaks

Validation

  • 99% identification of all peaks in a chromatogram
  • correct area/height estimation of saturated peaks
  • correct area/height estimation of partially cutoff peaks

Tests and checks

  • 100% test accuracy on a single peak
  • 100% test accuracy on multiple peaks

Separate Weight and Solver classes

Description

Split management of weights and weight updates from Link class and into its own Weight class. Provide a link between the weight and the link it corresponds to using a "link_id." This will also provide a mechanisms for link sharing. In addition, a separate Solver class maybe needed to handle weight updates.

Objects

  • Weight.h and Weight.cpp
  • Weight_test

File Export

Description

Write a modified file to disk

Objectives

  • GUI for saving a modified file to disk
  • Writer to write the modified file to disk

Tests

  • ?

ModelBuilder

Description

Class to automate the construction of networks. Would require a separate ModuleBuilder class to define reusable sub units.

Objectives

  • FC layer
  • number of nodes
  • Convolution Layer
  • Depth (# of filters), stride (spacing between filters), spatial extent(width and height), zero padding
  • Pooling layer (max)
  • Spatial extent (width and depth), stride
  • LSTM/GRU layer
  • number of hidden units, architecture
  • SoftMax
  • number of nodes
  • Custom module

EMGModel

Description

A class to calculate the modified EMG formulat

Objectives

  • calculation of the EMG PDF

LossFunction

Description

Abstract base and inherited classes for various loss functions

Objectives

  • cross entropy
  • eudlidean distance (autoencoder reconstruction error)
  • negative log likelihood
  • modified L2
  • mean squared error (MSE)
  • mean squared logarithmic error (MSLE)
  • Kullback Leibler (KL) Divergence
  • BCE with Logits
  • Softmax + CrossEntropy

Global thread management

Description

A method to management or optimize the number of devices and streams (GPUs) and threads (CPU) used based on user specifications and hardware configuration. The method should use a generic device abstraction to allow for multiple types of devices and new devices that may become available in the future.

Details

  • hardware resources can either be defined by the user or identified at run time
  • maximum number of threads and streams are tracked by the manager class
  • memory and host to device transfers are tracked by the DeviceManager
  • device type is abstracted
  • node and weight resources are managed by the associated device

HardwareManager Objectives

  • Track control flow devices and arithmetic devices
  • Discover or assign available hardware resources including num of CPUs/GPUs, number of streams, memory capacity

DeviceManager Objectives

  • abstract the device type
  • initialize the device
  • track memory on the device
  • manage streams on the device
  • copy host to device and device to host

Other changes

  • template for DeviceType in IntegrationFunction, LossFunction, and solver
  • initNodes and initWeights copies tensors to device
  • initModelError, initWeightMatrices, initNodeMatrices copies tensor to device
  • syncNodes and syncWeights copies tensors to host

NodeIntegration

Description

Allow for multiple input integration strategies such as Sum, Product, and Max

TODOs

  • NodeIntegration enum class
  • Tests and update of all other code for NodeIntegration
  • input_ internal parameter with getters/setters
  • Tests for input_
  • update FP with NodeIntegration options
  • Tests for FP with NodeIntegration options
  • update BP with NodeIntegration options
  • Tests for BP with NodeIntegration options
  • update weightUpdate with NodeIntegration options
  • Tests for weightUpdate with NodeIntegration options

Tensor device support

Description

implement interface for device specification when using the Eigen::Tensor library.

Objectives

  • GPU support using CUDA
  • Multi-CPU support

std::shared_ptr implementation of NodeIntegration

Description

Similar to solver, weight_init, and activation

Objectives

  • classes for Integration, IntegrationError, and IntegrationWeightGrad
  • tests for all integration classes
  • refactor of Model to use shared_ptr implementation of NodeIntegration
  • refactor of Node to use shared_ptr implementation of NodeIntegration
  • refactor of ModelReplicator to use shared_ptr implementation of NodeIntegration
  • update of all downstream tests and examples

File import

Description

Read a file from disk.

Objectives

  • GUI for selecting files on the users hard disk
  • parser for .csv files
  • import routine for files

Tests

  • ?

Check for a complete input to output network

Description

Some solutions to sequence problems can be found by disconnecting the output node. Need to implement a test to ensure that input can be propogated to the output.

Implementation

set all node outputs to linear, set all biases to 0, set all weights to 1, set all inputs to 1, FP, and check that the output is > 1

adaptiveScheduler methods

Description

Methods to allow for easy implementation of complex training schedules that involve adapting the population size, # of "mutations", etc., based on the network performance. In addition, it would be necessary to have methods to increase or decrease the difficulty of the task depending upon the network performance.

Objectives

  • model change scheduling mechanism for ModelReplicator (i.e. update of setRandomModifications)
  • model training hyper parameter mechanisms for ModelTrainer (i.e., updates of n_epochs, solver parameters, time-steps during back propogation, etc.)
  • population training hyper parameter mechanisms for PopulationTrainer (i.e., n_top, n_random, and n_replicates_per_model)

PeakSimulator

Description

A class to simulate a peak using an EMG model

Objectives

  • ability to simulate a peak of a given window size and point density
  • ability to add simulated noise to the peak
  • ability to specify different baseline heights for the left and right peak edges

Test coverage of ModelReplicator

Description

multiple bugs were found in invalid calls to getRandom (i.e., input vector of size 0) and insufficient handling of errors thrown by getRandom. These occurred during rounds of training using PopulationTrainer.

Objectives

  • improve test coverage of of ModelReplicator methods to cover cases of bad input.

ActivationFunction

Description

Abstract base and inherited classes for various activation functions

Objectives

  • ReLU
  • ELU
  • Sigmoid
  • tanh

References

R Hahnloser, R. Sarpeshkar, M A Mahowald, R. J. Douglas, H.S. Seung (2000). Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature. 405. pp. 947–951.
Clevert, Djork-Arné; Unterthiner, Thomas; Hochreiter, Sepp (2015). "Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)". arXiv:1511.07289

Population viewer

Description

Time line plot to show the different models in the population during training

Objectives

  • Timeline view
  • Overview of the validation accuracy of current and past models
  • Hierarchy to view the ancestry of models and penetration of models into the population

CI

  • Continuous integration using Travis and Appveyor
  • Unit testing using Boost.Test library

Drop Connection/Node

Description

implementations for drop connection and drop node during training. Implemented as a multiplication of the weight (drop connection) or the node output (drop node) by 0.

Implementation

  • probability of weight * 0
  • probability of output * 0

Ojectives

  • Drop connection
  • with tests
  • Drop Node
  • with tests

Weight initializations

Description

w=np.random.randn(layer_size[l],layer_size[l-1])*np.sqrt(2/layer_size[l-1])

Objectives

  • Xavier initializations for Non-ReLU FFNN (Norm Dist std dev = sqrt(2. / (in + out)))
  • He et al, 2015 initialization for ReLU FFNN (Norm Dist std dev = sqrt(2/in)

References

ModelLogger

Description

Methods for logging model diagnostics during training to replace verbose outputs to the console

Features

  • Model loss/error vs epoch curves
  • Train/Val accuracy vs epoch curves
  • Node Output/Expected Output vs epoch
  • Ratio of weight updates (want weights:updates ~1e-3) vs epoch
  • Node output distribution (mean/var) per layer per mini-batch per time-step vs epoch
  • Node error distribution (mean/var) per layer per mini-batch per time-step vs epoch
  • Epoch iteration vs time
  • Weight/Node values per time_step vs. Epoch

Objectives

  • ModelLogger class
  • getters/setters for ModelLogger for Model
  • activation of ModelLogger via ModelTrainer during training or validation
  • method to log each of the features above
  • tests for each of the features above

Optimize build

build only components of contrib needed
build only components of openMS needed

Improved test coverage and test updates

improve test coverage

ActivationOp and ActivationTensorOp

  • getters and setters
  • default and non-default constructors

ModelBuilder

  • dummy input for each ModelBuilder method
  • test for expected output of each test model

PopulationTrainer

  • Thread tests

ModelTrainer

  • Tests for validateModel

model

  • update tests for MapValuesToNodes to include node input

Model_DCG and DAG

  • sub tests for getNextInactiveLayer... and getNextUncorrectedLayer...

additionProblem

  • tests for AddProb
  • tests for MakeAddProblemTrainingData

MNIST

  • tests for data readers

Boost.Test

Add in support for using Boost.Test unit testing module

ChromatogramSimulator

Description

Simulate a chromatogram with multiple peaks, varying baseline, and stochastic detector noise.

Objectives

  • Add multiple simulated peaks
  • Remove overlapping points

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.