Giter Site home page Giter Site logo

shar-01 / intro_continual_learning Goto Github PK

View Code? Open in Web Editor NEW

This project forked from clam004/intro_continual_learning

0.0 0.0 0.0 4.57 MB

This is a tutorial to connect the fundamental mathematics to a practical implementation addressing the continual learning problem of artificial intelligence

License: MIT License

Jupyter Notebook 96.60% Python 3.40%

intro_continual_learning's Introduction

intro_continual_learning

This is a tutorial to connect the mathematics and machine learning theory to practical implementations addressing the continual learning problem of artificial intelligence. We will learn this in python by examining and deconstructing a method called elastic weight consolidation (EWC).

I wish there were more learning tools in this style that directly try to help the learner connect the math to the code, and do it using a simple but completely end to end project. While it is true that the average programmer can load a "out of the box" library in 5 minutes and be running the latest model solving a common task in 15 minutes, I often hear from engineers that although they are engineers, they feel under-developed in the math that underlies recent academic research in machine learning. I have received criticism from some that believe tutorials like this provide a shortcut for "average" engineers to "think" they understand the math behind a new flashy artificial intelligence concept, who think the joy of reading these papers should be reserved for the traditionally trained academics that have gone through the years of formal coursework. I think there is nothing wrong with motivating learners using a cool AI concept to learn more of the fundamental math on their own.

"anyone can cook" - ratatouille

What does elastic weight consolidation do?

The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. When an artificial neural network is trained on a new training set, unless that new training set includes all the old tasks combined with the new task, it generally is subject to catastrophic forgetting, whereby learning to solve new task B accompanies degradation of performance at old task A. In contrast, human neural networks can maintain expertise on tasks that they have not experienced for a long time. EWC addresses this problem by selectively slowing down learning on the weights (ie parameters, synaptic strengths) important for those old tasks.

Setup

  • Ubuntu 18.04.3 LTS (bionic)
  • Python 3.8
  • Cuda 10.1
  • cudnn7.6.4
  • PyTorch 1.10.0

These same steps should work on MacOS to

you@you:/path/to/folder$ pip3 install virtualenv

you@you:/path/to/folder$ virtualenv venv --python=python3.8

you@you:/path/to/folder$ source venv/bin/activate

(venv) you@you:/path/to/folder$ pip3 install -r requirements.txt

(venv) you@you:/path/to/folder$ jupyter notebook

Credit/References:

  1. James Kirkpatrick et al. Overcoming catastrophic forgetting in neural networks 2016(10.1073/pnas.1611835114)

  2. shivamsaboo17

  3. moskomule

intro_continual_learning's People

Contributors

clam004 avatar csinva avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.