a.i,lxzun

`Study Log`

Reinforcement Learning #1

 1. Mathematical Background 
      - Probability                 - ok
      - Random Variable             - ok
      - Random Process              - ok

  2. Basic RL Algorithm
      - Markov Process              - ok             
      - Markov Reward Process       - ok
      - Markov Decision Process     - ok 
          - Bellman Expectation Eqn 
          - Bellman Optimality  Eqn  

      - Dynamic Programming         - ok
          - Value iteration         - ok
          - Policy iteration        - ok

      - Model free Approaches
          # MF Prediction                
            - Monte Carlo           - ok
            - Temporal Difference   - ok
            
            (Example : Random Walk)
            
          # MF Control                
            - Sarsa                 - ok
            - Q-Learning            - ok

            (Example : Cliff Walking)
            (Example : Windy Grid)
            (Example : Windy Cliff)

  3. ML based R learning
          - Function Approximation  - ok 
          - DQN                     - ok

            (Example : Cartpole - DQN)
  
  4. Policy Base R Learning
          - REINFORCE               - ok
          - A2C                     - ok


  # Term Project

     Cartpole A2C          
     Cartpole DQN         
     Cartpole REINFORCE

Reinforcement Learning #2

Week 1 : Dynamic Programming

  - Policy Iteration
  - Value Iteration

  # Proof of Convergence

Week 2 : Monte Carlo

  - On Policy Monte Carlo  : Batch / Recursive 
  - Off Policy Monte Carlo : Batch / Recursive

  # Law of Large Number
  # Empirical Mean 
  # Importance Sampling

Week 3 : Temporal Difference

  - Temporal difference(0)
  - Temporal difference(1)
  - Temporal difference(λ)
  - SARSA
  - Q Learning
  - Double Q Learning
  - Deep Q Learning
  - Function Approximation 

  # Robbins-Monro rule
  # Sherman-Morrison fomular
  # Projected Bellman Eqn
  # Maximization bias

Week 4 : Policy Gradient

  - REINFORCE
  - A2C
  - DPG
  - DDPG

  # PG Proof
  # Information Theory 
      - Self Information
      - Shannon-Entropy
      - KL divergence
      - Cross Entropy

Week 5 : Advanced RL

  - D3QN
  - Double Deep
  - Dueling Deep
  - TD3
  - TRPO
  - PPO

Week 6 : Project
```
  # Solve BiPedal 
```

Machine Learning

1. Linear Regression       - ok
2. Logistic Regresssion    - ok
3. K-nearest neighborhood  - ok
4. K-means clustering      - ok      
5. Naive Bayes             - ok      
6. SVM                     - ok
7. PCA
8. Decision Tree           - ok
9. Perceptron              - ok      
       1. SLP              - ok
       2. MLP              - ok

Deep Learning

 1. Linear Regression      - ok    
 2. Logistic Regression            
      - Logistic Regression(Binary Classification)    - ok
      - Softmax Regression(MultiClass Classification) - ok
     
 3. Auto Encoder            
       - AE                - ok
       - CAE               - ok

4. Modern CNN
       - LeNet             - ok
       - AlexNet           - ok
       - VGG Nets          - ok
       - GoogLeNet         - ok
       - ResNet            - ok


 5. Semantic Segmentation
       - FCN               - ok
       - DeConvNet         - ok
       - SegNet            - ok     
       - U-Net             - ok
       - DeepLab v1, v2    - ok

 6. Object Detection
       - RCNN
       - Fast RCNN
       - Faster RCNN
       - SPP Net
       - Yolo
       - SDD
       - Attention Net

 7. NLP
       - RNN                       - ok
       - LSTM / GRU                - ok
       - Sequence Prediction       - ok
       - Sequence Classification   - ok

lxzun / a.i Goto Github PK

a.i's Introduction

`Study Log`

a.i's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent