The cnn_gaze's intro from paulweihan

尝试使用CNN实现视线预测
会逐步从简单的神经网络实现入手，搭起自己的CNN模型。


项目进展参考网站：

http://paulweihan.github.io

实现LeNet5 model00 done
改lenet5 model01-03 done
改googlenet model04，05 done

model00 得到和［ CVPR 2015 ］ Appearance-Based Gaze Estimation in the Wild 近似的结果


主要参考：
    
    1，深度学习教程网站的python代码
        http://deeplearning.net/tutorial/lenet.html#lenet
        http://blog.csdn.net/u012162613/article/details/43225445
        基于python theano实现的经典LeNet结构，第二个链接对第一个链接中给出的代码进行了注释：
        代码来自于深度学习教程：Convolutional Neural Networks (LeNet) ［第一个链接］，这个代码实现的是一个简化了的LeNet5，具体如下：
        没有实现location-specific gain and bias parameters
        用的是maxpooling，而不是average_pooling
        分类器用的是softmax，LeNet5用的是rbf
        LeNet5第二层并不是全连接的，本程序实现的是全连接
        另外，代码里将卷积层和子采用层合在一起，定义为“LeNetConvPoolLayer“（卷积采样层），这好理解，因为它们总是成对出现。但是有个地方需要注意，代码中将卷积后的输出直接作为子采样层的输入，而没有加偏置b再通过sigmoid函数进行映射，即没有了下图中fx后面的bx以及sigmoid映射，也即直接由fx得到Cx。

        最后，代码中第一个卷积层用的卷积核有20个，第二个卷积层用50个，而不是上面那张LeNet5图中所示的6个和16个。


    2，斯坦福的cs231n课程网站：
        http://cs231n.stanford.edu
        http://cs231n.stanford.edu/syllabus.html

 ＊ 3，Hacker's guide to Neural Networks
        理解BP，SVM，NN算法及其实现
        http://karpathy.github.io/neuralnets/

 ＊ 4，Unsupervised Feature Learning and Deep Learning
        http://deeplearning.stanford.edu/wiki/index.php/UFLDL_Tutorial
        注：最新的教程网址是：http://ufldl.stanford.edu/
        吴恩达／Andrew Ng主编
        学之前先看：他的机器学习
        http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=MachineLearning

    5，代码基于githup的Deep Learning的toolbox
        https://github.com/rasmusbergpalm/DeepLearnToolbox
        http://blog.csdn.net/zouxy09/article/details/9993743/

 ＊ 6，Caffe网站：
        http://caffe.berkeleyvision.org

        Jia Y, Shelhamer E, Donahue J, et al. Caffe: Convolutional Architecture for Fast Feature Embedding[J]. Eprint Arxiv, 2014.

        http://www.csdn.net/article/2015-01-22/2823663

        CUDA：https://developer.nvidia.com/cuda-downloads

    7, 菜鸟从零开始学习Deep learning
        http://blog.csdn.net/yihaizhiyan/article/category/2388401


    主要论文：
    ［ CVPR 2015 ］ Appearance-Based Gaze Estimation in the Wild
    ［ CVPR 2014 ］  Learning-by-Synthesis for Appearance-based 3D Gaze Estimation

    GoogLenet
        Szegedy C, Liu W, Jia Y, et al. Going Deeper with Convolutions[J]. Eprint Arxiv, 2014.

    DeepID-Net:
        http://www.ee.cuhk.edu.hk/~wlouyang/projects/imagenetDeepId/index.html

        Ouyang W, Luo P, Zeng X, et al. DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection[C]. Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on. IEEE, 2015.

        Ouyang W, Luo P, Zeng X, et al. DeepID-Net: multi-stage and deformable deep convolutional neural networks for object detection[J]. Eprint Arxiv, 2014.
.       
        Sun Y, Wang X, Tang X. Deep Learning Face Representation from Predicting 10,000 Classes[C]// Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014:1891-1898.

    RCNN:
        https://github.com/rbgirshick/rcnn
        bibtex
        @inproceedings{girshick14CVPR,
        Author = {Girshick, Ross and Donahue, Jeff and Darrell, Trevor and Malik, Jitendra},
        Title = {Rich feature hierarchies for accurate object detection and semantic segmentation},
        Booktitle = {Computer Vision and Pattern Recognition},
        Year = {2014}
        }

    经典论文：
            Notes on Convolutional Neural Networks
            Gradient-Based Learning Applied to Document Recognition
My notes: paulweihan.github.io
paulweihan / cnn_gaze Goto Github PK

cnn_gaze's Introduction

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent