Giter Site home page Giter Site logo

cnn_gaze's Introduction

尝试使用CNN实现视线预测
会逐步从简单的神经网络实现入手,搭起自己的CNN模型。


项目进展参考网站:

http://paulweihan.github.io

实现LeNet5 model00 done
改lenet5 model01-03 done
改googlenet model04,05 done

model00 得到和[ CVPR 2015 ] Appearance-Based Gaze Estimation in the Wild 近似的结果


主要参考:
    
    1,深度学习教程网站的python代码
        http://deeplearning.net/tutorial/lenet.html#lenet
        http://blog.csdn.net/u012162613/article/details/43225445
        基于python theano实现的经典LeNet结构,第二个链接对第一个链接中给出的代码进行了注释:
        代码来自于深度学习教程:Convolutional Neural Networks (LeNet) [第一个链接],这个代码实现的是一个简化了的LeNet5,具体如下:
        没有实现location-specific gain and bias parameters
        用的是maxpooling,而不是average_pooling
        分类器用的是softmax,LeNet5用的是rbf
        LeNet5第二层并不是全连接的,本程序实现的是全连接
        另外,代码里将卷积层和子采用层合在一起,定义为“LeNetConvPoolLayer“(卷积采样层),这好理解,因为它们总是成对出现。但是有个地方需要注意,代码中将卷积后的输出直接作为子采样层的输入,而没有加偏置b再通过sigmoid函数进行映射,即没有了下图中fx后面的bx以及sigmoid映射,也即直接由fx得到Cx。

        最后,代码中第一个卷积层用的卷积核有20个,第二个卷积层用50个,而不是上面那张LeNet5图中所示的6个和16个。


    2,斯坦福的cs231n课程网站:
        http://cs231n.stanford.edu
        http://cs231n.stanford.edu/syllabus.html

 * 3,Hacker's guide to Neural Networks
        理解BP,SVM,NN算法及其实现
        http://karpathy.github.io/neuralnets/

 * 4,Unsupervised Feature Learning and Deep Learning
        http://deeplearning.stanford.edu/wiki/index.php/UFLDL_Tutorial
        注:最新的教程网址是:http://ufldl.stanford.edu/
        吴恩达/Andrew Ng主编
        学之前先看:他的机器学习
        http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=MachineLearning

    5,代码基于githup的Deep Learning的toolbox
        https://github.com/rasmusbergpalm/DeepLearnToolbox
        http://blog.csdn.net/zouxy09/article/details/9993743/

 * 6,Caffe网站:
        http://caffe.berkeleyvision.org

        Jia Y, Shelhamer E, Donahue J, et al. Caffe: Convolutional Architecture for Fast Feature Embedding[J]. Eprint Arxiv, 2014.

        http://www.csdn.net/article/2015-01-22/2823663

        CUDA:https://developer.nvidia.com/cuda-downloads

    7, 菜鸟从零开始学习Deep learning
        http://blog.csdn.net/yihaizhiyan/article/category/2388401


    主要论文:
    [ CVPR 2015 ] Appearance-Based Gaze Estimation in the Wild
    [ CVPR 2014 ]  Learning-by-Synthesis for Appearance-based 3D Gaze Estimation

    GoogLenet
        Szegedy C, Liu W, Jia Y, et al. Going Deeper with Convolutions[J]. Eprint Arxiv, 2014.

    DeepID-Net:
        http://www.ee.cuhk.edu.hk/~wlouyang/projects/imagenetDeepId/index.html

        Ouyang W, Luo P, Zeng X, et al. DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection[C]. Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on. IEEE, 2015.

        Ouyang W, Luo P, Zeng X, et al. DeepID-Net: multi-stage and deformable deep convolutional neural networks for object detection[J]. Eprint Arxiv, 2014.
.       
        Sun Y, Wang X, Tang X. Deep Learning Face Representation from Predicting 10,000 Classes[C]// Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014:1891-1898.

    RCNN:
        https://github.com/rbgirshick/rcnn
        bibtex
        @inproceedings{girshick14CVPR,
        Author = {Girshick, Ross and Donahue, Jeff and Darrell, Trevor and Malik, Jitendra},
        Title = {Rich feature hierarchies for accurate object detection and semantic segmentation},
        Booktitle = {Computer Vision and Pattern Recognition},
        Year = {2014}
        }

    经典论文:
            Notes on Convolutional Neural Networks
            Gradient-Based Learning Applied to Document Recognition
My notes: paulweihan.github.io

cnn_gaze's People

Contributors

paulweihan avatar

Stargazers

Leo avatar  avatar 背对疾风吧 avatar  avatar  avatar  avatar Peng Dai avatar TiO2 avatar luckybee avatar

Watchers

Leo avatar James Cloos avatar Stefan Helmert avatar  avatar  avatar JiaYuanyuan avatar Max.Yao avatar

cnn_gaze's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.