wwxy261 / tiny_cnn Goto Github PK

View Code? Open in Web Editor NEW

The convolution, relu and pooling is the basic units of the neural network. This test wants you to write the function of these three parts in C/C++: · Forward only, Backward is PLUS; · Support Conv2D, Pooling2D operator; Verify the results by test case and calculate the computation efficiency;

CMake 0.03% C++ 99.39% C 0.16% Objective-C 0.01% Python 0.42%

tiny_cnn's Introduction

tiny_CNN性能测试

Convolution, Relu 以及Pooling是卷积神经网络的基本单元，这个项目采用C++实现了这三个基本单元的Forward以及Backward计算。并且实现了将这三个单元组合为一体的模块以提高性能，组合后有1.2倍的性能提升。该项目采用OpenMP支持多核并行。

Introduction

1. 项目结构

./src 源码

./lib 开源第三方矩阵库Eigen(底层可调用MKL)

./data mnist数据集

./python numpy实现算法demo以及Pytorch搭建网络作为benchmark

./main.cpp 测试性能代码

2. Build

mkdir build
cd build
cmake ../
make

3. 算法简介

Convolution Caffe中经典的im2col算法实现
Convolution和MaxPooling实现中对最外层的n_sample循环加入parallel for优化
将Convolution,Relu,Max_Pooling融合为一个模块，同样在最外层加入parallel for优化

4. 测试过程

使用mnist测试数据集输入，采用如下Pytorch定义的简化CNN网络模型(./python/mnist_cnn_pytorch.py)。

class Net(nn.Module):
   def __init__(self):
       super(Net, self).__init__()
       self.conv1 = nn.Conv2d(1, 6, 5)
       self.pool = nn.MaxPool2d(2, 2)
  

   def forward(self, x):
       x = x.view(-1, 1, 28, 28)
       x = self.pool(F.relu(self.conv1(x)))
       x = x.view(-1, 6 * 12 * 12)
       return x

⬆ 回到顶部

Results

1. 测试环境

Mac CPU(2 GHz 四核Intel Core i5)

HPC (2.5GHz 24核Core) ICC编译优化

2. Forward results

调用n次从网络输入到输出的forward计算并计时测试程序性能，结果如下表所示

n	Pytorch	serial	fuse	24-core	24-core-fuse	naive-numpy	fuse_numpy
n=1	0.582s	1.042s	0.995s	0.452s	0.431s	332.457s	64.329s
n=10	4.467s	7.357s	7.850s	1.433s	1.254s	-	-
n=100	42.883s	69.868s	77.798s	10.548s	9.938s	-	-

3. Backward results

调用n次从网络输出到输入的backward计算并计时测试程序性能,结果如下表所示

n	serial	fuse	24-core	24-core-fuse
n=1	1.082s	0.956s	0.470s	0.363s
n=10	9.861s	9.351s	4.496s	3.478s
n=100	97.247s	90.616s	41.811s	33.760s

⬆ back to top

Recommend Projects

wwxy261 / tiny_cnn Goto Github PK

tiny_cnn's Introduction

tiny_CNN性能测试

目录

Introduction

1. 项目结构

2. Build

3. 算法简介

4. 测试过程

Results

1. 测试环境

2. Forward results

3. Backward results

tiny_cnn's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent