Giter Site home page Giter Site logo

dgschwend / zynqnet Goto Github PK

View Code? Open in Web Editor NEW
731.0 731.0 296.0 52.73 MB

Master Thesis "ZynqNet: An FPGA-Accelerated Embedded Convolutional Neural Network"

License: GNU General Public License v3.0

Makefile 0.05% Tcl 0.74% C 20.70% C++ 28.53% HTML 49.39% Shell 0.01% Objective-C 0.04% MATLAB 0.55%

zynqnet's People

Contributors

dgschwend avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zynqnet's Issues

Zynqnet problem

Dear David
I created a new project in Vivado HLS and imported the “_HLS_CODE” files in the directory for source files. After synthesis, I met the following error:

Zip/zynqnet-master/_HLS_CODE/unittests.cpp:50:23: error: no member named 'setiosflags' in namespace 'std'
std::cout << std::setiosflags(std::ios::fixed) << std::setprecision(2)
~~~~~^
Zip/zynqnet-master/_HLS_CODE/unittests.cpp:50:60: error: no member named 'setprecision' in namespace 'std'
std::cout << std::setiosflags(std::ios::fixed) << std::setprecision(2)

I will be grateful if you help me and guide me to a successful synthesis

Best regards,
Roxana


Roxana Mir
MSc student,
CE department
Science and research branch of IAU

Create the SDK Project

Dear mr. @dgschwend,
I'm interested to test your CNN on Xilinx UltraScale+. I've generated the fpga_top IP with Vivado HLS and used it to create an hardware platform for the ZCU102 part with Vivado.
I've never used Vivado SDK so i'm right now stucked and don't know how to proceed, can you give me some advice on how to create a executable with SDK that uses the fpga_top IP? Another question, for creating the executable can i use the same .cpp files of the test bench used in Vivado HLS to execute the same test but on the board?

Thanks in advance,
Antonio.

Error when running the Train_Caffenet.sh

Hello,

I am using caffe and GPU Tesla K80 to train your model. This is what I get every time I run the train script. Any idea what could be the issue? Do I need more than one GPU to train?

I0529 11:25:57.548988 73168 blocking_queue.cpp:49] Waiting for data
I0529 11:28:21.023334 73174 data_layer.cpp:73] Restarting data prefetching from start.
I0529 11:28:23.880893 73168 solver.cpp:418] Test net output #0: accuracy = 0.000996492
I0529 11:28:23.880945 73168 solver.cpp:418] Test net output #1: accuracy_top5 = 0.00476323
I0529 11:28:23.880959 73168 solver.cpp:418] Test net output #2: loss = 6.93271 (* 1 = 6.93271 loss)
F0529 11:28:24.432736 73168 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
@ 0x2aaaac012e6d (unknown)
@ 0x2aaaac014ced (unknown)
@ 0x2aaaac012a5c (unknown)
@ 0x2aaaac01563e (unknown)
@ 0x2aaaaaf41140 caffe::SyncedMemory::mutable_gpu_data()
@ 0x2aaaaade3382 caffe::Blob<>::mutable_gpu_data()
@ 0x2aaaaaf84ef0 caffe::ConvolutionLayer<>::Forward_gpu()
@ 0x2aaaaaf0dfac caffe::Net<>::ForwardFromTo()
@ 0x2aaaaaf0e387 caffe::Net<>::Forward()
@ 0x2aaaaaf2bc4f caffe::Solver<>::Step()
@ 0x2aaaaaf2c44f caffe::Solver<>::Solve()
@ 0x40a727 train()
@ 0x407ebc main
@ 0x2aaabcc76c05 __libc_start_main
@ 0x408703 (unknown)
./examples/imagenet/train_caffenet.sh: line 5: 73168 Aborted (core dumped) ./build/tools/caffe train --solver=/home-new/aup019/zynqnet/_TRAINED_MODEL/solver.prototxt $@

something about codes

hello dgschwend:
typedef ap_uint<23> memaddr_t; // must remain <= 23 bits to fit into float
why must remain <= 23 bits to fit into float?
can I USE typedef ap_uint<32> memaddr_t;

Minimum Requirements and Details of the board

Hello!
We are trying to implement this project for some image recognition task. Can you please provide us the specific details of the boards that you used for this implementation and the minimum requirements of FPGA needed for this project?

Problem with loadbit command

I have a problem with the instruction 'loadbit' from _RUN.sh script. How to use this command to program the FPGA?

how to use the log in hls

hello,I see that in your codes,there have many LOG in the hls code. Can I see the log in the pc when I use the VHLS simulation?

Compressing zynqnet

Hi David,
I am trying to compress your model by removing some of the connections between the neurons so it would fit on my Zynq XC7Z020 FPGA.

I synthesized your model in Vivado HLS and chose my FPGA when selecting the part/board. The screen shot below is a summary of utilization estimates.
resource utilization

I would like to get your advice on this, do you think compressing the model would take care of it? I'm afraid the accuracy would suffer greatly if I reduce the size to that extend.
Any thoughts?

Problem in exporting RTL in VHLS.

Hi, I've passed the C simulation and C synthesis, but it took too long time(about five days) to finish the C/RTL cosimulation. So I dropped the cosimulation and wanted to export the RTL , but now I get stuck at this step. The detail info VHLS told me is below, could you please help me out,thank you?

Starting export RTL ...
D:/Xilinx_Vivado_2016.4/Vivado_HLS/2016.4/bin/vivado_hls.bat D:/lishen/lijun/zynqnet-master/zyncnet_ls1/solution1/export.tcl
INFO: [HLS 200-10] Running 'D:/Xilinx_Vivado_2016.4/Vivado_HLS/2016.4/bin/unwrapped/win64.o/vivado_hls.exe'
INFO: [HLS 200-10] For user 'lishen' on host 'amax-pc' (Windows NT_amd64 version 6.1) on Mon Jan 02 20:21:39 +0800 2023
INFO: [HLS 200-10] In directory 'D:/lishen/lijun/zynqnet-master'
INFO: [HLS 200-10] Opening project 'D:/lishen/lijun/zynqnet-master/zyncnet_ls1'.
INFO: [HLS 200-10] Opening solution 'D:/lishen/lijun/zynqnet-master/zyncnet_ls1/solution1'.
INFO: [SYN 201-201] Setting up clock 'default' with a period of 10ns.
INFO: [HLS 200-10] Setting target device to 'xc7z045ffg900-2'
INFO: [IMPL 213-8] Exporting RTL as an IP in IP-XACT.

****** Vivado v2016.4 (64-bit)
**** SW Build 1756540 on Mon Jan 23 19:11:23 MST 2017
**** IP Build 1755317 on Mon Jan 23 20:30:07 MST 2017
** Copyright 1986-2016 Xilinx, Inc. All Rights Reserved.

source run_ippack.tcl -notrace
INFO: [IP_Flow 19-234] Refreshing IP repositories
INFO: [IP_Flow 19-1704] No user IP repositories specified
INFO: [IP_Flow 19-2313] Loaded Vivado IP repository 'D:/Xilinx_Vivado_2016.4/Vivado/2016.4/data/ip'.
WARNING: [IP_Flow 19-4832] The IP name 'fpga_top_ap_fadd_2_full_dsp_32' you have specified is long. The Windows operating system has path length limitations. It is recommended you use shorter names to reduce the likelihood of issues.
INFO: [IP_Flow 19-1686] Generating 'Synthesis' target for IP 'fpga_top_ap_fadd_2_full_dsp_32'...
INFO: [IP_Flow 19-1686] Generating 'Simulation' target for IP 'fpga_top_ap_fadd_2_full_dsp_32'...
WARNING: [IP_Flow 19-4832] The IP name 'fpga_top_ap_fcmp_0_no_dsp_32' you have specified is long. The Windows operating system has path length limitations. It is recommended you use shorter names to reduce the likelihood of issues.
INFO: [IP_Flow 19-1686] Generating 'Synthesis' target for IP 'fpga_top_ap_fcmp_0_no_dsp_32'...
INFO: [IP_Flow 19-1686] Generating 'Simulation' target for IP 'fpga_top_ap_fcmp_0_no_dsp_32'...
WARNING: [IP_Flow 19-4832] The IP name 'fpga_top_ap_fmul_1_max_dsp_32' you have specified is long. The Windows operating system has path length limitations. It is recommended you use shorter names to reduce the likelihood of issues.
INFO: [IP_Flow 19-1686] Generating 'Synthesis' target for IP 'fpga_top_ap_fmul_1_max_dsp_32'...
INFO: [IP_Flow 19-1686] Generating 'Simulation' target for IP 'fpga_top_ap_fmul_1_max_dsp_32'...
bad lexical cast: source type value could not be interpreted as target
while executing
"rdi::set_property core_revision 2301022021 {component component_1}"
invoked from within
"set_property core_revision $Revision $core"
(file "run_ippack.tcl" line 1042)
INFO: [Common 17-206] Exiting Vivado at Mon Jan 02 20:22:12 2023...
ERROR: [IMPL 213-28] Failed to generate IP.
command 'ap_source' returned error code
while executing
"source D:/lishen/lijun/zynqnet-master/zyncnet_ls1/solution1/export.tcl"
invoked from within
"hls::main D:/lishen/lijun/zynqnet-master/zyncnet_ls1/solution1/export.tcl"
("uplevel" body line 1)
invoked from within
"uplevel 1 hls::main {*}$args"
(procedure "hls_proc" line 5)
invoked from within
"hls_proc $argv"
Finished export RTL.

Starting export RTL ...
D:/Xilinx_Vivado_2016.4/Vivado_HLS/2016.4/bin/vivado_hls.bat D:/lishen/lijun/zynqnet-master/zyncnet_ls1/solution1/export.tcl
INFO: [HLS 200-10] Running 'D:/Xilinx_Vivado_2016.4/Vivado_HLS/2016.4/bin/unwrapped/win64.o/vivado_hls.exe'
INFO: [HLS 200-10] For user 'lishen' on host 'amax-pc' (Windows NT_amd64 version 6.1) on Mon Jan 02 20:31:18 +0800 2023
INFO: [HLS 200-10] In directory 'D:/lishen/lijun/zynqnet-master'
INFO: [HLS 200-10] Opening project 'D:/lishen/lijun/zynqnet-master/zyncnet_ls1'.
INFO: [HLS 200-10] Opening solution 'D:/lishen/lijun/zynqnet-master/zyncnet_ls1/solution1'.
INFO: [SYN 201-201] Setting up clock 'default' with a period of 10ns.
INFO: [HLS 200-10] Setting target device to 'xc7z045ffg900-2'
INFO: [IMPL 213-8] Exporting RTL as an IP in IP-XACT.

****** Vivado v2016.4 (64-bit)
**** SW Build 1756540 on Mon Jan 23 19:11:23 MST 2017
**** IP Build 1755317 on Mon Jan 23 20:30:07 MST 2017
** Copyright 1986-2016 Xilinx, Inc. All Rights Reserved.

source run_ippack.tcl -notrace
INFO: [IP_Flow 19-234] Refreshing IP repositories
INFO: [IP_Flow 19-1704] No user IP repositories specified
INFO: [IP_Flow 19-2313] Loaded Vivado IP repository 'D:/Xilinx_Vivado_2016.4/Vivado/2016.4/data/ip'.
WARNING: [IP_Flow 19-4832] The IP name 'fpga_top_ap_fadd_2_full_dsp_32' you have specified is long. The Windows operating system has path length limitations. It is recommended you use shorter names to reduce the likelihood of issues.
INFO: [IP_Flow 19-1686] Generating 'Synthesis' target for IP 'fpga_top_ap_fadd_2_full_dsp_32'...
INFO: [IP_Flow 19-1686] Generating 'Simulation' target for IP 'fpga_top_ap_fadd_2_full_dsp_32'...
WARNING: [IP_Flow 19-4832] The IP name 'fpga_top_ap_fcmp_0_no_dsp_32' you have specified is long. The Windows operating system has path length limitations. It is recommended you use shorter names to reduce the likelihood of issues.
INFO: [IP_Flow 19-1686] Generating 'Synthesis' target for IP 'fpga_top_ap_fcmp_0_no_dsp_32'...
INFO: [IP_Flow 19-1686] Generating 'Simulation' target for IP 'fpga_top_ap_fcmp_0_no_dsp_32'...
WARNING: [IP_Flow 19-4832] The IP name 'fpga_top_ap_fmul_1_max_dsp_32' you have specified is long. The Windows operating system has path length limitations. It is recommended you use shorter names to reduce the likelihood of issues.
INFO: [IP_Flow 19-1686] Generating 'Synthesis' target for IP 'fpga_top_ap_fmul_1_max_dsp_32'...
INFO: [IP_Flow 19-1686] Generating 'Simulation' target for IP 'fpga_top_ap_fmul_1_max_dsp_32'...
bad lexical cast: source type value could not be interpreted as target
while executing
"rdi::set_property core_revision 2301022031 {component component_1}"
invoked from within
"set_property core_revision $Revision $core"
(file "run_ippack.tcl" line 1042)
INFO: [Common 17-206] Exiting Vivado at Mon Jan 02 20:31:32 2023...
ERROR: [IMPL 213-28] Failed to generate IP.
command 'ap_source' returned error code
while executing
"source D:/lishen/lijun/zynqnet-master/zyncnet_ls1/solution1/export.tcl"
invoked from within
"hls::main D:/lishen/lijun/zynqnet-master/zyncnet_ls1/solution1/export.tcl"
("uplevel" body line 1)
invoked from within
"uplevel 1 hls::main {*}$args"
(procedure "hls_proc" line 5)
invoked from within
"hls_proc $argv"
Finished export RTL.

about how to open the project

Hi David.
Thanks for your sharing, I am a beginner. I am very interested in your design and I have download the code, but I found that I can not open this project by Vivado, could you please share the project file?
Thank you again
Best wishes
Shuo Wang

Final caffemodel file

Hi David,
Can I have final caffemodel of ZynqNet? Your thesis is amazing, but FPGA version of ZynqNet have not been verified completely. I've made some tests on ZynqNet (FPGA version, running on OZ745 Kit) and there're some interesting results. First, with 5000 image from ILSVRC2012 validation set, top 1 result is 54% and top-5 is 77%. Second, I config my FPGA clock at > 200MHz (it's easy, just config a system register by devmem command), but it still take 1.7s for 1 image. And the last is the result of FPGA version and software version (C++ calssification, caffe framework) is not the same within 1 image. So i think the caffemodel file you uploaded is not the same with the file weighs.bin.
About the second issue, I've read your code quite carefully, and I think the bottleneck is memory (DRAM) access, so even you clock at double clock rate, performance will not be doubled.
Anyway, you've made a great framework, a lot of people are grateful to you, including me 😄

Testbench with a stream image file in HLS

Hello, currently I'm working in a project that focus in a based on zynqnet CNN accelerator, the tool that used is Vivado HLS, any advices for creating an IP that has a stream image for run a testbench that simulate the CPU tasks(reading an image and processing the last layer results) and calls the IP accelerator? Thanks!

configure the clock rate

Hi David, I have read [#2comment] reference as ug585, and configure the clock rate to 200MHZ,but it didn't work.So I try another way by using the /sys/class/xdevcfg driver to configure the clock rate in zc706,which used your files (setup_fpga.txt). and I transfer setup_fpga.txt to setup_fpga.sh. when I configure the zc706, the UART receive the error:
tee: /sys/device/soc0/amba/f8007000.devcfg/fclk_export: I/O erro.
Can you please explain to me how to that? Thanks a lot.

#!/bin/sh

set -e

write_to () {
echo $2 | tee $1 > /dev/null
}

set_rate () {
c=fclk$1
rate=$2

d=$(readlink -f "/sys/class/xdevcfg/xdevcfg/device")

[ -d $d/fclk/$c ] || write_to "$d/fclk_export" $c

write_to $d/fclk/$c/enable 1
write_to $d/fclk/$c/set_rate $rate

echo "Set clock $c to " $(cat $d/fclk/$c/set_rate)

}

set clock FCLK0 to 125MHz

set_rate 0 200000000

load Bitstream onto FPGA

loadbit zynq_top.bit

About the Linux Operating System on board

Hi David, I read something about the Linux Operating System on Board in your report which I can't understand well.
Quoting : "However, the Zynqbox already runs a fully working and properly tested Linux installation and includes tools to load new bitstreams into the programmable logic. Due to lack of time, we strongly favored reusing this existing installation."
I am wondering how you "load new bitstreams into the PL" because I build a new linux system with a new boot.bin for reconfiguring PL every time to test a new bitstream after making some change and that didn't go well. So if you have some way to configure PL directly in the linux on board, dose that mean any working linux system can easily test the Accelerator by a simple command like "load bitstram"? Can you please explain to me how to that? Thanks a lot.

Questions about the results of output on FPGA

Hi, I've run the zynqnet on Xilinx ZC706 successfully. The result of top-5 is puzzling me though
I've set the AXI HP port to 32 bits. Did you meet this situation before ? Thanks a lot.
4

HLS Synthesis Errors & Warnings

Hi ,
i am unable to synthesis the accelerator , synthesis results in following errors and warnings:

ERROR: [HLS 200-70] Synthesizability check failed.
command 'ap_source' returned error code
while executing
"source /opt/Xilinx/Vivado/2017.4/bin/fpga/solution1/csynth.tcl"
invoked from within
"hls::main /opt/Xilinx/Vivado/2017.4/bin/fpga/solution1/csynth.tcl"
("uplevel" body line 1)
invoked from within
"uplevel 1 hls::main {*}$args"
(procedure "hls_proc" line 5)
invoked from within
"hls_proc $argv" fpga:solution1 Feb 1, 2018 1:28:37 AM

ERROR: [SYNCHK 200-61] ../../../../../home/harry/Downloads/zynqnet-master/_HLS_CODE/memory_controller.cpp:123:
unsupported memory access on variable 'SHARED_DRAM' which is (or contains) an array with unknown size at compile time. fpga:solution1 Feb 1, 2018 1:28:37 AM

Warnings are:
@W[GUI]:"port" is required; "bundle" "input" invalid; "" "" invalid

please help me to solve this .

Underutilization of DSP resources

Hi David,
I tried to reuse your code in some parts of a project. However, I have the problem when I synthesize the code on the FPGA by Vivado HLS. The problem is related to the DSP utilization. I reuse the following code as the dot-product in my convolutional engine. However, the DSP utilization is much lower than the value I expected. Also, I receive this warning message: "WARNING: [SYN 201-303] Cannot apply functional unit assignment of 'FAddSub_fulldsp'".

Could You please help me with that based on your experience? how can I generally control the resource allocation in the Vivado HLS other than the compiler directives?
Thanks,
Arash

void CAccelerator::vector_dot_product(const data_t in[Tn], const data_t w[Tn], data_t& result)
{
#pragma HLS INLINE

data_t accumulator = 0.0f;
#pragma HLS RESOURCE variable=accumulator core=FAddSub_fulldsp

data_t multresult[Tn];
#pragma HLS ARRAY_PARTITION variable = multresult complete dim = 0
#pragma HLS RESOURCE variable=multresult core=FAddSub_fulldsp

for(int h=0; h<Tn; h++)
#pragma HLS UNROLL factor=Tn
multresult[h]=in[h] +w[h];

for(int h=0; h<Tn; h++)
#pragma HLS UNROLL factor=Tn
accumulator = accumulator + multresult[h];

result = accumulator;
}

Problem in running C simuliation.

Hi, I've create a project in Vivado HLS 2016.4 for zynqnet, and started the "Run C Simulation". The output in console would stop at "Offload Conv Layer c10/p1:....." and it said "Program recived signal" and debug stoped at here.

input image for simulation

hello I'm currently working on your project and I want to know which kind image you used for your siumulation
Also, I'm actually trying to convert jpg image with the python code provided but that dosen't work
this is the line code " convert_image("‪f:\telecharge.jpg",225,225,"‪C:\Users\Asus\Desktop") "
is there an error.
thanks

HLS malloc function question

Hi David,

Below is the screenshot of HLS runs. Please ensure the files in TestBench and src are correctly placed. When I tried for C/RTL Cosimulation, I got error message attached below.
image

Thanks

about _FIRMWARE

Dear @dgschwend
I use the HLS generate the FPGA_TOP IPcore , and build the Vivado Block Designer to generate the zynqnet_200M.bit file. But _FIRMWARE confuse me.

  1. if I use SDK (cross-compile),then I copy all files(such as fpga_top.cpp ... cpu_top.cpp ,except document vivado_include and ZynqNet_Accelerator_HW_DEF) to SDK (is it right?) . But fpga_top.cpp is HLS cpp style , can SDK compile it? or SDK know HLS cpp style?
  2. if I use linux run on zynq xc7z045( compile directly on Zynq) , I copy all the files to linux (run on zynq xc7z045) , then run "Makefile", (is it right?)
  3. document vivado_include and ZynqNet_Accelerator_HW_DEF are use for what?
    Thank you

"OPMODE Input Warning" Discussion

I have tried to fixed the "OPMODE Input Warning" . And this warning sames coming for "fpga_top_fadd_32ng8j.v" alone, which adds two floating data (32bit) using DSP48E. Other files including "fixed-point data ADD/MUL" and "float-point data MUL/CMP" do not induce this warning. This conclusion has verified after varying the data type and function of the Xilinx Tutorial "ug871-design-files/Using_IP_with_Zynq/lab1/hls_macc" .

void hls_macc(int a, int b, int *accum, bool accum_clr)
{
static acc_reg = 0;
if (accum_clr)
acc_reg = 0;
acc_reg += a * b;
*accum = acc_reg;
}

It sames that using "fixed-point ADD" to replace "float-point ADD" is the only solution to avoid this warning. But it's really inelegant. And I still don't know the reason. Could any one give me more tips?

DEBUGGING the Code

Hi David

Hope You are fine. I have to ask You that can we compile and Debug Your code on a c++ compiler besides VIVADO HLS. I really want to debug your code to understand it fully. Thanks. waiting for Your reply

Khan
Master thesis student at Bosch, Reningen.

Can zynqnet run on xilinx zcu102?

Hi all, our lab has only a soc xilinx zcu102 and i'm a freshman of FPGA developing. I want to verify zynqnet on zcu102 so i have to transplant it from 7045 to zcu102. Are there any docs or other projects to help me run zynqnet on zcu102? Thank you all.

Another ZYNQ board

Hi, David. I am interested in you zynqnet. It's really a great work. Thanks for sharing. I learnt a lot from you report and code. So I am trying to test you CNN on my own ZedBoard these days and have met 2 problems.

  1. I have imported you HLS code files into my HLS project correctly according to one of your previous reply to @wswsamao and synthesised successfully. But I met these errors while running C simulation as showed in this picture:

I am not quite familiar with HLS. Could you please help me figuring out why this occured? Thank you. And this is a screenshot of my project for you to check if I made some primary mistake as a new learner of HLS.

  1. I checked the "Evaluation and Results" in your report and found that the resource ultilization exceeds the range of my ZedBoard. You used 996 BLOCK RAMs for you CNN design but I only get 500 on my board.
    So it seems I have to make some simplification to your zynqnet in order to test it on my board. And I know of course that will damage the performance of your CNN but I really want to make it work on my board. So could you give me some advice on how to shrinking down the size of your zynqnet without causing too much damage to the performance? Thanks. Looking forward to your reply.

about device and usage

Hi ! @dgschwend now I am learning CNN FPGA-based ,and so luckily to see your work here. I want to know whether it supports my FPGA. My FPGA is Xilinx Zynq UltraScale+MPSoC ZCU102.
if possible ,could you give me the steps about how to implement on FPGA?
I am a beginner ,so I am afraid I need more details.

Best regards!!

_FIRMWARE problem

Hi dgschwend:
I'm trying to port zynqnet on a ZC706 board.
I've successfully generated HLS IP using codes under _HLS_CODE, and creat a BD project and generate a ZC706 bit in vivado. Next I create a SDK project and import sources from _FIRMWARE folder.
But when I add all the source files under _FIRMWARE folder into Xilinx SDK for embeded arm develop, I met lots of errors, should source files such as processing_element.cpp which is oriented in generating logic be added into SDK project?

Thanks!

Is DMA needed?

Hi, could you please tell me where can I find your block design diagram, can we just connect the m_axi from the PL to the HP port on ps? could that work? or we need a DMA between them. Thanks in advance!

Conversion to System C

Hi David. Hope You are fine and doing well.

I am the beginner in System C. Can You please help me out that How to convert Your ZYNQNET C++ code into System C. Can You give me few tips. I would be highly thankful for that. Waiting for the reply. Thanks

Best Regards
Wasim Khan

axi_hp config to 32 bits

Hi Gschwend David,
I try to execute the command axi_hp_config 0 32 on the Zynqbox as you say on the _RUN.SH but it say me that it doesn't know this command. How can I config the axi to be 32 bits ? I am not able to run the network using FPGA, only using CPU.

Thank you

How to get weights.bin

Hi David,
Thank you so much for your ZyngNet and your thesis! I helps me a lot to learn CNN~^o^~
I'm interested in your framework. And I have realized your network on FPGA using weights you provided(weights.bin). Now I want to try some other networks e.g. VGG, so how can I convert the original .caffemodel file to the .bin file, which can be read by your code. I think you do something for transforming snapshot_iter_300280.caffemodel to weights.bin.

Thanks,

Input Image format

Submitted by @divyapraneetha:

Hi David,

I am working on a similar project. Modifying the network according to the FPGA resources I have. I would like to know the input image format being followed.

I have gone through changing image files to lmdb sets. but the code you have implemented takes .bin image.

I would like to know how did you define input formats.

Any help in this is highly appreciable.

Thanks
Divya

Having some quesions about files in _HLS_CODE and _FIRMWARE

Hi David,
I was using vivado HLS for the first time. I have some simple questions. cpu_top.cpp file in _HLS_CODE folder is not testbench ? and are those files in _FIRMWARE folder Compiled with SDK? is it the application is run on the board without linux or not ? I am look forward to your help. thanks.

How to debug/run the C project “zynqnet_sdk” in Xilinx SDK?

Hi, I've finished all the steps in HLS and Vivado. But when i launch the xilinx SDK and start to debug the C project "zynqnet_sdk" which has a main() in cpu_top.cpp, the xilinx SDK tells me "Launch Failed. Bianry not found". Did you meet this problem before?My project explorer info is as follow.

Image Pixel Values Range

Hi David

I want to ask You that If i convert the hexadecimal values from the "indata.bin" file into the corresponding floating point values then What range of the values should I expect ? . At the moment I am getting the range of the pixels values in negative and positive both. Is there any defined range of pixel values e.g. (0, 255) or (-255, 255) etc. Hoping for an early response. Thanks

Best Regards
Khan

TestBench Result: FAILURE (on FPGA)

When I run the project ,I found the top5 result is change. Why? And I also find the timing is slack when the ipcore running 200Mhz, Slack is -3ns.

the first:
Result (top-5):

31.17%: class 238 (output  10.26)
26.41%: class 264 (output  10.10)
11.45%: class 156 (output   9.26)
 3.58%: class 232 (output   8.10)
 2.90%: class 218 (output   7.89)

TestBench Result: FAILURE
Actual: 31.17, Expected: 88.38

the second:
Result (top-5):

 0.75%: class 346 (output   2.61)
 0.49%: class 711 (output   2.19)
 0.49%: class   9 (output   2.19)
 0.47%: class 727 (output   2.14)
 0.47%: class  25 (output   2.14)

TestBench Result: FAILURE
Actual: 0.75, Expected: 88.38
the third:
Result (top-5):

15.67%: class 148 (output   6.63)
 9.07%: class 548 (output   6.09)
 4.59%: class 598 (output   5.41)
 2.51%: class 782 (output   4.80)
 2.03%: class 904 (output   4.59)

TestBench Result: FAILURE
Actual: 15.67, Expected: 88.38
root@localhost:~/tmp#

Problems about C cimulation and ZYNQ Board

Hi, David. I am interested in you zynqnet. It's really a great work. Thanks for sharing. I learnt a lot from you report and code. So I am trying to test you CNN on my own ZedBoard these days and have met 2 problems.

  1. I have imported you HLS code files into my HLS project correctly according to one of your previous reply to @wswsamao and synthesised successfully. But I met these errors while running C simulatio:

FPGA: Computing .....@e Simulation failed: SIGSEGV.
@e [SIM-1] CSim failed with errors.
4
while executing
"source D:/Users/Administrator/zynqnet/solution1/csim.tcl"
invoked from within
"hls::main D:/Users/Administrator/zynqnet/solution1/csim.tcl"
("uplevel" body line 1)
invoked from within
"uplevel 1 hls::main {*}$args"
(procedure "hls_proc" line 5)
invoked from within
"hls_proc $argv"

I am not quite familiar with HLS. Could you please help me figuring out why this occured? Thank you.

  1. I checked the "Evaluation and Results" in your report and found that the resource ultilization exceeds the range of my ZedBoard. You used 996 BLOCK RAMs for you CNN design but I only get 500 on my board.
    So it seems I have to make some simplification to your zynqnet in order to test it on my board. And I know of course that will damage the performance of your CNN but I really want to make it work on my board. So could you give me some advice on how to shrinking down the size of your zynqnet without causing too much damage to the performance? Thanks. Looking forward to your reply.

Convert jpeg to binary

Hi :)
I've got some issues. Would you check it up for me?
I found a different result from the example you uploaded(indata.bin) while trying to convert puppy-500x350.jpg to test.bin.
After converting the puppy-500x350.jpg, I compared test.bin with indata.bin except for noise padding parts and I used half-fill, half-crop by uncommenting your code. Unfortunately, they don't have the same value exactly.
Here are my results.
[Note. Black pixels mean different values and White pixels are equivalent values]
image

I worried that I would get wrong binary files in this way when I converted other jpg/jpeg files.
Actually, I checked the low accuracy of test.bin I converted around 75% when trying in CPU simulation, which means whereas the original file (indata.bin) shows the accuracy around 88% but I couldn't get.

Below the image shows the result of CPU simulation.
[Note. Left one uses indata.bin Right one uses test.bin]
image

Plus, I attached files containing Python code I rewrote(JUST REMOVE comments and import matplotlib to display images), indata.bin and indata-puppy-rgb.bin(test.bin). Binary files were converted with Python 2.7.
suzinee_issue.zip

Could you tell me why I could't get the same result with yours?
and happy new year!

What's reg(T x) template's pipelining directive working for ?

template T reg(T x) {
#pragma HLS pipeline
#pragma HLS inline self off
#pragma HLS interface ap_ctrl_none register port=return
return x;
}

reg(T x) is used as a register to store the SHARED_DRAM data. Why should it be defined as pipeline? Thx.  

How to run the project on FPGA?

Hi dgschwend, thanks for your sharing. I am interested in the project and try to run it on Xilinx ZC706.
I have successfully run the HLS synthesis and create the bitstream. Then I make the Makefile in the folder‘_FIRMWARE’ and create the test.exe. It run perfectly in the PC Linux with ./test CPU indata.bin.
Then I modefiy the IP core according to the ZC706 and create the bitstream successfully.
However when I copy all the files and my bitstream to the Linux of the ZC706, and run the ./test FPGA indata.bin. The ZC706 stuck and I have to reboot it.
May I ask how do you run the project on FPGA ?(SDSoc or SDK or else) and any suggestions about the problem.

Latency and power consumption of each function

Hi David,
I'm try to determine latency and power consumption of each function, such as: ImageCache::preloadPixelFromDRAM, WeightsCache::loadFromDRAM... when it run on FPGA. I try to synthesis these functions on Vivado HLS to determine latency, but the report show that latency is 0. I think it isn't correct. Do you have any idea for problem? thank you so much.

Problem in generating bitstream with Vivado SDSOC

Dear David,

I'm a self driving car researcher of the university of Modena. I'm very interested in your work and I want to test the ZynqNet on Zynq Ultrascale.
I have tested the net with Vivado HLS simulation and it works perfectly, but when I try to synthetize fpga_top in hardware with Vivado SDSOC it returns me some DMAnalysis errors, in particular:

ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1305] Trying to overwrite port 'layer_i' on CF block 'hwblk_fpga_top'
ERROR: [DMAnalysis 83-1305] portType = stream, expected stream
ERROR: [DMAnalysis 83-1305] direction = in, expected in
ERROR: [DMAnalysis 83-1305] mode = slave, expected master
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1369] NULL destination port 0x55fc970
ERROR: [DMAnalysis 83-1378] NULL --> s_axi_axilite offset:0x10
ERROR: [DMAnalysis 83-1332] CF data model: port map [2] error. Block: hwblk_fpga_top, Comp: fpga_top_1
ERROR: [DMAnalysis 83-4447] Failed creating data motion network hardware!

I have tried to remove the slave HLS pragmas under the fpga_top declaration in fpga_top.cpp, but I get this port mapping error:

ERROR: [HSL2XD 83-101] The HLS function 'fpga_top' has an invalid port mapping. When any argument is mapped onto an axilite interface, the return value must be mapped to the same interface, e.g., with #pragma HLS interface s_axilite port=return bundle=s_axi_AXILiteS

Could yout give me some helps with these errors? I'm new in FPGA programming

Best Regards,
Gianluca

zynqnet on xilinx XC7Z020

Hello David,

I am interested in testing your network on the XC7Z020 SoC which is smaller than the XC7Z045 you were using. I can see that the main issue is with the bram utilization.
Do you have an idea how this could work? Are there any changes we could try? Or do we need to completely redesign?

Edit: we have already tried to cut the number of processing elements. Even with 1 there is over utilization. We can see that most BRAMs were used by WeightsCache.WBRAM.

Thank you!

input_offset and weight offsets

Hi David,

I'm net to both Deep Learning and HLS & trying to understand your code. In the following function,

void fpga_top(layer_t layer, data_t *SHARED_DRAM, unsigned int weights_offset, weightaddr_t num_weights, unsigned int input_offset);
what are "weights_offset" and "input_offset", used in fpga_top.hpp file?
Thank you in Advance.
Yashwant

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.