Giter Site home page Giter Site logo

stm32ai-perf's Introduction

MLPerf™ Tiny Deep Learning Benchmarks for STM32 devices

The goal of MLPerf™ Tiny is to provide a representative set of deep neural nets and benchmarking code to compare performance between embedded devices. These devices typically run at between 10MHz and 250MHz, and can perform inference using less then 50mW of power.

For this Benchmark we are suggesting 3 Boards:

In this guide, we will learn how to generate the projects for the previous boards.

Before You Start

You need to download and install the following software:

Also we need to download the following pretrained quantized model from MLCommons™ Github:

NUCLEO-H7A3ZI-Q Projects

In this section we will explain how to generate the different benchmark projects for the NUCLEO-H7A3ZI-Q board. We will take the Image Classification benchmark project as an example and then exactly the same steps should be repeated for the rest of the scenarios (Anomaly Detection, Keyword Spotting and Person Detection)

Generate the project using STM32CubeMX

1. Download the Image Classification model and then save it under

...\stm32ai-perf\STM32_H7A3ZI\image_classification

You should have something like this to start: plot

2. Open the .ioc file and follow the next steps to generate your project template:

Alt Text

Configure and build the project using STM32CubeIDE

After generating the project using STM32CubeMX you should have something like this in your workspace:

plot

1. Open the .cproject file and follow the next steps to configure your project:

Alt Text

2. Modify the main.c file:

2.1 Open main.c located under Project Explorer:

plot

2.2 Under Private includes add the following:

/* Private includes ----------------------------------------------------------*/
/* USER CODE BEGIN Includes */
#include "submitter_implemented.h"
/* USER CODE END Includes */

2.3 Add the following line to USER CODE Section 2:

  /* USER CODE BEGIN 2 */

  ee_benchmark_initialize();

  /* USER CODE END 2 */

2. Modify the stm32h7xx_it.c file:

2.1 Open stm32h7xx_it.c located under Project Explorer:

plot

2.2 Delete any initial code in the file located after the following lines:

/******************************************************************************/
/* STM32H7xx Peripheral Interrupt Handlers                                    */
/* Add here the Interrupt Handlers for the used peripherals.                  */
/* For the available peripheral interrupt handler names,                      */
/* please refer to the startup file (startup_stm32h7xx.s).                    */
/******************************************************************************/

2.3 Configure the TCM memory on the STM32H7A3 :
Under the project, open the file called STM32H7A3ZITXQ_FLASH.ld. Under this file :
- Specify the memory area :

/* Specify the memory areas */
MEMORY
{
  ITCMRAM (xrw)  : ORIGIN = 0x00000000, LENGTH = 64K
  FLASH (rx)     : ORIGIN = 0x08000000, LENGTH = 2048K
  DTCMRAM1 (xrw) : ORIGIN = 0x20000000, LENGTH = 128K
  /*DTCMRAM2 (xrw) : ORIGIN = 0x20010000, LENGTH = 64K*/
  RAM (xrw)      : ORIGIN = 0x24000000, LENGTH = 1024K
  RAM_CD (xrw)   : ORIGIN = 0x30000000, LENGTH = 128K
  RAM_SRD (xrw)  : ORIGIN = 0x38000000, LENGTH = 32K
}


- Set data sections, into DTCMRAM1 :

  /* Uninitialized data section */
  . = ALIGN(4);
  .bss :
  {
    /* This is used by the startup in order to initialize the .bss section */
    _sbss = .;         /* define a global symbol at bss start */
    __bss_start__ = _sbss;
    *(.bss)
    *(.bss*)
    *(COMMON)

    . = ALIGN(4);
    _ebss = .;         /* define a global symbol at bss end */
    __bss_end__ = _ebss;
  } >DTCMRAM1
 


3. Build the project in Release mode:

Set the project to the release mode as the following and then click on build :

plot

Flash the board using STM32CubeProg

Next we will demonstarte how to flash your NUCLEO-H7A3ZI-Q board with the binary file you generated in the previous step

1. Verify the board configuration:

Make sure that your board has the the following jumpers fitted:

plot

2. Flash the board:

Connect the board to the laptop using a usb cable and use STM32CubeProg to flash the board as the following:

plot

Energy mode setup

For energy mode the EE_CFG_ENERGY_MODE flag should be set to 1.

plot

You need to make sure that the MCU_RST and the MCU_VDD jumpers are not fitted.

The final setup should look like the following:

plot

Make sure to push the Reset Button (the Black Button) on the board before starting the benchmark software.

For more details about the energy benchmark and EEMBCs EnergyRunner™ benchmark framework please refer to this link

Performance mode setup

For Performance mode the EE_CFG_ENERGY_MODE flag should be set to 0.

plot

The board should be connected to the laptop using a usb cable and you need to make sure that the MCU_RST and the MCU_VDD jumpers are fitted.

Make sure to push the Reset Button (the Black Button) on the board before starting the benchmark software.

For more details about the Performance mode and EEMBCs EnergyRunner™ benchmark framework please refer to this link

NUCLEO-U575ZI-Q Projects

In this section we will explain how to generate the different benchmark projects for the NUCLEO-U575ZI-Q board. We will take the Anomaly Detection benchmark project as an example and then exactly the same steps should be repeated for the rest of the scenarios (Image Classification, Keyword Spotting and Person Detection)

Generate the project using STM32CubeMX

1. Download the Anomaly Detection model and then save it under

...\stm32ai-perf\STM32_U575ZI\anomaly_detection

You should have something like this to start:

plot

2. Open the .ioc file and follow the next steps to generate your project template:

Alt Text

Configure and build the project using STM32CubeIDE

After generating the project using STM32CubeMX you should have something like this in your workspace:

plot

1. Open the .cproject file and follow the next steps to configure your project:

Alt Text

2. Modify the main.c file:

2.1 Open main.c located under Project Explorer:

plot

2.2 Under Private includes add the following:

/* Private includes ----------------------------------------------------------*/
/* USER CODE BEGIN Includes */
#include "submitter_implemented.h"
/* USER CODE END Includes */

2.3 Add following code in USER CODE Section 2:

  /* USER CODE BEGIN 2 */

  ee_benchmark_initialize();

  /* USER CODE END 2 */

2. Modify the stm32u5xx_it.c file:

2.1 Open stm32u5xx_it.c located under Project Explorer:

plot

2.2 Delete any initial code in the file located after the following lines:

/******************************************************************************/
/* STM32U5xx Peripheral Interrupt Handlers                                    */
/* Add here the Interrupt Handlers for the used peripherals.                  */
/* For the available peripheral interrupt handler names,                      */
/* please refer to the startup file (startup_stm32u5xx.s).                    */
/******************************************************************************/

3. Build the project in Release mode:

Set the project to the release mode as the following and then click on build :

plot

Flash the board using STM32CubeProg

Next we will demonstarte how to flash your NUCLEO-U575ZI-Q board with the binary file you generated in the previous step

1. Verify the board configuration:

Make sure that your board has the the following jumpers fitted:

plot

2. Flash the board:

Connect the board to the laptop using a usb cable and use STM32CubeProg to flash the board as the following:

plot

Energy mode setup

For energy mode the EE_CFG_ENERGY_MODE flag should be set to 1.

plot

You need to make sure that the T_NRST and the IDD jumpers are not fitted.

The final setup should look like the following:

plot

Make sure to push the Reset Button (the Black Button) on the board before starting the benchmark software.

For more details about the energy benchmark and EEMBCs EnergyRunner™ benchmark framework please refer to this link

Performance mode setup

For Performance mode the EE_CFG_ENERGY_MODE flag should be set to 0.

plot

The board should be connected to the laptop using a usb cable and you need to make sure that the T_NRST and the IDD jumpers are fitted.

Make sure to push the Reset Button (the Black Button) on the board before starting the benchmark software.

For more details about the Performance mode and EEMBCs EnergyRunner™ benchmark framework please refer to this link

NUCLEO-L4R5ZI Projects

In this section we will explain how to generate the different benchmark projects for the NUCLEO-L4R5ZI board. We will take the Person Detection benchmark project as an example and then exactly the same steps should be repeated for the rest of the scenarios (Anomaly Detection, Keyword Spotting and Image Classification)

Generate the project using STM32CubeMX

1. Download the Person Detection model and then save it under

...\stm32ai-perf\STM32_L4R5ZI\person_detection

You should have something like this to start: plot

2. Open the .ioc file and follow the next steps to generate your project template:

Alt Text

Configure and build the project using STM32CubeIDE

After generating the project using STM32CubeMX you should have something like this in your workspace:

plot

1. Open the .cproject file and follow the next steps to configure your project:

Alt Text

2. Modify the main.c file:

2.1 Open main.c located under Project Explorer:

plot

2.2 Under Private includes add the following:

/* Private includes ----------------------------------------------------------*/
/* USER CODE BEGIN Includes */
#include "submitter_implemented.h"
/* USER CODE END Includes */

2.3 Add the following line to USER CODE Section 2:

  /* USER CODE BEGIN 2 */

  ee_benchmark_initialize();

  /* USER CODE END 2 */

2. Modify the stm32l4xx_it.c file:

2.1 Open stm32l4xx_it.c located under Project Explorer:

plot

2.2 Delete any initial code in the file located after the following lines:

/******************************************************************************/
/* STM32L4xx Peripheral Interrupt Handlers                                    */
/* Add here the Interrupt Handlers for the used peripherals.                  */
/* For the available peripheral interrupt handler names,                      */
/* please refer to the startup file (startup_stm32l4xx.s).                    */
/******************************************************************************/

3. Build the project in Release mode:

Set the project to the release mode as the following and then click on build :

plot

Flash the board using STM32CubeProg

Next we will demonstarte how to flash your NUCLEO-L4R5ZI board with the binary file you generated in the previous step

1. Verify the board configuration:

Make sure that your board has the the following jumpers fitted:

plot

2. Flash the board:

Connect the board to the laptop using a usb cable and use STM32CubeProg to flash the board as the following and before flashing your board make sure that your board is in single bank, for this you need to go under option bytes then user configuration and let DBANK bit unchecked.

plot

Energy mode setup

For energy mode the EE_CFG_ENERGY_MODE flag should be set to 1.

plot

You need to make sure that the MCU_RST and the IDD jumpers are not fitted.

The final setup should look like the following:

plot

Make sure to push the Reset Button (the Black Button) on the board before starting the benchmark software.

For more details about the energy benchmark and EEMBCs EnergyRunner™ benchmark framework please refer to this link

Performance mode setup

For Performance mode the EE_CFG_ENERGY_MODE flag should be set to 0.

plot

The board should be connected to the laptop using a usb cable and you need to make sure that the MCU_RST and the IDD jumpers are fitted.

Make sure to push the Reset Button (the Black Button) on the board before starting the benchmark Software.

For more details about the Performance mode and EEMBCs EnergyRunner™ benchmark framework please refer to this link

stm32ai-perf's People

Contributors

fbagstm avatar mahdichtourou94 avatar stmicroelectronics-github avatar tristm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stm32ai-perf's Issues

linker error

I tried to compile the person detection project for the stm32L4R5ZI board, but got the following linker errors:

arm-none-eabi-gcc -o "VWW01.elf" @"objects.list"  -l:NetworkRuntime710_CM4_GCC.a -mcpu=cortex-m4 -T"/home/mohamadk/src/stm/STM32CubeIDE/workspace_1.10.1/VWW01/STM32L4R5ZITX_FLASH.ld" --specs=nosys.specs -Wl,-Map="VWW01.map" -Wl,--gc-sections -static -L../Middlewares/ST/AI/Lib --specs=nano.specs -mfpu=fpv4-sp-d16 -mfloat-abi=hard -mthumb -u _printf_float -Wl,--start-group -lc -lm -Wl,--end-group
/opt/st/stm32cubeide_1.10.1_2/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.0.202111181127/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: ./tinyml_api/st_port.o:(.bss.htim5+0x0): multiple definition of `htim5'; ./Core/Src/main.o:(.bss.htim5+0x0): first defined here
/opt/st/stm32cubeide_1.10.1_2/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.0.202111181127/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: ./tinyml_api/st_port.o:(.bss.huart3+0x0): multiple definition of `huart3'; ./Core/Src/main.o:(.bss.huart3+0x0): first defined here
/opt/st/stm32cubeide_1.10.1_2/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.0.202111181127/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: ./tinyml_api/st_port.o:(.bss.hlpuart1+0x0): multiple definition of `hlpuart1'; ./Core/Src/main.o:(.bss.hlpuart1+0x0): first defined here
/opt/st/stm32cubeide_1.10.1_2/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.0.202111181127/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: ./tinyml_api/submitter_implemented.o:(.bss.aiInData_int+0x0): multiple definition of `aiInData_int'; ./tinyml_api/st_port.o:(.bss.aiInData_int+0x0): first defined here
/opt/st/stm32cubeide_1.10.1_2/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.0.202111181127/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: ./tinyml_api/submitter_implemented.o:(.bss.ai_output+0x0): multiple definition of `ai_output'; ./tinyml_api/st_port.o:(.bss.ai_output+0x0): first defined here
/opt/st/stm32cubeide_1.10.1_2/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.0.202111181127/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: ./tinyml_api/submitter_implemented.o:(.bss.ai_input+0x0): multiple definition of `ai_input'; ./tinyml_api/st_port.o:(.bss.ai_input+0x0): first defined here
/opt/st/stm32cubeide_1.10.1_2/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.0.202111181127/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: ./tinyml_api/submitter_implemented.o:(.bss.network+0x0): multiple definition of `network'; ./tinyml_api/st_port.o:(.bss.network+0x0): first defined here
/opt/st/stm32cubeide_1.10.1_2/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.0.202111181127/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: ./tinyml_api/submitter_implemented.o:(.bss.pool0+0x0): multiple definition of `pool0'; ./tinyml_api/st_port.o:(.bss.pool0+0x0): first defined here
/opt/st/stm32cubeide_1.10.1_2/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.0.202111181127/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: ./tinyml_api/submitter_implemented.o:(.bss.aiOutData+0x0): multiple definition of `aiOutData'; ./tinyml_api/st_port.o:(.bss.aiOutData+0x0): first defined here
/opt/st/stm32cubeide_1.10.1_2/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.0.202111181127/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: ./tinyml_api/submitter_implemented.o:(.bss.htim5+0x0): multiple definition of `htim5'; ./Core/Src/main.o:(.bss.htim5+0x0): first defined here
/opt/st/stm32cubeide_1.10.1_2/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.0.202111181127/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: ./tinyml_api/submitter_implemented.o:(.bss.huart3+0x0): multiple definition of `huart3'; ./Core/Src/main.o:(.bss.huart3+0x0): first defined here
/opt/st/stm32cubeide_1.10.1_2/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.0.202111181127/tools/bin/../lib/gcc/arm-none-eabi/10.3.1/../../../../arm-none-eabi/bin/ld: ./tinyml_api/submitter_implemented.o:(.bss.hlpuart1+0x0): multiple definition of `hlpuart1'; ./Core/Src/main.o:(.bss.hlpuart1+0x0): first defined here
collect2: error: ld returned 1 exit status

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.