Giter Site home page Giter Site logo

xmchen1987 / pti-gpu Goto Github PK

View Code? Open in Web Editor NEW

This project forked from intel/pti-gpu

0.0 0.0 0.0 32.85 MB

Profiling Tools Interfaces for GPU (PTI for GPU) is a set of Getting Started Documentation and Tools Library to start performance analysis on Intel(R) Processor Graphics easily

License: MIT License

C++ 82.89% Python 13.08% C 0.23% CMake 3.80%

pti-gpu's Introduction

Profiling Tools Interfaces for GPU (PTI for GPU)

Overview

This repository describes the ways of collecting performance data for Intel(R) Processor Graphics and provides a set of samples that help to start.

License

Samples for Profiling Tools Interfaces for GPU (PTI for GPU) are distributed under the MIT License.

You may obtain a copy of the License at https://opensource.org/licenses/MIT

Supported OS

  • Linux

Windows support is under development

Supported Platforms

  • Intel(R) Processor Graphics Gen9 (formerly Skylake) and newer

Some samples may have higher hardware requirements

Regularly Tested Configurations

  • Ubuntu 20.04 with Intel(R) Iris(R) Plus Graphics 655

Profiling Chapters

  1. Runtime API Tracing
  2. Device Activity Tracing
  3. Binary/Source Correlation
  4. Metrics Collection
  5. Binary Instrumentation
  6. Code Annotation
  7. System Management

Profiling & Debug Tools

  • onetrace - host and device tracing tool for OpenCL(TM) and Level Zero backends with support of DPC++ (both for CPU and GPU) and OpenMP* GPU offload;
  • oneprof - GPU HW metrics collection tool for OpenCL(TM) and Level Zero backends with support of DPC++ and OpenMP* GPU offload;
  • ze_tracer - "Swiss army knife" for Level Zero API call tracing and profiling (former ze_intercept);
  • cl_tracer - "Swiss army knife" for OpenCL(TM) API call tracing and profiling;
  • gpuinfo - provides basic information about the GPUs installed in a system, and the list of HW metrics one can collect for it;
  • sysmon - Linux "top" like utility to monitor GPUs installed on a system;

Sample Tools & Utilities

  • tools for OpenCL(TM), DPC++ (with OpenCL(TM) backend) and OpenMP* GPU offload (with OpenCL(TM) backend):
    • cl_hot_functions - provides a list of hottest OpenCL(TM) API calls by backend (CPU and GPU);
    • cl_hot_kernels - provides a list of hottest OpenCL(TM) kernels by backend (CPU and GPU);
    • cl_debug_info - prints source and assembly (GEN ISA) for kernels on GPU;
    • cl_gpu_metrics - provides a list of hottest OpenCL(TM) GPU kernels along with percent of cycles it was active, stall and idle (based on continuous metrics collection mode);
    • cl_gpu_query - provides a list of hottest OpenCL(TM) GPU kernels along with percent of cycles it was active, stall and idle (based on query metrics collection mode);
  • tools for Level Zero, DPC++ (with Level Zero backend) and OpenMP* GPU offload (with Level Zero backend):
    • ze_hot_functions - provides a list of hottest Level Zero API calls;
    • ze_hot_kernels - provides a list of hottest Level Zero kernels;
    • ze_debug_info - prints source and assembly (GEN ISA) for kernels on GPU;
    • ze_metric_query - provides a list of hottest Level Zero GPU kernels along with percent of cycles it was active, stall and idle (metrics are collected in query mode);
    • ze_metric_streamer - provides a list of hottest Level Zero GPU kernels along with percent of cycles it was active, stall and idle (metrics are collected in streamer mode);
  • tools for OpenMP*:
    • omp_hot_regions - provides a list of hottest parallel (for CPU) and target (for GPU) OpenMP* regions;
  • tools for binary instrumentation:
    • gpu_inst_count - prints GPU kernel assembly (GEN ISA) annotated by instruction execution count;
    • gpu_perfmon_read - prints GPU kernel assembly (GEN ISA) annotated by specific HW metric, which is accumulated in EU PerfMon register;
  • utilities:
    • dpc_info - prints information on avaialble platforms and devices in DPC++;
    • ze_info - prints information on avaialble platforms and devices in Level Zero;
    • ze_metric_info - prints the list of HW metrics one can collect with the help of Level Zero;
    • gpu_perfmon_set - allows to choose HW metric for collection in EU PerfMon register;

Prerequisites

More information of what is needed for particular sample can be found on sample description page.

Build and Run

In general, to build samples one need to perform the following steps (specific instructions for particular sample can be found on sample description page):

cd <pti_root>/samples/<sample_root>
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make

To point out to specific headers and libraries one may use -DCMAKE_INCLUDE_PATH and -DCMAKE_LIBRARY_PATH options correspondingly, e.g.:

cmake -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_INCLUDE_PATH=/tmp/level_zero/include \
  -DCMAKE_LIBRARY_PATH=/tmp/level_zero/lib \
  ..

Run instructions may vary from sample to sample significantly, so they are provided on particular sample description page.

Testing

There is a way to build and test all the samples in one command, e.g.:

LD_LIBRARY_PATH=/usr/local/lib python <pti_root>/tests/run.py

In case of failed tests, error output will be available in stderr.log file.

It's also possible to test an exact sample or a group of samples, e.g.:

python <pti_root>/tests/run.py -s cl_hot_functions # build and test an exact sample "cl_hot_functions"
python <pti_root>/tests/run.py -s ze # build and test all L0 samples

To run tesing in debug mode one may use -d option, e.g.:

python <pti_root>/tests/run.py -s ze_gemm -d

The script creates build directory inside each sample folder while testing. To remove all of these folders, use:

python <pti_root>/tests/run.py -c

Tested software versions one may find in SOFTWARE file.

Known Issues

  1. On RHEL IGA library may not be found even after Intel(R) Graphics Compute Runtime for oneAPI Level Zero and OpenCL(TM) Driver installation. To fix it, make a link libiga64.so to libiga64.so.1, e.g.:
    cd /usr/lib64
    sudo ln -s libiga64.so.1 libiga64.so
    cd -
  2. On RHEL one may need to use newer compiler. To enable it, one may fix PATH and LD_LIBRARY_PATH variables, e.g.:
    export PATH=/opt/gcc/7.4.0/bin/:$PATH
    export LD_LIBRARY_PATH=/opt/gcc/7.4.0/lib:/opt/gcc/7.4.0/lib64:$LD_LIBRARY_PATH

(*) Other names and brands may be claimed as property of others

pti-gpu's People

Contributors

anton-v-gorshkov avatar zma2 avatar vladimir-tsymbal avatar al42and avatar eshulankina avatar ghgmc2 avatar jczaja avatar kcencele avatar rdower avatar vmustya avatar igorvorobtsov avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.