Giter Site home page Giter Site logo

pixiesunky / heterogeneity-aware-lowering-and-optimization Goto Github PK

View Code? Open in Web Editor NEW

This project forked from alibaba/heterogeneity-aware-lowering-and-optimization

0.0 1.0 0.0 30.12 MB

heterogeneity-aware-lowering-and-optimization

License: Apache License 2.0

CMake 0.93% TeX 0.03% C 1.48% C++ 96.44% Shell 0.24% Python 0.64% Makefile 0.04% Dockerfile 0.14% Cuda 0.07%

heterogeneity-aware-lowering-and-optimization's Introduction

License PRs Welcome

Testing (X86_64, GPU) Testing (IPU Simulator) API Publish

HALO

Heterogeneity-Aware Lowering and Optimization (HALO) is a heterogeneous computing acceleration platform based on the compiler technology. It exploits the heterogeneous computing power targeting the deep learning field through an abstract, extendable interface called Open Deep Learning API (ODLA). HALO provides a unified Ahead-Of-Time compilation solution, auto tailored for cloud, edge, and IoT scenarios.

HALO supports multiple compilation modes. Under the ahead-of-time (AOT) compilation mode, HALO compiles an AI model into the C/C++ code written in the ODLA APIs. The compiled model can be run on any supported platform with the corresponding ODLA runtime liibrary. Plus, HALO is able to compile both host and heterogeneous device code simultaneously. The picture below shows the overall compilation flow:

HALO has supported the compilation of models from the following frameworks:

  • Caffe
  • ONNX
  • TensorFlow
  • TFLite

More frameworks will be supported soon.

HALO supports Alibaba's first AI-inference chip: Hanguang-800 NPU via its HgAI SDK. Hanguang-800 NPU is designed by T-Head Semiconductor Co., Ltd. (also known as PingTouGe), a business entity of Alibaba Group.

A broad ODLA ecosystem is supported via the ODLA runtime library set targeting various heterogeneous accelerators/runtimes:

And we welcome new accelerator platforms to join in the ODLA community.

ODLA API Reference can be found here and detailed programming guide be coming soon...

Partners

We appreciate the support of ODLA runtimes from the following partners:

How to Use HALO

To build HALO, please follow the instructions here.

The workflow of deploying models using HALO includes:

  1. Use HALO to compile the model file(s) into an ODLA-based C/C++ source file.
  2. Use a C/C++ compiler to compile the generated C/C++ file into an object file.
  3. Link the object file, the weight binary, and specific ODLA runtime library together.

A Simple Example

Let's start with a simple example of MNIST based on TensorFlow Tutorial. The diagram below shows the overall workflow:

Brief explanations:

HALO generates 3 files:

  • mnist.h : the header file to be used by application.
  • mnist.cc : the ODLA C++ file that represents the model.
  • mnist.bin : the weights in ELF format.

To application, the inference is simply viewed as a function call mnist().

Note that, for portability purpose, HALO always exports functions in the C convention even though the output file model.cc is in the C++ format.

More detailed explanations can be found here. Example code can be found here

Please refer to HALO options list for all command line options.

More Examples

Contributing

We're always looking for help to improve HALO. See the Contributing Guide for more details. Thank you!

Resources

License

HALO is licensed under the Apache 2.0 License

heterogeneity-aware-lowering-and-optimization's People

Contributors

ahuizxc avatar alibaba-oss avatar alishenli avatar dj176050 avatar jackzipu avatar lingqingzz avatar lingyeai avatar littlefatfat avatar peng2007 avatar pengl avatar shuhand avatar tianboh avatar weifengz2016 avatar weimingzha0 avatar xuhongyao avatar yanwei-gr avatar youbeny avatar zars19 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.