Giter Site home page Giter Site logo

awesome-on-device-ai's Introduction

Welcome to Awesome On-device AI

Awesome PRs Welcome

A curated list of awesome projects and papers for AI on Mobile/IoT/Edge devices. Everything is continuously updating. Welcome contribution!

Contents

Papers/Tutorial

1. Learning on Devices

1.1 Memory Efficient Learning

  • [ICML'22] POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging. by Patil et al. [paper]
  • [NeruIPS'22] On-Device Training Under 256KB Memory. by Ji Lin, Song Han et al. [paper]
  • [MobiSys'22] Melon: breaking the memory wall for resource-efficient on-device machine learning. by Qipeng Wang et al. [paper]
  • [MobiSys'22] Sage: Memory-efficient DNN Training on Mobile Devices. by In Gim et al. 2022 [paper]

1.2 Learning Acceleration

  • [MobiCom'22] Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading. by Daliang Xu et al. [paper]

1.3 Learning on Mobile Cluster

  • [ICPP'22] Eco-FL: Adaptive Federated Learning with Efficient Edge Collaborative Pipeline Training. by Shengyuan Ye et al. [paper] [code]
  • [SEC'21] EDDL: A Distributed Deep Learning System for Resource-limited Edge Computing Environment. by Pengzhan Hao et al. [paper]

1.4 Measurement and Survey

  • [MobiSys'21 Workshop] Towards Ubiquitous Learning: A First Measurement of On-Device Training Performance. by Dongqi Chai, Mengwei Xu et al. [paper]

2. Inference on Devices

2.1 Collaborative Inference

  • [MobiSys'23] NN-Stretch: Automatic Neural Network Branching for Parallel Inference on Heterogeneous Multi-Processors. by USTC & Microsoft. [paper]
  • [MobiSys'22] CoDL: efficient CPU-GPU co-execution for deep learning inference on mobile devices. by Fucheng Jia et al. [paper]
  • [InfoCom'22] Distributed Inference with Deep Learning Models across Heterogeneous Edge Devices. by Chenghao hu et al. [paper]
  • [TON'20] Coedge: Cooperative dnn inference with adaptive workload partitioning over heterogeneous edge devices. by Liekang Zeng et al. [paper]
  • [ICCD'20] A distributed in-situ CNN inference system for IoT applications. by Jiangsu Du et al. [paper]
  • [TPDS'20] Model Parallelism Optimization for Distributed Inference via Decoupled CNN Structure. by Jiangsu Du et al. [paper]
  • [EuroSys'19] μLayer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Processor-Friendly Quantization. by Youngsok Kim et al. [paper]
  • [TCAD'18] DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters. by zhuoran Zhao et al. [paper]
  • [DATE'17] Modnn: Local distributed mobile computing system for deep neural network. by Jiachen Mao et al. [paper]

2.2 Latency Prediction for Inference

  • [MobiSys'21] nn-Meter: towards accurate latency prediction of deep-learning model inference on diverse edge devices. by Li Lyna Zhang et al. [paper]

2.3 Multi-DNN Serving

  • [MobiSys'22] Band: coordinated multi-DNN inference on heterogeneous mobile processors. by Seoul National University et al. [paper]

2.4 DNN Arch./Op.-level Optimization

  • [MobiSys'23] ConvReLU++: Reference-based Lossless Acceleration of Conv-ReLU Operations on Mobile CPU. by Shanghai Jiao Tong University [paper]

3. Models for Mobile

3.1 Lightweight Model

  • [ACL'20] MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices. by Zhiqing Sun et al. [paper]
  • [ICML'19] EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. by Mingxing Tan et al. [paper]
  • [CVPR'18] Shufflenet: An extremely efficient convolutional neural network for mobile devices. by Xiangyu Zhang et al.[paper]
  • [CVPR'18] MobileNetV2: Inverted Residuals and Linear Bottlenecks. by Mark Sandler et al. [paper]

4. On-device AI Application

4.1 On-device NLP

  • [Ubicomp'18] DeepType: On-Device Deep Learning for Input Personalization Service with Minimal Privacy Concern. by Mengwei Xu et al. [paper]
  • [Arxiv 2018] Federated learning for mobile keyboard prediction. by Google [paper]

5. Survey and Tutorial

5.1 Tutorial

  • [CVPR'23 Tutorial] Efficient Neural Networks: From Algorithm Design to Practical Mobile Deployments. by Snap Research [paper]

Open Source Projects

1. DL Framework on Mobile

  • Tensorflow Lite: Deploy machine learning models on mobile and edge devices. by Google. [code]
  • TensorflowJS: A WebGL accelerated JavaScript library for training and deploying ML models. by Google. [code]
  • MNN: A Universal and Efficient Inference Engine. by Alibaba. [code]

2. Inference Deployment

  • TensorRT: A C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators. by Nvidia. [code]
  • TVM: Open deep learning compiler stack for cpu, gpu and specialized accelerators. by Tianqi Chen et al. [code]
  • MACE: a deep learning inference framework optimized for mobile heterogeneous computing platforms. by XiaoMi. [code]
  • NCNN: a high-performance neural network inference framework optimized for the mobile platform. by Tencent. [code]

3. Open Source Auto-Parallelism Framework

3.1 Pipeline Parallelism

  • Pipeline Parallelism for PyTorch by Pytorch. [code]
  • A Gpipe implementation in Pytorch by Kakaobrain. [code]

Contribute

All contributions to this repository are welcome. Open an issue or send a pull request.

awesome-on-device-ai's People

Contributors

dujiangsu avatar ysyisyourbrother avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.