Giter Site home page Giter Site logo

modelzoo's Introduction

Cerebras Model Zoo

Introduction

This repository contains examples of common deep learning models that can be trained on Cerebras hardware. These models demonstrate the best practices for coding a model targeted at the Cerebras hardware so that you can take full advantage of this new powerful compute engine.

In order to get started with running your models on a Cerebras system, please refer to the Developer Documentation along with this readme.

NOTE: If you are interested in trying out Cerebras Model Zoo on Cerebras Hardware (CS-2 Systems), we offer the following options:

  • Academics - Please fill out our Partner Hardware Access Request form here and we will contact you about gaining access to a system from one of our partners.
  • Commercial - Please fill out our Get Demo form here so that our team can provide you with a demo and discuss access to our system.
  • For all others - Please contact us at [email protected].

For a list of all supported models, please check models in this repository.

Supported frameworks

We support the models developed in PyTorch and TensorFlow.

Basic workflow

When you are targeting the Cerebras Wafer-Scale Cluster for your neural network jobs, please follow the quick start guide from the developer docs to compile, validate and train the models in this Model Zoo for the framework of your choice.

For advanced use cases and porting your existing code please refer to the developer docs.

Execution modes

On the Cerebras Wafer Scale Cluster you can run neural networks of different model sizes. Cerebras Software supports different execution modes to efficiently run such variety of models.

The execution mode refers to how the Cerebras runtime loads your neural network model onto the Cerebras Wafer Scale Engine (WSE). Two execution modes are supported:

  • Weight streaming: In this mode one layer of the neural network model is loaded at a time. This layer-by-layer mode is used to run extremely large models (with billions to trillions of parameters).
  • Layer pipelined: In this mode all the layers of the network are loaded altogether onto the Cerebras WSE. This mode is selected for neural network models of small to medium sized models (with less than a billion parameters).

You can get more information about this on the developer page section on Cerebras Execution Modes

Models in this repository

Model Layer Pipeline mode Weight Streaming mode
BERT TensorFlow code
PyTorch code
PyTorch code
BERT (fine-tuning) Classifier TensorFlow code
PyTorch code
-
BERT (fine-tuning) Named Entity Recognition TensorFlow code
PyTorch code
-
BERT (fine-tuning) Summarization TensorFlow code
PyTorch code
-
BERT (fine-tuning) Question Answering TensorFlow code
PyTorch code
-
GPT-2 TensorFlow code
PyTorch code
TensorFlow code
PyTorch code
GPT-3 - TensorFlow code
PyTorch code
GPT-J - TensorFlow code
PyTorch code
GPT-NeoX - TensorFlow code
PyTorch code
GPT-J (fine-tuning) Summarization - TensorFlow code
Linformer TensorFlow code -
RoBERTa TensorFlow code
PyTorch code
-
T5 TensorFlow code
PyTorch code
PyTorch code
Transformer TensorFlow code
PyTorch code
-
MNIST (fully connected) TensorFlow code
PyTorch code
-
UNet - PyTorch code

License

Apache License 2.0

modelzoo's People

Contributors

ankitj-cerebras avatar bhargav-cerebras avatar emad-cb avatar richardk-cerebras avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.