Giter Site home page Giter Site logo

thexin7 / model_navigator Goto Github PK

View Code? Open in Web Editor NEW

This project forked from triton-inference-server/model_navigator

0.0 0.0 0.0 2.22 MB

The Triton Model Navigator is a tool that provides the ability to automate the process of model deployment on the Triton Inference Server.

License: Apache License 2.0

Shell 2.56% Python 97.23% Makefile 0.21%

model_navigator's Introduction

Triton Model Navigator

Model optimization plays a crucial role in unlocking the maximum performance capabilities of the underlying hardware. By applying various transformation techniques, models can be optimized to fully utilize the specific features offered by the hardware architecture to improve the inference performance and cost. Furthermore, in many cases allow for serialization of models, separating them from the source code. The serialization process enhances portability, allowing the models to be seamlessly deployed in production environments. The decoupling of models from the source code also facilitates maintenance, updates, and collaboration among developers. However, this process comprises multiple steps and offers various potential paths, making manual execution complicated and time-consuming.

The Triton Model Navigator offers a user-friendly and automated solution for optimizing and deploying machine learning models. Using a single entry point for various supported frameworks, allowing users to start the process of searching for the best deployment option with a single call to the dedicated optimize function. Model Navigator handles model export, conversion, correctness testing, and profiling to select optimal model format and save generated artifacts for inference deployment on the PyTriton or Triton Inference Server.

The high-level flowchart below illustrates the process of moving models from source code to deployment optimized formats with the support of the Model Navigator:

Overview

Documentation

The full documentation about optimizing models, using Navigator Package and deploying models in PyTriton and/or Triton Inference Server can be found in documentation.

Support Matrix

The Model Navigator generates multiple optimized and production-ready models. The table below illustrates the model formats that can be obtained by using the Model Navigator with various frameworks.

Table: Supported conversion target formats per each supported Python framework or file.

PyTorch TensorFlow 2 JAX ONNX
Torch Compile SavedModel SavedModel TensorRT
TorchScript Trace TensorRT in TensorFlow TensorRT in TensorFlow
TorchScript Script ONNX ONNX
Torch-TensorRT TensorRT TensorRT
ONNX
TensorRT

Note: The Model Navigator has the capability to support any Python function as input. However, in this particular case, its role is limited to profiling the function without generating any serialized models.

The Model Navigator stores all artifacts within the navigator_workspace. Additionally, it provides an option to save a portable and transferable Navigator Package - an artifact that includes only the models with minimal latency and maximal throughput. This package also includes base formats that can be used to regenerate the TensorRT plan on the target hardware.

Table: Model formats that can be generated from saved Navigator Package and from model sources.

From model source From Navigator Package
SavedModel TorchTensorRT
TensorFlowTensorRT TensorRT in TensorFlow
TorchScript Trace ONNX
TorchScript Script TensorRT
Torch 2 Compile
TorchTensorRT
ONNX
TensorRT

Installation

The following prerequisites must be fulfilled to use Triton Model Navigator

  • Installed Python 3.8+
  • Installed NVIDIA TensorRT for TensorRT models export.

We recommend to use NGC Containers for PyTorch and TensorFlow which provide have all necessary dependencies:

The package can be installed from pypi.org using extra index url:

pip install -U --extra-index-url https://pypi.ngc.nvidia.com triton-model-navigator[<extras,>]

or with nvidia-pyindex:

pip install nvidia-pyindex
pip install -U triton-model-navigator[<extras,>]

To install Triton Model Navigator from source use pip command:

$ pip install --extra-index-url https://pypi.ngc.nvidia.com .[<extras,>]

Extras:

  • tensorflow - Model Navigator with dependencies for TensorFlow2
  • jax - Model Navigator with dependencies for JAX

For using with PyTorch no extras are needed.

Quick Start

This sections describe simple steps of optimizing the model for serving inference on PyTriton or Triton Inference Server as well as saving a Navigator Package for distribution.

Optimize Model

Optimizing models using Model Navigator is as simply as calling optimize function. The optimization process requires at least:

  • model - a Python object, callable or file path with model to optimize.
  • dataloader - a method or class generating input data. The data is utilized to determine the maximum and minimum shapes of the model inputs and create output samples that are used during the optimization process.

Here is an example of running optimize on Torch Hub ResNet50 model:

import torch
import model_navigator as nav

package = nav.torch.optimize(
    model=torch.hub.load('NVIDIA/DeepLearningExamples:torchhub', 'nvidia_resnet50', pretrained=True).eval(),
    dataloader=[torch.randn(1, 3, 256, 256) for _ in range(10)],
)

Once the model has been optimized the created artifacts are stored in navigator_workspace and a Package object is returned from the function. Read more about optimize in documentation

Deploy model in PyTriton

The PyTriton can be used to serve inference of any optimized format. Model Navigator provide a dedicated PyTritonAdapter to retrieve the runner and other information required to bind model for serving inference. The runner is an abstraction that connects the model checkpoint with its runtime, making the inference process more accessible and straightforward.

Following that, you can initialize the PyTriton server using the adapter information:

pytriton_adapter = nav.pytriton.PyTritonAdapter(package=package, strategy=nav.MaxThroughputStrategy())
runner = pytriton_adapter.runner

runner.activate()


@batch
def infer_func(**inputs):
    return runner.infer(inputs)


with Triton() as triton:
    triton.bind(
        model_name="resnet50",
        infer_func=infer_func,
        inputs=pytriton_adapter.inputs,
        outputs=pytriton_adapter.outputs,
        config=pytriton_adapter.config,
    )
    triton.serve()

Read more about deploying model on PyTriton in documentation

Deploy model in Triton Inference Server

The optimized model can be also used for serving inference on Triton Inference Server when the serialized format has been created. Model Navigator provide functionality to generate a model deployment configuration directly inside Triton model_repository. The following command will select the model format with the highest throughput and create the Triton deployment in defined path to model repository:

nav.triton.model_repository.add_model_from_package(
    model_repository_path=pathlib.Path("model_repository"),
    model_name="resnet50",
    package=package,
    strategy=nav.MaxThroughputStrategy(),
)

Once the entry is created, you can simply start Triton Inference Server mounting the defined model_repository_path. Read more about deploying model on Triton Inference Server in documentation

Using Navigator Package

The Navigator Package is an artifact that can be produced at the end of the optimization process. The package is a simple Zip file which contains the optimization details, model metadata and serialized formats and can be saved using:

nav.package.save(
    package=package,
    path="/path/to/package.nav"
)

The package can be easily loaded on other machines and used to re-run the optimization process or profile the model. Read more about using package in documentation.

Examples

We provide step-by-step examples that demonstrate how to use various features of Model Navigator. For the sake of readability and accessibility, we use a simple torch.nn.Linear model as an example. These examples illustrate how to optimize, test and deploy the model on the PyTriton and Triton Inference Server.

Useful Links

model_navigator's People

Contributors

jkosek avatar kacper-kleczewski avatar ptarasiewicznv avatar jzakrzew avatar pziecina-nv avatar piotrm-nvidia avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.