Giter Site home page Giter Site logo

paddlepaddle / serving Goto Github PK

View Code? Open in Web Editor NEW
876.0 97.0 243.0 159.43 MB

A flexible, high-performance carrier for machine learning models(『飞桨』服务化部署框架)

License: Apache License 2.0

CMake 5.98% C++ 51.67% Shell 7.98% Python 26.93% Go 4.37% Roff 0.13% PHP 0.16% Java 2.40% Cuda 0.38%
paddle-serving rpc-service gpu python docker serving pipeline paddle deep-learning online-service

serving's Introduction

(简体中文|English)




Build Status Docs Release Python License Forks Issues Contributors Community


【更新说明】 我们在新开源项目FastDeploy里面,基于Triton Inference Server,集成FastDeploy Runtime(包括Paddle Inference、ONNX Runtime、TensorRT以及OpenVINO等),可支持飞桨模型的高性能服务化部署,对服务化部署有需求的开发者,可以参考如下文档进行使用,有任何问题,欢迎在FastDeploy开源项目里通过issue反馈。

Paddle Serving 依托深度学习框架 PaddlePaddle 旨在帮助深度学习开发者和企业提供高性能、灵活易用的工业级在线推理服务。Paddle Serving 支持 RESTful、gRPC、bRPC 等多种协议,提供多种异构硬件和多种操作系统环境下推理解决方案,和多种经典预训练模型示例。核心特性如下:

  • 集成高性能服务端推理引擎 Paddle Inference 和端侧引擎 Paddle Lite,其他机器学习平台(Caffe/TensorFlow/ONNX/PyTorch)可通过 x2paddle 工具迁移模型
  • 具有高性能 C++ Serving 和高易用 Python Pipeline 2套框架。C++ Serving 基于高性能 bRPC 网络框架打造高吞吐、低延迟的推理服务,性能领先竞品。Python Pipeline 基于 gRPC/gRPC-Gateway 网络框架和 Python 语言构建高易用、高吞吐推理服务框架。技术选型参考技术选型
  • 支持 HTTP、gRPC、bRPC 等多种协议;提供 C++、Python、Java 语言 SDK
  • 设计并实现基于有向无环图(DAG) 的异步流水线高性能推理框架,具有多模型组合、异步调度、并发推理、动态批量、多卡多流推理、请求缓存等特性
  • 适配 x86(Intel) CPU、ARM CPU、Nvidia GPU、昆仑 XPU、华为昇腾310/910、海光 DCU、Nvidia Jetson 等多种硬件
  • 集成 Intel MKLDNN、Nvidia TensorRT 加速库,以及低精度量化推理
  • 提供一套模型安全部署解决方案,包括加密模型部署、鉴权校验、HTTPs 安全网关,并在实际项目中应用
  • 支持云端部署,提供百度云智能云 kubernetes 集群部署 Paddle Serving 案例
  • 提供丰富的经典模型部署示例,如 PaddleOCR、PaddleClas、PaddleDetection、PaddleSeg、PaddleNLP、PaddleRec 等套件,共计40+个预训练精品模型
  • 支持大规模稀疏参数索引模型分布式部署,具有多表、多分片、多副本、本地高频 cache 等特性、可单机或云端部署
  • 支持服务监控,提供基于普罗米修斯的性能数据统计及端口访问

教程与案例

论文

文档

部署

此章节引导您完成安装和部署步骤,强烈推荐使用Docker部署Paddle Serving,如您不使用docker,省略docker相关步骤。在云服务器上可以使用Kubernetes部署Paddle Serving。在异构硬件如ARM CPU、昆仑XPU上编译或使用Paddle Serving可阅读以下文档。每天编译生成develop分支的最新开发包供开发者使用。

使用

安装Paddle Serving后,使用快速开始将引导您运行Serving。具体步骤如下:

第一步,调用模型保存接口,生成模型参数配置文件(.prototxt)用以在客户端和服务端使用;

第二步,阅读配置和启动参数并启动服务;

第三步,根据API和您的使用场景,基于SDK编写客户端请求,并测试推理服务。您想了解跟多特性的使用场景和方法,请详细阅读以下文档。

开发者

为Paddle Serving开发者,提供自定义OP,变长数据处理。

模型库

Paddle Serving与Paddle模型套件紧密配合,实现大量服务化部署,包括图像分类、物体检测、语言文本识别、中文词性、情感分析、内容推荐等多种类型示例,以及Paddle全链条项目,共计46个模型。

图像分类与识别 NLP 推荐系统 人脸识别 目标检测 文字识别 图像分割 关键点检测 视频理解
14 6 3 1 10 8 2 1 1

更多模型示例进入模型库

社区

您想要同开发者和其他用户沟通吗?欢迎加入我们,通过如下方式加入社群

微信

  • 微信用户请扫码

QQ

  • 飞桨推理部署交流群(Group No.:697765514)

贡献代码

如果您想为Paddle Serving贡献代码,请参考 Contribution Guidelines(English)

  • 感谢 @w5688414 提供 NLP Ernie Indexing 案例
  • 感谢 @loveululu 提供 Cube python API
  • 感谢 @EtachGu 更新 docker 使用命令
  • 感谢 @BeyondYourself 提供grpc教程,更新FAQ教程,整理文件目录。
  • 感谢 @mcl-stone 提供faster rcnn benchmark脚本
  • 感谢 @cg82616424 提供unet benchmark脚本和修改部分注释错误
  • 感谢 @cuicheng01 提供PaddleClas的11个模型
  • 感谢 @Jiaqi Liu 新增list[str]类型输入的预测支持
  • 感谢 @Bin Lu 提供PP-Shitu C++模型示例

反馈

如有任何反馈或是bug,请在 GitHub Issue提交

License

Apache 2.0 License

serving's People

Contributors

badangel avatar barrierye avatar beyondyourself avatar bjjwwang avatar bohaowu avatar cg82616424 avatar dyning avatar felixhjh avatar gentelyang avatar gongweibao avatar guru4elephant avatar hextostring avatar intsigstephon avatar jiangjiajun avatar ldoublev avatar liuchiachi avatar loveululu avatar mcl-stone avatar mrxlt avatar mycaster avatar paddlepm avatar shiningzhang avatar suoych avatar teslazhao avatar wangguibao avatar wangxicoding avatar wispedia avatar yangruihaha avatar zhangjun avatar zhangyulongg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

serving's Issues

关于编译SDK-CPP部分时的错误

一些环境信息
系统:Ubuntu 16.04.5 LTS
gcc/g++: 5.4.0
cmake: 3.5.1
python: 2.7.12

出现的情况
CMake过程中没有错误,顺利完成Configure。之后在make的过程中,前面是正常的,但是在编译SDK-CPP部分的时候会报出一些编译错误。下面截取其中主要的一些报错信息(基本都是语法错误):
1 /usr/include/c++/5/backward/hash_fun.h:136:12: error: previous definition of ‘struct __gnu_cxx::hash’
struct hash
2 serving/build/third_party/install/brpc/include/butil/strings/string_piece.h:416:70: error: ‘bool butil::operator!=(const StringPiece16&, const StringPiece16&)’ must have an argument of class or enumerated type
inline bool operator!=(const StringPiece16& x, const StringPiece16& y) {
3 serving/build/third_party/install/brpc/include/butil/strings/string_piece.h:378:53: error: ‘string16’ was not declared in this scope
extern template class BUTIL_EXPORT BasicStringPiece;
4 serving/build/third_party/install/brpc/include/butil/strings/string16.h:180:25: error: ‘basic_string’ is not a class template
class BUTIL_EXPORT std::basic_string<butil::char16, butil::string16_char_traits>;

个人的想法
个人是觉得是由于自己的环境导致的,但是在CMake自动配置的时候没有报错。所以,想请您给一些建议,或者将您编译成功的环境告知一下(gcc版本,系统版本等)。

gitsubmodule core/general-client/pybind11 seems to be broken

I encountered an error when building PaddleServing in CLIENT_ONLY mode:

[root@my_host_name build_client]# cmake -DCLIENT_ONLY=ON ..
-- Found Paddle host system: centos, version: 6.10
-- Found Paddle host system's CPU: 12 cores
-- CXX compiler: /opt/rh/devtoolset-2/root/usr/bin/c++, version: GNU 4.8.2
-- C compiler: /opt/rh/devtoolset-2/root/usr/bin/cc, version: GNU 4.8.2
-- Do not have AVX2 intrinsics and disabled MKL-DNN
-- BOOST_TAR: boost_1_41_0, BOOST_URL: http://paddlepaddledeps.cdn.bcebos.com/boost_1_41_0.tar.gz
-- Protobuf protoc executable: /home/Serving/build_client/third_party/install/protobuf/bin/protoc
-- Protobuf-lite library: /home/Serving/build_client/third_party/install/protobuf/lib/libprotobuf-lite.a
-- Protobuf library: /home/Serving/build_client/third_party/install/protobuf/lib/libprotobuf.a
-- Protoc library: /home/Serving/build_client/third_party/install/protobuf/lib/libprotoc.a
-- Protobuf version: 3.1
-- ssl:/usr/lib64/libssl.so
-- crypto:/usr/lib64/libcrypto.so
paddle serving source dir: /home/Serving
CMake Error at core/general-client/CMakeLists.txt:2 (add_subdirectory):
  The source directory

    /home/Serving/core/general-client/pybind11

  does not contain a CMakeLists.txt file.


CMake Error at core/general-client/CMakeLists.txt:3 (pybind11_add_module):
  Unknown CMake command "pybind11_add_module".

The error message says that /home/Serving/core/general-client/pybind11 directory does not contain a CMakeLists.txt file, and it turns out to be an empty directory. pybind11 has been changed to a git submodule in commit 546f164 , but this submodule is not properly configured, which makes git submodule update command fail.

The folder icon of pybind11 submodule on github is also grey, which indicates that the submodule points to an unreachable location.

publish imdb performance benchmark

please public prediction benchmark with predict() and batch_predict() interface.
For imdb task, the model should run on cpu device and model inference time should be profiled.

#215

Cube和Serving的配置过于复杂

易用性问题反馈:

  1. Cube和Serving启动的时候,配置文件复杂难以配合。
  2. 缺乏一个Demo模型 (稀疏 + Dense),本地自动加载并能够提供预测服务。

**serving二进制可以自动下载**

针对cpu、gpu版本,将server分成两组whl包,并能够根据包下载不同版本的二进制
预期支持:cpu+mkl+avx;cpu+openblas+avx;cpu+openblas+sse;gpu+mkl+avx

Should have a default gpu id?

File "test_gpu_server.py", line 26, in
server.run_server()
File "/home/users/dongdaxiang/paddle_whls/custom_op/paddle_release_home/python/lib/python2.7/site-packages/paddle_serving_server_gpu/init.py", line 264, in run_server
self.gpuid,
AttributeError: 'Server' object has no attribute 'gpuid'

pip可安装

支持paddle-serving-client的pip可安装
支持paddle-serving-server的cpu/gpu版本可安装,包含不同的lib版本

** Refine General Infer Op **

目前general infer op的输出,直接通过response发出,不适合在infer的结果之上进行二次计算,例如ernie model中的pooling操作,需要将general infer op的输出放到op的output中;此外,增加一个将input放到response中发出的general_response_op,用来负责响应rpc

Build with gcc 5.4.0 error

System: 16.04.6 LTS (Xenial Xerus)
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.11)

Error message:

In file included from /usr/include/c++/5/bits/ios_base.h:41:0,
                 from /usr/include/c++/5/ios:42,
                 from /usr/include/c++/5/ostream:38,
                 from /usr/include/c++/5/iterator:64,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/detail/iterator.hpp:54,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/iterator/iterator_categories.hpp:10,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/unordered/detail/table.hpp:14,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/unordered/detail/equivalent.hpp:10,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/unordered/unordered_map.hpp:19,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/unordered_map.hpp:16,
                 from /home/tangzhiyi01/ctr/Serving/sdk-cpp/include/common.h:26,
                 from /home/tangzhiyi01/ctr/Serving/sdk-cpp/include/config_manager.h:18,
                 from /home/tangzhiyi01/ctr/Serving/sdk-cpp/src/config_manager.cpp:15:
/usr/include/c++/5/bits/locale_classes.h:284:24: error: reference to ‘basic_string’ is ambiguous
       operator()(const basic_string<_Char, _Traits, _Alloc>& __s1,
                        ^
In file included from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/functional/hash/extensions.hpp:17:0,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/functional/hash/hash.hpp:477,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/functional/hash.hpp:6,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/unordered/unordered_map.hpp:17,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/unordered_map.hpp:16,
                 from /home/tangzhiyi01/ctr/Serving/sdk-cpp/include/common.h:26,
                 from /home/tangzhiyi01/ctr/Serving/sdk-cpp/include/config_manager.h:18,
                 from /home/tangzhiyi01/ctr/Serving/sdk-cpp/src/config_manager.cpp:15:
/home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/detail/container_fwd.hpp:61:65: note: candidates are: template<class charT, class traits, class Allocator> class std::basic_string
     template <class charT, class traits, class Allocator> class basic_string;
                                                                 ^
In file included from /usr/include/c++/5/string:39:0,
                 from /usr/include/c++/5/stdexcept:39,
                 from /usr/include/c++/5/array:38,
                 from /usr/include/c++/5/tuple:39,
                 from /usr/include/c++/5/bits/stl_map.h:63,
                 from /usr/include/c++/5/map:61,
                 from /home/tangzhiyi01/ctr/Serving/sdk-cpp/include/config_manager.h:16,
                 from /home/tangzhiyi01/ctr/Serving/sdk-cpp/src/config_manager.cpp:15:
/usr/include/c++/5/bits/stringfwd.h:71:11: note:                 template<class _CharT, class _Traits, class _Alloc> class std::__cxx11::basic_string
     class basic_string;
           ^
In file included from /usr/include/c++/5/bits/ios_base.h:41:0,
                 from /usr/include/c++/5/ios:42,
                 from /usr/include/c++/5/ostream:38,
                 from /usr/include/c++/5/iterator:64,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/detail/iterator.hpp:54,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/iterator/iterator_categories.hpp:10,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/unordered/detail/table.hpp:14,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/unordered/detail/equivalent.hpp:10,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/unordered/unordered_map.hpp:19,
                 from /home/tangzhiyi01/ctr/Serving/build/third_party/boost/src/extern_boost/boost/unordered_map.hpp:16,
                 from /home/tangzhiyi01/ctr/Serving/sdk-cpp/include/common.h:26,
                 from /home/tangzhiyi01/ctr/Serving/sdk-cpp/include/config_manager.h:18,
                 from /home/tangzhiyi01/ctr/Serving/sdk-cpp/src/config_manager.cpp:15:
/usr/include/c++/5/bits/locale_classes.h:284:36: error: expected ‘,’ or ‘...’ before ‘<’ token
       operator()(const basic_string<_Char, _Traits, _Alloc>& __s1,

add multi thread prediction wrapper

currently, client does not support multi thread prediction, but users can do multi process prediction based on python subprocess. we need to wrapper this as a simple function.
#156

cube-agent读取磁盘路径错误

cube/cube-agent/src/agent/define.go中init函数会读取磁盘路径,但是由于有的磁盘名过长,df显示时会换行,使用df -h | grep -E '/home|/ssd'命令获取到的信息不完整,字符串分割出来的数组长度会少一位,获取路径数据时会发生数组越界

kvdb test cases coredump

Running output/bin/db_func, output/bin/db_thread will occasionally coredump:

[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from KVDBTest
[ RUN      ] KVDBTest.AbstractKVDB_Func_Test
[       OK ] KVDBTest.AbstractKVDB_Func_Test (16 ms)
[----------] 1 test from KVDBTest (16 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (16 ms total)
[  PASSED  ] 1 test.
pure virtual method called
terminate called without an active exception
Aborted (core dumped)

failed fetch predictor

E0210 06:46:00.630698 101654 predictor_sdk.h:52] Failed fetch predictor:, ep_name: general_model
image

Unify Data Transfer Between Server Op

Currently, data structure is designed manually which is not very extensible. We want to make ops configurable in python API, data structure between ops should be unified and the dependencies of Op can be obtained through resource rather than hard coding

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.