spotify / zoltar Goto Github PK

View Code? Open in Web Editor NEW

139.0 24.0 33.0 1.92 MB

Common library for serving TensorFlow, XGBoost and scikit-learn models in production.

Home Page: https://spotify.github.io/zoltar/

License: Apache License 2.0

Shell 0.67% Makefile 0.39% Python 2.18% Java 92.58% Scala 4.14% Dockerfile 0.03%

model-serving machine-learning xgboost tensorflow java

zoltar's Introduction

Zoltar

Zoltar is a common library for serving TensorFlow, XGBoost and scikit-learn models in production. See Zoltar docs for details.

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

zoltar's People

Contributors

Stargazers

Watchers

zoltar's Issues

ID to class mapping should not be hardcoded in the service

This should not be hardcoded here:
https://github.com/spotify/zoltar/blob/master/examples/apollo-service-example/src/main/java/com/spotify/zoltar/examples/apollo/IrisPrediction.java#L64-L67

Might come for feature spec or whatever makes sense.

Add helper for TF predict results

Right now TF predict library returns a List of Tensors in the order of fetch operations, this is error prone, we should provide a helper that given a list of fetch ops, should return a map (?) of op name -> Tensor.

Remove the obsolete IrisExtras

IrisExtras is needed because during release a javadoc.jar is needed for sub-module zoltar-tests (sonatype requires this jar). The only way i got it to generate was by adding this dummy class.

Upgrade to TF 1.6.0

1.5.0 did not support ubuntu 14.04, which is our default, was suppose to be solved in 1.6.0 - see if we can upgrade.

Local inference should be sync by default

Apply consequences of spotify/scio#1238:

inference on local models should be by default sync
inference on remote models (ml-engine) should be async by default

Capture prediction specific metrics

This is related to monitoring epic, we want to capture basic service runtime metrics (we should get this part for free from apollo - need to validate?) and prediction specific metrics (see list of metrics in the doc below). The MVP would be to get a dashboard with basic metrics in alien.

Monitoring RFC: https://docs.google.com/document/d/1gc1vLcwml06S-Xh3sKZK8vh_tiApxz7DxpCdpGObZwY/edit

TEST

testing github integration

Add test/example with sparse vector features

Sparse vectors are commonly used to extract features, and we should make sure it works seamlessly, let's at least add tests for both models (and maybe example).

Better documentation

We're going to implement better documentation by mid-April. Let's use this issue to keep track of topics that should be covered.

Move iris dataset to data-integration-test project

in zoltar-tests we referenceml-sketchbook:zoltar.iris. We should move this dataset to the public data-integration-test project.

`TensorFlowPredictFn<I, P>` assumes intermediate format is an `Example`

cc: @regadas

Upgrade TensorFlow to 1.10.0

https://github.com/tensorflow/tensorflow/releases/tag/v1.9.0

Add tensorflow TFRecord example.

`XGBoostPredictFn<I, P>` assumes intermediate format is an `LabeledPoint`

Simplify predictor building logic

Right now construction the prediction objects is convoluted, we should provide helper methods to hide this complexity for the most common use cases.

Make stylecheck fail on missing javadoc for public APIs

Investigate possible issue with underscores under model URI.

Apparently causes exception to be thrown:

java.lang.IllegalArgumentException: Expected scheme-specific part at index 3: gs:

Make value classes serializable

Vector
Prediction
TensorflowModel.Options
...

Improve apollo example

right now the example seems a bit messy. I think we should improve:

config parsing
predictor creation
endpoint handling

Provide BOM

Right now there is multitude of conflicts between dependencies in common cases of mixing scio, featran, zoltar and apollo. For most of our sbt projects where we don't enforce enforcer - it's less of a "problem" (or should i see we are just postponing the problem), in most maven we do enforce it tho. It might be worth to provide a BOM for zoltar so that ppl can include it in their dependency management and get zoltar with fixed dependencies for combined zoltar+scio+featran+apollo.

Models hot-swap

Research how to and if we can/want-to do hot-swap of models to upgrade/downgrade models without service downtime. There is obviously a lot of corner cases and things to consider with this functionality - should we just keep it simple and delegate this problem to standard backend rollouts, how important is it to have hot-swaps?

Skipton has some of that functionality.

Related: #15

Helper method to feed sparse arrays into TF

We should add some convenience methods, maybe in TensorFlowExtras, to feed Featran FloatSparseArray and DoubleSparseArray into TensorFlow.

https://github.com/spotify/featran/blob/master/java/src/main/scala/com/spotify/featran/java/JavaOps.scala#L126

The feeding code might look like this:

runner
            .feed("input/raw_indices", Tensors.create(new long[]{0, 5, 9}))
            .feed("input/raw_shape", Tensors.create(new long[]{10}))
            .feed("input/raw_data", Tensors.create(new double[]{1.0, 2.0, 3.0}));

Clean up the code

We should:

poms use the same style
the same use of final
review public APIs - which methods/classes are open
document all the public APIs
no unused imports
document that both TF and XGBoost are thread-safe

FileSystemExtras.copyDir can't handle subdirectories/files with same name as parent

e.g.

given:
/model
/model.pb

copyDir('model/', '/tmp/dir', true)

you get

/tmp/dir/.pb since FileSystemExtras matches on the last match of models.

preloadAsync does not trigger the async loading.

Don't throw Exception in Predictor.predict

Exception should be encapsulated in the CompletionStage

Add bootstrap/setup stage on model loading

Add a stage/API to allow model setup/bootstrap before we run prediction.

This might be useful for example in case there is some data embedded in the graph, that we might want to retrieve.

maven fails with zoltar-tests test-jar not generated at compile time on fresh env.

Update apollo example with semantic metrics.

Flakiness around the example custom-metrics test

Support ONNX

Support for ONNX format https://github.com/onnx/onnx ?

Add startup check(s)

Add a support for startup checks, the flow would be:

load model
run user defined check on the model
allow prediction

This would be useful in canary environments to fail on "startup" - in case there is some skew between expected features and model(s).

Add operational metrics per model.

Review/improve prediction API

rav:

Get some feedback on the API for prediction, I have tried to use it myself on a completely new example, and quite quickly got confused, it's either that we need an easier API or some higher level API on top of the current building blocks.

Add operational metrics

We can start with a small set of metrics:

prediction duration
prediction rate
feature extraction duration
feature extraction rate

Research if we can make TF prediction more user friendly by packaging model and its metadata

Right now TF prediction is rather cumbersome and error-prone, it includes:

specify input and output operations
shaping of the input Tensors
reshaping of output Tensors

Research if we could provide an easy path from https://github.com/spotify/spotify-tensorflow to package a model/graph together with all the necessary metadata so that users don't need to worry about the low level TensorFlow constructs like operations, shapes and tensors in zoltar. This approach could cover 80% of use cases, we should still allow for a completely custom prediction.

Research how to expose model and feature metadata from a service

What and how do we want to expose metadata about a model and feature spec/settings:

how do squads want to consume it?
what is this information gonna be used for?

Requires talking with the squads/cream and figure out what can we do here. Might be that we will leave this open for some time, until we have better set of requirements.

Add example on how to add/extend metrics.

Support access to TensorFlow SavedModel's assets

Right now if there are assets in a TF saved model on GCS, they will be downloaded locally to tmp location - we should make it easy to retrieve assets from TF model.

Use LabeledPoint extraction for XGBoost

Probably requires spotify/featran#86

Consider a single element ExtractFn

Consider a single element ExtractFn interface, high level API should still allow for a list/vargs (any maybe a single element). Consider - cause we need to see if how will this work with Featran.

Publish to artifactory/maven

We want to:

publish SNAPSHOTS on master changes
have an automated or at least documented release process

Add example/tests of using multiple Predictors for the same input

Users should be able to for given input, specify multiple Predictors and then map input over the Predictors to get predictions.

Something like:

val input = List(...)
val predictors = List(Predictor.create(XGBoostModel, ...), Predictor.create(TFModle, ...))
predictors.map(_.predict(input))

Make JTensor Serializable

Remove zoltar-featran dependency from xgboost and tensorflow

Add support for the RankLib lib

We save some internal traction around RankLib.
Since it's a full JVM lib, it seems like a low hanging fruit to add support for it.

Rename `zoltar-core` to `zoltar-api`

we plan to add more helpers to this module like Predictors. We need a more brother name for this module and zoltar-api is the proposed one.

Support date partition parsing?

romain:

Disclaimer: I'm aware the feature bellow sounds like scope creeping, but I think it's important to think about whether it should part of the lib MVP or not.

The scope of the lib is intentionally quite limited, for very good reasons, so I think it's good to exclude things like versioning.

Yet, today most teams do a very simple versioning based on date partitions. For example, the model(& features) is trained daily and the date in the bucket path is used to resolve the latest model when the service starts, without having to do a config modification.

It seems like it could be fairly simple to offer some utils code to resolve the latest model form a path (gs://my_bucket/path_to_my_model/%Y%m%d/my_model.bin). I believe this feature would be really useful since most teams would want to do this.

Thoughts?

Add ML Engine model

Add ML engine model to abstract away serving from ML Engine. Users would push models to ML Engine and still use Zoltar to get predictions, zoltar would make call to ML Engine and retrieve prediction.

spotify / zoltar Goto Github PK

zoltar's Introduction

Zoltar

License

zoltar's People

Contributors

Stargazers

Watchers

Forkers

zoltar's Issues

Recommend Projects

Recommend Topics

Recommend Org