This project is a Machine Learning Model Server based on containers. MLBuffet consists of several modules described in the table below:
Module | Description |
---|---|
Deploy | Contains the necessary files to deploy the project, in docker-compose, Docker Swarm or Kubernetes. |
Inferrer | REST API that receives HTTP requests and balances the workload between Modelhosts. |
Modelhost | Workers for model deployment, inference and model management. There should be multiple instances of this module, each being aware of all the models stored. |
Metrics | Gathers and manages performance metrics from host system and services. |
Storage | Performs version controlling. |
If you are new to MLBuffet, please take a look at MLBuffet in a nutshell document
The Inferrer and Modelhost modules expose REST APIs built on Flask for intercommunication. The Inferrer will handle user requests made to the available models, i.e., uploading a model, asking for a prediction... And will send them as jobs to any of the Modelhosts, which will perform them in the background asynchronously.
When a prediction is requested, the Modelhost will pass the HTTP request as an input to the ONNX session running in the background, and the answer is sent back to the user through Inferrer.
TCP ports 80, 8001, 8002 and 9091 must be available before deploying in Docker Swarm or docker-compose. For Kubernetes deployments, check the YAML files to configure port forwarding and exposing.
Images must be built from source with the provided build.sh script.
For orchestrated deployments (Docker Swarm or Kubernetes), all images must be available for every node on the cluster, otherwise nodes will not be able to deploy containers.
For Docker Swarm deployments, there is a deploy.sh script provided which will deploy automatically and configure your cluster. There are two restrictions:
- Stack name must be set as (this is a reserved name for container intracommunication).
- The mlbuffet_overlay network is used by nodes to communicate. The preferred subnet is 10.0.13.0/24.
Reported issue: Sometimes after installing docker-compose, the docker-compose tool is unable to access the docker socket
due to permission issues. To solve
this problem, include docker your user in the docker group with sudo usermod -aG docker $USER
. Then, give
docker-compose permission to access the docker.socket file, by running sudo chmod 666 /var/run/docker.sock
.
For Kubernetes deployments, there is also a script provided, deploy.sh, however all the config files are provided so the user can deploy with custom configuration if desired.
To change the name of the corresponding service, please, modify the deploy/kubernetes/autodeploy/kustomization.yaml
file, replacing each image name by his correct name at the parameter -newName
:
For instance, if your modelhost image name is docker.io/library/modelhost:latest
, the image default image resource is:
For instance, the default image name of the modelhost is:
# Modelhost image name
- name: IMAGE_MLBUFFET_MODELHOST
newName: IMAGE_MLBUFFET_MODELHOST
newTag: latest
But your modelhost image name is docker.io/library/modelhost:latest
, so the kustomization.yaml
file should be displayed as:
# Modelhost image name
- name: IMAGE_MLBUFFET_MODELHOST
newName: docker.io/library/modelhost
newTag: latest
For an easier deployment, a Helm Chart is provided. To install it with helm charts, build the chart with a release name:
helm install my-release deploy/kubernetes/mlbuffet-chart/
Configure the Chart
To configure the Chart, edit the values from deploy/kubernetes/mlbuffet-chart/Values.yaml
, or set via --set
flag during the installation.
The module for the user to communicate via HTTP requests is the Inferrer.
To test the API, use curl http://<INFERRER-IP>:8000/
in Docker Swarm or standalone deployments.
In Kubernetes, use kubectl get endpoints inferrer
to check where the API is hosted.
The welcome message should be displayed.
{
"http_status": {
"code": 200,
"description": "Greetings from MLBuffet - Inferrer, the Machine Learning model server. For more information, visit /help",
"name": "OK"
}
}
Or you can try asking for some help:
curl http://<INFERRER-IP>:8000/help
Some pre-trained models are already uploaded and can be updated manually through the probe_models/
directory.
However, new model uploading is supported by MLBuffet. All Modelhost servers can access the directory of models, so they
share a pool of common models.
Every model must be associated with a tag, which will always be located in the path.
For instance, we can have some versions of the iris model as irisv1.onnx
, irisv2.onnx
or iris_final_version.onnx
, but all of them are several versions of the same model, tagged as iris_model
.
You can upload new versions of a tag model, and they will be stored into the storage module of MLBuffet, only the default versions of each tag will be exposed into the path modelhost/models/
.
In Kubernetes deployments, the volume for models is /tmp
. You can change this directory for the models not being ephemeral in the pv.yaml file that is provided in the deploy directory.
Several methods for model handling can be used from the API:
Get the list and descriptions of available models
curl -X GET http://<INFERRER-IP>:8000/api/v1/models
{
"http_status": {
"code": 200,
"description": "OK",
"name": "OK"
},
"model_list": [
{
"description": "",
"model": "diabetes.onnx"
},
{
"description": "Clasificación de especies de flores de Iris. Se clasifican en setosa, versicolor o virginica dependiendo de las medidas de sépalo y pétalo",
"model": "iris.onnx"
}
]
}
Get model tag information
You can read the relevant information of the models associated with a tag:
curl -X GET http://<INFERRER-IP>:8000/api/v1/models/iris_model/information
{
"http_status": {
"code": 200,
"description": "OK",
"name": "OK"
},
"tag_list_versions": {
"1": {
"description": "Not description provided",
"file": "iris.onnx",
"folder": "files/iris_model/1",
"time": "11:05:06 25/03/2022"
},
"2": {
"description": "second version",
"file": "iris_v2.onnx",
"folder": "files/iris_model/2",
"time": "11:21:23 25/03/2022"
}
}
}
Update model information
A model may have incomplete information, wrong information or no information at all. You can update the description of a model using PUT method:
curl -X PUT -H "Content-Type: application/json" --data '{"model_description":"This model classifies a 4 element array input between different species of Iris flowers."}' http://<INFERRER-IP>:8000/api/v1/models/iris_model/information
Upload a new model
You can upload your own models to MLBuffet using POST method:
curl -X POST -F "path=@/path/to/local/model" http://<INFERRER-IP>:8000/api/v1/models/<tag>
You can also give some information of the version changes:
curl -X POST -F "path=@/path/to/local/model" -F "model_description=version description of the file" http://<INFERRER-IP>:8000/api/v1/models/<tag>
Note: you will not be able to change this version description.
When a new version is uploaded, that will be associated as the default model, and read by the modelhosts.
Download model
You can download any version of any model tag located in the storage module. You can specify the version you want give from the storage in three ways:
<tag>
<tag>:default
<tag>:<version>
The first two methods download the file set as default, and the last one downloads the specified version.
wget http://<INFERRER-IP>:8000/api/v1/models/<tag>/download --content-disposition
You can also download files with your browser with the above URL.
Delete a model
You can delete models you do not need anymore with DELETE method. You can specify the version you want remove of the storage in three ways:
<tag>
<tag>:default
<tag>:<version>
The first two methods remove the file set as default, and the last one removes the specified version.
curl -X DELETE http://<INFERRER-IP>:8000/api/v1/models/<tag>
Set model as default
You can put any version stored into the storage service as the default version:
curl -X POST -H "Content-Type: application/json" --data '{"default": <new default version>}' http://<INFERRER-IP>:8000/api/v1/models/<tag>/default
Get a prediction!
Once models have been correctly uploaded, the server is ready for inference. Requests must be done with an input in json format. This command will send an HTTP request to the server asking for a prediction on the pre-uploaded Iris model:
curl -X POST -H "Content-Type: application/json" --data '{"values":[2, 5, 1, 4]}' http://<INFERRER-IP>:8000/api/v1/models/iris_model/prediction
{
"http_status": {
"code": 200,
"description": "Prediction successful",
"name": "OK"
},
"values": [
1
]
}
The field "values":[1] is the prediction for the input flower. You are now ready to upload your own models and make predictions!
You can predict objects with more complex models. For now, the server only is enabled to predict with images, but other types could be allowed in the future. For that predictions, the command to send the HTTP request is the following:
curl -X GET -F "file=@dog_resized.jpeg" http://<INFERRER-IP>:8000/api/v1/models/dog_model/prediction | jq
{
"http_status": {
"code": 200,
"description": "Prediction successful",
"name": "OK"
},
"values": [
[
8.3947160334219e-09,
...
2.68894262411834e-09
]
]
}
Prepare your trining scripts
MLBuffet is able to train your own models and automatically upload them for inference or model management.
For this capability to be available it is necessary to expose your Docker daemon in the Docker host's machine
at port :2376, and security is enforced by TLS by default. This settings might not be changed to --unsecure,
as exposing your docker daemon insecurely is a high risk practice and can lead to security issues.
Your client cert, key and ca-certs must be located at inferrer/flask_app/utils/certs
.
MLBuffet supports training using any Python-based library, but some interaction and configuration is required from the
user. You will need 3 files to perform training on a virtualized container environment, which are your custom
train.py
script, requirements.txt
with all the libraries and tools to be imported during training, and dataset.csv
or any other data file which must be read during runtime by the training script to make operations over the dataset and
perform the training.
The way MLBuffet knows which file that has been uploaded corresponds to each of the above is via form names. "script" is
for the training script, "requirements" for requirements.txt
and "dataset" for the dataset.
The output models will be sent to a local directory via a Docker bind mount volume. You can make this directory coincide with the Modelhost mounting point for the models to be trained also being used for inference, if trained with ONNX supported format.
Example curl
request:
curl -X POST <INFERRER-IP>:8000/api/v1/train/<tag>/<model_name> -F "dataset=@/path/to/dataset.csv" -F "script=@/path/to/train.py" -F "requirements=@/path/to/requirements.txt"
Take into account that these code will be executed inside a containerized environment and the resulting model must be
able to be located by the Trainer container to auto-upload it into the system. The tag associated to the model must be
provided in <tag>
. The model name must be the same that you sed in the HTTP request <model_name>
.