Giter Site home page Giter Site logo

machine-learning-on-kubernetes's Introduction

Packt Conference

3 Days, 20+ AI Experts, 25+ Workshops and Power Talks

Code: USD75OFF

Machine Learning on Kubernetes

Machine Learning on Kubernetes

This is the code repository for Machine Learning on Kubernetes, published by Packt.

A practical handbook for building and using a complete open source machine learning platform on Kubernetes

What is this book about?

MLOps is an emerging field that aims to bring repeatability, automation, and standardization of the software engineering domain to data science and machine learning engineering. By implementing MLOps with Kubernetes, data scientists, IT professionals, and data engineers can collaborate and build machine learning solutions that deliver business value for their organization.

This book covers the following exciting features:

  • Understand the different stages of a machine learning project
  • Use open source software to build a machine learning platform on Kubernetes
  • Implement a complete ML project using the machine learning platform presented in this book
  • Improve on your organization's collaborative journey toward machine learning
  • Discover how to use the platform as a data engineer, ML engineer, or data scientist
  • Find out how to apply machine learning to solve real business problems

If you feel this book is for you, get your copy today!

https://www.packtpub.com/

Instructions and Navigations

All of the code is organized into folders.

The code will look like the following:

docker tag scikit-notebook:v1.1.0 quay.io/ml-on-k8s/scikitnotebook:v1.1.0

Following is what you need for this book: This book is for data scientists, data engineers, IT platform owners, AI product owners, and data architects who want to build their own platform for ML development. Although this book starts with the basics, a solid understanding of Python and Kubernetes, along with knowledge of the basic concepts of data science and data engineering will help you grasp the topics covered in this book in a better way.

With the following software and hardware list you can run all code files present in the book (Chapter 1-11).

Software and Hardware List

Software required OS required
kubernetes, Python, Spark, MLflow, Windows, Mac OS X, and Linux (Any)
Seldon, Airflow

Running the platform requires a good amount of compute resources. If you do not have the required number of CPU cores and memory on your desktop or laptop computer, we recommend running a virtual machine on Google Cloud or any other cloud platform

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.

Errata and Troubleshooting tips

  • Chapter 10: If you encounter the following error in chapter 10, "seldon_core.wrapper:handle_invalid_usage:60 - ERROR: {'status': {'status': 1, 'info': 'Invalid request data type", try using the following data.json file. More details regarding this error can be found here in this issue thread, courtesy of our readers @aquynh1682 and @webmakaka.

Related products

Get to Know the Authors

Faisal Masood is a principal architect at Red Hat. He has been helping teams to design and build data science and application platforms using OpenShift, Red Hat’s enterprise Kubernetes offering. Faisal has over 20 years of experience in building software and has been building microservices since the pre-Kubernetes era.

Ross Brigoli is an associate principal architect at Red Hat. He has been designing and building software in various industries for over 18 years. He has designed and built data platforms and workflow automation platforms. Before Red Hat, Ross led a data engineering team as an architect in the financial services industry. He currently designs and builds microservices architectures and machine learning solutions on OpenShift.

Download a free PDF

If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost.
Simply click on the link to claim your free PDF.

https://packt.link/free-ebook/9781803241807

machine-learning-on-kubernetes's People

Contributors

davids-packt avatar fmasoodredhat avatar masoodfaisal avatar packt-itservice avatar rajat-packt avatar rossbrigoli avatar utkarsha-packt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

machine-learning-on-kubernetes's Issues

Chapter 5. Data Engineering | Error on run manifests/kfdef/ml-platform.yaml

Hello!

After i run

$ kubectl create -f - --namespace ml-workshop manifests/kfdef/ml-platform.yaml

I am getting

pic1

pic2

pic3

Full log if needed

$ kubectl logs start-spark-cluster.3f3abca9fc804b3e9ade7c47afab0a04 --namespace ml-workshop
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 18688  100 18688    0     0   214k      0 --:--:-- --:--:-- --:--:--  214k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1282  100  1282    0     0  16227      0 --:--:-- --:--:-- --:--:-- 16227
Requirement already satisfied: packaging in /opt/app-root/lib/python3.8/site-packages (21.0)
Requirement already satisfied: pyparsing>=2.0.2 in /opt/app-root/lib/python3.8/site-packages (from packaging) (2.4.7)
WARNING: You are using pip version 21.2.4; however, version 22.2.2 is available.
You should consider upgrading via the '/opt/app-root/bin/python3 -m pip install --upgrade pip' command.
[I 19:47:05.009] 'jhuser-0623-02-spark-0623202840':'start-spark-cluster' - starting operation 
[I 19:47:05.009] 'jhuser-0623-02-spark-0623202840':'start-spark-cluster' - Installing packages 
[I 19:47:05.010] Package not found. Installing ipykernel package with version 5.3.0...
[I 19:47:05.010] Package not found. Installing ipython package with version 7.15.0...
[I 19:47:05.010] Package not found. Installing ipython-genutils package with version 0.2.0...
[I 19:47:05.010] Package not found. Installing jupyter-client package with version 6.1.6...
[I 19:47:05.010] Package not found. Installing jupyter-core package with version 4.6.3...
[I 19:47:05.010] Newer minio package with version 6.0.2 already installed. Skipping...
[I 19:47:05.010] Package not found. Installing nbclient package with version 0.4.1...
[I 19:47:05.010] Package not found. Installing nbconvert package with version 5.6.1...
[I 19:47:05.010] Package not found. Installing nbformat package with version 5.0.7...
[I 19:47:05.010] Package not found. Installing papermill package with version 2.1.2...
[I 19:47:05.010] Package not found. Installing pyzmq package with version 19.0.1...
[I 19:47:05.010] Package not found. Installing prompt-toolkit package with version 3.0.5...
[I 19:47:05.010] Newer requests package with version 2.26.0 already installed. Skipping...
[I 19:47:05.010] Package not found. Installing tornado package with version 6.0.4...
[I 19:47:05.010] Package not found. Installing traitlets package with version 4.3.3...
[I 19:47:05.010] Newer urllib3 package with version 1.26.7 already installed. Skipping...
Collecting ipykernel==5.3.0
  Downloading ipykernel-5.3.0-py3-none-any.whl (119 kB)
Collecting ipython==7.15.0
  Downloading ipython-7.15.0-py3-none-any.whl (783 kB)
Collecting ipython-genutils==0.2.0
  Downloading ipython_genutils-0.2.0-py2.py3-none-any.whl (26 kB)
Collecting jupyter-client==6.1.6
  Downloading jupyter_client-6.1.6-py3-none-any.whl (108 kB)
Collecting jupyter-core==4.6.3
  Downloading jupyter_core-4.6.3-py2.py3-none-any.whl (83 kB)
Collecting nbclient==0.4.1
  Downloading nbclient-0.4.1-py3-none-any.whl (65 kB)
Collecting nbconvert==5.6.1
  Downloading nbconvert-5.6.1-py2.py3-none-any.whl (455 kB)
Collecting nbformat==5.0.7
  Downloading nbformat-5.0.7-py3-none-any.whl (170 kB)
Collecting papermill==2.1.2
  Downloading papermill-2.1.2-py3-none-any.whl (32 kB)
Collecting pyzmq==19.0.1
  Downloading pyzmq-19.0.1-cp38-cp38-manylinux1_x86_64.whl (1.1 MB)
Collecting prompt-toolkit==3.0.5
  Downloading prompt_toolkit-3.0.5-py3-none-any.whl (351 kB)
Collecting tornado==6.0.4
  Downloading tornado-6.0.4.tar.gz (496 kB)
Collecting traitlets==4.3.3
  Downloading traitlets-4.3.3-py2.py3-none-any.whl (75 kB)
Collecting backcall
  Downloading backcall-0.2.0-py2.py3-none-any.whl (11 kB)
Requirement already satisfied: setuptools>=18.5 in /opt/app-root/lib/python3.8/site-packages (from ipython==7.15.0) (41.6.0)
Collecting pygments
  Downloading Pygments-2.13.0-py3-none-any.whl (1.1 MB)
Collecting decorator
  Downloading decorator-5.1.1-py3-none-any.whl (9.1 kB)
Collecting pexpect
  Downloading pexpect-4.8.0-py2.py3-none-any.whl (59 kB)
Collecting pickleshare
  Downloading pickleshare-0.7.5-py2.py3-none-any.whl (6.9 kB)
Collecting jedi>=0.10
  Downloading jedi-0.18.1-py2.py3-none-any.whl (1.6 MB)
Requirement already satisfied: python-dateutil>=2.1 in /opt/app-root/lib/python3.8/site-packages (from jupyter-client==6.1.6) (2.8.2)
Collecting async-generator
  Downloading async_generator-1.10-py3-none-any.whl (18 kB)
Collecting nest-asyncio
  Downloading nest_asyncio-1.5.5-py3-none-any.whl (5.2 kB)
Requirement already satisfied: jinja2>=2.4 in /opt/app-root/lib/python3.8/site-packages (from nbconvert==5.6.1) (3.0.1)
Collecting mistune<2,>=0.8.1
  Downloading mistune-0.8.4-py2.py3-none-any.whl (16 kB)
Requirement already satisfied: entrypoints>=0.2.2 in /opt/app-root/lib/python3.8/site-packages (from nbconvert==5.6.1) (0.3)
Collecting defusedxml
  Downloading defusedxml-0.7.1-py2.py3-none-any.whl (25 kB)
Collecting testpath
  Downloading testpath-0.6.0-py3-none-any.whl (83 kB)
Collecting pandocfilters>=1.4.1
  Downloading pandocfilters-1.5.0-py2.py3-none-any.whl (8.7 kB)
Collecting bleach
  Downloading bleach-5.0.1-py3-none-any.whl (160 kB)
Collecting jsonschema!=2.5.0,>=2.4
  Downloading jsonschema-4.10.0-py3-none-any.whl (80 kB)
Collecting tenacity
  Downloading tenacity-8.0.1-py3-none-any.whl (24 kB)
Collecting ansiwrap
  Downloading ansiwrap-0.8.4-py2.py3-none-any.whl (8.5 kB)
Requirement already satisfied: pyyaml in /opt/app-root/lib/python3.8/site-packages (from papermill==2.1.2) (5.4.1)
Requirement already satisfied: requests in /opt/app-root/lib/python3.8/site-packages (from papermill==2.1.2) (2.26.0)
Collecting tqdm>=4.32.2
  Downloading tqdm-4.64.0-py2.py3-none-any.whl (78 kB)
Requirement already satisfied: click in /opt/app-root/lib/python3.8/site-packages (from papermill==2.1.2) (8.0.1)
Collecting black
  Downloading black-22.6.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB)
Collecting wcwidth
  Downloading wcwidth-0.2.5-py2.py3-none-any.whl (30 kB)
Requirement already satisfied: six in /opt/app-root/lib/python3.8/site-packages (from traitlets==4.3.3) (1.16.0)
Collecting parso<0.9.0,>=0.8.0
  Downloading parso-0.8.3-py2.py3-none-any.whl (100 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/app-root/lib/python3.8/site-packages (from jinja2>=2.4->nbconvert==5.6.1) (2.0.1)
Collecting attrs>=17.4.0
  Downloading attrs-22.1.0-py2.py3-none-any.whl (58 kB)
Collecting pkgutil-resolve-name>=1.3.10
  Downloading pkgutil_resolve_name-1.3.10-py3-none-any.whl (4.7 kB)
Collecting pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0
  Downloading pyrsistent-0.18.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (119 kB)
Collecting importlib-resources>=1.4.0
  Downloading importlib_resources-5.9.0-py3-none-any.whl (33 kB)
Requirement already satisfied: zipp>=3.1.0 in /opt/app-root/lib/python3.8/site-packages (from importlib-resources>=1.4.0->jsonschema!=2.5.0,>=2.4->nbformat==5.0.7) (3.5.0)
Collecting textwrap3>=0.9.2
  Downloading textwrap3-0.9.2-py2.py3-none-any.whl (12 kB)
Collecting tomli>=1.1.0
  Downloading tomli-2.0.1-py3-none-any.whl (12 kB)
Collecting mypy-extensions>=0.4.3
  Downloading mypy_extensions-0.4.3-py2.py3-none-any.whl (4.5 kB)
Collecting pathspec>=0.9.0
  Downloading pathspec-0.9.0-py2.py3-none-any.whl (31 kB)
Requirement already satisfied: typing-extensions>=3.10.0.0 in /opt/app-root/lib/python3.8/site-packages (from black->papermill==2.1.2) (3.10.0.2)
Collecting platformdirs>=2
  Downloading platformdirs-2.5.2-py3-none-any.whl (14 kB)
Collecting webencodings
  Downloading webencodings-0.5.1-py2.py3-none-any.whl (11 kB)
Collecting ptyprocess>=0.5
  Downloading ptyprocess-0.7.0-py2.py3-none-any.whl (13 kB)
Requirement already satisfied: certifi>=2017.4.17 in /opt/app-root/lib/python3.8/site-packages (from requests->papermill==2.1.2) (2021.5.30)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/app-root/lib/python3.8/site-packages (from requests->papermill==2.1.2) (1.26.7)
Requirement already satisfied: charset-normalizer~=2.0.0 in /opt/app-root/lib/python3.8/site-packages (from requests->papermill==2.1.2) (2.0.6)
Requirement already satisfied: idna<4,>=2.5 in /opt/app-root/lib/python3.8/site-packages (from requests->papermill==2.1.2) (3.2)
Building wheels for collected packages: tornado
  Building wheel for tornado (setup.py): started
  Building wheel for tornado (setup.py): finished with status 'done'
  Created wheel for tornado: filename=tornado-6.0.4-cp38-cp38-linux_x86_64.whl size=428150 sha256=69de556337567b40733cd63882ecad9117573709e377f5e8dc6defdb4e34fdba
  Stored in directory: /tmp/pip-ephem-wheel-cache-h9e_6z1g/wheels/88/79/e5/598ba17e85eccf2626eab62e4ee8452895636cd542650d450d
Successfully built tornado
Installing collected packages: ipython-genutils, decorator, traitlets, pyrsistent, pkgutil-resolve-name, importlib-resources, attrs, wcwidth, tornado, pyzmq, ptyprocess, parso, jupyter-core, jsonschema, webencodings, tomli, textwrap3, pygments, prompt-toolkit, platformdirs, pickleshare, pexpect, pathspec, nest-asyncio, nbformat, mypy-extensions, jupyter-client, jedi, backcall, async-generator, tqdm, testpath, tenacity, pandocfilters, nbclient, mistune, ipython, defusedxml, bleach, black, ansiwrap, papermill, nbconvert, ipykernel
Successfully installed ansiwrap-0.8.4 async-generator-1.10 attrs-22.1.0 backcall-0.2.0 black-22.6.0 bleach-5.0.1 decorator-5.1.1 defusedxml-0.7.1 importlib-resources-5.9.0 ipykernel-5.3.0 ipython-7.15.0 ipython-genutils-0.2.0 jedi-0.18.1 jsonschema-4.10.0 jupyter-client-6.1.6 jupyter-core-4.6.3 mistune-0.8.4 mypy-extensions-0.4.3 nbclient-0.4.1 nbconvert-5.6.1 nbformat-5.0.7 nest-asyncio-1.5.5 pandocfilters-1.5.0 papermill-2.1.2 parso-0.8.3 pathspec-0.9.0 pexpect-4.8.0 pickleshare-0.7.5 pkgutil-resolve-name-1.3.10 platformdirs-2.5.2 prompt-toolkit-3.0.5 ptyprocess-0.7.0 pygments-2.13.0 pyrsistent-0.18.1 pyzmq-19.0.1 tenacity-8.0.1 testpath-0.6.0 textwrap3-0.9.2 tomli-2.0.1 tornado-6.0.4 tqdm-4.64.0 traitlets-4.3.3 wcwidth-0.2.5 webencodings-0.5.1
WARNING: You are using pip version 21.2.4; however, version 22.2.2 is available.
You should consider upgrading via the '/opt/app-root/bin/python3 -m pip install --upgrade pip' command.
absl-py==0.14.1
alembic==1.4.1
ansiwrap==0.8.4
astunparse==1.6.3
async-generator==1.10
attrs==22.1.0
backcall==0.2.0
bcrypt==3.2.0
black==22.6.0
bleach==5.0.1
boto3==1.18.49
botocore==1.21.49
cachetools==4.2.4
certifi==2021.5.30
cffi==1.14.6
charset-normalizer==2.0.6
click==8.0.1
cloudpickle==2.0.0
configparser==5.0.2
cryptography==3.4.8
databricks-cli==0.15.0
decorator==5.1.1
defusedxml==0.7.1
docker==5.0.2
entrypoints==0.3
Flask==2.0.1
flatbuffers==2.0
gast==0.4.0
gitdb==4.0.7
GitPython==3.1.24
google-auth==2.3.0
google-auth-oauthlib==0.4.6
google-pasta==0.2.0
greenlet==1.1.1
grpcio==1.41.0
gunicorn==20.1.0
h5py==3.4.0
idna==3.2
importlib-metadata==4.8.1
importlib-resources==5.9.0
ipykernel==5.3.0
ipython==7.15.0
ipython-genutils==0.2.0
itsdangerous==2.0.1
jedi==0.18.1
Jinja2==3.0.1
jmespath==0.10.0
joblib==1.0.1
jsonschema==4.10.0
jupyter-client==6.1.6
jupyter-core==4.6.3
keras-nightly==2.7.0.dev2021100607
Keras-Preprocessing==1.1.2
libclang==12.0.0
Mako==1.1.5
Markdown==3.3.4
MarkupSafe==2.0.1
minio==6.0.2
mistune==0.8.4
mlflow==1.20.2
mypy-extensions==0.4.3
nbclient==0.4.1
nbconvert==5.6.1
nbformat==5.0.7
nest-asyncio==1.5.5
numpy==1.21.2
oauthlib==3.1.1
openshift-client==1.0.13
opt-einsum==3.3.0
packaging==21.0
pandas==1.3.3
pandocfilters==1.5.0
papermill==2.1.2
paramiko==2.7.2
parso==0.8.3
pathspec==0.9.0
pexpect==4.8.0
pickleshare==0.7.5
pkgutil_resolve_name==1.3.10
platformdirs==2.5.2
prometheus-client==0.11.0
prometheus-flask-exporter==0.18.2
prompt-toolkit==3.0.5
protobuf==3.18.0
ptyprocess==0.7.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.20
Pygments==2.13.0
PyNaCl==1.4.0
pyparsing==2.4.7
pyrsistent==0.18.1
python-dateutil==2.8.2
python-editor==1.0.4
pytz==2021.1
PyYAML==5.4.1
pyzmq==19.0.1
querystring-parser==1.2.4
requests==2.26.0
requests-oauthlib==1.3.0
rsa==4.7.2
s3transfer==0.5.0
scikit-learn==0.24.2
scipy==1.7.1
six==1.16.0
smmap==4.0.0
SQLAlchemy==1.4.25
sqlparse==0.4.2
tabulate==0.8.9
tb-nightly==2.7.0a20211013
tenacity==8.0.1
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.0
tensorflow-io-gcs-filesystem==0.21.0
termcolor==1.1.0
testpath==0.6.0
textwrap3==0.9.2
tf-estimator-nightly==2.7.0.dev2021092408
tf-nightly==2.8.0.dev20211005
threadpoolctl==2.2.0
tomli==2.0.1
tornado==6.0.4
tqdm==4.64.0
traitlets==4.3.3
typing-extensions==3.10.0.2
urllib3==1.26.7
wcwidth==0.2.5
webencodings==0.5.1
websocket-client==1.2.1
Werkzeug==2.0.1
wrapt==1.13.2
zipp==3.5.0
[I 19:47:15.943] 'jhuser-0623-02-spark-0623202840':'start-spark-cluster' - Packages installed (10.934 secs)
[I 19:47:16.002] 'jhuser-0623-02-spark-0623202840':'start-spark-cluster' - processing dependencies 
Traceback (most recent call last):
  File "bootstrapper.py", line 430, in <module>
    main()
  File "bootstrapper.py", line 421, in main
    file_op.process_dependencies()
  File "bootstrapper.py", line 96, in process_dependencies
    self.get_file_from_object_storage(archive_file)
  File "bootstrapper.py", line 144, in get_file_from_object_storage
    self.cos_client.fget_object(bucket_name=self.cos_bucket,
  File "/opt/app-root/lib64/python3.8/site-packages/minio/api.py", line 787, in fget_object
    stat = self.stat_object(
  File "/opt/app-root/lib64/python3.8/site-packages/minio/api.py", line 1195, in stat_object
    response = self._url_open(
  File "/opt/app-root/lib64/python3.8/site-packages/minio/api.py", line 2226, in _url_open
    raise ResponseError(response,
minio.error.NoSuchKey: NoSuchKey: message: The specified key does not exist.

What are the problems?
How to fix it? Thanks.

Chapter04 - opendata-hub-operator-controller-manager crashloopbackoff

After apply the odh-subscription.yaml, I'm getting some errors on the pod. Could someone help me?

kubectl logs opendatahub-operator-controller-manager-bff48cb67-c6wpq -n operators -f
2023-05-13T16:31:40.886Z	INFO	controller.kfdef-controller	Starting Controller	{"reconciler group": "kfdef.apps.kubeflow.org", "reconciler kind": "KfDef"}
2023-05-13T16:31:40.987Z	INFO	controllers.KfDef	Adding finalizer	{"kfdef-finalizer.kfdef.apps.kubeflow.org": "ml-workshop/opendatahub-ml-workshop"}
2023-05-13T16:31:41.340Z	ERROR	controller-runtime.source	if kind is a CRD, it should be installed before calling Start	{"kind": "DeploymentConfig.apps.openshift.io", "error": "no matches for kind \"DeploymentConfig\" in version \"apps.openshift.io/v1\""}
2023-05-13T16:31:41.340Z	INFO	controller.secret-generator-controller	Starting workers	{"reconciler group": "", "reconciler kind": "Secret", "worker count": 1}
2023-05-13T16:31:41.346Z	INFO	controllers.KfDef	Adding finalizer	{"kfdef-finalizer.kfdef.apps.kubeflow.org": "ml-workshop/opendatahub-ml-workshop"}
I0513 16:31:42.488515       1 request.go:668] Waited for 1.048971907s due to client-side throttling, not priority and fairness, request: GET:https://10.96.0.1:443/apis/node.k8s.io/v1?timeout=32s
2023-05-13T16:31:43.289Z	ERROR	controller-runtime.source	if kind is a CRD, it should be installed before calling Start	{"kind": "BuildConfig.build.openshift.io", "error": "no matches for kind \"BuildConfig\" in version \"build.openshift.io/v1\""}
2023-05-13T16:31:43.289Z	ERROR	controller.kfdef-controller	Could not wait for Cache to sync	{"reconciler group": "kfdef.apps.kubeflow.org", "reconciler kind": "KfDef", "error": "failed to wait for kfdef-controller caches to sync: no matches for kind \"DeploymentConfig\" in version \"apps.openshift.io/v1\""}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start
	/opt/app-root/src/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234
sigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1
	/opt/app-root/src/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:696
2023-05-13T16:31:43.289Z	INFO	controller.secret-generator-controller	Shutdown signal received, waiting for all workers to finish	{"reconciler group": "", "reconciler kind": "Secret"}
2023-05-13T16:31:43.289Z	INFO	controller.secret-generator-controller	All workers finished	{"reconciler group": "", "reconciler kind": "Secret"}
2023-05-13T16:31:43.289Z	ERROR	error received after stop sequence was engaged	{"error": "leader election lost"}
runtime.goexit
	/usr/lib/golang/src/runtime/asm_amd64.s:1571
2023-05-13T16:31:43.290Z	ERROR	error received after stop sequence was engaged	{"error": "context canceled"}
runtime.goexit
	/usr/lib/golang/src/runtime/asm_amd64.s:1571
2023-05-13T16:31:43.290Z	ERROR	error received after stop sequence was engaged	{"error": "context canceled"}
runtime.goexit
	/usr/lib/golang/src/runtime/asm_amd64.s:1571
2023-05-13T16:31:43.290Z	ERROR	setup	problem running manager	{"error": "failed to wait for kfdef-controller caches to sync: no matches for kind \"DeploymentConfig\" in version \"apps.openshift.io/v1\""}
runtime.goexit
	/usr/lib/golang/src/runtime/asm_amd64.s:1571

Ingress Host IP missing a '.'

spark-cluster-mluser192.168.0.37.nip.io is not usable since it is missing a '.' between username and ip address.

kubectl get ingress -n ml-workshop
NAME CLASS HOSTS ADDRESS PORTS AGE
jupyterhub jupyterhub.192.168.0.37.nip.io 192.168.0.37,192.168.0.63 80, 443 111m
ap-airflow2 airflow.192.168.0.37.nip.io 192.168.0.37,192.168.0.63 80, 443 111m
minio-ml-workshop-ui minio.192.168.0.37.nip.io 192.168.0.37,192.168.0.63 80, 443 111m
mlflow mlflow.192.168.0.37.nip.io 192.168.0.37,192.168.0.63 80, 443 111m
grafana grafana.192.168.0.37.nip.io 192.168.0.37,192.168.0.63 80, 443 111m
spark-cluster-mluser spark-cluster-mluser192.168.0.37.nip.io 192.168.0.37,192.168.0.63 80 12m

This breaks following along for Chapter 5.

Chapter 9. Building Your Data Pipeline | Error on execute Chapter09/explore_data.ipynb

Hello!

When i run block

import os
from pyspark.sql import SparkSession


os.environ['PYSPARK_SUBMIT_ARGS'] = f"\
--conf spark.hadoop.fs.s3a.endpoint=http://minio-ml-workshop:9000 \
--conf spark.hadoop.fs.s3a.access.key=minio \
--conf spark.hadoop.fs.s3a.secret.key=minio123 \
--conf spark.hadoop.fs.s3a.path.style.access=true \
--conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
--conf spark.hadoop.fs.s3a.multipart.size=104857600 \
--packages org.apache.hadoop:hadoop-aws:3.2.0,org.postgresql:postgresql:42.3.3 \
--master spark://{os.environ['SPARK_CLUSTER']}:7077 pyspark-shell "

# Create the spark application
spark = SparkSession \
    .builder \
    .appName("Python Spark S3 example") \
    .getOrCreate()

dfAirlines = spark.read\
                .options(delimeter=',', inferSchema='True', header='True') \
                .csv("s3a://airport-data/airlines.csv")
dfAirlines.printSchema()

dfAirports = spark.read\
                .options(delimiter=',', inferSchema='True', header='True') \
                .csv("s3a://airport-data/airports.csv")
dfAirports.printSchema()

dfAirports.show(truncate=False)
dfAirlines.show(truncate=False)

print(dfAirports.count())
print(dfAirlines.count())

spark.stop()

I am getting next error:

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<ipython-input-7-426019de4890> in <module>
     19     .getOrCreate()
     20 
---> 21 dfAirlines = spark.read\
     22                 .options(delimeter=',', inferSchema='True', header='True') \
     23                 .csv("s3a://airport-data/airlines.csv")

/opt/app-root/lib/python3.8/site-packages/pyspark/sql/readwriter.py in csv(self, path, schema, sep, encoding, quote, escape, comment, header, inferSchema, ignoreLeadingWhiteSpace, ignoreTrailingWhiteSpace, nullValue, nanValue, positiveInf, negativeInf, dateFormat, timestampFormat, maxColumns, maxCharsPerColumn, maxMalformedLogPerPartition, mode, columnNameOfCorruptRecord, multiLine, charToEscapeQuoteEscaping, samplingRatio, enforceSchema, emptyValue, locale, lineSep, pathGlobFilter, recursiveFileLookup)
    533             path = [path]
    534         if type(path) == list:
--> 535             return self._df(self._jreader.csv(self._spark._sc._jvm.PythonUtils.toSeq(path)))
    536         elif isinstance(path, RDD):
    537             def func(iterator):

/opt/app-root/lib/python3.8/site-packages/py4j/java_gateway.py in __call__(self, *args)
   1302 
   1303         answer = self.gateway_client.send_command(command)
-> 1304         return_value = get_return_value(
   1305             answer, self.gateway_client, self.target_id, self.name)
   1306 

/opt/app-root/lib/python3.8/site-packages/pyspark/sql/utils.py in deco(*a, **kw)
    126     def deco(*a, **kw):
    127         try:
--> 128             return f(*a, **kw)
    129         except py4j.protocol.Py4JJavaError as e:
    130             converted = convert_exception(e.java_exception)

/opt/app-root/lib/python3.8/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    324             value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
    325             if answer[1] == REFERENCE_TYPE:
--> 326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
    328                     format(target_id, ".", name), value)

Py4JJavaError: An error occurred while calling o74.csv.
: java.nio.file.AccessDeniedException: airport-data: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:187)
	at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111)
	at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:265)
	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:322)
	at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:261)
	at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:236)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:375)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:311)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
	at org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:46)
	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:366)
	at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:297)
	at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:286)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:286)
	at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:723)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
	at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:159)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1166)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:762)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:724)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
	at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
	at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4368)
	at com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:5129)
	at com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:5103)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4352)
	at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4315)
	at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1344)
	at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1284)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:376)
	at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)
	... 30 more
Caused by: com.amazonaws.SdkClientException: Unable to load credentials from service endpoint
	at com.amazonaws.auth.EC2CredentialsFetcher.handleError(EC2CredentialsFetcher.java:183)
	at com.amazonaws.auth.EC2CredentialsFetcher.fetchCredentials(EC2CredentialsFetcher.java:162)
	at com.amazonaws.auth.EC2CredentialsFetcher.getCredentials(EC2CredentialsFetcher.java:82)
	at com.amazonaws.auth.InstanceProfileCredentialsProvider.getCredentials(InstanceProfileCredentialsProvider.java:164)
	at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:137)
	... 47 more
Caused by: java.net.NoRouteToHostException: No route to host (Host unreachable)
	at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
	at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242)
	at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224)
	at java.base/java.net.Socket.connect(Socket.java:609)
	at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:177)
	at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:474)
	at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:569)
	at java.base/sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
	at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:341)
	at java.base/sun.net.www.http.HttpClient.New(HttpClient.java:362)
	at java.base/sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1253)
	at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1232)
	at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1081)
	at java.base/sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:1015)
	at com.amazonaws.internal.ConnectionUtils.connectToEndpoint(ConnectionUtils.java:54)
	at com.amazonaws.internal.EC2CredentialsUtils.readResource(EC2CredentialsUtils.java:116)
	at com.amazonaws.internal.EC2CredentialsUtils.readResource(EC2CredentialsUtils.java:87)
	at com.amazonaws.auth.InstanceProfileCredentialsProvider$InstanceMetadataCredentialsEndpointProvider.getCredentialsEndpoint(InstanceProfileCredentialsProvider.java:189)
	at com.amazonaws.auth.EC2CredentialsFetcher.fetchCredentials(EC2CredentialsFetcher.java:122)
	... 50 more

Screenshot from 2022-08-20 22-21-42

How to fix it?
Thanks!

Previous steps in this script finished correctly

Chapter10: seldon_core.wrapper:handle_invalid_usage:60 - ERROR: {'status': {'status': 1, 'info': 'Invalid request data type:

Hello everyone, it's me again :)))

I have resolved the previous errors I encountered. Currently, in this chapter 10, I have successfully executed the pipeline build and push. After a successful deployment, I tried to call the model (I used both curl and Postman), but it encountered an error. Below is the error log I encountered and the command to call the model.

call model with curl:
curl -vvvvk --header "content-type: application/json" -X POST -d @data.json https://yourdomain.com/api/v1.0/predictions; done

and response:

< HTTP/1.1 100 Continue
< HTTP/1.1 400 Bad Request
< Date: Mon, 24 Jul 2023 02:37:56 GMT
< Content-Type: text/plain; charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< Access-Control-Allow-Headers: Accept, Accept-Encoding, Authorization, Content-Length, Content-Type, X-CSRF-Token
< Access-Control-Allow-Methods: OPTIONS,POST
< Access-Control-Allow-Origin: *
< Seldon-Puid: 36e3f612-eb06-4f9c-98fe-ff74ceebdcf1
< X-Content-Type-Options: nosniff
< Strict-Transport-Security: max-age=15724800; includeSubDomains
* HTTP error before end of send, stop sending
< 
{"status":{"code":-1,"info":"Invalid request data type: {'annotations': {'list': [{'builtIn': 1, 'datasource': '-- Grafana --', 'enable': True, 'hide': True, 'iconColor': 'rgba(0, 211, 255, 1)', 'name': 'Annotations \u0026 Alerts', 'target': {'limit': 100, 'matchAny': False, 'tags': [], 'type': 'dashboard'}, 'type': 'dashboard'}]}, 'editable': True, 'fiscalYearStartMonth': 0, 'graphTooltip': 0, 'id': 2, 'iteration': 1648959757365, 'links': [], 'liveNow': False, 'panels': [{'collapsed': True, 'gridPos': {'h': 1, 'w': 24, 'x': 0, 'y': 0}, 'id': 40, 'panels': [{'gridPos': {'h': 3, 'w': 24, 'x': 0, 'y': 1}, 'id': 27, 'links': [], 'options': {'content': '\u003cdiv class=\"text-center dashboard-header\"\u003e\\n  \u003cspan\u003eSeldon Core API Dashboard\u003c/span\u003e\\n\u003c/div\u003e', 'mode': 'html'}, 'pluginVersion': '8.3.3', 'type': 'text'}], 'title': 'Heading', 'type': 'row'}, {'collapsed': False, 'gridPos': {'h': 1, 'w': 24, 'x': 0, 'y': 1}, 'id': 41, 'panels': [], 'title': 'Global Counts', 'type': 'row'}, {'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'fieldConfig': {'defaults': {'color': {'fixedColor': 'rgb(31, 120, 193)', 'mode': 'fixed'}, 'mappings': [{'options': {'match': 'null', 'result': {'text': 'N/A'}}, 'type': 'special'}], 'thresholds': {'mode': 'absolute', 'steps': [{'color': 'green', 'value': None}, {'color': 'red', 'value': 80}]}, 'unit': 'ops'}, 'overrides': []}, 'gridPos': {'h': 3, 'w': 6, 'x': 0, 'y': 2}, 'id': 16, 'links': [], 'maxDataPoints': 100, 'options': {'colorMode': 'none', 'graphMode': 'area', 'justifyMode': 'auto', 'orientation': 'horizontal', 'reduceOptions': {'calcs': ['lastNotNull'], 'fields': '', 'values': False}, 'textMode': 'auto'}, 'pluginVersion': '8.3.3', 'targets': [{'expr': 'round(sum(irate(seldon_api_executor_server_requests_seconds_count[1m])),0.001)', 'format': 'time_series', 'instant': False, 'intervalFactor': 2, 'refId': 'A', 'step': 20}], 'title': 'Global Request Rate', 'type': 'stat'}, {'fieldConfig': {'defaults': {'color': {'fixedColor': 'rgb(31, 120, 193)', 'mode': 'fixed'}, 'mappings': [{'options': {'match': 'null', 'result': {'text': 'N/A'}}, 'type': 'special'}], 'thresholds': {'mode': 'absolute', 'steps': [{'color': 'green', 'value': None}, {'color': 'red', 'value': 80}]}, 'unit': 'percentunit'}, 'overrides': []}, 'gridPos': {'h': 3, 'w': 6, 'x': 6, 'y': 2}, 'id': 17, 'links': [], 'maxDataPoints': 100, 'options': {'colorMode': 'none', 'graphMode': 'area', 'justifyMode': 'auto', 'orientation': 'horizontal', 'reduceOptions': {'calcs': ['mean'], 'fields': '', 'values': False}, 'textMode': 'auto'}, 'pluginVersion': '8.3.3', 'targets': [{'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'sum(rate(seldon_api_executor_server_requests_seconds_count{code!~\"5.*\"}[1m])) / sum(rate(seldon_api_executor_server_requests_seconds_count[1m]))', 'format': 'time_series', 'interval': '', 'intervalFactor': 2, 'legendFormat': '', 'refId': 'A', 'step': 20}], 'title': 'Success', 'type': 'stat'}, {'fieldConfig': {'defaults': {'color': {'fixedColor': 'rgb(31, 120, 193)', 'mode': 'fixed'}, 'mappings': [{'options': {'match': 'null', 'result': {'text': 'N/A'}}, 'type': 'special'}], 'thresholds': {'mode': 'absolute', 'steps': [{'color': 'green', 'value': None}, {'color': 'red', 'value': 80}]}, 'unit': 'ops'}, 'overrides': []}, 'gridPos': {'h': 3, 'w': 6, 'x': 12, 'y': 2}, 'id': 18, 'links': [], 'maxDataPoints': 100, 'options': {'colorMode': 'none', 'graphMode': 'area', 'justifyMode': 'auto', 'orientation': 'horizontal', 'reduceOptions': {'calcs': ['mean'], 'fields': '', 'values': False}, 'textMode': 'auto'}, 'pluginVersion': '8.3.3', 'targets': [{'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'sum(irate(seldon_api_executor_server_requests_seconds_count{code=~\"4.*\"}[1m])) ', 'format': 'time_series', 'interval': '', 'intervalFactor': 2, 'legendFormat': '', 'refId': 'A', 'step': 20}], 'title': '4xxs', 'type': 'stat'}, {'fieldConfig': {'defaults': {'color': {'fixedColor': 'rgb(31, 120, 193)', 'mode': 'fixed'}, 'mappings': [{'options': {'match': 'null', 'result': {'text': 'N/A'}}, 'type': 'special'}], 'thresholds': {'mode': 'absolute', 'steps': [{'color': 'green', 'value': None}, {'color': 'red', 'value': 80}]}, 'unit': 'ops'}, 'overrides': []}, 'gridPos': {'h': 3, 'w': 6, 'x': 18, 'y': 2}, 'id': 19, 'links': [], 'maxDataPoints': 100, 'options': {'colorMode': 'none', 'graphMode': 'area', 'justifyMode': 'auto', 'orientation': 'horizontal', 'reduceOptions': {'calcs': ['mean'], 'fields': '', 'values': False}, 'textMode': 'auto'}, 'pluginVersion': '8.3.3', 'targets': [{'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'sum(irate(seldon_api_executor_server_requests_seconds_count{code=~\"5.*\"}[1m])) ', 'format': 'time_series', 'hide': False, 'interval': '', 'intervalFactor': 2, 'legendFormat': '', 'refId': 'A', 'step': 20}], 'title': '5xxs', 'type': 'stat'}, {'collapsed': False, 'gridPos': {'h': 1, 'w': 24, 'x': 0, 'y': 5}, 'id': 42, 'panels': [], 'repeat': 'deployment', 'title': 'Deployment Counts $deployment', 'type': 'row'}, {'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'fieldConfig': {'defaults': {'color': {'fixedColor': 'rgb(31, 120, 193)', 'mode': 'fixed'}, 'mappings': [{'options': {'match': 'null', 'result': {'text': 'N/A'}}, 'type': 'special'}], 'thresholds': {'mode': 'absolute', 'steps': [{'color': 'green', 'value': None}, {'color': 'red', 'value': 80}]}, 'unit': 'ops'}, 'overrides': []}, 'gridPos': {'h': 3, 'w': 6, 'x': 0, 'y': 6}, 'id': 36, 'links': [], 'maxDataPoints': 100, 'options': {'colorMode': 'none', 'graphMode': 'area', 'justifyMode': 'auto', 'orientation': 'horizontal', 'reduceOptions': {'calcs': ['lastNotNull'], 'fields': '', 'values': False}, 'textMode': 'auto'}, 'pluginVersion': '8.3.3', 'targets': [{'expr': \"round(sum(irate(seldon_api_executor_client_requests_seconds_count{deployment_name=~'$deployment'}[1m])), 0.001)\", 'format': 'time_series', 'intervalFactor': 2, 'refId': 'A', 'step': 20}], 'title': 'Request Rate ($deployment)', 'type': 'stat'}, {'fieldConfig': {'defaults': {'color': {'fixedColor': 'rgb(31, 120, 193)', 'mode': 'fixed'}, 'mappings': [{'options': {'match': 'null', 'result': {'text': 'N/A'}}, 'type': 'special'}], 'thresholds': {'mode': 'absolute', 'steps': [{'color': 'green', 'value': None}, {'color': 'red', 'value': 80}]}, 'unit': 'percentunit'}, 'overrides': []}, 'gridPos': {'h': 3, 'w': 6, 'x': 6, 'y': 6}, 'id': 37, 'links': [], 'maxDataPoints': 100, 'options': {'colorMode': 'none', 'graphMode': 'area', 'justifyMode': 'auto', 'orientation': 'horizontal', 'reduceOptions': {'calcs': ['mean'], 'fields': '', 'values': False}, 'textMode': 'auto'}, 'pluginVersion': '8.3.3', 'targets': [{'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'sum(rate(seldon_api_executor_client_requests_seconds_count{deployment_name=~\"$deployment\",code!~\"5.*\"}[1m])) / sum(rate(seldon_api_executor_client_requests_seconds_count{deployment_name=~\"$deployment\"}[1m]))', 'format': 'time_series', 'interval': '', 'intervalFactor': 2, 'legendFormat': '', 'refId': 'A', 'step': 20}], 'title': 'Success ($deployment)', 'type': 'stat'}, {'fieldConfig': {'defaults': {'color': {'fixedColor': 'rgb(31, 120, 193)', 'mode': 'fixed'}, 'mappings': [{'options': {'match': 'null', 'result': {'text': 'N/A'}}, 'type': 'special'}], 'thresholds': {'mode': 'absolute', 'steps': [{'color': 'green', 'value': None}, {'color': 'red', 'value': 80}]}, 'unit': 'ops'}, 'overrides': []}, 'gridPos': {'h': 3, 'w': 6, 'x': 12, 'y': 6}, 'id': 38, 'links': [], 'maxDataPoints': 100, 'options': {'colorMode': 'none', 'graphMode': 'area', 'justifyMode': 'auto', 'orientation': 'horizontal', 'reduceOptions': {'calcs': ['mean'], 'fields': '', 'values': False}, 'textMode': 'auto'}, 'pluginVersion': '8.3.3', 'targets': [{'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'sum(irate(seldon_api_executor_client_requests_seconds_count{deployment_name=~\"$deployment\",code=~\"4.*\"}[1m]))', 'format': 'time_series', 'interval': '', 'intervalFactor': 2, 'legendFormat': '', 'refId': 'A', 'step': 20}], 'title': '4xxs ($deployment)', 'type': 'stat'}, {'fieldConfig': {'defaults': {'color': {'fixedColor': 'rgb(31, 120, 193)', 'mode': 'fixed'}, 'mappings': [{'options': {'match': 'null', 'result': {'text': 'N/A'}}, 'type': 'special'}], 'thresholds': {'mode': 'absolute', 'steps': [{'color': 'green', 'value': None}, {'color': 'red', 'value': 80}]}, 'unit': 'ops'}, 'overrides': []}, 'gridPos': {'h': 3, 'w': 6, 'x': 18, 'y': 6}, 'id': 39, 'links': [], 'maxDataPoints': 100, 'options': {'colorMode': 'none', 'graphMode': 'area', 'justifyMode': 'auto', 'orientation': 'horizontal', 'reduceOptions': {'calcs': ['mean'], 'fields': '', 'values': False}, 'textMode': 'auto'}, 'pluginVersion': '8.3.3', 'targets': [{'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'sum(irate(seldon_api_executor_client_requests_seconds_count{deployment_name=~\"$deployment\",code=~\"5.*\"}[1m]))', 'format': 'time_series', 'hide': False, 'interval': '', 'intervalFactor': 2, 'legendFormat': '', 'refId': 'A', 'step': 20}], 'title': '5xxs ($deployment)', 'type': 'stat'}, {'collapsed': True, 'gridPos': {'h': 1, 'w': 24, 'x': 0, 'y': 9}, 'id': 43, 'panels': [{'gridPos': {'h': 3, 'w': 24, 'x': 0, 'y': 13}, 'id': 8, 'links': [], 'options': {'content': '\u003cdiv class=\"text-center dashboard-header\"\u003e\\n  \u003cspan\u003eModels\u003c/span\u003e\\n\u003c/div\u003e', 'mode': 'html'}, 'pluginVersion': '8.3.3', 'type': 'text'}], 'title': 'Models', 'type': 'row'}, {'aliasColors': {}, 'bars': False, 'dashLength': 10, 'dashes': False, 'fieldConfig': {'defaults': {'links': []}, 'overrides': []}, 'fill': 1, 'fillGradient': 0, 'gridPos': {'h': 9, 'w': 12, 'x': 0, 'y': 10}, 'hiddenSeries': False, 'id': 7, 'legend': {'avg': False, 'current': False, 'max': False, 'min': False, 'show': True, 'total': False, 'values': False}, 'lines': True, 'linewidth': 1, 'links': [], 'nullPointMode': 'null', 'options': {'alertThreshold': True}, 'percentage': False, 'pluginVersion': '8.3.3', 'pointradius': 5, 'points': False, 'renderer': 'flot', 'seriesOverrides': [], 'spaceLength': 10, 'stack': False, 'steppedLine': False, 'targets': [{'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'sum(rate(seldon_api_executor_client_requests_seconds_count{model_name=~\"$model_name\",model_version=~\"$model_version\",model_image=~\"$model_image\",predictor_name=~\"$predictor\",predictor_version=~\"$version\"}[5s])) by (model_name,predictor_name,predictor_version,model_image,model_version,service)', 'format': 'time_series', 'hide': True, 'interval': '', 'intervalFactor': 2, 'legendFormat': '{{predictor_name}}:{{predictor_version}} ({{model_name}} {{model_image}} : {{model_version}}) {{service}}', 'metric': 'io_seldon_apife_api_rest_RestClientController_home_snapshot_75thPercentile', 'refId': 'A', 'step': 2}, {'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'rate(seldon_api_executor_client_requests_seconds_count{model_name=~\"$model_name\",model_version=~\"$model_version\",model_image=~\"$model_image\",predictor_name=~\"$predictor\",predictor_version=~\"$version\"}[1m])', 'hide': False, 'interval': '', 'legendFormat': '', 'refId': 'B'}], 'thresholds': [], 'timeRegions': [], 'title': 'Reqs/sec to $model_image', 'tooltip': {'shared': True, 'sort': 0, 'value_type': 'individual'}, 'type': 'graph', 'xaxis': {'mode': 'time', 'show': True, 'values': []}, 'yaxes': [{'format': 'short', 'logBase': 1, 'min': '0', 'show': True}, {'format': 'short', 'logBase': 1, 'show': True}], 'yaxis': {'align': False}}, {'aliasColors': {}, 'bars': False, 'dashLength': 10, 'dashes': False, 'fieldConfig': {'defaults': {'links': []}, 'overrides': []}, 'fill': 1, 'fillGradient': 0, 'gridPos': {'h': 9, 'w': 12, 'x': 12, 'y': 10}, 'hiddenSeries': False, 'id': 11, 'legend': {'alignAsTable': True, 'avg': False, 'current': False, 'max': False, 'min': False, 'show': True, 'total': False, 'values': False}, 'lines': True, 'linewidth': 1, 'links': [], 'nullPointMode': 'null', 'options': {'alertThreshold': True}, 'percentage': False, 'pluginVersion': '8.3.3', 'pointradius': 5, 'points': False, 'renderer': 'flot', 'seriesOverrides': [], 'spaceLength': 10, 'stack': False, 'steppedLine': False, 'targets': [{'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'expr': 'histogram_quantile(0.5, sum(rate(seldon_api_executor_client_requests_seconds_bucket{service=~\"/Predict\", model_image=~\"$model_image\",predictor_name=~\"$predictor\",predictor_version=~\"$version\",model_name=~\"$model_name\",model_version=~\"$model_version\"}[20s])) by (predictor_name,predictor_version,model_name,model_image,model_version,le))', 'format': 'time_series', 'hide': True, 'intervalFactor': 2, 'legendFormat': '{{predictor_name}}:{{predictor_version}} {{model_name}} {{model_image}}: {{model_version}} (p50)', 'metric': '', 'refId': 'E', 'step': 2}, {'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'histogram_quantile(0.75, sum(rate(seldon_api_executor_client_requests_seconds_bucket{model_image=~\"$model_image\",predictor_name=~\"$predictor\",predictor_version=~\"$version\",model_name=~\"$model_name\",model_version=~\"$model_version\"}[1m])) by (model_image, predictor_name, predictor_name, model_name, model_version, le))', 'format': 'time_series', 'hide': False, 'interval': '', 'intervalFactor': 2, 'legendFormat': '{{predictor_name}}:{{predictor_version}} {{model_name}} {{model_image}}:{{model_version}} {{service}} (p75)', 'metric': '', 'refId': 'B', 'step': 2}, {'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'histogram_quantile(0.9, sum(rate(seldon_api_executor_client_requests_seconds_bucket{model_image=~\"$model_image\",predictor_name=~\"$predictor\",predictor_version=~\"$version\",model_name=~\"$model_name\",model_version=~\"$model_version\"}[1m])) by (model_image, predictor_name, predictor_name, model_name, model_version, le))', 'format': 'time_series', 'hide': False, 'interval': '', 'intervalFactor': 2, 'legendFormat': '{{predictor_name}}:{{predictor_version}} {{model_name}} {{model_image}}:{{model_version}}  {{service}} (p90)', 'metric': '', 'refId': 'A', 'step': 2}, {'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'histogram_quantile(0.95, sum(rate(seldon_api_executor_client_requests_seconds_bucket{model_image=~\"$model_image\",predictor_name=~\"$predictor\",predictor_version=~\"$version\",model_name=~\"$model_name\",model_version=~\"$model_version\"}[1m])) by (model_image, predictor_name, predictor_name, model_name, model_version, le))', 'format': 'time_series', 'hide': False, 'interval': '', 'intervalFactor': 2, 'legendFormat': '{{predictor_name}}:{{predictor_version}} {{model_name}} {{model_image}}:{{model_version}} {{service}} (p95)', 'metric': '', 'refId': 'C', 'step': 2}, {'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'histogram_quantile(0.99, sum(rate(seldon_api_executor_client_requests_seconds_bucket{model_image=~\"$model_image\",predictor_name=~\"$predictor\",predictor_version=~\"$version\",model_name=~\"$model_name\",model_version=~\"$model_version\"}[1m])) by (model_image, predictor_name, predictor_name, model_name, model_version, le))', 'format': 'time_series', 'hide': False, 'interval': '', 'intervalFactor': 2, 'legendFormat': '{{predictor_name}}:{{predictor_version}} {{model_name}} {{model_image}}:{{model_version}} {{service}} (p99)', 'metric': '', 'refId': 'D', 'step': 2}, {'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'sum(rate(seldon_api_executor_client_requests_seconds_bucket{model_image=~\"$model_image\",predictor_name=~\"$predictor\",predictor_version=~\"$version\",model_name=~\"$model_name\",model_version=~\"$model_version\"}[1m]))', 'hide': True, 'interval': '', 'legendFormat': '', 'refId': 'F'}, {'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'exemplar': True, 'expr': 'code=\"500\",deployment_name=\"model-d4973568c999443a9ab474efc625f496\",method=\"post\",model_image=\"quay.io/ml-on-k8s/container-model\",model_name=\"transformer\",model_version=\"2.0.0\",predictor_name=\"flights-ontime\",predictor_version=\"\",service=\"/transform-input\",le=\"0.005\"}', 'hide': True, 'interval': '', 'legendFormat': '', 'refId': 'G'}], 'thresholds': [], 'timeRegions': [], 'title': '$model_image Latency', 'tooltip': {'shared': False, 'sort': 0, 'value_type': 'individual'}, 'type': 'graph', 'xaxis': {'mode': 'time', 'show': True, 'values': []}, 'yaxes': [{'format': 'short', 'logBase': 1, 'min': '0', 'show': True}, {'format': 'short', 'logBase': 1, 'show': True}], 'yaxis': {'align': False}}, {'collapsed': False, 'gridPos': {'h': 1, 'w': 24, 'x': 0, 'y': 19}, 'id': 44, 'panels': [], 'repeat': 'model_image', 'title': 'Model Metrics $model_image', 'type': 'row'}], 'refresh': '', 'schemaVersion': 34, 'style': 'dark', 'tags': ['seldon'], 'templating': {'list': [{'allValue': '.*', 'current': {'selected': True, 'text': ['All'], 'value': ['$__all']}, 'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'definition': 'label_values(seldon_api_executor_client_requests_seconds_count,deployment_name)', 'hide': 0, 'includeAll': True, 'multi': True, 'name': 'deployment', 'options': [], 'query': {'query': 'label_values(seldon_api_executor_client_requests_seconds_count,deployment_name)', 'refId': 'prometheus-deployment-Variable-Query'}, 'refresh': 1, 'regex': '', 'skipUrlSync': False, 'sort': 0, 'tagValuesQuery': '', 'tagsQuery': '', 'type': 'query', 'useTags': False}, {'allValue': '.*', 'current': {'selected': True, 'text': ['All'], 'value': ['$__all']}, 'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'definition': 'label_values(seldon_api_executor_client_requests_seconds_count,predictor_name)', 'hide': 0, 'includeAll': True, 'multi': True, 'name': 'predictor', 'options': [], 'query': {'query': 'label_values(seldon_api_executor_client_requests_seconds_count,predictor_name)', 'refId': 'prometheus-predictor-Variable-Query'}, 'refresh': 1, 'regex': '', 'skipUrlSync': False, 'sort': 0, 'tagValuesQuery': '', 'tagsQuery': '', 'type': 'query', 'useTags': False}, {'allValue': '.*', 'current': {'selected': True, 'text': ['All'], 'value': ['$__all']}, 'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'definition': 'label_values(seldon_api_executor_client_requests_seconds_count,predictor_version)', 'hide': 0, 'includeAll': True, 'multi': True, 'name': 'version', 'options': [], 'query': {'query': 'label_values(seldon_api_executor_client_requests_seconds_count,predictor_version)', 'refId': 'prometheus-version-Variable-Query'}, 'refresh': 1, 'regex': '', 'skipUrlSync': False, 'sort': 0, 'tagValuesQuery': '', 'tagsQuery': '', 'type': 'query', 'useTags': False}, {'allValue': '.*', 'current': {'selected': True, 'text': ['All'], 'value': ['$__all']}, 'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'definition': 'label_values(seldon_api_executor_client_requests_seconds_count,model_name)', 'hide': 0, 'includeAll': True, 'multi': True, 'name': 'model_name', 'options': [], 'query': {'query': 'label_values(seldon_api_executor_client_requests_seconds_count,model_name)', 'refId': 'prometheus-model_name-Variable-Query'}, 'refresh': 1, 'regex': '', 'skipUrlSync': False, 'sort': 0, 'tagValuesQuery': '', 'tagsQuery': '', 'type': 'query', 'useTags': False}, {'allValue': '.*', 'current': {'selected': True, 'text': ['All'], 'value': ['$__all']}, 'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'definition': 'label_values(seldon_api_executor_client_requests_seconds_count,model_image)', 'hide': 0, 'includeAll': True, 'multi': True, 'name': 'model_image', 'options': [], 'query': {'query': 'label_values(seldon_api_executor_client_requests_seconds_count,model_image)', 'refId': 'prometheus-model_image-Variable-Query'}, 'refresh': 1, 'regex': '', 'skipUrlSync': False, 'sort': 0, 'tagValuesQuery': '', 'tagsQuery': '', 'type': 'query', 'useTags': False}, {'allValue': '.*', 'current': {'selected': True, 'text': ['All'], 'value': ['$__all']}, 'datasource': {'type': 'prometheus', 'uid': 'HPonbHsnk'}, 'definition': 'label_values(seldon_api_executor_client_requests_seconds_count,model_version)', 'hide': 0, 'includeAll': True, 'multi': True, 'name': 'model_version', 'options': [], 'query': {'query': 'label_values(seldon_api_executor_client_requests_seconds_count,model_version)', 'refId': 'prometheus-model_version-Variable-Query'}, 'refresh': 1, 'regex': '', 'skipUrlSync': False, 'sort': 0, 'tagValuesQuery': '', 'tagsQuery': '', 'type': 'query', 'useTags': False}]}, 'time': {'from': 'now-30m', 'to': 'now'}, 'timepicker': {'refresh_intervals': ['5s', '10s', '30s', '1m', '5m', '15m', '30m', '1h', '2h', '1d'], 'time_options': ['5m', '15m', '1h', '6h', '12h', '24h', '2d', '7d', '30d']}, 'timezone': 'browser', 'title': 'Fights Prediction Analytics', 'uid': 'U1cSDzyZzx', 'version': 7, 'weekStart': ''}","reason":"MICROSERVICE_BAD_DATA","status":1}}
* Closing connection 0

Seldon-core Logs:

[2023-07-24 02:40:21 +0000] [82] [DEBUG] Closing connection. 
[2023-07-24 02:40:21 +0000] [82] [DEBUG] Closing connection. 
[2023-07-24 02:40:22 +0000] [82] [DEBUG] POST /transform-input
2023-07-24 02:40:22,049 - seldon_core.wrapper:TransformInput:106 - DEBUG:  REST Request: <Request 'https://localhost:9000/transform-input' [POST]>
2023-07-24 02:40:22,050 - seldon_core.wrapper:handle_invalid_usage:60 - ERROR:  {'status': {'status': 1, 'info': 'Invalid request data type: {\'annotations\': {\'list\': [{\'builtIn\': 1, \'datasource\': \'-- Grafana --\', \'enable\': True, \'hide\': True, \'iconColor\': \'rgba(0, 211, 255, 1)\', \'name\': \'Annotations & Alerts\', \'target\': {\'limit\': 100, \'matchAny\': False, \'tags\': [], \'type\': \'dashboard\'}, \'type\': \'dashboard\'}]}, \'editable\': True, \'fiscalYearStartMonth\': 0, \'graphTooltip\': 0, \'id\': 2, \'iteration\': 1648959757365, \'links\': [], \'liveNow\': False, \'panels\': [{\'collapsed\': True, \'gridPos\': {\'h\': 1, \'w\': 24, \'x\': 0, \'y\': 0}, \'id\': 40, \'panels\': [{\'gridPos\': {\'h\': 3, \'w\': 24, \'x\': 0, \'y\': 1}, \'id\': 27, \'links\': [], \'options\': {\'content\': \'<div class="text-center dashboard-header">\\n  <span>Seldon Core API Dashboard</span>\\n</div>\', \'mode\': \'html\'}, \'pluginVersion\': \'8.3.3\', \'type\': \'text\'}], \'title\': \'Heading\', \'type\': \'row\'}, {\'collapsed\': False, \'gridPos\': {\'h\': 1, \'w\': 24, \'x\': 0, \'y\': 1}, \'id\': 41, \'panels\': [], \'title\': \'Global Counts\', \'type\': \'row\'}, {\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'fieldConfig\': {\'defaults\': {\'color\': {\'fixedColor\': \'rgb(31, 120, 193)\', \'mode\': \'fixed\'}, \'mappings\': [{\'options\': {\'match\': \'null\', \'result\': {\'text\': \'N/A\'}}, \'type\': \'special\'}], \'thresholds\': {\'mode\': \'absolute\', \'steps\': [{\'color\': \'green\', \'value\': None}, {\'color\': \'red\', \'value\': 80}]}, \'unit\': \'ops\'}, \'overrides\': []}, \'gridPos\': {\'h\': 3, \'w\': 6, \'x\': 0, \'y\': 2}, \'id\': 16, \'links\': [], \'maxDataPoints\': 100, \'options\': {\'colorMode\': \'none\', \'graphMode\': \'area\', \'justifyMode\': \'auto\', \'orientation\': \'horizontal\', \'reduceOptions\': {\'calcs\': [\'lastNotNull\'], \'fields\': \'\', \'values\': False}, \'textMode\': \'auto\'}, \'pluginVersion\': \'8.3.3\', \'targets\': [{\'expr\': \'round(sum(irate(seldon_api_executor_server_requests_seconds_count[1m])),0.001)\', \'format\': \'time_series\', \'instant\': False, \'intervalFactor\': 2, \'refId\': \'A\', \'step\': 20}], \'title\': \'Global Request Rate\', \'type\': \'stat\'}, {\'fieldConfig\': {\'defaults\': {\'color\': {\'fixedColor\': \'rgb(31, 120, 193)\', \'mode\': \'fixed\'}, \'mappings\': [{\'options\': {\'match\': \'null\', \'result\': {\'text\': \'N/A\'}}, \'type\': \'special\'}], \'thresholds\': {\'mode\': \'absolute\', \'steps\': [{\'color\': \'green\', \'value\': None}, {\'color\': \'red\', \'value\': 80}]}, \'unit\': \'percentunit\'}, \'overrides\': []}, \'gridPos\': {\'h\': 3, \'w\': 6, \'x\': 6, \'y\': 2}, \'id\': 17, \'links\': [], \'maxDataPoints\': 100, \'options\': {\'colorMode\': \'none\', \'graphMode\': \'area\', \'justifyMode\': \'auto\', \'orientation\': \'horizontal\', \'reduceOptions\': {\'calcs\': [\'mean\'], \'fields\': \'\', \'values\': False}, \'textMode\': \'auto\'}, \'pluginVersion\': \'8.3.3\', \'targets\': [{\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'sum(rate(seldon_api_executor_server_requests_seconds_count{code!~"5.*"}[1m])) / sum(rate(seldon_api_executor_server_requests_seconds_count[1m]))\', \'format\': \'time_series\', \'interval\': \'\', \'intervalFactor\': 2, \'legendFormat\': \'\', \'refId\': \'A\', \'step\': 20}], \'title\': \'Success\', \'type\': \'stat\'}, {\'fieldConfig\': {\'defaults\': {\'color\': {\'fixedColor\': \'rgb(31, 120, 193)\', \'mode\': \'fixed\'}, \'mappings\': [{\'options\': {\'match\': \'null\', \'result\': {\'text\': \'N/A\'}}, \'type\': \'special\'}], \'thresholds\': {\'mode\': \'absolute\', \'steps\': [{\'color\': \'green\', \'value\': None}, {\'color\': \'red\', \'value\': 80}]}, \'unit\': \'ops\'}, \'overrides\': []}, \'gridPos\': {\'h\': 3, \'w\': 6, \'x\': 12, \'y\': 2}, \'id\': 18, \'links\': [], \'maxDataPoints\': 100, \'options\': {\'colorMode\': \'none\', \'graphMode\': \'area\', \'justifyMode\': \'auto\', \'orientation\': \'horizontal\', \'reduceOptions\': {\'calcs\': [\'mean\'], \'fields\': \'\', \'values\': False}, \'textMode\': \'auto\'}, \'pluginVersion\': \'8.3.3\', \'targets\': [{\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'sum(irate(seldon_api_executor_server_requests_seconds_count{code=~"4.*"}[1m])) \', \'format\': \'time_series\', \'interval\': \'\', \'intervalFactor\': 2, \'legendFormat\': \'\', \'refId\': \'A\', \'step\': 20}], \'title\': \'4xxs\', \'type\': \'stat\'}, {\'fieldConfig\': {\'defaults\': {\'color\': {\'fixedColor\': \'rgb(31, 120, 193)\', \'mode\': \'fixed\'}, \'mappings\': [{\'options\': {\'match\': \'null\', \'result\': {\'text\': \'N/A\'}}, \'type\': \'special\'}], \'thresholds\': {\'mode\': \'absolute\', \'steps\': [{\'color\': \'green\', \'value\': None}, {\'color\': \'red\', \'value\': 80}]}, \'unit\': \'ops\'}, \'overrides\': []}, \'gridPos\': {\'h\': 3, \'w\': 6, \'x\': 18, \'y\': 2}, \'id\': 19, \'links\': [], \'maxDataPoints\': 100, \'options\': {\'colorMode\': \'none\', \'graphMode\': \'area\', \'justifyMode\': \'auto\', \'orientation\': \'horizontal\', \'reduceOptions\': {\'calcs\': [\'mean\'], \'fields\': \'\', \'values\': False}, \'textMode\': \'auto\'}, \'pluginVersion\': \'8.3.3\', \'targets\': [{\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'sum(irate(seldon_api_executor_server_requests_seconds_count{code=~"5.*"}[1m])) \', \'format\': \'time_series\', \'hide\': False, \'interval\': \'\', \'intervalFactor\': 2, \'legendFormat\': \'\', \'refId\': \'A\', \'step\': 20}], \'title\': \'5xxs\', \'type\': \'stat\'}, {\'collapsed\': False, \'gridPos\': {\'h\': 1, \'w\': 24, \'x\': 0, \'y\': 5}, \'id\': 42, \'panels\': [], \'repeat\': \'deployment\', \'title\': \'Deployment Counts $deployment\', \'type\': \'row\'}, {\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'fieldConfig\': {\'defaults\': {\'color\': {\'fixedColor\': \'rgb(31, 120, 193)\', \'mode\': \'fixed\'}, \'mappings\': [{\'options\': {\'match\': \'null\', \'result\': {\'text\': \'N/A\'}}, \'type\': \'special\'}], \'thresholds\': {\'mode\': \'absolute\', \'steps\': [{\'color\': \'green\', \'value\': None}, {\'color\': \'red\', \'value\': 80}]}, \'unit\': \'ops\'}, \'overrides\': []}, \'gridPos\': {\'h\': 3, \'w\': 6, \'x\': 0, \'y\': 6}, \'id\': 36, \'links\': [], \'maxDataPoints\': 100, \'options\': {\'colorMode\': \'none\', \'graphMode\': \'area\', \'justifyMode\': \'auto\', \'orientation\': \'horizontal\', \'reduceOptions\': {\'calcs\': [\'lastNotNull\'], \'fields\': \'\', \'values\': False}, \'textMode\': \'auto\'}, \'pluginVersion\': \'8.3.3\', \'targets\': [{\'expr\': "round(sum(irate(seldon_api_executor_client_requests_seconds_count{deployment_name=~\'$deployment\'}[1m])), 0.001)", \'format\': \'time_series\', \'intervalFactor\': 2, \'refId\': \'A\', \'step\': 20}], \'title\': \'Request Rate ($deployment)\', \'type\': \'stat\'}, {\'fieldConfig\': {\'defaults\': {\'color\': {\'fixedColor\': \'rgb(31, 120, 193)\', \'mode\': \'fixed\'}, \'mappings\': [{\'options\': {\'match\': \'null\', \'result\': {\'text\': \'N/A\'}}, \'type\': \'special\'}], \'thresholds\': {\'mode\': \'absolute\', \'steps\': [{\'color\': \'green\', \'value\': None}, {\'color\': \'red\', \'value\': 80}]}, \'unit\': \'percentunit\'}, \'overrides\': []}, \'gridPos\': {\'h\': 3, \'w\': 6, \'x\': 6, \'y\': 6}, \'id\': 37, \'links\': [], \'maxDataPoints\': 100, \'options\': {\'colorMode\': \'none\', \'graphMode\': \'area\', \'justifyMode\': \'auto\', \'orientation\': \'horizontal\', \'reduceOptions\': {\'calcs\': [\'mean\'], \'fields\': \'\', \'values\': False}, \'textMode\': \'auto\'}, \'pluginVersion\': \'8.3.3\', \'targets\': [{\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'sum(rate(seldon_api_executor_client_requests_seconds_count{deployment_name=~"$deployment",code!~"5.*"}[1m])) / sum(rate(seldon_api_executor_client_requests_seconds_count{deployment_name=~"$deployment"}[1m]))\', \'format\': \'time_series\', \'interval\': \'\', \'intervalFactor\': 2, \'legendFormat\': \'\', \'refId\': \'A\', \'step\': 20}], \'title\': \'Success ($deployment)\', \'type\': \'stat\'}, {\'fieldConfig\': {\'defaults\': {\'color\': {\'fixedColor\': \'rgb(31, 120, 193)\', \'mode\': \'fixed\'}, \'mappings\': [{\'options\': {\'match\': \'null\', \'result\': {\'text\': \'N/A\'}}, \'type\': \'special\'}], \'thresholds\': {\'mode\': \'absolute\', \'steps\': [{\'color\': \'green\', \'value\': None}, {\'color\': \'red\', \'value\': 80}]}, \'unit\': \'ops\'}, \'overrides\': []}, \'gridPos\': {\'h\': 3, \'w\': 6, \'x\': 12, \'y\': 6}, \'id\': 38, \'links\': [], \'maxDataPoints\': 100, \'options\': {\'colorMode\': \'none\', \'graphMode\': \'area\', \'justifyMode\': \'auto\', \'orientation\': \'horizontal\', \'reduceOptions\': {\'calcs\': [\'mean\'], \'fields\': \'\', \'values\': False}, \'textMode\': \'auto\'}, \'pluginVersion\': \'8.3.3\', \'targets\': [{\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'sum(irate(seldon_api_executor_client_requests_seconds_count{deployment_name=~"$deployment",code=~"4.*"}[1m]))\', \'format\': \'time_series\', \'interval\': \'\', \'intervalFactor\': 2, \'legendFormat\': \'\', \'refId\': \'A\', \'step\': 20}], \'title\': \'4xxs ($deployment)\', \'type\': \'stat\'}, {\'fieldConfig\': {\'defaults\': {\'color\': {\'fixedColor\': \'rgb(31, 120, 193)\', \'mode\': \'fixed\'}, \'mappings\': [{\'options\': {\'match\': \'null\', \'result\': {\'text\': \'N/A\'}}, \'type\': \'special\'}], \'thresholds\': {\'mode\': \'absolute\', \'steps\': [{\'color\': \'green\', \'value\': None}, {\'color\': \'red\', \'value\': 80}]}, \'unit\': \'ops\'}, \'overrides\': []}, \'gridPos\': {\'h\': 3, \'w\': 6, \'x\': 18, \'y\': 6}, \'id\': 39, \'links\': [], \'maxDataPoints\': 100, \'options\': {\'colorMode\': \'none\', \'graphMode\': \'area\', \'justifyMode\': \'auto\', \'orientation\': \'horizontal\', \'reduceOptions\': {\'calcs\': [\'mean\'], \'fields\': \'\', \'values\': False}, \'textMode\': \'auto\'}, \'pluginVersion\': \'8.3.3\', \'targets\': [{\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'sum(irate(seldon_api_executor_client_requests_seconds_count{deployment_name=~"$deployment",code=~"5.*"}[1m]))\', \'format\': \'time_series\', \'hide\': False, \'interval\': \'\', \'intervalFactor\': 2, \'legendFormat\': \'\', \'refId\': \'A\', \'step\': 20}], \'title\': \'5xxs ($deployment)\', \'type\': \'stat\'}, {\'collapsed\': True, \'gridPos\': {\'h\': 1, \'w\': 24, \'x\': 0, \'y\': 9}, \'id\': 43, \'panels\': [{\'gridPos\': {\'h\': 3, \'w\': 24, \'x\': 0, \'y\': 13}, \'id\': 8, \'links\': [], \'options\': {\'content\': \'<div class="text-center dashboard-header">\\n  <span>Models</span>\\n</div>\', \'mode\': \'html\'}, \'pluginVersion\': \'8.3.3\', \'type\': \'text\'}], \'title\': \'Models\', \'type\': \'row\'}, {\'aliasColors\': {}, \'bars\': False, \'dashLength\': 10, \'dashes\': False, \'fieldConfig\': {\'defaults\': {\'links\': []}, \'overrides\': []}, \'fill\': 1, \'fillGradient\': 0, \'gridPos\': {\'h\': 9, \'w\': 12, \'x\': 0, \'y\': 10}, \'hiddenSeries\': False, \'id\': 7, \'legend\': {\'avg\': False, \'current\': False, \'max\': False, \'min\': False, \'show\': True, \'total\': False, \'values\': False}, \'lines\': True, \'linewidth\': 1, \'links\': [], \'nullPointMode\': \'null\', \'options\': {\'alertThreshold\': True}, \'percentage\': False, \'pluginVersion\': \'8.3.3\', \'pointradius\': 5, \'points\': False, \'renderer\': \'flot\', \'seriesOverrides\': [], \'spaceLength\': 10, \'stack\': False, \'steppedLine\': False, \'targets\': [{\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'sum(rate(seldon_api_executor_client_requests_seconds_count{model_name=~"$model_name",model_version=~"$model_version",model_image=~"$model_image",predictor_name=~"$predictor",predictor_version=~"$version"}[5s])) by (model_name,predictor_name,predictor_version,model_image,model_version,service)\', \'format\': \'time_series\', \'hide\': True, \'interval\': \'\', \'intervalFactor\': 2, \'legendFormat\': \'{{predictor_name}}:{{predictor_version}} ({{model_name}} {{model_image}} : {{model_version}}) {{service}}\', \'metric\': \'io_seldon_apife_api_rest_RestClientController_home_snapshot_75thPercentile\', \'refId\': \'A\', \'step\': 2}, {\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'rate(seldon_api_executor_client_requests_seconds_count{model_name=~"$model_name",model_version=~"$model_version",model_image=~"$model_image",predictor_name=~"$predictor",predictor_version=~"$version"}[1m])\', \'hide\': False, \'interval\': \'\', \'legendFormat\': \'\', \'refId\': \'B\'}], \'thresholds\': [], \'timeRegions\': [], \'title\': \'Reqs/sec to $model_image\', \'tooltip\': {\'shared\': True, \'sort\': 0, \'value_type\': \'individual\'}, \'type\': \'graph\', \'xaxis\': {\'mode\': \'time\', \'show\': True, \'values\': []}, \'yaxes\': [{\'format\': \'short\', \'logBase\': 1, \'min\': \'0\', \'show\': True}, {\'format\': \'short\', \'logBase\': 1, \'show\': True}], \'yaxis\': {\'align\': False}}, {\'aliasColors\': {}, \'bars\': False, \'dashLength\': 10, \'dashes\': False, \'fieldConfig\': {\'defaults\': {\'links\': []}, \'overrides\': []}, \'fill\': 1, \'fillGradient\': 0, \'gridPos\': {\'h\': 9, \'w\': 12, \'x\': 12, \'y\': 10}, \'hiddenSeries\': False, \'id\': 11, \'legend\': {\'alignAsTable\': True, \'avg\': False, \'current\': False, \'max\': False, \'min\': False, \'show\': True, \'total\': False, \'values\': False}, \'lines\': True, \'linewidth\': 1, \'links\': [], \'nullPointMode\': \'null\', \'options\': {\'alertThreshold\': True}, \'percentage\': False, \'pluginVersion\': \'8.3.3\', \'pointradius\': 5, \'points\': False, \'renderer\': \'flot\', \'seriesOverrides\': [], \'spaceLength\': 10, \'stack\': False, \'steppedLine\': False, \'targets\': [{\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'expr\': \'histogram_quantile(0.5, sum(rate(seldon_api_executor_client_requests_seconds_bucket{service=~"/Predict", model_image=~"$model_image",predictor_name=~"$predictor",predictor_version=~"$version",model_name=~"$model_name",model_version=~"$model_version"}[20s])) by (predictor_name,predictor_version,model_name,model_image,model_version,le))\', \'format\': \'time_series\', \'hide\': True, \'intervalFactor\': 2, \'legendFormat\': \'{{predictor_name}}:{{predictor_version}} {{model_name}} {{model_image}}: {{model_version}} (p50)\', \'metric\': \'\', \'refId\': \'E\', \'step\': 2}, {\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'histogram_quantile(0.75, sum(rate(seldon_api_executor_client_requests_seconds_bucket{model_image=~"$model_image",predictor_name=~"$predictor",predictor_version=~"$version",model_name=~"$model_name",model_version=~"$model_version"}[1m])) by (model_image, predictor_name, predictor_name, model_name, model_version, le))\', \'format\': \'time_series\', \'hide\': False, \'interval\': \'\', \'intervalFactor\': 2, \'legendFormat\': \'{{predictor_name}}:{{predictor_version}} {{model_name}} {{model_image}}:{{model_version}} {{service}} (p75)\', \'metric\': \'\', \'refId\': \'B\', \'step\': 2}, {\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'histogram_quantile(0.9, sum(rate(seldon_api_executor_client_requests_seconds_bucket{model_image=~"$model_image",predictor_name=~"$predictor",predictor_version=~"$version",model_name=~"$model_name",model_version=~"$model_version"}[1m])) by (model_image, predictor_name, predictor_name, model_name, model_version, le))\', \'format\': \'time_series\', \'hide\': False, \'interval\': \'\', \'intervalFactor\': 2, \'legendFormat\': \'{{predictor_name}}:{{predictor_version}} {{model_name}} {{model_image}}:{{model_version}}  {{service}} (p90)\', \'metric\': \'\', \'refId\': \'A\', \'step\': 2}, {\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'histogram_quantile(0.95, sum(rate(seldon_api_executor_client_requests_seconds_bucket{model_image=~"$model_image",predictor_name=~"$predictor",predictor_version=~"$version",model_name=~"$model_name",model_version=~"$model_version"}[1m])) by (model_image, predictor_name, predictor_name, model_name, model_version, le))\', \'format\': \'time_series\', \'hide\': False, \'interval\': \'\', \'intervalFactor\': 2, \'legendFormat\': \'{{predictor_name}}:{{predictor_version}} {{model_name}} {{model_image}}:{{model_version}} {{service}} (p95)\', \'metric\': \'\', \'refId\': \'C\', \'step\': 2}, {\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'histogram_quantile(0.99, sum(rate(seldon_api_executor_client_requests_seconds_bucket{model_image=~"$model_image",predictor_name=~"$predictor",predictor_version=~"$version",model_name=~"$model_name",model_version=~"$model_version"}[1m])) by (model_image, predictor_name, predictor_name, model_name, model_version, le))\', \'format\': \'time_series\', \'hide\': False, \'interval\': \'\', \'intervalFactor\': 2, \'legendFormat\': \'{{predictor_name}}:{{predictor_version}} {{model_name}} {{model_image}}:{{model_version}} {{service}} (p99)\', \'metric\': \'\', \'refId\': \'D\', \'step\': 2}, {\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'sum(rate(seldon_api_executor_client_requests_seconds_bucket{model_image=~"$model_image",predictor_name=~"$predictor",predictor_version=~"$version",model_name=~"$model_name",model_version=~"$model_version"}[1m]))\', \'hide\': True, \'interval\': \'\', \'legendFormat\': \'\', \'refId\': \'F\'}, {\'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'exemplar\': True, \'expr\': \'code="500",deployment_name="model-d4973568c999443a9ab474efc625f496",method="post",model_image="quay.io/ml-on-k8s/container-model",model_name="transformer",model_version="2.0.0",predictor_name="flights-ontime",predictor_version="",service="/transform-input",le="0.005"}\', \'hide\': True, \'interval\': \'\', \'legendFormat\': \'\', \'refId\': \'G\'}], \'thresholds\': [], \'timeRegions\': [], \'title\': \'$model_image Latency\', \'tooltip\': {\'shared\': False, \'sort\': 0, \'value_type\': \'individual\'}, \'type\': \'graph\', \'xaxis\': {\'mode\': \'time\', \'show\': True, \'values\': []}, \'yaxes\': [{\'format\': \'short\', \'logBase\': 1, \'min\': \'0\', \'show\': True}, {\'format\': \'short\', \'logBase\': 1, \'show\': True}], \'yaxis\': {\'align\': False}}, {\'collapsed\': False, \'gridPos\': {\'h\': 1, \'w\': 24, \'x\': 0, \'y\': 19}, \'id\': 44, \'panels\': [], \'repeat\': \'model_image\', \'title\': \'Model Metrics $model_image\', \'type\': \'row\'}], \'refresh\': \'\', \'schemaVersion\': 34, \'style\': \'dark\', \'tags\': [\'seldon\'], \'templating\': {\'list\': [{\'allValue\': \'.*\', \'current\': {\'selected\': True, \'text\': [\'All\'], \'value\': [\'$__all\']}, \'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'definition\': \'label_values(seldon_api_executor_client_requests_seconds_count,deployment_name)\', \'hide\': 0, \'includeAll\': True, \'multi\': True, \'name\': \'deployment\', \'options\': [], \'query\': {\'query\': \'label_values(seldon_api_executor_client_requests_seconds_count,deployment_name)\', \'refId\': \'prometheus-deployment-Variable-Query\'}, \'refresh\': 1, \'regex\': \'\', \'skipUrlSync\': False, \'sort\': 0, \'tagValuesQuery\': \'\', \'tagsQuery\': \'\', \'type\': \'query\', \'useTags\': False}, {\'allValue\': \'.*\', \'current\': {\'selected\': True, \'text\': [\'All\'], \'value\': [\'$__all\']}, \'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'definition\': \'label_values(seldon_api_executor_client_requests_seconds_count,predictor_name)\', \'hide\': 0, \'includeAll\': True, \'multi\': True, \'name\': \'predictor\', \'options\': [], \'query\': {\'query\': \'label_values(seldon_api_executor_client_requests_seconds_count,predictor_name)\', \'refId\': \'prometheus-predictor-Variable-Query\'}, \'refresh\': 1, \'regex\': \'\', \'skipUrlSync\': False, \'sort\': 0, \'tagValuesQuery\': \'\', \'tagsQuery\': \'\', \'type\': \'query\', \'useTags\': False}, {\'allValue\': \'.*\', \'current\': {\'selected\': True, \'text\': [\'All\'], \'value\': [\'$__all\']}, \'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'definition\': \'label_values(seldon_api_executor_client_requests_seconds_count,predictor_version)\', \'hide\': 0, \'includeAll\': True, \'multi\': True, \'name\': \'version\', \'options\': [], \'query\': {\'query\': \'label_values(seldon_api_executor_client_requests_seconds_count,predictor_version)\', \'refId\': \'prometheus-version-Variable-Query\'}, \'refresh\': 1, \'regex\': \'\', \'skipUrlSync\': False, \'sort\': 0, \'tagValuesQuery\': \'\', \'tagsQuery\': \'\', \'type\': \'query\', \'useTags\': False}, {\'allValue\': \'.*\', \'current\': {\'selected\': True, \'text\': [\'All\'], \'value\': [\'$__all\']}, \'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'definition\': \'label_values(seldon_api_executor_client_requests_seconds_count,model_name)\', \'hide\': 0, \'includeAll\': True, \'multi\': True, \'name\': \'model_name\', \'options\': [], \'query\': {\'query\': \'label_values(seldon_api_executor_client_requests_seconds_count,model_name)\', \'refId\': \'prometheus-model_name-Variable-Query\'}, \'refresh\': 1, \'regex\': \'\', \'skipUrlSync\': False, \'sort\': 0, \'tagValuesQuery\': \'\', \'tagsQuery\': \'\', \'type\': \'query\', \'useTags\': False}, {\'allValue\': \'.*\', \'current\': {\'selected\': True, \'text\': [\'All\'], \'value\': [\'$__all\']}, \'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'definition\': \'label_values(seldon_api_executor_client_requests_seconds_count,model_image)\', \'hide\': 0, \'includeAll\': True, \'multi\': True, \'name\': \'model_image\', \'options\': [], \'query\': {\'query\': \'label_values(seldon_api_executor_client_requests_seconds_count,model_image)\', \'refId\': \'prometheus-model_image-Variable-Query\'}, \'refresh\': 1, \'regex\': \'\', \'skipUrlSync\': False, \'sort\': 0, \'tagValuesQuery\': \'\', \'tagsQuery\': \'\', \'type\': \'query\', \'useTags\': False}, {\'allValue\': \'.*\', \'current\': {\'selected\': True, \'text\': [\'All\'], \'value\': [\'$__all\']}, \'datasource\': {\'type\': \'prometheus\', \'uid\': \'HPonbHsnk\'}, \'definition\': \'label_values(seldon_api_executor_client_requests_seconds_count,model_version)\', \'hide\': 0, \'includeAll\': True, \'multi\': True, \'name\': \'model_version\', \'options\': [], \'query\': {\'query\': \'label_values(seldon_api_executor_client_requests_seconds_count,model_version)\', \'refId\': \'prometheus-model_version-Variable-Query\'}, \'refresh\': 1, \'regex\': \'\', \'skipUrlSync\': False, \'sort\': 0, \'tagValuesQuery\': \'\', \'tagsQuery\': \'\', \'type\': \'query\', \'useTags\': False}]}, \'time\': {\'from\': \'now-30m\', \'to\': \'now\'}, \'timepicker\': {\'refresh_intervals\': [\'5s\', \'10s\', \'30s\', \'1m\', \'5m\', \'15m\', \'30m\', \'1h\', \'2h\', \'1d\'], \'time_options\': [\'5m\', \'15m\', \'1h\', \'6h\', \'12h\', \'24h\', \'2d\', \'7d\', \'30d\']}, \'timezone\': \'browser\', \'title\': \'Fights Prediction Analytics\', \'uid\': \'U1cSDzyZzx\', \'version\': 7, \'weekStart\': \'\'}', 'code': -1, 'reason': 'MICROSERVICE_BAD_DATA'}}

Chapter 4. The Anatomy of a Machine Learning Platform | Issue with starting keycloak | docker-entrypoint.sh: line 165: DB_ADDR: unbound variable

Hello!
I have an issue with starting keycloak in minikube.


$ kubectl create ns keycloak
$ kubectl create -f Chapter04/keycloak.yaml --namespace keycloak

$ kubectl get pods -n keycloak
NAME                       READY   STATUS             RESTARTS      AGE
keycloak-ffb6b445c-dvj4t   0/1     CrashLoopBackOff   4 (43s ago)   2m27s

$ kubectl logs keycloak-ffb6b445c-dvj4t -n keycloak
Added 'admin' to '/opt/jboss/keycloak/standalone/configuration/keycloak-add-user.json', restart server to load user
-b 0.0.0.0
/opt/jboss/tools/docker-entrypoint.sh: line 165: DB_ADDR: unbound variable

Any ideas how ho solve it?

======================

Previous steps


$ cd ~/tmp/
$ git clone [email protected]:PacktPublishing/Machine-Learning-on-Kubernetes.git
$ cd Machine-Learning-on-Kubernetes/

$ kubectl create -f Chapter04/catalog-source.yaml
$ kubectl create -f Chapter04/odh-subscription.yaml

$ kubectl get packagemanifests -o wide -n olm | grep -I opendatahub
opendatahub-operator                        Community Operators Red Hat   34m

$ kubectl get pods -n operators
NAME                                    READY   STATUS    RESTARTS   AGE
opendatahub-operator-6496657bf6-hllqr   1/1     Running   0          25s

Chapter05 manifests/kfdef/ml-platform.yaml airflow problems

Platform: minikube version: v1.24.0

tl;dr airflow won't start, logs of everything are listed below

I'm trying to recreate everything and I'm stuck with this part. I've been waiting for some time for everything written in ml-platform.yaml to configure and app-aflow-airflow-web is in CrashLoopBackoff state for 1 hour now.

I've tried killing it, recreating it and nothing has worked.

Here is list of pods created during execution of this command:

kubectl apply -f manifests/kfdef/ml-platform.yaml -n ml-workshop 
NAME                                          READY   STATUS             RESTARTS        AGE
app-aflow-airflow-scheduler-f7fc5d4cb-dndwb   2/2     Running            2 (6m30s ago)   14m
app-aflow-airflow-web-54659fb97d-n6lms        1/2     CrashLoopBackOff   4 (29s ago)     2m34s
app-aflow-airflow-web-7c566d79d-4v2wv         1/2     CrashLoopBackOff   4 (19s ago)     2m33s
app-aflow-airflow-worker-0                    1/2     Running            0               2m17s
app-aflow-postgresql-0                        1/1     Running            0               14m
app-aflow-redis-master-0                      1/1     Running            0               14m
grafana-5dc6cf89d-vs8xd                       1/1     Running            0               14m
jupyterhub-7848ccd4b7-jkvpr                   1/1     Running            0               14m
jupyterhub-db-0                               1/1     Running            0               14m
minio-ml-workshop--1-m2bh4                    0/1     Completed          2               14m
minio-ml-workshop-6b84fdc7c4-7nsql            1/1     Running            0               14m
mlflow-d65ccb65d-8wpm6                        2/2     Running            0               14m
mlflow-db-0                                   1/1     Running            0               14m
seldon-controller-manager-7f67f4985b-bs5sq    1/1     Running            0               14m
spark-operator-69cfd96bf4-7h94n               1/1     Running            0               14m

I've changed to minikube ip as mentioned.

Logs from failing container app-aflow-airflow-web-7c566d79d-4v2wv:airflow-web:

airflow 14:25:27.02
airflow 14:25:27.02 Welcome to the Bitnami airflow container
airflow 14:25:27.02 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-airflow
airflow 14:25:27.02 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-airflow/issues
airflow 14:25:27.02
airflow 14:25:27.02 INFO  ==> Enabling non-root system user with nss_wrapper
airflow 14:25:27.03 INFO  ==> ** Starting Airflow setup **
airflow 14:25:27.05 INFO  ==> Initializing Airflow ...
airflow 14:25:27.06 INFO  ==> No injected configuration file found. Creating default config file
airflow 14:25:27.77 INFO  ==> Configuring Airflow webserver authentication
airflow 14:25:27.78 INFO  ==> Configuring Airflow database
airflow 14:25:27.81 INFO  ==> Configuring Celery Executor
airflow 14:25:27.83 INFO  ==> Waiting for PostgreSQL to be available at app-aflow-postgresql:5432...
Stream closed EOF for ml-workshop/app-aflow-airflow-web-7c566d79d-4v2wv (airflow-web)

describing pods also does not reveal much for me:

Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  14m                   default-scheduler  Successfully assigned ml-workshop/app-aflow-airflow-web-7c566d79d-4v2wv to minikube
  Normal   Pulling    14m                   kubelet            Pulling image "registry.access.redhat.com/rhscl/postgresql-96-rhel7:latest"
  Normal   Pulled     14m                   kubelet            Successfully pulled image "registry.access.redhat.com/rhscl/postgresql-96-rhel7:latest" in 945.837211ms
  Normal   Created    14m                   kubelet            Created container waifordatabase
  Normal   Started    14m                   kubelet            Started container waifordatabase
  Normal   Pulling    14m                   kubelet            Pulling image "k8s.gcr.io/git-sync/git-sync:v3.2.2"
  Normal   Pulled     14m                   kubelet            Successfully pulled image "k8s.gcr.io/git-sync/git-sync:v3.2.2" in 2.879638022s
  Normal   Created    14m                   kubelet            Created container git-sync
  Normal   Started    14m                   kubelet            Started container git-sync
  Normal   Pulled     14m                   kubelet            Successfully pulled image "quay.io/ml-on-k8s/airflow:2.1.7.web.keycloak" in 1.810590705s
  Normal   Pulled     14m                   kubelet            Successfully pulled image "quay.io/ml-on-k8s/airflow:2.1.7.web.keycloak" in 1.999765805s
  Normal   Pulled     13m                   kubelet            Successfully pulled image "quay.io/ml-on-k8s/airflow:2.1.7.web.keycloak" in 2.210168418s
  Normal   Created    13m (x3 over 14m)     kubelet            Created container airflow-web
  Normal   Started    13m (x3 over 14m)     kubelet            Started container airflow-web
  Normal   Pulling    13m (x4 over 14m)     kubelet            Pulling image "quay.io/ml-on-k8s/airflow:2.1.7.web.keycloak"
  Warning  BackOff    4m14s (x46 over 13m)  kubelet            Back-off restarting failed container

replicaset:

Normal  SuccessfulCreate  15m   replicaset-controller  Created pod: app-aflow-airflow-web-7c566d79d-4v2wv 

kubectl logs:

$ kubectl logs -n ml-workshop app-aflow-airflow-web-7c566d79d-4v2wv
Defaulted container "git-sync" out of: git-sync, airflow-web, waifordatabase (init)
INFO: detected pid 1, running init handler
I1018 14:19:00.618669      12 main.go:430]  "level"=0 "msg"="starting up"  "args"=["/git-sync"] "pid"=12
I1018 14:19:00.618718      12 main.go:694]  "level"=0 "msg"="cloning repo"  "origin"="https://github.com/airflow-dags/dags/" "path"="/tmp/git"
I1018 14:19:14.308794      12 main.go:586]  "level"=0 "msg"="syncing git"  "hash"="8f22697a507c40bb42d4c674edd6b5c49ea0ecbb" "rev"="HEAD"
I1018 14:19:17.552166      12 main.go:607]  "level"=0 "msg"="adding worktree"  "branch"="origin/main" "path"="/tmp/git/rev-8f22697a507c40bb42d4c674edd6b5c49ea0ecbb"
I1018 14:19:17.556761      12 main.go:630]  "level"=0 "msg"="reset worktree to hash"  "hash"="8f22697a507c40bb42d4c674edd6b5c49ea0ecbb" "path"="/tmp/git/rev-8f22697a507c40bb42d4c674edd6b5c49ea0ecbb"
I1018 14:19:17.556781      12 main.go:635]  "level"=0 "msg"="updating submodules" 

previous logs:

$ kubectl logs -n ml-workshop app-aflow-airflow-web-7c566d79d-4v2wv --previous
Defaulted container "git-sync" out of: git-sync, airflow-web, waifordatabase (init)
Error from server (BadRequest): previous terminated container "git-sync" in pod "app-aflow-airflow-web-7c566d79d-4v2wv" not found

Service for postgresql exists and waitfordatabase executed successfully.

When I deleted this with:

kubectl delete -f manifests/kfdef/ml-platform.yaml -n ml-workshop 

and reapplied it with same command as mentioned above, airflow2-proxy secret was missing. Added that from manifests/airflow2/base/service-accounts.yaml and same error appeared.

Chapter04 Can't access Keycloak server

I'm stuck in the Keycloak server creation part (Installing Keycloak on Kubernetes).
After creating it and checking if it's running I'm trying to access the server URL, but every time I got the timeout message.
I already cleaned all k8s stuff and started from scratch both in WSL and Powershell, but the same error still occurs.

I surely followed each step to reach this part, but I don't know what to do right now.

In the logs I see nothing wrong. Here is log from the Keycloak pod:

kubectl logs keycloak-698d8fb4b-wc5xk -n keycloak


Added 'admin' to '/opt/jboss/keycloak/standalone/configuration/keycloak-add-user.json', restart server to load user
-b 0.0.0.0
=========================================================================

  Using Embedded H2 database

=========================================================================

=========================================================================

  JBoss Bootstrap Environment

  JBOSS_HOME: /opt/jboss/keycloak

  JAVA: java

  JAVA_OPTS:  -server -Xms64m -Xmx512m -XX:MetaspaceSize=96M -XX:MaxMetaspaceSize=256m -Djava.net.preferIPv4Stack=true -Djboss.modules.system.pkgs=org.jboss.byteman -Djava.awt.headless=true   --add-exports=java.base/sun.nio.ch=ALL-UNNAMED --add-exports=jdk.unsupported/sun.misc=ALL-UNNAMED --add-exports=jdk.unsupported/sun.reflect=ALL-UNNAMED

=========================================================================

←[0m21:25:31,179 INFO  [org.jboss.modules] (main) JBoss Modules version 1.11.0.Final
←[0m←[0m21:25:31,438 INFO  [org.jboss.msc] (main) JBoss MSC version 1.4.12.Final
←[0m←[0m21:25:31,443 INFO  [org.jboss.threads] (main) JBoss Threads version 2.4.0.Final
←[0m←[0m21:25:31,510 INFO  [org.jboss.as] (MSC service thread 1-2) WFLYSRV0049: Keycloak 15.0.2 (WildFly Core 15.0.1.Final) starting
←[0m←[0m21:25:31,560 INFO  [org.jboss.vfs] (MSC service thread 1-1) VFS000002: Failed to clean existing content for temp file provider of type temp. Enable DEBUG level log to find what caused this
←[0m←[0m21:25:31,976 INFO  [org.wildfly.security] (ServerService Thread Pool -- 22) ELY00001: WildFly Elytron version 1.15.3.Final
←[0m←[0m21:25:32,175 INFO  [org.jboss.as.controller.management-deprecated] (ServerService Thread Pool -- 14) WFLYCTL0033: Extension 'security' is deprecated and may not be supported in future versions
←[0m←[0m21:25:32,351 INFO  [org.jboss.as.controller.management-deprecated] (Controller Boot Thread) WFLYCTL0028: Attribute 'security-realm' in the resource at address '/core-service=management/management-interface=http-interface' is deprecated, and may be removed in a future version. See the attribute description in the output of the read-resource-description operation to learn more about the deprecation.
←[0m←[0m21:25:32,383 INFO  [org.jboss.as.controller.management-deprecated] (ServerService Thread Pool -- 24) WFLYCTL0028: Attribute 'security-realm' in the resource at address '/subsystem=undertow/server=default-server/https-listener=https' is deprecated, and may be removed in a future version. See the attribute description in the output of the read-resource-description operation to learn more about the deprecation.
←[0m←[0m21:25:32,459 INFO  [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0039: Creating http management service using socket-binding (management-http)
←[0m←[0m21:25:32,477 INFO  [org.xnio] (MSC service thread 1-2) XNIO version 3.8.4.Final
←[0m←[0m21:25:32,482 INFO  [org.xnio.nio] (MSC service thread 1-2) XNIO NIO Implementation Version 3.8.4.Final
←[0m←[0m21:25:32,516 INFO  [org.jboss.remoting] (MSC service thread 1-1) JBoss Remoting version 5.0.20.Final
←[0m←[0m21:25:32,518 INFO  [org.wildfly.extension.health] (ServerService Thread Pool -- 38) WFLYHEALTH0001: Activating Base Health Subsystem
←[0m←[33m21:25:32,520 WARN  [org.jboss.as.txn] (ServerService Thread Pool -- 55) WFLYTX0013: The node-identifier attribute on the /subsystem=transactions is set to the default value. This is a danger for environments running multiple servers. Please make sure the attribute value is unique.
←[0m←[0m21:25:32,521 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 39) WFLYCLINF0001: Activating Infinispan subsystem.
←[0m←[0m21:25:32,523 INFO  [org.jboss.as.security] (ServerService Thread Pool -- 53) WFLYSEC0002: Activating Security Subsystem
←[0m←[0m21:25:32,527 INFO  [org.jboss.as.clustering.jgroups] (ServerService Thread Pool -- 43) WFLYCLJG0001: Activating JGroups subsystem. JGroups version 4.2.11
←[0m←[0m21:25:32,525 INFO  [org.jboss.as.naming] (ServerService Thread Pool -- 50) WFLYNAM0001: Activating Naming Subsystem
←[0m←[0m21:25:32,531 INFO  [org.jboss.as.connector.subsystems.datasources] (ServerService Thread Pool -- 33) WFLYJCA0004: Deploying JDBC-compliant driver class org.h2.Driver (version 1.4)
←[0m←[0m21:25:32,541 INFO  [org.wildfly.extension.io] (ServerService Thread Pool -- 40) WFLYIO001: Worker 'default' has auto-configured to 2 IO threads with 16 max task threads based on your 1 available processors
←[0m←[0m21:25:32,547 INFO  [org.wildfly.extension.metrics] (ServerService Thread Pool -- 48) WFLYMETRICS0001: Activating Base Metrics Subsystem
←[0m←[0m21:25:32,559 INFO  [org.jboss.as.jaxrs] (ServerService Thread Pool -- 41) WFLYRS0016: RESTEasy version 3.15.1.Final
←[0m←[0m21:25:32,582 INFO  [org.jboss.as.connector] (MSC service thread 1-1) WFLYJCA0009: Starting Jakarta Connectors Subsystem (WildFly/IronJacamar 1.4.27.Final)
←[0m←[0m21:25:32,589 INFO  [org.jboss.as.security] (MSC service thread 1-1) WFLYSEC0001: Current PicketBox version=5.0.3.Final-redhat-00007
←[0m←[33m21:25:32,609 WARN  [org.wildfly.clustering.web.undertow] (ServerService Thread Pool -- 56) WFLYCLWEBUT0007: No routing provider found for default-server; using legacy provider based on static configuration
←[0m←[0m21:25:32,643 INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 56) WFLYUT0014: Creating file handler for path '/opt/jboss/keycloak/welcome-content' with options [directory-listing: 'false', follow-symlink: 'false', case-sensitive: 'true', safe-symlink-paths: '[]']
←[0m←[0m21:25:32,682 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-2) WFLYUT0003: Undertow 2.2.5.Final starting
←[0m←[0m21:25:32,712 INFO  [org.jboss.as.ejb3] (MSC service thread 1-2) WFLYEJB0482: Strict pool mdb-strict-max-pool is using a max instance size of 4 (per class), which is derived from the number of CPUs on this host.
←[0m←[0m21:25:32,713 INFO  [org.jboss.as.ejb3] (MSC service thread 1-2) WFLYEJB0481: Strict pool slsb-strict-max-pool is using a max instance size of 16 (per class), which is derived from thread worker pool sizing.
←[0m←[0m21:25:32,713 INFO  [org.jboss.as.connector.deployers.jdbc] (MSC service thread 1-2) WFLYJCA0018: Started Driver service with driver-name = h2
←[0m←[0m21:25:32,785 INFO  [org.jboss.as.naming] (MSC service thread 1-2) WFLYNAM0003: Starting Naming Service
←[0m←[0m21:25:32,794 INFO  [org.jboss.as.mail.extension] (MSC service thread 1-2) WFLYMAIL0001: Bound mail session [java:jboss/mail/Default]
←[0m←[33m21:25:32,816 WARN  [org.wildfly.extension.elytron] (MSC service thread 1-2) WFLYELY00023: KeyStore file '/opt/jboss/keycloak/standalone/configuration/application.keystore' does not exist. Used blank.
←[0m←[33m21:25:32,876 WARN  [org.wildfly.extension.elytron] (MSC service thread 1-2) WFLYELY01084: KeyStore /opt/jboss/keycloak/standalone/configuration/application.keystore not found, it will be auto generated on first use with a self-signed certificate for host localhost
←[0m←[0m21:25:32,879 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-1) WFLYUT0012: Started server default-server.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.wildfly.extension.elytron.SSLDefinitions (jar:file:/opt/jboss/keycloak/modules/system/layers/base/org/wildfly/extension/elytron/main/wildfly-elytron-integration-15.0.1.Final.jar!/) to method com.sun.net.ssl.internal.ssl.Provider.isFIPS()
WARNING: Please consider reporting this to the maintainers of org.wildfly.extension.elytron.SSLDefinitions
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
←[0m←[0m21:25:32,886 INFO  [org.jboss.as.patching] (MSC service thread 1-2) WFLYPAT0050: Keycloak cumulative patch ID is: base, one-off patches include: none
←[0m←[0m21:25:32,910 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-2) WFLYUT0006: Undertow AJP listener ajp listening on 0.0.0.0:8009
←[0m←[0m21:25:32,911 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-2) Queuing requests.
←[0m←[0m21:25:32,911 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-2) WFLYUT0018: Host default-host starting
←[0m←[0m21:25:32,918 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-2) WFLYUT0006: Undertow HTTP listener default listening on 0.0.0.0:8080
←[0m←[33m21:25:32,928 WARN  [org.jboss.as.domain.management.security] (MSC service thread 1-1) WFLYDM0111: Keystore /opt/jboss/keycloak/standalone/configuration/application.keystore not found, it will be auto generated on first use with a self signed certificate for host localhost
←[0m←[0m21:25:32,929 INFO  [org.jboss.as.server.deployment.scanner] (MSC service thread 1-2) WFLYDS0013: Started FileSystemDeploymentService for directory /opt/jboss/keycloak/standalone/deployments
←[0m←[0m21:25:32,930 INFO  [org.jboss.modcluster] (ServerService Thread Pool -- 58) MODCLUSTER000001: Initializing mod_cluster version 1.4.3.Final
←[0m←[0m21:25:32,933 INFO  [org.jboss.as.server.deployment] (MSC service thread 1-1) WFLYSRV0027: Starting deployment of "keycloak-server.war" (runtime-name: "keycloak-server.war")
←[0m←[0m21:25:32,934 INFO  [org.jboss.modcluster] (ServerService Thread Pool -- 58) MODCLUSTER000032: Listening to proxy advertisements on /224.0.1.105:23364
←[0m←[0m21:25:32,937 INFO  [org.jboss.as.ejb3] (MSC service thread 1-1) WFLYEJB0493: Jakarta Enterprise Beans subsystem suspension complete
←[0m←[0m21:25:32,977 INFO  [org.wildfly.extension.undertow] (MSC service thread 1-2) WFLYUT0006: Undertow HTTPS listener https listening on 0.0.0.0:8443
←[0m←[0m21:25:33,007 INFO  [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-2) WFLYJCA0001: Bound data source [java:jboss/datasources/KeycloakDS]
←[0m←[0m21:25:33,007 INFO  [org.jboss.as.connector.subsystems.datasources] (MSC service thread 1-2) WFLYJCA0001: Bound data source [java:jboss/datasources/ExampleDS]
←[0m←[33m21:25:33,160 WARN  [org.jgroups.protocols.UDP] (ServerService Thread Pool -- 58) JGRP000015: the send buffer of socket ManagedMulticastSocketBinding was set to 1.00MB, but the OS only allocated 212.99KB
←[0m←[33m21:25:33,160 WARN  [org.jgroups.protocols.UDP] (ServerService Thread Pool -- 58) JGRP000015: the receive buffer of socket ManagedMulticastSocketBinding was set to 20.00MB, but the OS only allocated 212.99KB
←[0m←[33m21:25:33,160 WARN  [org.jgroups.protocols.UDP] (ServerService Thread Pool -- 58) JGRP000015: the send buffer of socket ManagedMulticastSocketBinding was set to 1.00MB, but the OS only allocated 212.99KB
←[0m←[33m21:25:33,161 WARN  [org.jgroups.protocols.UDP] (ServerService Thread Pool -- 58) JGRP000015: the receive buffer of socket ManagedMulticastSocketBinding was set to 25.00MB, but the OS only allocated 212.99KB
←[0m←[0m21:25:36,174 INFO  [org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool -- 58) keycloak-698d8fb4b-wc5xk: no members discovered after 3002 ms: creating cluster as coordinator
←[0m←[0m21:25:36,451 INFO  [org.infinispan.CONTAINER] (ServerService Thread Pool -- 58) ISPN000128: Infinispan version: Infinispan 'Corona Extra' 11.0.9.Final
←[0m←[0m21:25:36,466 INFO  [org.infinispan.PERSISTENCE] (ServerService Thread Pool -- 59) ISPN000556: Starting user marshaller 'org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller'
←[0m←[0m21:25:36,482 INFO  [org.infinispan.PERSISTENCE] (ServerService Thread Pool -- 58) ISPN000556: Starting user marshaller 'org.wildfly.clustering.infinispan.spi.marshalling.InfinispanProtoStreamMarshaller'
←[0m←[0m21:25:36,482 INFO  [org.infinispan.PERSISTENCE] (ServerService Thread Pool -- 61) ISPN000556: Starting user marshaller 'org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller'
←[0m←[0m21:25:36,482 INFO  [org.infinispan.PERSISTENCE] (ServerService Thread Pool -- 62) ISPN000556: Starting user marshaller 'org.wildfly.clustering.infinispan.spi.marshalling.InfinispanProtoStreamMarshaller'
←[0m←[0m21:25:36,482 INFO  [org.infinispan.PERSISTENCE] (ServerService Thread Pool -- 60) ISPN000556: Starting user marshaller 'org.wildfly.clustering.infinispan.spi.marshalling.InfinispanProtoStreamMarshaller'
←[0m←[0m21:25:36,541 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 59) ISPN000078: Starting JGroups channel ejb
←[0m←[0m21:25:36,541 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 62) ISPN000078: Starting JGroups channel ejb
←[0m←[0m21:25:36,541 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 60) ISPN000078: Starting JGroups channel ejb
←[0m←[0m21:25:36,541 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 61) ISPN000078: Starting JGroups channel ejb
←[0m←[0m21:25:36,541 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 58) ISPN000078: Starting JGroups channel ejb
←[0m←[0m21:25:36,544 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 62) ISPN000094: Received new cluster view for channel ejb: [keycloak-698d8fb4b-wc5xk|0] (1) [keycloak-698d8fb4b-wc5xk]
←[0m←[0m21:25:36,544 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 59) ISPN000094: Received new cluster view for channel ejb: [keycloak-698d8fb4b-wc5xk|0] (1) [keycloak-698d8fb4b-wc5xk]
←[0m←[0m21:25:36,544 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 58) ISPN000094: Received new cluster view for channel ejb: [keycloak-698d8fb4b-wc5xk|0] (1) [keycloak-698d8fb4b-wc5xk]
←[0m←[0m21:25:36,544 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 61) ISPN000094: Received new cluster view for channel ejb: [keycloak-698d8fb4b-wc5xk|0] (1) [keycloak-698d8fb4b-wc5xk]
←[0m←[0m21:25:36,544 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 60) ISPN000094: Received new cluster view for channel ejb: [keycloak-698d8fb4b-wc5xk|0] (1) [keycloak-698d8fb4b-wc5xk]
←[0m←[0m21:25:36,549 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 59) ISPN000079: Channel ejb local address is keycloak-698d8fb4b-wc5xk, physical addresses are [172.17.0.12:55200]
←[0m←[0m21:25:36,549 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 60) ISPN000079: Channel ejb local address is keycloak-698d8fb4b-wc5xk, physical addresses are [172.17.0.12:55200]
←[0m←[0m21:25:36,549 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 58) ISPN000079: Channel ejb local address is keycloak-698d8fb4b-wc5xk, physical addresses are [172.17.0.12:55200]
←[0m←[0m21:25:36,549 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 62) ISPN000079: Channel ejb local address is keycloak-698d8fb4b-wc5xk, physical addresses are [172.17.0.12:55200]
←[0m←[0m21:25:36,554 INFO  [org.infinispan.CLUSTER] (ServerService Thread Pool -- 61) ISPN000079: Channel ejb local address is keycloak-698d8fb4b-wc5xk, physical addresses are [172.17.0.12:55200]
←[0m←[0m21:25:36,575 INFO  [org.infinispan.CONFIG] (MSC service thread 1-1) ISPN000152: Passivation configured without an eviction policy being selected. Only manually evicted entities will be passivated.
←[0m←[0m21:25:36,577 INFO  [org.infinispan.CONFIG] (MSC service thread 1-1) ISPN000152: Passivation configured without an eviction policy being selected. Only manually evicted entities will be passivated.
←[0m←[0m21:25:36,805 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 62) WFLYCLINF0002: Started http-remoting-connector cache from ejb container
←[0m←[0m21:25:36,826 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 59) WFLYCLINF0002: Started work cache from keycloak container
←[0m←[0m21:25:36,827 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 64) WFLYCLINF0002: Started offlineSessions cache from keycloak container
←[0m←[0m21:25:36,829 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 65) WFLYCLINF0002: Started clientSessions cache from keycloak container
←[0m←[0m21:25:36,834 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 61) WFLYCLINF0002: Started actionTokens cache from keycloak container
←[0m←[0m21:25:36,835 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 60) WFLYCLINF0002: Started offlineClientSessions cache from keycloak container
←[0m←[0m21:25:36,835 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 69) WFLYCLINF0002: Started authenticationSessions cache from keycloak container
←[0m←[0m21:25:36,837 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 66) WFLYCLINF0002: Started sessions cache from keycloak container
←[0m←[0m21:25:36,837 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 70) WFLYCLINF0002: Started loginFailures cache from keycloak container
←[0m←[0m21:25:36,850 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 58) WFLYCLINF0002: Started users cache from keycloak container
←[0m←[0m21:25:36,850 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 63) WFLYCLINF0002: Started authorization cache from keycloak container
←[0m←[0m21:25:36,850 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 67) WFLYCLINF0002: Started realms cache from keycloak container
←[0m←[0m21:25:36,850 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 68) WFLYCLINF0002: Started keys cache from keycloak container
←[0m←[33m21:25:36,894 WARN  [org.jboss.as.server.deployment] (MSC service thread 1-2) WFLYSRV0273: Excluded subsystem webservices via jboss-deployment-structure.xml does not exist.
←[0m←[0m21:25:37,178 INFO  [org.keycloak.services] (ServerService Thread Pool -- 68) KC-SERVICES0001: Loading config from standalone.xml or domain.xml
←[0m←[0m21:25:37,470 INFO  [org.keycloak.url.DefaultHostnameProviderFactory] (ServerService Thread Pool -- 68) Frontend: <request>, Admin: <frontend>, Backend: <request>
←[0m←[0m21:25:37,484 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 68) WFLYCLINF0002: Started realmRevisions cache from keycloak container
←[0m←[0m21:25:37,486 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 68) WFLYCLINF0002: Started userRevisions cache from keycloak container
←[0m←[0m21:25:37,489 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 68) WFLYCLINF0002: Started authorizationRevisions cache from keycloak container
←[0m←[0m21:25:37,490 INFO  [org.keycloak.connections.infinispan.DefaultInfinispanConnectionProviderFactory] (ServerService Thread Pool -- 68) Node name: keycloak-698d8fb4b-wc5xk, Site name: null
←[0m←[0m21:25:38,142 INFO  [org.keycloak.connections.jpa.DefaultJpaConnectionProviderFactory] (ServerService Thread Pool -- 68) Database info: {databaseUrl=jdbc:h2:/opt/jboss/keycloak/standalone/data/keycloak, databaseUser=SA, databaseProduct=H2 1.4.197 (2018-03-18), databaseDriver=H2 JDBC Driver 1.4.197 (2018-03-18)}
←[0m←[0m21:25:38,852 INFO  [org.hibernate.jpa.internal.util.LogHelper] (ServerService Thread Pool -- 68) HHH000204: Processing PersistenceUnitInfo [
        name: keycloak-default
        ...]
←[0m←[0m21:25:38,883 INFO  [org.hibernate.Version] (ServerService Thread Pool -- 68) HHH000412: Hibernate Core {5.3.20.Final}
←[0m←[0m21:25:38,884 INFO  [org.hibernate.cfg.Environment] (ServerService Thread Pool -- 68) HHH000206: hibernate.properties not found
←[0m←[0m21:25:38,950 INFO  [org.hibernate.annotations.common.Version] (ServerService Thread Pool -- 68) HCANN000001: Hibernate Commons Annotations {5.0.5.Final}
←[0m←[0m21:25:39,033 INFO  [org.hibernate.dialect.Dialect] (ServerService Thread Pool -- 68) HHH000400: Using dialect: org.hibernate.dialect.H2Dialect
←[0m←[0m21:25:39,056 INFO  [org.hibernate.envers.boot.internal.EnversServiceImpl] (ServerService Thread Pool -- 68) Envers integration enabled? : true
←[0m←[0m21:25:39,281 INFO  [org.hibernate.orm.beans] (ServerService Thread Pool -- 68) HHH10005002: No explicit CDI BeanManager reference was passed to Hibernate, but CDI is available on the Hibernate ClassLoader.
←[0m←[0m21:25:39,309 INFO  [org.hibernate.validator.internal.util.Version] (ServerService Thread Pool -- 68) HV000001: Hibernate Validator 6.0.22.Final
←[0m←[0m21:25:39,832 INFO  [org.hibernate.hql.internal.QueryTranslatorFactoryInitiator] (ServerService Thread Pool -- 68) HHH000397: Using ASTQueryTranslatorFactory
←[0m←[0m21:25:40,204 INFO  [org.keycloak.services] (ServerService Thread Pool -- 68) KC-SERVICES0006: Importing users from '/opt/jboss/keycloak/standalone/configuration/keycloak-add-user.json'
←[0m←[33m21:25:40,307 WARN  [org.keycloak.services] (ServerService Thread Pool -- 68) KC-SERVICES0104: Not creating user admin. It already exists.
←[0m←[0m21:25:40,337 INFO  [org.jboss.resteasy.resteasy_jaxrs.i18n] (ServerService Thread Pool -- 68) RESTEASY002225: Deploying javax.ws.rs.core.Application: class org.keycloak.services.resources.KeycloakApplication
←[0m←[0m21:25:40,338 INFO  [org.jboss.resteasy.resteasy_jaxrs.i18n] (ServerService Thread Pool -- 68) RESTEASY002200: Adding class resource org.keycloak.services.resources.ThemeResource from Application class org.keycloak.services.resources.KeycloakApplication
←[0m←[0m21:25:40,339 INFO  [org.jboss.resteasy.resteasy_jaxrs.i18n] (ServerService Thread Pool -- 68) RESTEASY002205: Adding provider class org.keycloak.services.error.KeycloakErrorHandler from Application class org.keycloak.services.resources.KeycloakApplication
←[0m←[0m21:25:40,339 INFO  [org.jboss.resteasy.resteasy_jaxrs.i18n] (ServerService Thread Pool -- 68) RESTEASY002200: Adding class resource org.keycloak.services.resources.JsResource from Application class org.keycloak.services.resources.KeycloakApplication
←[0m←[0m21:25:40,339 INFO  [org.jboss.resteasy.resteasy_jaxrs.i18n] (ServerService Thread Pool -- 68) RESTEASY002205: Adding provider class org.keycloak.services.filters.KeycloakSecurityHeadersFilter from Application class org.keycloak.services.resources.KeycloakApplication
←[0m←[0m21:25:40,339 INFO  [org.jboss.resteasy.resteasy_jaxrs.i18n] (ServerService Thread Pool -- 68) RESTEASY002220: Adding singleton resource org.keycloak.services.resources.WelcomeResource from Application class org.keycloak.services.resources.KeycloakApplication
←[0m←[0m21:25:40,339 INFO  [org.jboss.resteasy.resteasy_jaxrs.i18n] (ServerService Thread Pool -- 68) RESTEASY002210: Adding provider singleton org.keycloak.services.util.ObjectMapperResolver from Application class org.keycloak.services.resources.KeycloakApplication
←[0m←[0m21:25:40,339 INFO  [org.jboss.resteasy.resteasy_jaxrs.i18n] (ServerService Thread Pool -- 68) RESTEASY002220: Adding singleton resource org.keycloak.services.resources.RobotsResource from Application class org.keycloak.services.resources.KeycloakApplication
←[0m←[0m21:25:40,339 INFO  [org.jboss.resteasy.resteasy_jaxrs.i18n] (ServerService Thread Pool -- 68) RESTEASY002220: Adding singleton resource org.keycloak.services.resources.RealmsResource from Application class org.keycloak.services.resources.KeycloakApplication
←[0m←[0m21:25:40,339 INFO  [org.jboss.resteasy.resteasy_jaxrs.i18n] (ServerService Thread Pool -- 68) RESTEASY002220: Adding singleton resource org.keycloak.services.resources.admin.AdminRoot from Application class org.keycloak.services.resources.KeycloakApplication
←[0m←[0m21:25:40,385 INFO  [org.wildfly.extension.undertow] (ServerService Thread Pool -- 68) WFLYUT0021: Registered web context: '/auth' for server 'default-server'
←[0m←[0m21:25:40,443 INFO  [org.jboss.as.server] (ServerService Thread Pool -- 46) WFLYSRV0010: Deployed "keycloak-server.war" (runtime-name : "keycloak-server.war")
←[0m←[0m21:25:40,464 INFO  [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0212: Resuming server
←[0m←[0m21:25:40,466 INFO  [org.jboss.as] (Controller Boot Thread) WFLYSRV0025: Keycloak 15.0.2 (WildFly Core 15.0.1.Final) started in 9485ms - Started 692 of 977 services (686 services are lazy, passive or on-demand)
←[0m←[0m21:25:40,467 INFO  [org.jboss.as] (Controller Boot Thread) WFLYSRV0060: Http management interface listening on http://127.0.0.1:9990/management
←[0m←[0m21:25:40,467 INFO  [org.jboss.as] (Controller Boot Thread) WFLYSRV0051: Admin console listening on http://127.0.0.1:9990

Chapter07: subprocess.CalledProcessError: Command '['python3', 'build_push_image.py']' returned non-zero exit status 1.

Platform: Kubespray version 1.26.6

Wishing you all a good day, I am following along in your documentation, up to page 210.

And I'm encountering an error while building on Airflow, specifically in the build_push_image step. From what I can see, it just finished executing the step to install the requirements.txt. And then I encountered this error: subprocess.CalledProcessError: Command '['python3', 'build_push_image.py']' returned non-zero exit status 1. (I don't understand why Airflow treats this as an info, but when I check the pod logs, it shows this as an error, lol :))). Please help me investigate this error. Below is the full log and the structure inside the build_push_image.py file. Thank you very much.

Log in pod:

[root@k8s-master airflow-dags]# kubectl logs -f -n ml-workshop build-push-image.2bc94f1f645145b2a24d8577fa5c32de
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 18688 100 18688 0 0 50644 0 --:--:-- --:--:-- --:--:-- 50644
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1282 100 1282 0 0 11446 0 --:--:-- --:--:-- --:--:-- 11446
Requirement already satisfied: packaging in /usr/local/lib/python3.7/site-packages (21.3)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/site-packages (from packaging) (3.0.6)
WARNING: You are using pip version 20.3.1; however, version 23.1.2 is available.
You should consider upgrading via the '/usr/local/bin/python3 -m pip install --upgrade pip' command.
[I 01:58:35.771] 'model_deploy-0628015808':'build_push_image' - starting operation
[I 01:58:35.771] 'model_deploy-0628015808':'build_push_image' - Installing packages
[I 01:58:35.771] Package not found. Installing ipykernel package with version 5.3.0...
[I 01:58:35.772] Package not found. Installing ipython package with version 7.15.0...
[I 01:58:35.772] Package not found. Installing ipython-genutils package with version 0.2.0...
[I 01:58:35.772] Package not found. Installing jupyter-client package with version 6.1.6...
[I 01:58:35.772] Package not found. Installing jupyter-core package with version 4.6.3...
[I 01:58:35.772] Newer minio package with version 6.0.2 already installed. Skipping...
[I 01:58:35.772] Package not found. Installing nbclient package with version 0.4.1...
[I 01:58:35.772] Package not found. Installing nbconvert package with version 5.6.1...
[I 01:58:35.772] Package not found. Installing nbformat package with version 5.0.7...
[I 01:58:35.772] Package not found. Installing papermill package with version 2.1.2...
[I 01:58:35.772] Package not found. Installing pyzmq package with version 19.0.1...
[I 01:58:35.772] Package not found. Installing prompt-toolkit package with version 3.0.5...
[I 01:58:35.772] Newer requests package with version 2.25.0 already installed. Skipping...
[I 01:58:35.773] Newer tornado package with version 6.1 already installed. Skipping...
[I 01:58:35.773] Package not found. Installing traitlets package with version 4.3.3...
[I 01:58:35.773] Newer urllib3 package with version 1.26.2 already installed. Skipping...
......
WARNING: You are using pip version 20.3.1; however, version 23.1.2 is available.
You should consider upgrading via the '/usr/local/bin/python3 -m pip install --upgrade pip' command.
....
[I 01:58:52.394] 'model_deploy-0628015808':'build_push_image' - Packages installed (16.623 secs)
[I 01:58:52.494] 'model_deploy-0628015808':'build_push_image' - processing dependencies
[I 01:58:52.779] 'model_deploy-0628015808':'build_push_image' - downloaded build_push_image-30e0375e-66aa-4f9a-994c-77e7814be449.tar.gz from bucket: airflow, object: model_deploy-0628015808/build_push_image-30e0375e-66aa-4f9a-994c-77e7814be449.tar.gz (0.285 secs)
/
Dockerfile
tar: Removing leading `/' from member names
Predictor.py
base_requirements.txt
build_push_image.py
[I 01:58:52.786] 'model_deploy-0628015808':'build_push_image' - dependencies processed (0.292 secs)
[I 01:58:52.786] 'model_deploy-0628015808':'build_push_image' - executing python script using 'python3 build_push_image.py' to 'build_push_image.log'
[E 01:58:57.482] Unexpected error: <class 'subprocess.CalledProcessError'>
[E 01:58:57.483] Error details: Command '['python3', 'build_push_image.py']' returned non-zero exit status 1.
[I 01:58:57.538] 'model_deploy-0628015808':'build_push_image' - uploaded build_push_image.log to bucket: airflow object: model_deploy-0628015808/build_push_image.log (0.056 secs)
Traceback (most recent call last):
File "bootstrapper.py", line 430, in
main()
File "bootstrapper.py", line 423, in main
file_op.execute()
File "bootstrapper.py", line 274, in execute
raise ex
File "bootstrapper.py", line 261, in execute
subprocess.run(['python3', python_script], stdout=log_file, stderr=subprocess.STDOUT, check=True)
File "/usr/local/lib/python3.7/subprocess.py", line 512, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['python3', 'build_push_image.py']' returned non-zero exit status 1.

log in airflow:

[2023-06-28 01:58:28,165] {taskinstance.py:903} INFO - Dependencies all met for <TaskInstance: model_deploy-0628015808.build_push_image 2023-06-27T00:00:00+00:00 [queued]>
[2023-06-28 01:58:28,407] {taskinstance.py:903} INFO - Dependencies all met for <TaskInstance: model_deploy-0628015808.build_push_image 2023-06-27T00:00:00+00:00 [queued]>

[2023-06-28 01:58:28,407] {taskinstance.py:1094} INFO -

[2023-06-28 01:58:28,407] {taskinstance.py:1095} INFO - Starting attempt 1 of 1
[2023-06-28 01:58:28,407] {taskinstance.py:1096} INFO -

[2023-06-28 01:58:28,839] {taskinstance.py:1114} INFO - Executing <Task(NotebookOp): build_push_image> on 2023-06-27T00:00:00+00:00
[2023-06-28 01:58:28,843] {standard_task_runner.py:52} INFO - Started process 412 to run task
[2023-06-28 01:58:28,849] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', 'model_deploy-0628015808', 'build_push_image', '2023-06-27T00:00:00+00:00', '--job-id', '27', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/gitdags/model_deploy-0628015808.py', '--cfg-path', '/tmp/tmpzr3gsi1w', '--error-file', '/tmp/tmpml3n40xm']
[2023-06-28 01:58:28,850] {standard_task_runner.py:77} INFO - Job 27: Subtask build_push_image
[2023-06-28 01:58:30,367] {logging_mixin.py:109} INFO - Running <TaskInstance: model_deploy-0628015808.build_push_image 2023-06-27T00:00:00+00:00 [running]> on host app-aflow-airflow-worker-0.app-aflow-airflow-worker-headless.ml-workshop.svc.cluster.local
[2023-06-28 01:58:32,245] {taskinstance.py:1251} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=airflow
AIRFLOW_CTX_DAG_ID=model_deploy-0628015808
AIRFLOW_CTX_TASK_ID=build_push_image
AIRFLOW_CTX_EXECUTION_DATE=2023-06-27T00:00:00+00:00
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2023-06-27T00:00:00+00:00
[2023-06-28 01:58:32,266] {kubernetes_pod.py:368} INFO - creating pod with labels {'dag_id': 'model_deploy-0628015808', 'task_id': 'build_push_image', 'execution_date': '2023-06-27T0000000000-63dd9c5d6', 'try_number': '1'} and launcher <airflow.providers.cncf.kubernetes.utils.pod_launcher.PodLauncher object at 0x7f4f9cd36e50>
[2023-06-28 01:58:32,302] {pod_launcher.py:198} INFO - Event: build-push-image.2bc94f1f645145b2a24d8577fa5c32de had an event of type Pending
[2023-06-28 01:58:32,302] {pod_launcher.py:128} WARNING - Pod not yet started: build-push-image.2bc94f1f645145b2a24d8577fa5c32de
[2023-06-28 01:58:33,310] {pod_launcher.py:198} INFO - Event: build-push-image.2bc94f1f645145b2a24d8577fa5c32de had an event of type Pending
[2023-06-28 01:58:33,310] {pod_launcher.py:128} WARNING - Pod not yet started: build-push-image.2bc94f1f645145b2a24d8577fa5c32de
[2023-06-28 01:58:34,320] {pod_launcher.py:198} INFO - Event: build-push-image.2bc94f1f645145b2a24d8577fa5c32de had an event of type Running
[2023-06-28 01:58:34,335] {pod_launcher.py:149} INFO - % Total % Received % Xferd Average Speed Time Time Time Current
[2023-06-28 01:58:34,335] {pod_launcher.py:149} INFO - Dload Upload Total Spent Left Speed
[2023-06-28 01:58:34,335] {pod_launcher.py:149} INFO -
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 18688 100 18688 0 0 50644 0 --:--:-- --:--:-- --:--:-- 50644
[2023-06-28 01:58:34,335] {pod_launcher.py:149} INFO - % Total % Received % Xferd Average Speed Time Time Time Current
[2023-06-28 01:58:34,335] {pod_launcher.py:149} INFO - Dload Upload Total Spent Left Speed
[2023-06-28 01:58:34,336] {pod_launcher.py:149} INFO -
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 1282 100 1282 0 0 11446 0 --:--:-- --:--:-- --:--:-- 11446
[2023-06-28 01:58:34,336] {pod_launcher.py:149} INFO - Requirement already satisfied: packaging in /usr/local/lib/python3.7/site-packages (21.3)
[2023-06-28 01:58:34,336] {pod_launcher.py:149} INFO - Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/site-packages (from packaging) (3.0.6)
[2023-06-28 01:58:35,213] {pod_launcher.py:149} INFO - WARNING: You are using pip version 20.3.1; however, version 23.1.2 is available.
[2023-06-28 01:58:35,214] {pod_launcher.py:149} INFO - You should consider upgrading via the '/usr/local/bin/python3 -m pip install --upgrade pip' command.
[2023-06-28 01:58:35,774] {pod_launcher.py:149} INFO - [I 01:58:35.771] 'model_deploy-0628015808':'build_push_image' - starting operation
[2023-06-28 01:58:35,774] {pod_launcher.py:149} INFO - [I 01:58:35.771] 'model_deploy-0628015808':'build_push_image' - Installing packages
[2023-06-28 01:58:35,774] {pod_launcher.py:149} INFO - [I 01:58:35.771] Package not found. Installing ipykernel package with version 5.3.0...
[2023-06-28 01:58:35,774] {pod_launcher.py:149} INFO - [I 01:58:35.772] Package not found. Installing ipython package with version 7.15.0...
[2023-06-28 01:58:35,774] {pod_launcher.py:149} INFO - [I 01:58:35.772] Package not found. Installing ipython-genutils package with version 0.2.0...
[2023-06-28 01:58:35,774] {pod_launcher.py:149} INFO - [I 01:58:35.772] Package not found. Installing jupyter-client package with version 6.1.6...
[2023-06-28 01:58:35,774] {pod_launcher.py:149} INFO - [I 01:58:35.772] Package not found. Installing jupyter-core package with version 4.6.3...
[2023-06-28 01:58:35,774] {pod_launcher.py:149} INFO - [I 01:58:35.772] Newer minio package with version 6.0.2 already installed. Skipping...
[2023-06-28 01:58:35,774] {pod_launcher.py:149} INFO - [I 01:58:35.772] Package not found. Installing nbclient package with version 0.4.1...
[2023-06-28 01:58:35,774] {pod_launcher.py:149} INFO - [I 01:58:35.772] Package not found. Installing nbconvert package with version 5.6.1...
[2023-06-28 01:58:35,774] {pod_launcher.py:149} INFO - [I 01:58:35.772] Package not found. Installing nbformat package with version 5.0.7...
[2023-06-28 01:58:35,775] {pod_launcher.py:149} INFO - [I 01:58:35.772] Package not found. Installing papermill package with version 2.1.2...
[2023-06-28 01:58:35,775] {pod_launcher.py:149} INFO - [I 01:58:35.772] Package not found. Installing pyzmq package with version 19.0.1...
[2023-06-28 01:58:35,775] {pod_launcher.py:149} INFO - [I 01:58:35.772] Package not found. Installing prompt-toolkit package with version 3.0.5...
[2023-06-28 01:58:35,775] {pod_launcher.py:149} INFO - [I 01:58:35.772] Newer requests package with version 2.25.0 already installed. Skipping...
[2023-06-28 01:58:35,775] {pod_launcher.py:149} INFO - [I 01:58:35.773] Newer tornado package with version 6.1 already installed. Skipping...
[2023-06-28 01:58:35,775] {pod_launcher.py:149} INFO - [I 01:58:35.773] Package not found. Installing traitlets package with version 4.3.3...
[2023-06-28 01:58:35,775] {pod_launcher.py:149} INFO - [I 01:58:35.773] Newer urllib3 package with version 1.26.2 already installed. Skipping...
.......
[2023-06-28 01:58:51,789] {pod_launcher.py:149} INFO - Successfully installed ansiwrap-0.8.4 async-generator-1.10 backcall-0.2.0 black-23.3.0 bleach-6.0.0 click-8.1.3 decorator-5.1.1 defusedxml-0.7.1 ipykernel-5.3.0 ipython-7.15.0 ipython-genutils-0.2.0 jedi-0.18.2 jupyter-client-6.1.6 jupyter-core-4.6.3 mistune-0.8.4 mypy-extensions-1.0.0 nbclient-0.4.1 nbconvert-5.6.1 nbformat-5.0.7 nest-asyncio-1.5.6 packaging-23.1 pandocfilters-1.5.0 papermill-2.1.2 parso-0.8.3 pathspec-0.11.1 pexpect-4.8.0 pickleshare-0.7.5 platformdirs-3.8.0 prompt-toolkit-3.0.5 ptyprocess-0.7.0 pygments-2.15.1 pyzmq-19.0.1 tenacity-8.2.2 testpath-0.6.0 textwrap3-0.9.2 tomli-2.0.1 tqdm-4.65.0 traitlets-4.3.3 typed-ast-1.5.4 typing-extensions-4.6.3 wcwidth-0.2.6 webencodings-0.5.1
[2023-06-28 01:58:51,825] {pod_launcher.py:149} INFO - WARNING: You are using pip version 20.3.1; however, version 23.1.2 is available.
[2023-06-28 01:58:51,826] {pod_launcher.py:149} INFO - You should consider upgrading via the '/usr/local/bin/python3 -m pip install --upgrade pip' command.
....
[2023-06-28 01:58:52,373] {pod_launcher.py:149} INFO - Werkzeug==1.0.1
[2023-06-28 01:58:52,373] {pod_launcher.py:149} INFO - zipp==3.4.0
[2023-06-28 01:58:52,395] {pod_launcher.py:149} INFO - [I 01:58:52.394] 'model_deploy-0628015808':'build_push_image' - Packages installed (16.623 secs)
[2023-06-28 01:58:52,496] {pod_launcher.py:149} INFO - [I 01:58:52.494] 'model_deploy-0628015808':'build_push_image' - processing dependencies
[2023-06-28 01:58:52,780] {pod_launcher.py:149} INFO - [I 01:58:52.779] 'model_deploy-0628015808':'build_push_image' - downloaded build_push_image-30e0375e-66aa-4f9a-994c-77e7814be449.tar.gz from bucket: airflow, object: model_deploy-0628015808/build_push_image-30e0375e-66aa-4f9a-994c-77e7814be449.tar.gz (0.285 secs)
[2023-06-28 01:58:52,786] {pod_launcher.py:149} INFO - /
[2023-06-28 01:58:52,786] {pod_launcher.py:149} INFO - Dockerfile
[2023-06-28 01:58:52,786] {pod_launcher.py:149} INFO - tar: Removing leading `/' from member names
[2023-06-28 01:58:52,786] {pod_launcher.py:149} INFO - Predictor.py
[2023-06-28 01:58:52,787] {pod_launcher.py:149} INFO - base_requirements.txt
[2023-06-28 01:58:52,787] {pod_launcher.py:149} INFO - build_push_image.py
[2023-06-28 01:58:52,787] {pod_launcher.py:149} INFO - [I 01:58:52.786] 'model_deploy-0628015808':'build_push_image' - dependencies processed (0.292 secs)
[2023-06-28 01:58:52,787] {pod_launcher.py:149} INFO - [I 01:58:52.786] 'model_deploy-0628015808':'build_push_image' - executing

python script using 'python3 build_push_image.py' to 'build_push_image.log'
[2023-06-28 01:58:57,483] {pod_launcher.py:149} INFO - [E 01:58:57.482] Unexpected error: <class 'subprocess.CalledProcessError'>
[2023-06-28 01:58:57,484] {pod_launcher.py:149} INFO - [E 01:58:57.483] Error details: Command '['python3', 'build_push_image.py']' return
ed non-zero exit status 1.
[2023-06-28 01:58:57,539] {pod_launcher.py:149} INFO - [I 01:58:57.538] 'model_deploy-0628015808':'build_push_image' - uploaded build_push_image.log to bucket: airflow object: model_deploy-0628015808/build_push_image.log (0.056 secs)
[2023-06-28 01:58:57,547] {pod_launcher.py:149} INFO - Traceback (most recent call last):
[2023-06-28 01:58:57,548] {pod_launcher.py:149} INFO - File "bootstrapper.py", line 430, in
[2023-06-28 01:58:57,548] {pod_launcher.py:149} INFO - main()
[2023-06-28 01:58:57,548] {pod_launcher.py:149} INFO - File "bootstrapper.py", line 423, in main
[2023-06-28 01:58:57,548] {pod_launcher.py:149} INFO - file_op.execute()
[2023-06-28 01:58:57,548] {pod_launcher.py:149} INFO - File "bootstrapper.py", line 274, in execute
[2023-06-28 01:58:57,548] {pod_launcher.py:149} INFO - raise ex
[2023-06-28 01:58:57,548] {pod_launcher.py:149} INFO - File "bootstrapper.py", line 261, in execute
[2023-06-28 01:58:57,548] {pod_launcher.py:149} INFO - subprocess.run(['python3', python_script], stdout=log_file, stderr=subprocess.STDOUT, check=True)
[2023-06-28 01:58:57,548] {pod_launcher.py:149} INFO - File "/usr/local/lib/python3.7/subprocess.py", line 512, in run
[2023-06-28 01:58:57,548] {pod_launcher.py:149} INFO - output=stdout, stderr=stderr)
[2023-06-28 01:58:57,549] {pod_launcher.py:149} INFO - subprocess.CalledProcessError: Command '['python3', 'build_push_image.py']' returned non-zero exit status 1.
[2023-06-28 01:59:01,568] {pod_launcher.py:198} INFO - Event: build-push-image.2bc94f1f645145b2a24d8577fa5c32de had an event of type Running
[2023-06-28 01:59:01,568] {pod_launcher.py:171} INFO - Pod build-push-image.2bc94f1f645145b2a24d8577fa5c32de has state running
[2023-06-28 01:59:03,579] {pod_launcher.py:198} INFO - Event: build-push-image.2bc94f1f645145b2a24d8577fa5c32de had an event of type Failed
[2023-06-28 01:59:03,579] {pod_launcher.py:308} ERROR - Event with job id build-push-image.2bc94f1f645145b2a24d8577fa5c32de Failed
[2023-06-28 01:59:03,587] {pod_launcher.py:198} INFO - Event: build-push-image.2bc94f1f645145b2a24d8577fa5c32de had an event of type Failed
[2023-06-28 01:59:03,587] {pod_launcher.py:308} ERROR - Event with job id build-push-image.2bc94f1f645145b2a24d8577fa5c32de Failed
[2023-06-28 01:59:03,965] {taskinstance.py:1462} ERROR - Task failed with exception
Traceback (most recent call last):
File "/opt/bitnami/airflow/venv/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 371, in execute
raise AirflowException(f'Pod {self.pod.metadata.name} returned a failure: {remote_pod}')
airflow.exceptions.AirflowException: Pod build-push-image.2bc94f1f645145b2a24d8577fa5c32de returned a failure: {'api_version': 'v1',
'kind': 'Pod',
'metadata': {'annotations': {'cni.projectcalico.org/containerID': '6eed1d1530a24956ba1d2ae739fd0d751ffca0ceb08053b23bf5a7f4234487fc',
'cni.projectcalico.org/podIP': '',
'cni.projectcalico.org/podIPs': '',
'kubernetes.io/limit-ranger': 'LimitRanger '
'plugin set: cpu, '
'memory request '
'for container '
'base; cpu, memory '
'limit for '
'container base'},
'cluster_name': None,
'creation_timestamp': datetime.datetime(2023, 6, 28, 1, 58, 32, tzinfo=tzlocal()),
'deletion_grace_period_seconds': None,
'deletion_timestamp': None,
'finalizers': None,
'generate_name': None,
'generation': None,
'initializers': None,
'labels': {'airflow_version': '2.1.3',
'dag_id': 'model_deploy-0628015808',
'execution_date': '2023-06-27T0000000000-63dd9c5d6',
'kubernetes_pod_operator': 'True',
'task_id': 'build_push_image',
'try_number': '1'},
'managed_fields': [{'api_version': 'v1',
'fields': None,
'manager': 'OpenAPI-Generator',
'operation': 'Update',
'time': datetime.datetime(2023, 6, 28, 1, 58, 32, tzinfo=tzlocal())},
{'api_version': 'v1',
'fields': None,
'manager': 'calico',
'operation': 'Update',
'time': datetime.datetime(2023, 6, 28, 1, 59, 1, tzinfo=tzlocal())},
{'api_version': 'v1',
'fields': None,
'manager': 'kubelet',
'operation': 'Update',
'time': datetime.datetime(2023, 6, 28, 1, 59, 2, tzinfo=tzlocal())}],
'name': 'build-push-image.2bc94f1f645145b2a24d8577fa5c32de',
'namespace': 'ml-workshop',
'owner_references': None,
'resource_version': '476630',
'self_link': None,
'uid': '1fa4dfdf-fa59-470c-a580-65d316edd9cf'},
'spec': {'active_deadline_seconds': None,
'affinity': {'node_affinity': None,
'pod_affinity': None,
'pod_anti_affinity': None},
'automount_service_account_token': None,
'containers': [{'args': ['mkdir -p ./jupyter-work-dir/ && cd '
'./jupyter-work-dir/ && curl -H '
"'Cache-Control: no-cache' -L "
'https://raw.githubusercontent.com/elyra-ai/airflow-notebook/v0.0.7/etc/docker-scripts/bootstrapper.py '
'--output bootstrapper.py && curl -H '
"'Cache-Control: no-cache' -L "
'https://raw.githubusercontent.com/elyra-ai/airflow-notebook/v0.0.7/etc/requirements-elyra.txt '
'--output requirements-elyra.txt && python3 '
'-m pip install packaging && python3 -m pip '
'freeze > requirements-current.txt && '
'python3 bootstrapper.py --cos-endpoint '
'http://minio-ml-workshop:9000// '
'--cos-bucket airflow --cos-directory '
"'model_deploy-0628015808' "
'--cos-dependencies-archive '
"'build_push_image-30e0375e-66aa-4f9a-994c-77e7814be449.tar.gz' "
'--file '
"'Machine-Learning-on-Kubernetes/Chapter07/model_deploy_pipeline/model_build_push/build_push_image.py' "],
'command': ['sh', '-c'],
'env': [{'name': 'AWS_ACCESS_KEY_ID',
'value': 'minio',
'value_from': None},
{'name': 'AWS_SECRET_ACCESS_KEY',
'value': 'minio123',
'value_from': None},
{'name': 'ELYRA_ENABLE_PIPELINE_INFO',
'value': 'True',
'value_from': None},
{'name': 'MODEL_NAME',
'value': 'mlflowdemo',
'value_from': None},
{'name': 'MODEL_VERSION',
'value': '1',
'value_from': None},
{'name': 'CONTAINER_REGISTRY',
'value': 'https://quay.io/',
'value_from': None},
{'name': 'CONTAINER_REGISTRY_USER',
'value': '',
'value_from': None},
{'name': 'CONTAINER_REGISTRY_PASSWORD',
'value': '',
'value_from': None},
{'name': 'CONTAINER_DETAILS',
'value': '',
'value_from': None}],
'env_from': None,
'image': 'quay.io/ml-on-k8s/kaniko-container-builder:1.0.0',
'image_pull_policy': 'Never',
'lifecycle': None,
'liveness_probe': None,
'name': 'base',
'ports': None,
'readiness_probe': None,
'resources': {'limits': {'cpu': '1', 'memory': '1Gi'},
'requests': {'cpu': '100m',
'memory': '500Mi'}},
'security_context': None,
'stdin': None,
'stdin_once': None,
'termination_message_path': '/dev/termination-log',
'termination_message_policy': 'File',
'tty': None,
'volume_devices': None,
'volume_mounts': [{'mount_path': '/var/run/secrets/kubernetes.io/serviceaccount',
'mount_propagation': None,
'name': 'kube-api-access-68j6b',
'read_only': True,
'sub_path': None,
'sub_path_expr': None}],
'working_dir': None}],
'dns_config': None,
'dns_policy': 'ClusterFirst',
'enable_service_links': True,
'host_aliases': None,
'host_ipc': None,
'host_network': None,
'host_pid': None,
'hostname': None,
'image_pull_secrets': None,
'init_containers': None,
'node_name': 'k8s-master',
'node_selector': None,
'preemption_policy': 'PreemptLowerPriority',
'priority': 0,
'priority_class_name': None,
'readiness_gates': None,
'restart_policy': 'Never',
'runtime_class_name': None,
'scheduler_name': 'default-scheduler',
'security_context': {'fs_group': None,
'run_as_group': None,
'run_as_non_root': None,
'run_as_user': None,
'se_linux_options': None,
'supplemental_groups': None,
'sysctls': None,
'windows_options': None},
'service_account': 'default',
'service_account_name': 'default',
'share_process_namespace': None,
'subdomain': None,
'termination_grace_period_seconds': 30,
'tolerations': [{'effect': 'NoExecute',
'key': 'node.kubernetes.io/not-ready',
'operator': 'Exists',
'toleration_seconds': 300,
'value': None},
{'effect': 'NoExecute',
'key': 'node.kubernetes.io/unreachable',
'operator': 'Exists',
'toleration_seconds': 300,
'value': None}],
'volumes': [{'aws_elastic_block_store': None,
'azure_disk': None,
'azure_file': None,
'cephfs': None,
'cinder': None,
'config_map': None,
'csi': None,
'downward_api': None,
'empty_dir': None,
'fc': None,
'flex_volume': None,
'flocker': None,
'gce_persistent_disk': None,
'git_repo': None,
'glusterfs': None,
'host_path': None,
'iscsi': None,
'name': 'kube-api-access-68j6b',
'nfs': None,
'persistent_volume_claim': None,
'photon_persistent_disk': None,
'portworx_volume': None,
'projected': {'default_mode': 420,
'sources': [{'config_map': None,
'downward_api': None,
'secret': None,
'service_account_token': {'audience': None,
'expiration_seconds': 3607,
'path': 'token'}},
{'config_map': {'items': [{'key': 'ca.crt',
'mode': None,
'path': 'ca.crt'}],
'name': 'kube-root-ca.crt',
'optional': None},
'downward_api': None,
'secret': None,
'service_account_token': None},
{'config_map': None,
'downward_api': {'items': [{'field_ref': {'api_version': 'v1',
'field_path': 'metadata.namespace'},
'mode': None,
'path': 'namespace',
'resource_field_ref': None}]},
'secret': None,
'service_account_token': None}]},
'quobyte': None,
'rbd': None,
'scale_io': None,
'secret': None,
'storageos': None,
'vsphere_volume': None}]},
'status': {'conditions': [{'last_probe_time': None,
'last_transition_time': datetime.datetime(2023, 6, 28, 1, 58, 32, tzinfo=tzlocal()),
'message': None,
'reason': None,
'status': 'True',
'type': 'Initialized'},
{'last_probe_time': None,
'last_transition_time': datetime.datetime(2023, 6, 28, 1, 59, tzinfo=tzlocal()),
'message': None,
'reason': 'PodFailed',
'status': 'False',
'type': 'Ready'},
{'last_probe_time': None,
'last_transition_time': datetime.datetime(2023, 6, 28, 1, 59, tzinfo=tzlocal()),
'message': None,
'reason': 'PodFailed',
'status': 'False',
'type': 'ContainersReady'},
{'last_probe_time': None,
'last_transition_time': datetime.datetime(2023, 6, 28, 1, 58, 32, tzinfo=tzlocal()),
'message': None,
'reason': None,
'status': 'True',
'type': 'PodScheduled'}],
'container_statuses': [{'container_id': 'containerd://f98598d2b0c14cc945fd8631b81fd17d1310bcd2fd931f143007a286e3db232f',
'image': 'quay.io/ml-on-k8s/kaniko-container-builder:1.0.0',
'image_id': 'quay.io/ml-on-k8s/kaniko-container-builder@sha256:7204881d5ba9c83f8b5b5580ef716c91d374f728c003faa3af9f1bd047e8535e',
'last_state': {'running': None,
'terminated': None,
'waiting': None},
'name': 'base',
'ready': False,
'restart_count': 0,
'state': {'running': None,
'terminated': {'container_id': 'containerd://f98598d2b0c14cc945fd8631b81fd17d1310bcd2fd931f143007a286e3db232f',
'exit_code': 1,
'finished_at': datetime.datetime(2023, 6, 28, 1, 58, 57, tzinfo=tzlocal()),
'message': None,
'reason': 'Error',
'signal': None,
'started_at': datetime.datetime(2023, 6, 28, 1, 58, 33, tzinfo=tzlocal())},
'waiting': None}}],
'host_ip': '10.1.0.4',
'init_container_statuses': None,
'message': None,
'nominated_node_name': None,
'phase': 'Failed',
'pod_ip': '10.233.123.99',
'qos_class': 'Burstable',
'reason': None,
'start_time': datetime.datetime(2023, 6, 28, 1, 58, 32, tzinfo=tzlocal())}}

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/opt/bitnami/airflow/venv/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1164, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File "/opt/bitnami/airflow/venv/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1282, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File "/opt/bitnami/airflow/venv/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1312, in _execute_task
result = task_copy.execute(context=context)
File "/opt/bitnami/airflow/venv/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 376, in execute
raise AirflowException(f'Pod Launching failed: {ex}')
airflow.exceptions.AirflowException: Pod Launching failed: Pod build-push-image.2bc94f1f645145b2a24d8577fa5c32de returned a failure: {'api_version': 'v1',
'kind': 'Pod',
'metadata': {'annotations': {'cni.projectcalico.org/containerID': '6eed1d1530a24956ba1d2ae739fd0d751ffca0ceb08053b23bf5a7f4234487fc',
'cni.projectcalico.org/podIP': '',
'cni.projectcalico.org/podIPs': '',
'kubernetes.io/limit-ranger': 'LimitRanger '
'plugin set: cpu, '
'memory request '
'for container '
'base; cpu, memory '
'limit for '
'container base'},
'cluster_name': None,
'creation_timestamp': datetime.datetime(2023, 6, 28, 1, 58, 32, tzinfo=tzlocal()),
'deletion_grace_period_seconds': None,
'deletion_timestamp': None,
'finalizers': None,
'generate_name': None,
'generation': None,
'initializers': None,
'labels': {'airflow_version': '2.1.3',
'dag_id': 'model_deploy-0628015808',
'execution_date': '2023-06-27T0000000000-63dd9c5d6',
'kubernetes_pod_operator': 'True',
'task_id': 'build_push_image',
'try_number': '1'},
'managed_fields': [{'api_version': 'v1',
'fields': None,
'manager': 'OpenAPI-Generator',
'operation': 'Update',
'time': datetime.datetime(2023, 6, 28, 1, 58, 32, tzinfo=tzlocal())},
{'api_version': 'v1',
'fields': None,
'manager': 'calico',
'operation': 'Update',
'time': datetime.datetime(2023, 6, 28, 1, 59, 1, tzinfo=tzlocal())},
{'api_version': 'v1',
'fields': None,
'manager': 'kubelet',
'operation': 'Update',
'time': datetime.datetime(2023, 6, 28, 1, 59, 2, tzinfo=tzlocal())}],
'name': 'build-push-image.2bc94f1f645145b2a24d8577fa5c32de',
'namespace': 'ml-workshop',
'owner_references': None,
'resource_version': '476630',
'self_link': None,
'uid': '1fa4dfdf-fa59-470c-a580-65d316edd9cf'},
'spec': {'active_deadline_seconds': None,
'affinity': {'node_affinity': None,
'pod_affinity': None,
'pod_anti_affinity': None},
'automount_service_account_token': None,
'containers': [{'args': ['mkdir -p ./jupyter-work-dir/ && cd '
'./jupyter-work-dir/ && curl -H '
"'Cache-Control: no-cache' -L "
'https://raw.githubusercontent.com/elyra-ai/airflow-notebook/v0.0.7/etc/docker-scripts/bootstrapper.py '
'--output bootstrapper.py && curl -H '
"'Cache-Control: no-cache' -L "
'https://raw.githubusercontent.com/elyra-ai/airflow-notebook/v0.0.7/etc/requirements-elyra.txt '
'--output requirements-elyra.txt && python3 '
'-m pip install packaging && python3 -m pip '
'freeze > requirements-current.txt && '
'python3 bootstrapper.py --cos-endpoint '
'http://minio-ml-workshop:9000// '
'--cos-bucket airflow --cos-directory '
"'model_deploy-0628015808' "
'--cos-dependencies-archive '
"'build_push_image-30e0375e-66aa-4f9a-994c-77e7814be449.tar.gz' "
'--file '
"'Machine-Learning-on-Kubernetes/Chapter07/model_deploy_pipeline/model_build_push/build_push_image.py' "],
'command': ['sh', '-c'],
'env': [{'name': 'AWS_ACCESS_KEY_ID',
'value': 'minio',
'value_from': None},
{'name': 'AWS_SECRET_ACCESS_KEY',
'value': 'minio123',
'value_from': None},
{'name': 'ELYRA_ENABLE_PIPELINE_INFO',
'value': 'True',
'value_from': None},
{'name': 'MODEL_NAME',
'value': 'mlflowdemo',
'value_from': None},
{'name': 'MODEL_VERSION',
'value': '1',
'value_from': None},
{'name': 'CONTAINER_REGISTRY',
'value': 'https://quay.io/',
'value_from': None},
{'name': 'CONTAINER_REGISTRY_USER',
'value': '',
'value_from': None},
{'name': 'CONTAINER_REGISTRY_PASSWORD',
'value': '',
'value_from': None},
{'name': 'CONTAINER_DETAILS',
'value': '',
'value_from': None}],
'env_from': None,
'image': 'quay.io/ml-on-k8s/kaniko-container-builder:1.0.0',
'image_pull_policy': 'Never',
'lifecycle': None,
'liveness_probe': None,
'name': 'base',
'ports': None,
'readiness_probe': None,
'resources': {'limits': {'cpu': '1', 'memory': '1Gi'},
'requests': {'cpu': '100m',
'memory': '500Mi'}},
'security_context': None,
'stdin': None,
'stdin_once': None,
'termination_message_path': '/dev/termination-log',
'termination_message_policy': 'File',
'tty': None,
'volume_devices': None,
'volume_mounts': [{'mount_path': '/var/run/secrets/kubernetes.io/serviceaccount',
'mount_propagation': None,
'name': 'kube-api-access-68j6b',
'read_only': True,
'sub_path': None,
'sub_path_expr': None}],
'working_dir': None}],
'dns_config': None,
'dns_policy': 'ClusterFirst',
'enable_service_links': True,
'host_aliases': None,
'host_ipc': None,
'host_network': None,
'host_pid': None,
'hostname': None,
'image_pull_secrets': None,
'init_containers': None,
'node_name': 'k8s-master',
'node_selector': None,
'preemption_policy': 'PreemptLowerPriority',
'priority': 0,
'priority_class_name': None,
'readiness_gates': None,
'restart_policy': 'Never',
'runtime_class_name': None,
'scheduler_name': 'default-scheduler',
'security_context': {'fs_group': None,
'run_as_group': None,
'run_as_non_root': None,
'run_as_user': None,
'se_linux_options': None,
'supplemental_groups': None,
'sysctls': None,
'windows_options': None},
'service_account': 'default',
'service_account_name': 'default',
'share_process_namespace': None,
'subdomain': None,
'termination_grace_period_seconds': 30,
'tolerations': [{'effect': 'NoExecute',
'key': 'node.kubernetes.io/not-ready',
'operator': 'Exists',
'toleration_seconds': 300,
'value': None},
{'effect': 'NoExecute',
'key': 'node.kubernetes.io/unreachable',
'operator': 'Exists',
'toleration_seconds': 300,
'value': None}],
'volumes': [{'aws_elastic_block_store': None,
'azure_disk': None,
'azure_file': None,
'cephfs': None,
'cinder': None,
'config_map': None,
'csi': None,
'downward_api': None,
'empty_dir': None,
'fc': None,
'flex_volume': None,
'flocker': None,
'gce_persistent_disk': None,
'git_repo': None,
'glusterfs': None,
'host_path': None,
'iscsi': None,
'name': 'kube-api-access-68j6b',
'nfs': None,
'persistent_volume_claim': None,
'photon_persistent_disk': None,
'portworx_volume': None,
'projected': {'default_mode': 420,
'sources': [{'config_map': None,
'downward_api': None,
'secret': None,
'service_account_token': {'audience': None,
'expiration_seconds': 3607,
'path': 'token'}},
{'config_map': {'items': [{'key': 'ca.crt',
'mode': None,
'path': 'ca.crt'}],
'name': 'kube-root-ca.crt',
'optional': None},
'downward_api': None,
'secret': None,
'service_account_token': None},
{'config_map': None,
'downward_api': {'items': [{'field_ref': {'api_version': 'v1',
'field_path': 'metadata.namespace'},
'mode': None,
'path': 'namespace',
'resource_field_ref': None}]},
'secret': None,
'service_account_token': None}]},
'quobyte': None,
'rbd': None,
'scale_io': None,
'secret': None,
'storageos': None,
'vsphere_volume': None}]},
'status': {'conditions': [{'last_probe_time': None,
'last_transition_time': datetime.datetime(2023, 6, 28, 1, 58, 32, tzinfo=tzlocal()),
'message': None,
'reason': None,
'status': 'True',
'type': 'Initialized'},
{'last_probe_time': None,
'last_transition_time': datetime.datetime(2023, 6, 28, 1, 59, tzinfo=tzlocal()),
'message': None,
'reason': 'PodFailed',
'status': 'False',
'type': 'Ready'},
{'last_probe_time': None,
'last_transition_time': datetime.datetime(2023, 6, 28, 1, 59, tzinfo=tzlocal()),
'message': None,
'reason': 'PodFailed',
'status': 'False',
'type': 'ContainersReady'},
{'last_probe_time': None,
'last_transition_time': datetime.datetime(2023, 6, 28, 1, 58, 32, tzinfo=tzlocal()),
'message': None,
'reason': None,
'status': 'True',
'type': 'PodScheduled'}],
'container_statuses': [{'container_id': 'containerd://f98598d2b0c14cc945fd8631b81fd17d1310bcd2fd931f143007a286e3db232f',
'image': 'quay.io/ml-on-k8s/kaniko-container-builder:1.0.0',
'image_id': 'quay.io/ml-on-k8s/kaniko-container-builder@sha256:7204881d5ba9c83f8b5b5580ef716c91d374f728c003faa3af9f1bd047e8535e',
'last_state': {'running': None,
'terminated': None,
'waiting': None},
'name': 'base',
'ready': False,
'restart_count': 0,
'state': {'running': None,
'terminated': {'container_id': 'containerd://f98598d2b0c14cc945fd8631b81fd17d1310bcd2fd931f143007a286e3db232f',
'exit_code': 1,
'finished_at': datetime.datetime(2023, 6, 28, 1, 58, 57, tzinfo=tzlocal()),
'message': None,
'reason': 'Error',
'signal': None,
'started_at': datetime.datetime(2023, 6, 28, 1, 58, 33, tzinfo=tzlocal())},
'waiting': None}}],
'host_ip': '10.1.0.4',
'init_container_statuses': None,
'message': None,
'nominated_node_name': None,
'phase': 'Failed',
'pod_ip': '10.233.123.99',
'qos_class': 'Burstable',
'reason': None,
'start_time': datetime.datetime(2023, 6, 28, 1, 58, 32, tzinfo=tzlocal())}}
[2023-06-28 01:59:03,982] {taskinstance.py:1505} INFO - Marking task as FAILED. dag_id=model_deploy-0628015808, task_id=build_push_image, execution_date=20230627T000000, start_date=20230628T015828, end_date=20230628T015903
[2023-06-28 01:59:05,176] {local_task_job.py:151} INFO - Task exited with return code 1
[2023-06-28 01:59:05,994] {local_task_job.py:261} INFO - 0 downstream tasks scheduled from follow-on schedule check

file python build_push_image.py:

import string
import subprocess
import os
import base64
import mlflow
from minio import Minio
from mlflow.tracking import MlflowClient


"""
    This script assumes that the /kaniko/.docker/config.json has the correct repo and associated credentials mounted
    It also expects the these env variables has been set
    CONTAINER_REGISTRY is the resitry server like quay.io
    CONTAINER_DETAILS is the container coordinates like ml-on-k8s/containermodel:1.0.0
    AWS_SECRET_ACCESS_KEY is the password for the S3 store
    MODEL_NAME is hte name of the model in mlflow
    MODEL_VERSION is the version of the model in mlflow
"""

os.environ['MLFLOW_S3_ENDPOINT_URL']='http://minio-ml-workshop:9000'
os.environ['AWS_ACCESS_KEY_ID']='minio'
os.environ['AWS_REGION']='us-east-1'
os.environ['AWS_BUCKET_NAME']='mlflow'

HOST = "http://mlflow:5500"

model_name = os.environ["MODEL_NAME"]
model_version = os.environ["MODEL_VERSION"]
build_name = f"seldon-model-{model_name}-v{model_version}"

auth_encoded = string.Template("$CONTAINER_REGISTRY_USER:$CONTAINER_REGISTRY_PASSWORD").substitute(os.environ)
os.environ["CONTAINER_REGISTRY_CREDS"] = base64.b64encode(auth_encoded.encode("ascii")).decode("ascii")

print(auth_encoded)

docker_auth = string.Template('{"auths":{"$CONTAINER_REGISTRY":{"auth":"$CONTAINER_REGISTRY_CREDS"}}}').substitute(os.environ)
print(docker_auth)
f = open("/kaniko/.docker/config.json", "w")
f.write(docker_auth)
f.close()

def get_s3_server():
    minioClient = Minio('minio-ml-workshop:9000',
                        access_key=os.environ['AWS_ACCESS_KEY_ID'],
                        secret_key=os.environ["AWS_SECRET_ACCESS_KEY"],
                        secure=False)

    return minioClient


def init():
    mlflow.set_tracking_uri(HOST)


def download_artifacts():
    print("retrieving model metadata from mlflow...")
    # model = mlflow.pyfunc.load_model(
    #     model_uri=f"models:/{model_name}/{model_version}"
    # )
    client = MlflowClient()

    model = client.get_registered_model(model_name)

    print(model)

    run_id = model._latest_version[0].run_id
    source = model._latest_version[0].source
    experiment_id = "1" # to be calculated from the source which is source='s3://mlflow/1/bf721e5641394ed6866baf20131fca20/artifacts/model'

    print("initializing connection to s3 server...")
    minioClient = get_s3_server()

    #     artifact_location = mlflow.get_experiment_by_name('rossdemo').artifact_location
    #     print("downloading artifacts from s3 bucket " + artifact_location)

    data_file_model = minioClient.fget_object("mlflow", f"/{experiment_id}/{run_id}/artifacts/model/model.pkl", "model.pkl")
    data_file_requirements = minioClient.fget_object("mlflow", f"/{experiment_id}/{run_id}/artifacts/model/requirements.txt", "requirements.txt")

    #Using boto3 Download the files from mlflow, the file path is in the model meta
    #write the files to the file system
    print("download successful")

    return run_id

def build_push_image():
    container_location = string.Template("$CONTAINER_REGISTRY/$CONTAINER_DETAILS").substitute(os.environ)
    
    #For docker repo, do not include the registry domain name in container location
    if os.environ["CONTAINER_REGISTRY"].find("docker.io") != -1:
        container_location= os.environ["CONTAINER_DETAILS"]
        
    full_command = "/kaniko/executor --context=" + os.getcwd() + " --dockerfile=Dockerfile --verbosity=debug --cache=true --single-snapshot=true --destination=" + container_location
    print(full_command)
    process = subprocess.run(full_command, shell=True, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    print(process.stdout)
    print(process.stderr)

     print(subprocess.check_output(['/kaniko/executor', '--context', '/workspace',  '--dockerfile', 'Dockerfile', '--destination', container_location]))


init()
download_artifacts()
build_push_image()

Config Opendatahub components

Hi, I am using K8s GKE on GCP instead of minikube.
Instead of minikube ip, have used GKE cluster ip and couldn't view the keycloak UI on webbroser. (chapter 4)

Also couldn't see any pods of opendatahub after applying the ml-platform.yaml with changed to GKE cluster ip. (chapter 5)

Can you please guide on how to address the above issues and setup/configure the opendatahub and use Jupyterhub with spark and other associated components.

Regards
Arun

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.