Giter Site home page Giter Site logo

node-problem-detector-operator's Introduction

node-problem-detector-operator

An operator to run Node Problem Detector on OpenShift

To deploy the operator:

oc create -f deploy/crd.yaml
oc create -f deploy/ns.yaml
oc create -f deploy/sa.yaml
oc create -f deploy/rbac.yaml
oc create -f deploy/operator.yaml
oc create -f deploy/cr.yaml

To uninstall the operator and Node Problem Detector:

oc delete -f deploy/cr.yaml
oc delete -f deploy/operator.yaml
oc delete -f deploy/rbac.yaml
oc adm policy remove-scc-from-user -n openshift-node-problem-detector privileged -z node-problem-detector
oc delete -f deploy/sa.yaml
oc delete -f deploy/ns.yaml
oc delete -f deploy/crd.yaml

node-problem-detector-operator's People

Contributors

alvaroaleman avatar dobbymoodge avatar joelsmith avatar jupierce avatar openshift-bot avatar openshift-merge-robot avatar rphillips avatar sjenning avatar smarterclayton avatar yselkowitz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

node-problem-detector-operator's Issues

image not pullable

latest version of operator uses an invalid image:

Back-off pulling image "registry.svc.ci.openshift.org/openshift/origin-v4.0:node-problem-detector-operator"
docker pull registry.svc.ci.openshift.org/openshift/origin-v4.0:node-problem-detector-operator
Error response from daemon: received unexpected HTTP status: 503 Service Unavailable

missing rbac

After installing operator 0.0.1 it logs:

node-problem-detector-2200398493\" is forbidden: user \"system:serviceaccount:openshift-node-problem-detector:node-problem-detector-operator\" (groups=[\"system:serviceaccounts\" \"system:serviceaccounts:openshift-node-problem-detector\" \"system:authenticated\"]) is attempting to grant RBAC permissions not currently held:\n{APIGroups:[\"events.k8s.io\"], Resources:[\"events\"], Verbs:[\"create\" \"patch\" \"update\"]}"

and it is not able to create a ds for the actual NPDs

kubelet-health.sh is not working inside the container

I am unable to get KubeletProblem status in API server, the status is always showing Unknown.
I tried to run the script manually on host and it is working fine.

#!/usr/bin/env bash

set -eou pipefail

data=$(curl \
  -s \
  http://127.0.0.1:10250/healthz
)

if [[ "$data" != "ok" ]]; then
  exit 20
fi

exit 0
origin@master1:/home/origin>./kubelet-health.sh
origin@master1:/home/origin>echo $?
20
origin@master1:/home/origin>v

But it is not working inside the container

origin@bastion1:/home/origin>oc get po -o wide
NAME                                              READY     STATUS    RESTARTS   AGE       IP            NODE       NOMINATED NODE
node-problem-detector-48jdh                       1/1       Running   0          19m       10.128.0.28   master1    <none>
node-problem-detector-9fccv                       1/1       Running   0          20m       10.128.0.34   compute1   <none>
node-problem-detector-9nlgv                       1/1       Running   0          19m       10.128.0.30   master3    <none>
node-problem-detector-c2dvv                       1/1       Running   0          20m       10.128.0.35   compute2   <none>
node-problem-detector-nzgzn                       1/1       Running   0          19m       10.128.0.31   infra1     <none>
node-problem-detector-operator-56df5c5cb4-4rm48   1/1       Running   0          6h        10.1.6.49     compute2   <none>
node-problem-detector-x8v5v                       1/1       Running   0          20m       10.128.0.32   infra2     <none>
node-problem-detector-zhkcx                       1/1       Running   0          20m       10.128.0.29   master2    <none>
node-problem-detector-zr5nf                       1/1       Running   0          20m       10.128.0.33   infra3     <none>
origin@bastion1:/home/origin>

origin@bastion1:/home/origin>oc exec -it node-problem-detector-48jdh -- /bin/bash
root@master1:/# cd /etc/npd-plugins/
root@master1:/etc/npd-plugins# ./kubelet-health.sh
./kubelet-health.sh: line 8: warning: command substitution: ignored null byte in input
root@master1:/etc/npd-plugins#

image pull error when installed from operator hub

I just installed this operator from operato hub and I'm getting this error:

Failed to pull image "openshift/ose-node-problem-detector:v4.0": rpc error: code = Unknown desc = Error reading manifest v4.0 in docker.io/openshift/ose-node-problem-detector: errors: denied: requested access to the resource is denied unauthorized: authentication required

on the daemon set image.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.