Giter Site home page Giter Site logo

kubernetes-master-class's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kubernetes-master-class's Issues

Handy Kubernetes One-Liners

Handy Kubernetes One-Liners

Root Privesc

kubectl run r00t --restart=Never -it --rm --image alpine:latest --overrides \
	'{ "spec": { "hostPID": true, "containers": [{ \
	"name": "1", "image": "alpine", "command": [ \
	"nsenter", "--mount=/proc/1/ns/mnt", "--", "/bin/bash" \
	], "stdin": true, "tty": true, "securityContext": { \
	"privileged": true }]}}'

List etcd members

docker exec -e ETCDCTL_ENDPOINTS=$(docker exec etcd /bin/sh -c "etcdctl member list | cut -d, -f5 | sed -e 's/ //g' | paste -sd ','") etcd etcdctl endpoint status --write-out table

Root Privesc 2

nsenter --mount=/host/proc/1/ns/mnt --net=/host/proc/1/ns/net /bin/bash

Recovering from `rke etcd snapshot-restore` failures

Recovering from rke etcd snapshot-restore failures

If an error is encountered while performing an etcd restoration from a snapshot, it's possible for failures to persist even after remediating the root cause. To avoid this behavior, it's recommended to clean up one or all the pods that were created during the rke etcd snapshot-restore process prior to performing subsequent restore attempts. Following is a list of applicable pods that are created during the etcd restoral process:

  • etcd-restore
  • etcd-checksum-checker
  • etcd-download-backup
  • etcd-Serve-backup
  • etcd-extract-statefile

References

Update k8s troubleshooting

Thank you for doing the class on troubleshooting Kubernetes.

Could you do an updated class on this subject?

Avoid node scaling when performing `rke etcd snapshot-restore`

Avoid node scaling when performing rke etcd snapshot-restore

When performing an rke etcd snapshot-restore, its strongly recommended that node configuration for the cluster is maintained until the restore is complete. Modifying the nodes prior to completing a restore can cause etcd certificate issues or have other unexpected, deleterious results that will impede your ability to restore your etcd snapshot.

Longhorn Troubleshooting: Cleaning up deleted nodes

Longhorn Troubleshooting: Cleaning up deleted nodes

Even though I have deleted the nodes I added from Rancher UI, those nodes where still in Longhorn configuration. Hence, longhorn-manager where not able to schedule again because of that. 

Here is the procedure I performed: 

cloud@rebond-oks:~$ kubectl get nodes
NAME                            STATUS   ROLES               AGE    VERSION
winoks-rlk-eu-west-0a-master1   Ready    controlplane,etcd   240d   v1.18.10
winoks-rlk-eu-west-0a-worker1   Ready    worker              240d   v1.18.10
winoks-rlk-eu-west-0b-master1   Ready    controlplane,etcd   240d   v1.18.10
winoks-rlk-eu-west-0b-worker1   Ready    worker              240d   v1.18.10
winoks-rlk-eu-west-0c-master1   Ready    controlplane,etcd   240d   v1.18.10
winoks-rlk-eu-west-0c-worker1   Ready    worker              240d   v1.18.10
cloud@rebond-oks:~$ kubectl -n longhorn-system get [nodes.longhorn.io](https://nodes.longhorn.io/)
NAME                            READY   ALLOWSCHEDULING   SCHEDULABLE   AGE
winoks-rlk-eu-west-0a-worker1   True    true              True          240d
winoks-rlk-eu-west-0a-worker2   True    false             True          6h38m
winoks-rlk-eu-west-0b-worker1   True    true              True          240d
winoks-rlk-eu-west-0b-worker2   True    false             True          6h33m
winoks-rlk-eu-west-0c-worker1   True    true              True          240d
winoks-rlk-eu-west-0c-worker2   True    false             True          6h46m

As you can see, I still have xxx-worker2 in longhorn nodes. 

Then

cloud@rebond-oks:~$ kubectl -n longhorn-system delete [nodes.longhorn.io](https://nodes.longhorn.io/) winoks-rlk-eu-west-0a-worker2
cloud@rebond-oks:~$ kubectl -n longhorn-system delete [nodes.longhorn.io](https://nodes.longhorn.io/) winoks-rlk-eu-west-0b-worker2
cloud@rebond-oks:~$ kubectl -n longhorn-system delete [nodes.longhorn.io](https://nodes.longhorn.io/) winoks-rlk-eu-west-0c-worker2

then I redeployed longhorn-manager and everything is coming back well. Replicas are rebuilding.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.