Comments (18)
I think #1513 will address this. Though really I've moved this to doing gets instead of deletes. So if the Gets are being audited there would still be lots of logs.
Another possibility would only delete once when we know it needs to be deleted or at start, and just assume that nothing else will be creating the tigerastatus.
from operator.
in our case, we are look at 4.. and 5.. return code in general, for any verb. This helps us detect wrongly configured services or clients that might be doing things wrong.
This means that we would indeed still report it but with a different verb.
Another possibility would only delete once when we know it needs to be deleted or at start, and just assume that nothing else will be creating the tigerastatus.
This seems indeed like a cleaner solution, I haven't looked into the tigerastatus
and it's lifecycle, but in the case of AKS, I"m they are not creating any.
From looking at the Crd definition, https://docs.tigera.io/manifests/ocp/crds/01-crd-tigerastatus.yaml should we expect the status to be created after the first install and stay present to reflect the live status of the operator ?
from operator.
actually, from this comment in your code
// removeTigeraStatus returns true and removes the status displayed in TigeraStatus if corresponding CR not found
Could there be an issue where the CR should be present on the AKS cluster ?
I'm actually lost, since I cannot find the operator running in my cluster.
and I only see a single CRD related to Tigera operator.
❯ k get crd | grep tigera
installations.operator.tigera.io 2021-01-19T01:37:17Z
No installations CR exist on the cluster.
I'm asking the AKS folks if they are running the operator on their side targeting clusters. cause from your code, it seems that the deletion should only occur when the statusManager isn't enabled.
from operator.
so I was tired and I was looking at a cluster that wasn't updated to 1.20 yet, so indeed I now have a tiger-operator namespace and the logs was showing
it seems that the status manager isn't starting, it explains why it keeps trying to delete the status CR
{"level":"info","ts":1631506159.7930598,"logger":"setup","msg":"Checking type of cluster","provider":""}
{"level":"info","ts":1631506159.7975225,"logger":"setup","msg":"Checking if TSEE controllers are required","required":false}
{"level":"info","ts":1631506160.0142817,"logger":"typha_autoscaler","msg":"Starting typha autoscaler","syncPeriod":10}
{"level":"info","ts":1631506160.0143795,"logger":"setup","msg":"starting manager"}
I0913 04:09:20.015304 1 leaderelection.go:243] attempting to acquire leader lease tigera-operator/operator-lock...
{"level":"info","ts":1631506160.1331236,"logger":"typha_autoscaler","msg":"Updating typha replicas from 3 to 0"}
{"level":"info","ts":1631506170.0432284,"logger":"typha_autoscaler","msg":"Updating typha replicas from 0 to 3"}
I0913 04:09:37.643890 1 leaderelection.go:253] successfully acquired lease tigera-operator/operator-lock
{"level":"info","ts":1631506177.644897,"logger":"controller-runtime.manager.controller.apiserver-controller","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1631506177.6461124,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1631506177.7463608,"logger":"controller-runtime.manager.controller.apiserver-controller","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1631506177.7476687,"logger":"controller-runtime.manager.controller.apiserver-controller","msg":"Starting EventSource","source":"kind source: /V1, Kind=ConfigMap"}
{"level":"info","ts":1631506177.74758,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /V1, Kind=Secret"}
{"level":"info","ts":1631506178.953494,"logger":"controller-runtime.manager.controller.apiserver-controller","msg":"Starting EventSource","source":"kind source: /V1, Kind=Secret"}
{"level":"info","ts":1631506180.9547164,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /V1, Kind=ConfigMap"}
{"level":"info","ts":1631506180.9582217,"logger":"controller-runtime.manager.controller.apiserver-controller","msg":"Starting EventSource","source":"kind source: /V1, Kind=Secret"}
{"level":"info","ts":1631506180.9617813,"logger":"controller-runtime.manager.controller.apiserver-controller","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1631506180.972377,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /V1, Kind=ConfigMap"}
{"level":"info","ts":1631506180.981175,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /V1, Kind=ConfigMap"}
{"level":"info","ts":1631506180.9844275,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1631506180.9886284,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1631506181.0789032,"logger":"controller-runtime.manager.controller.apiserver-controller","msg":"Starting Controller"}
{"level":"info","ts":1631506181.0797353,"logger":"controller-runtime.manager.controller.apiserver-controller","msg":"Starting workers","worker count":1}
{"level":"info","ts":1631506181.0802593,"logger":"controller_apiserver","msg":"Reconciling APIServer","Request.Namespace":"","Request.Name":"default"}
{"level":"info","ts":1631506181.0815806,"logger":"controller_apiserver","msg":"APIServer config not found","Request.Namespace":"","Request.Name":"default"}
{"level":"info","ts":1631506181.0896528,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1631506181.1946084,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1631506181.2977993,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1631506181.3995752,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1631506181.5003502,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1631506181.6088758,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting EventSource","source":"kind source: /, Kind="}
{"level":"info","ts":1631506181.7110636,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting Controller"}
{"level":"info","ts":1631506181.7465715,"logger":"controller-runtime.manager.controller.tigera-installation-controller","msg":"Starting workers","worker count":1}
{"level":"info","ts":1631506185.01433,"logger":"status_manager.calico","msg":"Status manager is not ready to report component statuses."}
{"level":"info","ts":1631506191.3367884,"logger":"status_manager","msg":"Failed to update tigera status","error":"Operation cannot be fulfilled on tigerastatuses.operator.tigera.io \"calico\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"info","ts":1631537615.0556335,"logger":"status_manager","msg":"Failed to update tigera status","error":"Operation cannot be fulfilled on tigerastatuses.operator.tigera.io \"calico\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"info","ts":1631552145.0310025,"logger":"status_manager","msg":"Failed to update tigera status","error":"Operation cannot be fulfilled on tigerastatuses.operator.tigera.io \"calico\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"info","ts":1631552190.0333617,"logger":"status_manager","msg":"Failed to update tigera status","error":"Operation cannot be fulfilled on tigerastatuses.operator.tigera.io \"calico\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"info","ts":1631552210.0374384,"logger":"status_manager","msg":"Failed to update tigera status","error":"Operation cannot be fulfilled on tigerastatuses.operator.tigera.io \"calico\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"info","ts":1631553660.0494459,"logger":"status_manager","msg":"Failed to update tigera status","error":"Operation cannot be fulfilled on tigerastatuses.operator.tigera.io \"calico\": the object has been modified; please apply your changes to the latest version and try again"}
{"level":"info","ts":1631576174.2103264,"logger":"controller_apiserver","msg":"Reconciling APIServer","Request.Namespace":"","Request.Name":"default"}
{"level":"info","ts":1631576174.2106676,"logger":"controller_apiserver","msg":"APIServer config not found","Request.Namespace":"","Request.Name":"default"}
{"level":"info","ts":1631611172.4854195,"logger":"controller_apiserver","msg":"Reconciling APIServer","Request.Namespace":"","Request.Name":"default"}
{"level":"info","ts":1631611172.489367,"logger":"controller_apiserver","msg":"APIServer config not found","Request.Namespace":"","Request.Name":"default"}
from operator.
CC: @paulgmiller since you might be interested in this
from operator.
@djsly You could add a CR for the apiserver which will cause the calico-apiserver to be deployed and then the status won't be trying to remove the tigerastatus for it anymore.
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}
From looking at the Crd definition, https://docs.tigera.io/manifests/ocp/crds/01-crd-tigerastatus.yaml should we expect the status to be created after the first install and stay present to reflect the live status of the operator ?
There will be a tigerastatus only when the corresponding tigera operator CR will is created, that is why you're seeing a delete for apiserver tigerastatus because no apiserver.operator.tigera.io 'default' has been created so it is ensuring the tigerastatus is cleaned up.
from operator.
thanks @tmjd , so I guess
- Issue on AKS team: why aren't they creating the default APIServer CR. was this by design or something they weren't aware
- Issue on the Operator side: Does it need to constantly try to delete something that was not created on purpose. Can the logic be improved to try once, or maybe to run a GET with a labelSelector, this would yield a No Resource found with a 200 OK
❯ k get APIServer -l name=tigera-default -v 6
I0914 11:40:11.106136 90335 loader.go:372] Config loaded from file: /Users/sylvain_boily/.kube/config
I0914 11:40:11.416221 90335 round_trippers.go:454] GET https://<apiserver>:443/apis/operator.tigera.io/v1/apiservers?labelSelector=name%3Dtigera-default&limit=500 200 OK in 300 milliseconds
No resources found
from operator.
- I'm not sure if they were aware or not or if it was a choice they made.
- It doesn't constantly need to try that delete. That's a good idea on the get with a labelSelector as an easy change. We'd need to add a label but that is easy enough and I don't think would cause any issues. Though I do think I'd prefer to adjust how we do the removal, either with a watch to see if it ever got created or delete it perhaps at "start of day" or if we knew it should be removed.
from operator.
so I added the following to my cluster
❯ cat default-apiserver
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata: ``
name: default
spec: {}
and I'm getting 10x more 404 now :)
from operator.
Does that continue forever?
Does kubectl get tigerastatus apiserver -o yaml
show that the apiserver is ready? I would expect once apiserver was available that those would stop.
from operator.
status:
conditions:
- lastTransitionTime: "2021-09-14T17:49:50Z"
status: "False"
type: Progressing
- lastTransitionTime: "2021-09-14T17:49:45Z"
message: 'Pod calico-apiserver/calico-apiserver-f878b8657-gk2lx failed to pull
container image for: calico-apiserver'
reason: Some pods are failing
status: "True"
type: Degraded
- lastTransitionTime: "2021-09-14T17:49:45Z"
status: "False"
type: Available
I guess something isn't configured with the with image... I'm not seeing where this container is getting created.
Actually it went in the calico-apiserver
namespace.... can I have it run in the tigera-operator namespace ?
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Failed 11m (x8568 over 32h) kubelet Error: ImagePullBackOff
Normal BackOff 108s (x8613 over 32h) kubelet Back-off pulling image "mcr.microsoft.com/oss/calico/apiserver:v3.20.0"
I'm not sure if the image was pushed to the mcr.microsoft.com repo...
from operator.
It was confirmed that the image was missing, I will be retrying once it is available on mcr.microsoft.com
. They are working on it
from operator.
@tmjd sorry for the late reply, the image is now present in the Microsoft repo.
The ApiServer started
❯ k describe apiserver default
Name: default
Namespace:
Labels: <none>
Annotations: <none>
API Version: operator.tigera.io/v1
Kind: APIServer
Metadata:
Creation Timestamp: 2021-09-16T13:13:58Z
Generation: 1
Managed Fields:
API Version: operator.tigera.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:spec:
Manager: kubectl-client-side-apply
Operation: Update
Time: 2021-09-16T13:13:58Z
API Version: operator.tigera.io/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:state:
Manager: operator
Operation: Update
Time: 2021-09-21T20:41:21Z
Resource Version: 1078163162
UID: a2c5f2f4-f6f7-40cc-9663-0de3001aa195
Spec:
Status:
State: Ready
Events: <none>
and the POD is running
❯ k get pods -n calico-apiserver
NAME READY STATUS RESTARTS AGE
calico-apiserver-64c74c9dc5-5wgp9 1/1 Running 0 5d7h
it also looks like the 404s are gone as well now!
one question, is there a need to have the apiserver pods run in the calico-apiserver
namespace ? could we configure it to run in the same tigera-operator namespace ?
from operator.
The tigera-operator is opinionated and we think it is best practice to use different namespaces, so the operator does not provide an option to put that in a different namespace.
from operator.
@tmjd / @caseydavenport what is the fix in the end ?
from operator.
Sorry, I was just skimming issues and it looked like this one was fixed based on your comment:
it also looks like the 404s are gone as well now!
Is it not fixed, @djsly?
from operator.
from operator.
I don't think it's worth jumping through hoops to avoid 404s showing up in the API server logs - that's a normal and expected response on a distributed system and shouldn't be used as a metric of whether or not something is functioning or not.
However, it does make sense to avoid calling Delete() unnecessarily. This PR updates the operator to track whether or not it has created / deleted the CR, so it knows when it needs to delete it and when it does not: #1654
It has the side-effect of not spotting when a user creates / deletes the CR out-of-band, which isn't ideal. Need to think about that more.
from operator.
Related Issues (20)
- Typha autoscaler's autoscaling profile to be configurable
- Propose Windows operator updates HOT 7
- Calico v3.27.0 not working with Tigera v1.32.3 HOT 5
- Uninstallation Failure: Calico Module Leaves Remaining Jobs Blocking Deletion HOT 1
- Can't use calico on windows on EKS due to forced network mode HOT 1
- Calico APIServer does not find certs secret HOT 2
- With Tigera operator, applicative pod lost network after windows nodes reboot HOT 2
- Calico or Tigera operator should create CRDs automatically HOT 1
- Calico v3.27.2 is not working with TigeraOperator v1.32.5 HOT 2
- is there anyway to config labels for calico-system and calico-apiserver using tigera operator
- Expose CNI path for configuration
- [SOLVED] Issue migrating to Tigera Operator, IPAMCONFIGURATION not found HOT 8
- Tigera Operator installation causing significant growth in kube-apiserver-audit and operator workload logs HOT 1
- strict decoding error: unknown field "spec.FailsafeInboundHostPorts" HOT 5
- operator: error while loading shared libraries: libdl.so.2: cannot open shared object file: No such file or directory HOT 4
- Tigera-operator helm chart unable to set csiNodeDriverDaemonSet resource memory/cpu requests & limits HOT 5
- bug: Calico Uninstallation Fails Due to Finalizers on Service Accounts HOT 13
- tigera operator throws error every 5 minutes for ippool not created and managed by operator HOT 2
- Request to upgrade Go packages to fix a vulnerability HOT 2
- Support for traffic shaping using the calico operator? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from operator.