Giter Site home page Giter Site logo

coroot / coroot Goto Github PK

View Code? Open in Web Editor NEW
3.8K 32.0 132.0 18.78 MB

Coroot is an open-source APM & Observability tool, a DataDog and NewRelic alternative ๐Ÿ“Š, ๐Ÿ–ฅ๏ธ, ๐Ÿ‘‰. Powered by eBPF for rapid insights into system performance. Monitor, analyze, and optimize your infrastructure effortlessly for peak reliability at any scale.

Home Page: https://coroot.com

License: Apache License 2.0

Go 61.07% JavaScript 1.87% HTML 0.09% Vue 35.89% Dockerfile 0.07% CSS 0.93% Makefile 0.07%
dashboard database-monitoring kubernetes log-analysis metrics monitoring network-monitoring observability prometheus service-map

coroot's Introduction

Go Report Card License

Open-source observability augmented with actionable insights

Collecting metrics, logs, and traces alone doesn't make your applications observable. Coroot turns that data into actionable insights for you!

Features

Zero-instrumentation observability

  • Metrics, logs, traces, and profiles are gathered automatically by using eBPF
  • Coroot provides you with a Service Map that covers 100% of your system with no blind spots
  • Predefined inspections audit each application without any configuration

Application Health Summary

  • Easily understand the status of your services, even when dealing with hundreds of them
  • Gain insight into application logs without the need to manually inspect each one
  • SLOs (Service Level Objectives) tracking

Explore any outlier requests with distributed tracing

  • Investigate any anomaly with just one click
  • Vendor-neutral instrumentation with OpenTelemetry
  • Are you unable to instrument legacy or third-party services? Coroot's eBPF-based instrumentation can capture requests without requiring any code changes.

Grasp insights from logs with just a quick glance

  • Log patterns: out-of-the-box event clustering
  • Seamless logs-to-traces correlation
  • Lightning-fast search based on ClickHouse

Profile any application in 1 click

  • Analyze any unexpected spike in CPU or memory usage down to the precise line of code
  • Don't make assumptions, know exactly what the resources were spent on
  • Easily investigate any anomaly by comparing it to the system's baseline behavior

Built-in expertise

  • Coroot can automatically identify over 80% of issues
  • If an app is not meeting its Service Level Objectives (SLOs), Coroot will send a single alert that includes the results of all relevant inspections
  • You can easily adjust any inspection for a particular application or an entire project

Deployment Tracking

  • Coroot discovers and monitors every application rollout in your Kubernetes cluster
  • Requires no integration with your CI/CD pipeline
  • Each release is automatically compared with the previous one, so you'll never miss even the slightest performance degradation
  • With integrated Cost Monitoring, developers can track how each change affects their cloud bill

Cost Monitoring

  • Understand your cloud costs down to the specific application
  • Doesn't require access to you cloud account or any other configurations
  • AWS, GCP, Azure

Installation

You can run Coroot as a Docker container or deploy it into any Kubernetes cluster. Check out the Installation guide.

Documentation

The Coroot documentation is available at coroot.com/docs/coroot-community-edition.

Live demo

A live demo of Coroot is available at community-demo.coroot.com

Community & Support

Contributing

To start contributing, check out our Contributing Guide.

License

Coroot is licensed under the Apache License, Version 2.0.

coroot's People

Contributors

apetruhin avatar bck01215 avatar cdodd avatar def avatar dependabot[bot] avatar eabykov avatar jipok avatar juneezee avatar split174 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

coroot's Issues

Not getting service maps

Coroot 0.2.2
Coroot-node-agent 1.0.19
k3s v1.24.4+k3s1

Status reports everything is ok:

prometheus: ok
coroot-node-agent: 7 nodes found
kube-state-metrics: 186 applications found

As an example, i took loki in distributed setup.

In app search, i can see them:
image

But if i open app details, i can't see inter-communication between components:
image

Wondering if i'm missing something? Maybe specific labels?

Ability to send alerts to webhook

Hello Coroot Team!

Really nice work on the tool. It would be awesome to develop webhook support for alert notification in order to support more external system .eg Microsoft Teams.

Thanks!

Random crashes

Hello!
Got this error, appears randomly. Does not crash pod with coroot, but stops responding in UI

2022/10/18 00:32:49 http: panic serving 10.251.3.145:50394: runtime error: integer divide by zero
goroutine 60260 [running]:
net/http.(*conn).serve.func1()
	/usr/local/go/src/net/http/server.go:1825 +0xbf
panic({0xaefee0, 0x1253250})
	/usr/local/go/src/runtime/panic.go:844 +0x258
github.com/coroot/coroot/auditor.(*appAuditor).logs(0xc001afebb8)
	/tmp/src/auditor/logs.go:92 +0x86a
github.com/coroot/coroot/auditor.Audit(0xc0124584d0)
	/tmp/src/auditor/auditor.go:32 +0x159
github.com/coroot/coroot/api/views/overview.Render(0xc0124584d0)

Another one:

2022/10/18 00:35:14 http: panic serving 10.251.6.173:57792: runtime error: integer divide by zero
goroutine 62487 [running]:
net/http.(*conn).serve.func1()
	/usr/local/go/src/net/http/server.go:1825 +0xbf
panic({0xaefee0, 0x1253250})
	/usr/local/go/src/runtime/panic.go:844 +0x258
github.com/coroot/coroot/auditor.(*appAuditor).logs(0xc01cbd6bb8)
	/tmp/src/auditor/logs.go:92 +0x86a
github.com/coroot/coroot/auditor.Audit(0xc01459a000)
	/tmp/src/auditor/auditor.go:32 +0x159
github.com/coroot/coroot/api/views/overview.Render(0xc01459a000)
	/tmp/src/api/views/overview/overview.go:41 +0x9d
github.com/coroot/coroot/api/views.Overview(...)
	/tmp/src/api/views/views.go:20

pyroscope-ebpf-agent fail on openshift

after installing on prem i got the following error from pyroscope-ebpf-agent :
Error: unknown cri cri-o://1.23.5-7.rhaos4.10.git5cc2f1e.el8

I assume it failed because openshift used crio instead of docker
is there any fix/workaround for that ?

False positive on cronjob-backed pods

Found interesting thing:
I have cronjob (renovate) running inside k8s cluster. Since pod starts, performs job, and ends, it seems like coroot detects it memory usage grouping all pods as "renovate-github" in my case and reports memory leak. Screenshot attached.

Looks like it's aggregating memory usage on each cronjob pod run?

Screenshot 2023-03-08 at 04 37 05

Cant see the costs tab

Hi there,

Thanks for the awesome work you guys are doing with coroot!

Im running the latest version of coroot (0.17.0) and the latest node-agent version (1.8.1).

I can see in prometheus that the metric node_cloud_info exists and is being exported properly by the node-adent.

But still I cant see the costs tab in the GUI:

Screenshot 2023-05-18 at 12 10 08

Ive used the helm charts for installing it.
What am I doing wrong?

Thanks

Coroot UI doesn't see pg-agent metrics

Hello,
Thank you for this wonderful and convinient instrument!
Unfotunately, I met some troubles with pg-agent.

After installation I've seen the relevant metrics in Prometheus, but my Coroot UI doesn't see them (though it registered the instance of pg-agent)

image
image

I started pg-agent with
docker run -d --name coroot-pg-agent -p <port>:80 --env DSN="postgresql://<user>:<password>@<ip>:5432/postgres?connect_timeout=1&statement_timeout=30000" ghcr.io/coroot/coroot-pg-agent

But got in logs of Coroot UI container:
couldn't find actual instance for "postgres", initial instance is "[email protected]" (map[])
(yes, I have the role of postgres_exporter in PostgreSQL, but use another role for Coroot pg-agent).
What have I done wrong?

Also, I've got none of those metrics: pg_lock_awaiting_queries, pg_wal_receiver_status, pg_wal_replay_paused, pg_wal_receive_lsn, pg_wal_reply_lsn. May be, it was happened because of using PostgreSQL 11 version?

unstable work of graph

Screenshot_20
Screenshot_18
Screenshot_19

unstable work of graph sometimes graphs are vanishing and can`t make full node graph. I use service mesh linkerd

Add filter metrics by metric labels

Hi,

The Coroot project is a powerful tool for monitoring and managing containerized environments, but it currently lacks the ability to filter metrics by their labels. This feature would greatly enhance the functionality of the Coroot project by allowing users to easily filter and aggregate metrics based on specific labels.

For example, imagine that you have a number of Prometheus agent instances that are collecting metrics for different k8s clusters/environments. Prometheus Federate (or Thanos) aggregating metrics from multiple prometheus instances with adding labels (environment="env_name"). With the current version of the Coroot project, we can't filter metrics to one environment, we had 3 environment in one coroot project instead of create separate eg. dev, staging and prod projects. With the ability to filter metrics by labels, we would be able to easily aggregate metrics for a specific environment, without having to deploy dedicated Prometheus for coroot with the same metrics we already had in Prometheus federate.

I propose that a new config is added to the Coroot project settings tab that allows users to filter metrics by labels for project. The config would take a list of label-value pairs as an argument, and would only return metrics that match those label-value pairs.

This feature would greatly enhance the functionality of the Coroot project and make it much easier for users to aggregate and filter metrics in huge projects with multiple clusters/environment. Please consider adding this feature in a future release.

Steps to reproduce:

Start Coroot project with the new config and list of label-value pairs.
Metrics will be filtered by the given label-value pairs and show only services in expected k8s cluster/environment.

Expected results:

Only metrics that match the given label-value pairs are returned.

Actual results:

Metrics are not filtered by label-value pairs and we had all environments together on application list, or service details.

Notes:

This feature request is not a bug, but a new feature request

Node agent is crashing with `netlink receive: no such file or directory`

I was trying out coroot on local minikube setup and I see that node agent is going into crashloopbackoff with this error

$ kubectl -n coroot logs -f coroot-node-agent-fgwbg
I1206 16:28:32.657690   15296 main.go:76] agent version: 1.4.1
I1206 16:28:32.657822   15296 main.go:82] hostname: minikube
I1206 16:28:32.657831   15296 main.go:83] kernel version: 5.15.0-56-generic
I1206 16:28:32.657916   15296 main.go:69] machine-id:  845c2b4a10104ec4926fec08a1d703fc
I1206 16:28:32.658081   15296 metadata.go:63] cloud provider: 
I1206 16:28:32.658091   15296 collector.go:152] instance metadata: <nil>
F1206 16:28:32.658338   15296 main.go:103] netlink receive: no such file or directory

minikube version:

$ minikube version
minikube version: v1.20.0
commit: c61663e942ec43b20e8e70839dcca52e44cd85ae

kubectl version

$ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean", BuildDate:"2022-11-09T13:36:36Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-13T13:20:00Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.25) and server (1.20) exceeds the supported minor version skew of +/-1

Coroot agent fails to detect and monitor KeyDB but detects Redis

I installed the coroot agent on my kubernetes cluster to monitor the performance of my KeyDB instance. However, I noticed that the agent is not able to detect and monitor KeyDB, whereas it is able to detect and monitor Redis on the same kubernetes cluster.

KeyDB is a fork of redis the export is the same as Redis.

Steps to reproduce:

  • Install the coroot agent on the kubernetes cluster.
    
  • Install KeyDB and Redis on the same kubernetes cluster.
    
  • Check the coroot agent dashboard to see if both KeyDB and Redis are being monitored.
    

Expected behavior:
The coroot agent should be able to detect and monitor both KeyDB and Redis.

Actual behavior:
The coroot agent is only able to detect and monitor Redis. It is not able to detect and monitor KeyDB.

Additional information:

  • I am using the latest version of the coroot agent.
    
  • I have checked that KeyDB is running on the kubernetes cluster and is accessible.
    
  • I have checked the coroot agent logs and did not find any errors related to KeyDB.
    

Please let me know if there is anything else I can provide to help resolve this issue.

Application Discovery

Hi, I installed coroot in our environment, and configured the prometheus url to discover the apps. in the logs, I see "0 nodes, 111 services, 211 applications' - But I do not see any application showing up in the UI.

Nodes - I understand the daemonset did not come up because of kernel version mismatch.

Any clues ?

Thanks

Communication errors

Greetings coroot team! Great work keep it up!

I've been running coroot for the past month, and currently I'm facing an issue that I cannot really explain.

Post http://thanos-query.monitoring.svc.cluster.local:10902/api/v1/query" dial tcp 10.172.3.247:10902: connect: cannot assign requested address.

Maybe the container is opening too many connections and the ports are getting exhausted? Have you come across this before? This error popped up after around a week of stable operation.

Add support for different notification channels for Deployments and Incidents

Currently, our notification system uses the same channel for all types of notifications, including deployment notifications and incident notifications. However, we would like to have the ability to use different channels for these two types of notifications.

For example, we would like to use a dedicated Slack channel for deployment notifications so that our development team can quickly see when a new deployment has been made and the impact of it.

On the other hand, we would like to use an other channel notifications for incidents, as this allows us to keep a more detailed record of incidents and enable notification alert.

Please let us know if there are any questions or concerns regarding this request.

[feature request] allow blacklisting of certain services

Hello! I'm not sure if this should be part of node-agent or UI, so putting it here.

I want to be able to blacklist certain services from UI. As an example, i have iscsid running on host nodes for longhorn (also used by openebs jiva, maybe something else too). It creates useless links on map (will be running on every node, connected to all instance managers with local replica, so there will be tons of them). Hardcoding them all to coroot code seems pointless, so maybe just add setting to hide those nodes from UI?

On the other side, blacklisting them in agent (maybe with argument, like "--ignore-services=iscsid,haproxy"?) might reduce cardinality in prometheus, assuming those services will be dropped at probe time.

No metrics found. Looks like you didn't install node-agent. after helm install

Because i need prometheus with AlertManager (for robusta) - i am using Kube-Prometheus-Stack
i used
helm install --namespace coroot --create-namespace coroot --set prometheus.enabled=false coroot/coroot
And now coroot shows me error "No metrics found. Looks like you didn't install node-agent."
But agents is there and it's working, what did i miss ?

k get pod -A | grep agent

coroot          coroot-node-agent-lcd97                                  1/1     Running     0              26m
coroot          coroot-node-agent-zjqjv                                  1/1     Running     0              26m
coroot          coroot-node-agent-zr9z7                                  1/1     Running     0              26m

image

Containerd memory usage in Cost tab

Hi,

In cost tab we can see high usage of memory by Containerd@ (or dockerd on previous K8s versions) but on my knowledge these memory is not used. It's only cached, so it shouldn't be visible on these usage bars. Information about we had 50-70% memory cached it's not give us anything usefull because it's standard behavior of linux system to use free memory as cache.

On example below we really have used less than 50% memory not ~80%, and containerd (magenta color) not use 38% memory.

image

image

image

Valuable tracing info has dissappeared

We are running v0.17.0 on a 1.24 AWS-EKS cluster. Initially when we installed it (with Helm) we were able to see the frequency as well as the latency of interactions between services, which was very very valuable to us. Since then, we have done very little in terms of customizations, we just introduced a set of Categories to better allow us to separate services, and also enabled basic authentication for the Costs URL. Additionally, we are using an external Postgres database, installed with its own Helm chart. Prometheus is also external, we are using our main Prometheus service.

The problem we have discovered is that, now, the thickness of the lines between services remains fixed (whereas it previously would get thicker if the interactions were more frequent), and there is no label on them with the average latency of each interaction.

What could be wrong? How can we troubleshoot this?

Thanks

label selectors to differentiate apps in namespace

I have a need to have multiple groups of applications in a single namespace. I may have 10 copies of a set of applications and want to be able to group them. It looks like now you primarily do this via namespace

COSTS tab can't parse json.

Hi,

We still have configuration I described in #20.

I already reinstall our coroot and node-agents to see new feature and have error on COSTS tab. I looked in the documentation but couldn't find any information about it.

We have:
image
and in logs of container I only see:

W0413 11:17:52.309318       1 jvm.go:17] only one JVM per instance is supported so far, will keep only org.elasticsearch.launcher.CliToolLauncher
W0413 11:17:52.309403       1 jvm.go:17] only one JVM per instance is supported so far, will keep only org.elasticsearch.launcher.CliToolLauncher
W0413 11:17:52.309750       1 jvm.go:17] only one JVM per instance is supported so far, will keep only org.elasticsearch.launcher.CliToolLauncher
W0413 11:17:52.309774       1 jvm.go:17] only one JVM per instance is supported so far, will keep only org.elasticsearch.launcher.CliToolLauncher
I0413 11:17:52.322924       1 constructor.go:113] got 8 nodes, 159 services, 205 applications
I0413 11:17:52.322939       1 api.go:606] world loaded in 903.774258ms
E0413 11:17:52.504682       1 json.go:15] failed to encode: json: unsupported value: NaN

Can you help to resolve that issue? I miss something important?

UPDATE: we have karpenter, it spawns ondemands and spot nodes in that EKS cluster.

coroot agent

Hi - Coroot agent is failing to start in our environment as our kernel version is 3.10.0 whereas what is required is 4.16. Is there any workaround for the same?

We are using rhel7

pv's run out of space | data persistence

Hi, the question is about data persistence.

coroot: unknown

/dev/rbd3 50G 13G 37G 26% /data

clickhouse: unknown

/dev/rbd4 49G 49G 936K 100% /bitnami/clickhouse

prometheus: ok

- args:
  - --storage.tsdb.retention.time=1d

pyroscope: ok

retention: 8h
retention-levels:
  "0": 1h
  "1": 4h
  "2": 8h

How i can set data retention periods at coroot / clickhouse?

root@coroot-7cc6f5d764-tw2wh:/opt/coroot# df -h
Filesystem            Size  Used Avail Use% Mounted on
overlay               120G   26G   95G  22% /
tmpfs                  64M     0   64M   0% /dev
tmpfs                  32G     0   32G   0% /sys/fs/cgroup
/dev/rbd3              50G   13G   37G  26% /data
/dev/mapper/data-var  120G   26G   95G  22% /etc/hosts
shm                    64M     0   64M   0% /dev/shm
tmpfs                  63G   12K   63G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs                  32G     0   32G   0% /proc/acpi
tmpfs                  32G     0   32G   0% /proc/scsi
tmpfs                  32G     0   32G   0% /sys/firmware
I have no name!@coroot-clickhouse-shard0-0:/$ df -h
Filesystem            Size  Used Avail Use% Mounted on
overlay               120G   26G   95G  22% /
tmpfs                  64M     0   64M   0% /dev
tmpfs                  32G     0   32G   0% /sys/fs/cgroup
/dev/mapper/data-var  120G   26G   95G  22% /etc/hosts
/dev/rbd4              49G   49G  936K 100% /bitnami/clickhouse
shm                    64M     0   64M   0% /dev/shm
tmpfs                  63G   12K   63G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs                  32G     0   32G   0% /proc/acpi
tmpfs                  32G     0   32G   0% /proc/scsi
tmpfs                  32G     0   32G   0% /sys/firmware
NAME                             STATUS  VOLUME                                    CAPACITY  ACCESS  STORAGECLASS
coroot-data                      Bound   pvc-40c82276-07b4-4cc3-8fb0-f2e87688e772  50Gi      RWO     39d
coroot-prometheus-server         Bound   pvc-a46d2876-c619-4bdb-aac8-cbf8c3ac91b9  50Gi      RWO     39d
coroot-pyroscope                 Bound   pvc-9387ec5b-4d08-4f0c-b848-af600ee918e0  50Gi      RWO     39d
data-coroot-clickhouse-shard0-0  Bound   pvc-9e46791c-9334-4a81-906e-30e36d4f5b6d  50Gi      RWO     39d

Thanks

Coroot UI fails to see one of postgres instances.

Hi. Coroot doesn't show one of the replicas in my 3-server postgres setup as a postgres node. Also it fails to see replication roles, and doesn't recognize other deployments (gitlab/nextcloud) as postgres.

image

The relevant metrics are present in prometheus:
image

image

image

Could you please advise on how to handle this? I get the feeling that coroot doesn't like low-query situations or my labels.

Support for ARM64 architectures

Hello there,

I am trying coroot for an in-house kubernetes clusters deployed on Raspberry Pi 4 machines. I followed the installation guide but I am getting exec /opt/coroot/coroot: exec format error . This tells me that the arm64 arch is not supported. Considering all cloud providers like AWS, GCP, Oracle and now Azure have arm64 support, it would be great if we can get arm64 bit support.

Node Agent is configured but the coroot is not identifying

I have a problem, even configuring node-agent the coroot is not working and collecting node metrics. Below is the YAML files. Basically I implemented it in the default, I'm also using kube-prometheus-stack but I don't use the podSelector configuration

image

As pictured above, the configuration with promtheus is OK.

image

The agent node daemon sets are apparently healthy too but looking into coroot web UI always show coroot-node-agent : no agent installed

I've followed the document below to install node-agent

https://coroot.com/docs/metric-exporters/node-agent/overview

apiVersion: v1
kind: Namespace
metadata:
  name: monitoring

---

apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: coroot-node-agent
  name: coroot-node-agent
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: coroot-node-agent
  template:
    metadata:
      labels:
        app: coroot-node-agent
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '80'
    spec:
      tolerations:
        - operator: Exists
      hostPID: true
      containers:
        - name: coroot-node-agent
          image: ghcr.io/coroot/coroot-node-agent:latest
          args: ["--cgroupfs-root", "/host/sys/fs/cgroup"]
          ports:
            - containerPort: 80
              name: http
          securityContext:
            privileged: true
          volumeMounts:
            - mountPath: /host/sys/fs/cgroup
              name: cgroupfs
              readOnly: true
            - mountPath: /sys/kernel/debug
              name: debugfs
              readOnly: false
      volumes:
        - hostPath:
            path: /sys/fs/cgroup
          name: cgroupfs
        - hostPath:
            path: /sys/kernel/debug
          name: debugfs
---
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
  name: coroot-node-agent
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: coroot-node-agent
  podMetricsEndpoints:
    - port: http

k0s v1.26.1+k0s.0 failed to inspect container

I'm having trouble getting service maps working. I installed coroot into a 1 master node + 2 worker nodes cluster, all applications show external endpoints, and no CPU/Memory data is picked up.

I used helm which installed the following coroot versions:

$ helm install --namespace coroot --create-namespace coroot coroot/coroot

image: ghcr.io/coroot/coroot-node-agent:1.6.4
image: ghcr.io/coroot/coroot:0.13.1
$ k0s version
v1.26.1+k0s.0

$ uname -a
Linux kmaster 5.15.0-60-generic #66-Ubuntu SMP Fri Jan 20 14:29:49 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

$ sudo sysctl -a | grep bpf
kernel.bpf_stats_enabled = 1
kernel.unprivileged_bpf_disabled = 0
net.core.bpf_jit_enable = 1
net.core.bpf_jit_harden = 0
net.core.bpf_jit_kallsyms = 1
net.core.bpf_jit_limit = 264241152

node agent logs are showing failed to get container metadata for pid and failed to inspect container errors

I0226 11:40:37.256313   13256 registry.go:191] TCP connection from unknown container {connection-open none 14216 10.244.0.218:46266 10.103.101.131:80 25 5635465385867 <nil>}
W0226 11:40:37.256387   13256 registry.go:244] failed to get container metadata for pid 14216 -> /kubepods/burstable/podd785437d-e85d-40f0-b13f-52a66f1dda5d/cc5347fb09a49ab8a1017960f8c70e4e765dedb561cb0e2eb7325196fc4efcf4: failed to interact with dockerd (%!s(<nil>)) or with containerd (%!s(<nil>))

It seems that I could not push my code and got this error message:
Permission to coroot/coroot-node-agent.git denied to irvanmohamad

I would like to add support for k0s distribution in containerd.go file, so that it would be like this:

sockets := []string{"/var/snap/microk8s/common/run/containerd.sock", "/run/k3s/containerd/containerd.sock", "/run/containerd/containerd.sock", "/run/k0s/containerd.sock"}

microk8s failed to inspect container

I'm having trouble getting service maps working. I installed coroot into a single node microk8s cluster, all applications show external endpoints, and no CPU/Memory data is picked up.

I used helm which installed the following coroot versions:

$ helm install --namespace coroot --create-namespace coroot coroot/coroot

   image: ghcr.io/coroot/coroot-node-agent:1.6.1
   image: ghcr.io/coroot/coroot:0.11.0
$ microk8s version
MicroK8s v1.25.4 revision 4221

$ uname -a
Linux micro.k8s 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

node agent logs are showing "failed to inspect container" errors

+ coroot-node-agent-cc898 โ€บ node-agent
coroot-node-agent-cc898 node-agent W1228 12:01:01.468515 2813613 registry.go:323] failed to inspect container 3525641293dd8d8f2974aa8e1dd605f291f233356a9a5a9a931535c8ebc13df7: Error: No such container: 3525641293dd8d8f2974aa8e1dd605f291f233356a9a5a9a931535c8ebc13df7
coroot-node-agent-cc898 node-agent W1228 12:01:01.468993 2813613 registry.go:332] failed to inspect container 3525641293dd8d8f2974aa8e1dd605f291f233356a9a5a9a931535c8ebc13df7: container "3525641293dd8d8f2974aa8e1dd605f291f233356a9a5a9a931535c8ebc13df7" in namespace "k8s.io": not found
coroot-node-agent-cc898 node-agent W1228 12:01:01.469017 2813613 registry.go:244] failed to get container metadata for pid 2814392 -> /kubepods/burstable/podb2eadec9-fe40-4534-b14f-c4f4fbe695fc/3525641293dd8d8f2974aa8e1dd605f291f233356a9a5a9a931535c8ebc13df7: failed to interact with dockerd (Error: No such container: 3525641293dd8d8f2974aa8e1dd605f291f233356a9a5a9a931535c8ebc13df7) or with containerd (container "3525641293dd8d8f2974aa8e1dd605f291f233356a9a5a9a931535c8ebc13df7" in namespace "k8s.io": not found)
coroot-node-agent-cc898 node-agent W1228 12:01:01.639019 2813613 registry.go:323] failed to inspect container c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0: Error: No such container: c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0
coroot-node-agent-cc898 node-agent W1228 12:01:01.639488 2813613 registry.go:332] failed to inspect container c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0: container "c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0" in namespace "k8s.io": not found
coroot-node-agent-cc898 node-agent W1228 12:01:01.639507 2813613 registry.go:244] failed to get container metadata for pid 169108 -> /kubepods/besteffort/pod6cfd3792-d0be-4835-905e-11415fab06bb/c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0: failed to interact with dockerd (Error: No such container: c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0) or with containerd (container "c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0" in namespace "k8s.io": not found)
coroot-node-agent-cc898 node-agent I1228 12:01:01.639514 2813613 registry.go:197] TCP connection error from unknown container {connection-error none 169108 10.1.93.101:33846 192.168.1.197:8126 8 0 <nil>}
coroot-node-agent-cc898 node-agent W1228 12:01:01.639827 2813613 registry.go:323] failed to inspect container c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0: Error: No such container: c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0
coroot-node-agent-cc898 node-agent W1228 12:01:01.640051 2813613 registry.go:332] failed to inspect container c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0: container "c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0" in namespace "k8s.io": not found
coroot-node-agent-cc898 node-agent W1228 12:01:01.640063 2813613 registry.go:244] failed to get container metadata for pid 168714 -> /kubepods/besteffort/pod6cfd3792-d0be-4835-905e-11415fab06bb/c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0: failed to interact with dockerd (Error: No such container: c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0) or with containerd (container "c14e5fc91c78fd05debecb9e5a805403bb7a6d4d2a75750ad4b5ba25a3b249c0" in namespace "k8s.io": not found)
coroot-node-agent-cc898 node-agent I1228 12:01:01.640067 2813613 registry.go:197] TCP connection error from unknown container {connection-error none 168714 10.1.93.101:33858 192.168.1.197:8126 8 0 <nil>}
coroot-node-agent-cc898 node-agent W1228 12:01:01.960432 2813613 registry.go:323] failed to inspect container 9352666a52fae4b94047fd1f07620b4ef23826c079ff36064cc025ac24877f84: Error: No such container: 9352666a52fae4b94047fd1f07620b4ef23826c079ff36064cc025ac24877f84
coroot-node-agent-cc898 node-agent W1228 12:01:01.960823 2813613 registry.go:332] failed to inspect container 9352666a52fae4b94047fd1f07620b4ef23826c079ff36064cc025ac24877f84: container "9352666a52fae4b94047fd1f07620b4ef23826c079ff36064cc025ac24877f84" in namespace "k8s.io": not found
coroot-node-agent-cc898 node-agent W1228 12:01:01.960978 2813613 registry.go:244] failed to get container metadata for pid 3742007 -> /kubepods/burstable/pod3faa3f58-e25e-4dac-9a6e-6a59a20d6cdb/9352666a52fae4b94047fd1f07620b4ef23826c079ff36064cc025ac24877f84: failed to interact with dockerd (Error: No such container: 9352666a52fae4b94047fd1f07620b4ef23826c079ff36064cc025ac24877f84) or with containerd (container "9352666a52fae4b94047fd1f07620b4ef23826c079ff36064cc025ac24877f84" in namespace "k8s.io": not found)
coroot-node-agent-cc898 node-agent I1228 12:01:01.960994 2813613 registry.go:191] TCP connection from unknown container {connection-open none 3742007 127.0.0.1:45680 127.0.0.1:8080 14 1338699945448561 <nil>}
coroot-node-agent-cc898 node-agent W1228 12:01:02.317373 2813613 registry.go:323] failed to inspect container 3525641293dd8d8f2974aa8e1dd605f291f233356a9a5a9a931535c8ebc13df7: Error: No such container: 3525641293dd8d8f2974aa8e1dd605f291f233356a9a5a9a931535c8ebc13df7

image
image

OCI OKE: Not getting service maps

image

coroot-node-agent:1.8.8
coroot:0.17.11


[root@oke-cz7rd4xktfa-n7ojj5z4kja-s52h5y2w2ra-0 /]# uname -a
Linux oke-cz7rd4xktfa-n7ojj5z4kja-s52h5y2w2ra-0 5.15.0-101.103.2.1.el8uek.aarch64 #2 SMP Mon May 1 19:47:28 PDT 2023 aarch64 aarch64 aarch64 GNU/Linux

> node_info

node_info{app="coroot-node-agent", controller_revision_hash="846465d744", hostname="oke-cz7rd4xktfa-n7ojj5z4kja-s52h5y2w2ra-0", instance="20.244.0.80:80", job="kubernetes-pods", kernel_version="5.15.0-101.103.2.1.el8uek.aarch64", kubernetes_namespace="monitoring-stg", kubernetes_pod_name="coroot-node-agent-897l9", machine_id="8b84bad6abdc42a8b283cd1b3e0fa093", pod_template_generation="3"}
	1
node_info{app="coroot-node-agent", controller_revision_hash="846465d744", hostname="oke-cz7rd4xktfa-n7ojj5z4kja-s52h5y2w2ra-1", instance="20.244.0.177:80", job="kubernetes-pods", kernel_version="5.15.0-101.103.2.1.el8uek.aarch64", kubernetes_namespace="monitoring-stg", kubernetes_pod_name="coroot-node-agent-l6l2r", machine_id="a7969b82508a4c529f8fe2bad247f1e6", pod_template_generation="3"}


> node_agent_info

node_agent_info{app="coroot-node-agent", controller_revision_hash="846465d744", instance="20.244.0.177:80", job="kubernetes-pods", kubernetes_namespace="monitoring-stg", kubernetes_pod_name="coroot-node-agent-l6l2r", machine_id="a7969b82508a4c529f8fe2bad247f1e6", pod_template_generation="3", version="1.8.8"}
	1
node_agent_info{app="coroot-node-agent", controller_revision_hash="846465d744", instance="20.244.0.80:80", job="kubernetes-pods", kubernetes_namespace="monitoring-stg", kubernetes_pod_name="coroot-node-agent-897l9", machine_id="8b84bad6abdc42a8b283cd1b3e0fa093", pod_template_generation="3", version="1.8.8"}


> kube_node_info

kube_node_info{app="kube-state-metrics", container_runtime_version="cri-o://1.26.2-142.el8", instance="20.244.0.58:8080", internal_ip="20.0.10.151", job="kubernetes-pods", kernel_version="5.15.0-101.103.2.1.el8uek.aarch64", kubelet_version="v1.26.2", kubeproxy_version="v1.26.2", kubernetes_namespace="monitoring-stg", kubernetes_pod_name="kube-state-metrics-54869c556b-mmxg9", node="20.0.10.151", os_image="Oracle Linux Server 8.7", pod_cidr="20.244.0.128/25", pod_template_hash="54869c556b", provider_id="ocid1.instance.oc1.iad.<REMOVED_SECURITY>", system_uuid="a7969b82-508a-4c52-9f8f-e2bad247f1e6"}
	1
kube_node_info{app="kube-state-metrics", container_runtime_version="cri-o://1.26.2-142.el8", instance="20.244.0.58:8080", internal_ip="20.0.10.77", job="kubernetes-pods", kernel_version="5.15.0-101.103.2.1.el8uek.aarch64", kubelet_version="v1.26.2", kubeproxy_version="v1.26.2", kubernetes_namespace="monitoring-stg", kubernetes_pod_name="kube-state-metrics-54869c556b-mmxg9", node="20.0.10.77", os_image="Oracle Linux Server 8.7", pod_cidr="20.244.0.0/25", pod_template_hash="54869c556b", provider_id="ocid1.instance.oc1.iad..<REMOVED_SECURITY>", system_uuid="8b84bad6-abdc-42a8-b283-cd1b3e0fa093"}



> container_net_tcp_active_connections

container_net_tcp_active_connections{actual_destination="127.0.0.1:10250", app="coroot-node-agent", container_id="/k8s/kube-system/proxymux-client-rxsvt/proxymux-client", controller_revision_hash="846465d744", destination="127.0.0.1:10250", instance="20.244.0.80:80", job="kubernetes-pods", kubernetes_namespace="monitoring-stg", kubernetes_pod_name="coroot-node-agent-897l9", machine_id="8b84bad6abdc42a8b283cd1b3e0fa093", pod_template_generation="3"}
	2
container_net_tcp_active_connections{actual_destination="127.0.0.1:32768", app="coroot-node-agent", container_id="/system.slice/oracle-cloud-agent.service", controller_revision_hash="846465d744", destination="127.0.0.1:32768", instance="20.244.0.80:80", job="kubernetes-pods", kubernetes_namespace="monitoring-stg", kubernetes_pod_name="coroot-node-agent-897l9", machine_id="8b84bad6abdc42a8b283cd1b3e0fa093", pod_template_generation="3"}
	1
container_net_tcp_active_connections{actual_destination="127.0.0.1:32769", app="coroot-node-agent", container_id="/system.slice/oracle-cloud-agent.service", controller_revision_hash="846465d744", destination="127.0.0.1:32769", instance="20.244.0.177:80", job="kubernetes-pods", kubernetes_namespace="monitoring-stg", kubernetes_pod_name="coroot-node-agent-l6l2r", machine_id="a7969b82508a4c529f8fe2bad247f1e6", pod_template_generation="3"}
	1
container_net_tcp_active_connections{actual_destination="127.0.0.1:43389", app="coroot-node-agent", container_id="/system.slice/kubelet.service", controller_revision_hash="846465d744", destination="127.0.0.1:43389", instance="20.244.0.80:80", job="kubernetes-pods", kubernetes_namespace="monitoring-stg", kubernetes_pod_name="coroot-node-agent-897l9", machine_id="8b84bad6abdc42a8b283cd1b3e0fa093", pod_template_generation="3"}


> log

I0718 07:25:56.969044  560034 cilium.go:29] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_ct4_global: no such file or directory
I0718 07:25:56.969425  560034 cilium.go:35] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_ct6_global: no such file or directory
I0718 07:25:56.969465  560034 cilium.go:42] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_lb4_backends_v2: no such file or directory
I0718 07:25:56.969501  560034 cilium.go:42] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_lb4_backends_v3: no such file or directory
I0718 07:25:56.969538  560034 cilium.go:51] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_lb6_backends_v2: no such file or directory
I0718 07:25:56.969600  560034 cilium.go:51] Unable to get object /proc/1/root/sys/fs/bpf/tc/globals/cilium_lb6_backends_v3: no such file or directory
I0718 07:25:56.970004  560034 main.go:81] agent version: 1.8.8
I0718 07:25:56.970538  560034 main.go:87] hostname: oke-cz7rd4xktfa-n7ojj5z4kja-s52h5y2w2ra-0
I0718 07:25:56.970580  560034 main.go:88] kernel version: 5.15.0-101.103.2.1.el8uek.aarch64
I0718 07:25:56.970936  560034 main.go:71] machine-id:  8b84bad6abdc42a8b283cd1b3e0fa093
I0718 07:25:56.971002  560034 tracing.go:29] no OpenTelemetry collector endpoint configured
I0718 07:25:56.971247  560034 metadata.go:66] cloud provider: 
I0718 07:25:56.971283  560034 collector.go:157] instance metadata: <nil>
W0718 07:25:56.972131  560034 registry.go:66] Cannot connect to the Docker daemon at unix:///proc/1/root/run/docker.sock. Is the docker daemon running?
W0718 07:25:56.972131  560034 registry.go:66] Cannot connect to the Docker daemon at unix:///proc/1/root/run/docker.sock. Is the docker daemon running?
W0718 07:26:00.996200  560034 registry.go:69] couldn't connect to containerd through the following UNIX sockets [/var/snap/microk8s/common/run/containerd.sock,/run/k0s/containerd.sock,/run/k3s/containerd/containerd.sock,/run/containerd/containerd.sock]: failed to dial "/proc/1/root/run/containerd/containerd.sock": context deadline exceeded
W0718 07:26:00.996200  560034 registry.go:69] couldn't connect to containerd through the following UNIX sockets [/var/snap/microk8s/common/run/containerd.sock,/run/k0s/containerd.sock,/run/k3s/containerd/containerd.sock,/run/containerd/containerd.sock]: failed to dial "/proc/1/root/run/containerd/containerd.sock": context deadline exceeded
I0718 07:26:00.999763  560034 crio.go:42] cri-o socket: /proc/1/root/var/run/crio/crio.sock
I0718 07:26:01.775727  560034 registry.go:266] calculated container id 1 -> /init.scope -> 
I0718 07:26:01.775856  560034 registry.go:271] "ignoring" cg="/init.scope" pid=1
I0718 07:26:01.776549  560034 registry.go:266] calculated container id 2 -> / -> 
I0718 07:26:01.776614  560034 registry.go:271] "ignoring" cg="/" pid=2
I0718 07:26:01.776731  560034 registry.go:266] calculated container id 3 -> / -> 
I0718 07:26:01.776778  560034 registry.go:271] "ignoring" cg="/" pid=3
I0718 07:26:01.776864  560034 registry.go:266] calculated container id 4 -> / -> 
I0718 07:26:01.776920  560034 registry.go:271] "ignoring" cg="/" pid=4
I0718 07:26:01.777008  560034 registry.go:266] calculated container id 5 -> / -> 
I0718 07:26:01.777045  560034 registry.go:271] "ignoring" cg="/" pid=5
I0718 07:26:01.777105  560034 registry.go:266] calculated container id 6 -> / -> 
I0718 07:26:01.777149  560034 registry.go:271] "ignoring" cg="/" pid=6
I0718 07:26:01.777207  560034 registry.go:266] calculated container id 8 -> / -> 
I0718 07:26:01.777248  560034 registry.go:271] "ignoring" cg="/" pid=8
I0718 07:26:01.777308  560034 registry.go:266] calculated container id 10 -> / -> 
I0718 07:26:01.777352  560034 registry.go:271] "ignoring" cg="/" pid=10
I0718 07:26:01.777410  560034 registry.go:266] calculated container id 11 -> / -> 
I0718 07:26:01.777452  560034 registry.go:271] "ignoring" cg="/" pid=11
I0718 07:26:01.777513  560034 registry.go:266] calculated container id 12 -> / -> 
I0718 07:26:01.777573  560034 registry.go:271] "ignoring" cg="/" pid=12
I0718 07:26:01.777654  560034 registry.go:266] calculated container id 13 -> / -> 
I0718 07:26:01.777688  560034 registry.go:271] "ignoring" cg="/" pid=13
I0718 07:26:01.777765  560034 registry.go:266] calculated container id 14 -> / -> 
I0718 07:26:01.777805  560034 registry.go:271] "ignoring" cg="/" pid=14
I0718 07:26:01.777895  560034 registry.go:266] calculated container id 15 -> / -> 
I0718 07:26:01.777952  560034 registry.go:271] "ignoring" cg="/" pid=15
I0718 07:26:01.778013  560034 registry.go:266] calculated container id 17 -> / -> 
I0718 07:26:01.778057  560034 registry.go:271] "ignoring" cg="/" pid=17
I0718 07:26:01.778128  560034 registry.go:266] calculated container id 18 -> / -> 
I0718 07:26:01.778159  560034 registry.go:271] "ignoring" cg="/" pid=18
I0718 07:26:01.778238  560034 registry.go:266] calculated container id 19 -> / -> 
I0718 07:26:01.778282  560034 registry.go:271] "ignoring" cg="/" pid=19
I0718 07:26:01.778339  560034 registry.go:266] calculated container id 20 -> / -> 
I0718 07:26:01.778380  560034 registry.go:271] "ignoring" cg="/" pid=20
I0718 07:26:01.778489  560034 registry.go:266] calculated container id 21 -> / -> 
I0718 07:26:01.778523  560034 registry.go:271] "ignoring" cg="/" pid=21
I0718 07:26:01.778607  560034 registry.go:266] calculated container id 22 -> / -> 
I0718 07:26:01.778653  560034 registry.go:271] "ignoring" cg="/" pid=22
I0718 07:26:01.778714  560034 registry.go:266] calculated container id 23 -> / -> 
I0718 07:26:01.778756  560034 registry.go:271] "ignoring" cg="/" pid=23
I0718 07:26:01.778849  560034 registry.go:266] calculated container id 24 -> / -> 
I0718 07:26:01.778883  560034 registry.go:271] "ignoring" cg="/" pid=24
I0718 07:26:01.778951  560034 registry.go:266] calculated container id 25 -> / -> 
I0718 07:26:01.778995  560034 registry.go:271] "ignoring" cg="/" pid=25
I0718 07:26:01.779052  560034 registry.go:266] calculated container id 26 -> / -> 
I0718 07:26:01.779092  560034 registry.go:271] "ignoring" cg="/" pid=26
I0718 07:26:01.779170  560034 registry.go:266] calculated container id 82 -> / -> 
I0718 07:26:01.779202  560034 registry.go:271] "ignoring" cg="/" pid=82
I0718 07:26:01.779277  560034 registry.go:266] calculated container id 83 -> / -> 
I0718 07:26:01.779308  560034 registry.go:271] "ignoring" cg="/" pid=83
I0718 07:26:01.779377  560034 registry.go:266] calculated container id 84 -> / -> 
I0718 07:26:01.779408  560034 registry.go:271] "ignoring" cg="/" pid=84
I0718 07:26:01.779479  560034 registry.go:266] calculated container id 85 -> / -> 
I0718 07:26:01.779528  560034 registry.go:271] "ignoring" cg="/" pid=85
I0718 07:26:01.779589  560034 registry.go:266] calculated container id 86 -> / -> 
I0718 07:26:01.779669  560034 registry.go:271] "ignoring" cg="/" pid=86
I0718 07:26:01.779733  560034 registry.go:266] calculated container id 87 -> / -> 
I0718 07:26:01.779775  560034 registry.go:271] "ignoring" cg="/" pid=87
I0718 07:26:01.780254  560034 registry.go:266] calculated container id 88 -> / -> 
I0718 07:26:01.780306  560034 registry.go:271] "ignoring" cg="/" pid=88
I0718 07:26:01.780384  560034 registry.go:266] calculated container id 89 -> / -> 
I0718 07:26:01.780441  560034 registry.go:271] "ignoring" cg="/" pid=89
I0718 07:26:01.780636  560034 registry.go:266] calculated container id 90 -> / -> 
I0718 07:26:01.780704  560034 registry.go:271] "ignoring" cg="/" pid=90
I0718 07:26:01.780781  560034 registry.go:266] calculated container id 92 -> / -> 
I0718 07:26:01.780814  560034 registry.go:271] "ignoring" cg="/" pid=92
I0718 07:26:01.780897  560034 registry.go:266] calculated container id 93 -> / -> 
I0718 07:26:01.780939  560034 registry.go:271] "ignoring" cg="/" pid=93
I0718 07:26:01.781016  560034 registry.go:266] calculated container id 95 -> / -> 
I0718 07:26:01.781047  560034 registry.go:271] "ignoring" cg="/" pid=95
I0718 07:26:01.781120  560034 registry.go:266] calculated container id 96 -> / -> 
I0718 07:26:01.781153  560034 registry.go:271] "ignoring" cg="/" pid=96
I0718 07:26:01.781220  560034 registry.go:266] calculated container id 97 -> / -> 
I0718 07:26:01.781270  560034 registry.go:271] "ignoring" cg="/" pid=97
I0718 07:26:01.781453  560034 registry.go:266] calculated container id 98 -> / -> 
I0718 07:26:01.781492  560034 registry.go:271] "ignoring" cg="/" pid=98
I0718 07:26:01.781565  560034 registry.go:266] calculated container id 99 -> / -> 
I0718 07:26:01.781767  560034 registry.go:271] "ignoring" cg="/" pid=99
I0718 07:26:01.781914  560034 registry.go:266] calculated container id 100 -> / -> 
I0718 07:26:01.781980  560034 registry.go:271] "ignoring" cg="/" pid=100
I0718 07:26:01.782051  560034 registry.go:266] calculated container id 101 -> / -> 
I0718 07:26:01.782208  560034 registry.go:271] "ignoring" cg="/" pid=101
I0718 07:26:01.782276  560034 registry.go:266] calculated container id 102 -> / -> 
I0718 07:26:01.782318  560034 registry.go:271] "ignoring" cg="/" pid=102
I0718 07:26:01.782381  560034 registry.go:266] calculated container id 103 -> / -> 
I0718 07:26:01.782914  560034 registry.go:271] "ignoring" cg="/" pid=103
I0718 07:26:01.783023  560034 registry.go:266] calculated container id 104 -> / -> 
I0718 07:26:01.783059  560034 registry.go:271] "ignoring" cg="/" pid=104
I0718 07:26:01.783144  560034 registry.go:266] calculated container id 105 -> / -> 
I0718 07:26:01.783184  560034 registry.go:271] "ignoring" cg="/" pid=105
I0718 07:26:01.783250  560034 registry.go:266] calculated container id 106 -> / -> 
I0718 07:26:01.783284  560034 registry.go:271] "ignoring" cg="/" pid=106
I0718 07:26:01.783352  560034 registry.go:266] calculated container id 107 -> / -> 
I0718 07:26:01.783386  560034 registry.go:271] "ignoring" cg="/" pid=107
I0718 07:26:01.783447  560034 registry.go:266] calculated container id 108 -> / -> 
I0718 07:26:01.783480  560034 registry.go:271] "ignoring" cg="/" pid=108
I0718 07:26:01.783537  560034 registry.go:266] calculated container id 109 -> / -> 
I0718 07:26:01.783986  560034 registry.go:271] "ignoring" cg="/" pid=109
I0718 07:26:01.784064  560034 registry.go:266] calculated container id 110 -> / -> 
I0718 07:26:01.784097  560034 registry.go:271] "ignoring" cg="/" pid=110
I0718 07:26:01.879247  560034 registry.go:266] calculated container id 549568 -> /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod2093ea02_83df_4cc6_a873_6f2039a15d2a.slice/crio-921b7e1c75a70b965193cf092c1cdc4e0193a0843df1da555f708190d14da00a.scope -> /k8s/monitoring-stg/prometheus-0/prometheus
I0718 07:26:01.977487  560034 registry.go:187] TCP listen open from unknown container {listen-open none 2801 127.0.0.1:9003 0.0.0.0:0 10 0 <nil>}
I0718 07:26:04.481017  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:43066 169.254.169.254:80 15 53376656589205 <nil>}
I0718 07:26:04.947052  560034 registry.go:266] calculated container id 560190 -> /init.scope -> 
I0718 07:26:04.948888  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=560190
I0718 07:26:05.750873  560034 registry.go:266] calculated container id 560193 -> /system.slice/setroubleshootd.service -> /system.slice/setroubleshootd.service
I0718 07:26:05.751081  560034 container.go:833] "started journald logparser" cg="/system.slice/setroubleshootd.service"
I0718 07:26:05.751100  560034 registry.go:297] "detected a new container" pid=560193 cg="/system.slice/setroubleshootd.service" id=/system.slice/setroubleshootd.service
I0718 07:26:14.484658  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:60972 169.254.169.254:80 16 53386660156442 <nil>}
I0718 07:26:24.488563  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:48202 169.254.169.254:80 7 53396669132946 <nil>}
I0718 07:26:33.138721  560034 registry.go:266] calculated container id 560400 -> / -> 
I0718 07:26:33.139074  560034 registry.go:271] "ignoring" cg="/" pid=560400
I0718 07:26:33.147890  560034 registry.go:266] calculated container id 560401 -> /init.scope -> 
I0718 07:26:33.148245  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=560401
I0718 07:26:34.491688  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:45128 169.254.169.254:80 18 53406673137186 <nil>}
I0718 07:26:34.951248  560034 registry.go:266] calculated container id 560455 -> / -> 
I0718 07:26:34.951276  560034 registry.go:271] "ignoring" cg="/" pid=560455
I0718 07:26:34.991290  560034 registry.go:266] calculated container id 560460 -> / -> 
I0718 07:26:34.991316  560034 registry.go:271] "ignoring" cg="/" pid=560460
I0718 07:26:35.013947  560034 registry.go:266] calculated container id 560463 -> / -> 
I0718 07:26:35.013974  560034 registry.go:271] "ignoring" cg="/" pid=560463
I0718 07:26:44.498085  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:47108 169.254.169.254:80 19 53416679573357 <nil>}
I0718 07:26:48.942916  560034 registry.go:266] calculated container id 560612 -> / -> 
I0718 07:26:48.943272  560034 registry.go:271] "ignoring" cg="/" pid=560612
I0718 07:26:52.303707  560034 registry.go:266] calculated container id 560624 -> /user.slice -> 
I0718 07:26:52.303734  560034 registry.go:271] "ignoring" cg="/user.slice" pid=560624
I0718 07:26:52.333305  560034 registry.go:266] calculated container id 560625 -> / -> 
I0718 07:26:52.333334  560034 registry.go:271] "ignoring" cg="/" pid=560625
I0718 07:26:52.399757  560034 registry.go:266] calculated container id 560628 -> /user.slice -> 
I0718 07:26:52.399784  560034 registry.go:271] "ignoring" cg="/user.slice" pid=560628
I0718 07:26:52.423358  560034 registry.go:266] calculated container id 560629 -> / -> 
I0718 07:26:52.423628  560034 registry.go:271] "ignoring" cg="/" pid=560629
I0718 07:26:54.503940  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:47658 169.254.169.254:80 17 53426684349200 <nil>}
I0718 07:26:56.473279  560034 registry.go:266] calculated container id 560663 -> /user.slice -> 
I0718 07:26:56.473307  560034 registry.go:271] "ignoring" cg="/user.slice" pid=560663
I0718 07:26:56.491511  560034 registry.go:266] calculated container id 560664 -> / -> 
I0718 07:26:56.491541  560034 registry.go:271] "ignoring" cg="/" pid=560664
I0718 07:27:04.508990  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:36382 169.254.169.254:80 20 53436690683931 <nil>}
I0718 07:27:04.900884  560034 registry.go:266] calculated container id 560722 -> /init.scope -> 
I0718 07:27:04.900912  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=560722
I0718 07:27:05.622074  560034 registry.go:266] calculated container id 560754 -> / -> 
I0718 07:27:05.622115  560034 registry.go:271] "ignoring" cg="/" pid=560754
I0718 07:27:05.651856  560034 registry.go:266] calculated container id 560758 -> / -> 
I0718 07:27:05.651883  560034 registry.go:271] "ignoring" cg="/" pid=560758
I0718 07:27:14.524615  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:36238 169.254.169.254:80 15 53446701658926 <nil>}
I0718 07:27:19.828075  560034 registry.go:266] calculated container id 560920 -> / -> 
I0718 07:27:19.828142  560034 registry.go:271] "ignoring" cg="/" pid=560920
I0718 07:27:24.526287  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:59580 169.254.169.254:80 16 53456707945497 <nil>}
I0718 07:27:32.954031  560034 registry.go:266] calculated container id 560989 -> / -> 
I0718 07:27:32.954060  560034 registry.go:271] "ignoring" cg="/" pid=560989
I0718 07:27:34.529740  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:49902 169.254.169.254:80 7 53466711404813 <nil>}
I0718 07:27:44.536045  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:40388 169.254.169.254:80 18 53476717716024 <nil>}
I0718 07:27:48.471510  560034 registry.go:198] TCP connection from unknown container {connection-open none 561189 127.0.0.1:60658 127.0.0.1:10248 3 53480649074530 <nil>}
I0718 07:27:54.539453  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:34994 169.254.169.254:80 19 53486721115740 <nil>}
I0718 07:28:02.608170  560034 registry.go:266] calculated container id 561253 -> / -> 
I0718 07:28:02.608483  560034 registry.go:271] "ignoring" cg="/" pid=561253
I0718 07:28:04.545688  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:34358 169.254.169.254:80 17 53496727345551 <nil>}
I0718 07:28:04.925673  560034 registry.go:266] calculated container id 561284 -> /init.scope -> 
I0718 07:28:04.925710  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=561284
I0718 07:28:06.444272  560034 registry.go:266] calculated container id 561314 -> / -> 
I0718 07:28:06.444740  560034 registry.go:271] "ignoring" cg="/" pid=561314
I0718 07:28:14.548924  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:50342 169.254.169.254:80 20 53506730315104 <nil>}
I0718 07:28:14.800904  560034 registry.go:266] calculated container id 561468 -> / -> 
I0718 07:28:14.801212  560034 registry.go:271] "ignoring" cg="/" pid=561468
I0718 07:28:14.864626  560034 registry.go:266] calculated container id 561489 -> / -> 
I0718 07:28:14.864653  560034 registry.go:271] "ignoring" cg="/" pid=561489
I0718 07:28:14.879313  560034 registry.go:266] calculated container id 561495 -> / -> 
I0718 07:28:14.879340  560034 registry.go:271] "ignoring" cg="/" pid=561495
I0718 07:28:24.571427  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:55678 169.254.169.254:80 15 53516737138558 <nil>}
I0718 07:28:33.111692  560034 registry.go:266] calculated container id 561602 -> / -> 
I0718 07:28:33.111729  560034 registry.go:271] "ignoring" cg="/" pid=561602
I0718 07:28:33.118855  560034 registry.go:266] calculated container id 561603 -> / -> 
I0718 07:28:33.119250  560034 registry.go:271] "ignoring" cg="/" pid=561603
I0718 07:28:34.571405  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:57440 169.254.169.254:80 16 53526744265733 <nil>}
I0718 07:28:37.010016  560034 registry.go:266] calculated container id 561664 -> / -> 
I0718 07:28:37.011749  560034 registry.go:271] "ignoring" cg="/" pid=561664
I0718 07:28:41.483099  560034 registry.go:266] calculated container id 561758 -> / -> 
I0718 07:28:41.483137  560034 registry.go:271] "ignoring" cg="/" pid=561758
I0718 07:28:44.571586  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:33406 169.254.169.254:80 7 53536752971796 <nil>}
I0718 07:28:54.577383  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:47388 169.254.169.254:80 18 53546759177526 <nil>}
I0718 07:29:04.582986  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:34034 169.254.169.254:80 19 53556764723374 <nil>}
I0718 07:29:04.928993  560034 registry.go:266] calculated container id 561900 -> /init.scope -> 
I0718 07:29:04.929381  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=561900
I0718 07:29:14.591146  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:39906 169.254.169.254:80 17 53566772922634 <nil>}
I0718 07:29:24.605884  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:52710 169.254.169.254:80 20 53576780308770 <nil>}
I0718 07:29:32.982824  560034 registry.go:266] calculated container id 562166 -> / -> 
I0718 07:29:32.983172  560034 registry.go:271] "ignoring" cg="/" pid=562166
I0718 07:29:34.605294  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:46424 169.254.169.254:80 15 53586787067904 <nil>}
I0718 07:29:40.876249  560034 registry.go:266] calculated container id 562317 -> / -> 
I0718 07:29:40.876283  560034 registry.go:271] "ignoring" cg="/" pid=562317
I0718 07:29:44.614922  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:51644 169.254.169.254:80 16 53596795331205 <nil>}
I0718 07:29:54.622759  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:60128 169.254.169.254:80 7 53606803104963 <nil>}
I0718 07:30:03.613327  560034 registry.go:266] calculated container id 562455 -> /init.scope -> 
I0718 07:30:03.613647  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=562455
I0718 07:30:03.639416  560034 registry.go:266] calculated container id 562455 -> /system.slice/sysstat-collect.service -> /system.slice/sysstat-collect.service
I0718 07:30:03.639958  560034 container.go:833] "started journald logparser" cg="/system.slice/sysstat-collect.service"
I0718 07:30:03.639989  560034 registry.go:297] "detected a new container" pid=562455 cg="/system.slice/sysstat-collect.service" id=/system.slice/sysstat-collect.service
I0718 07:30:03.647913  560034 registry.go:266] calculated container id 562456 -> / -> 
I0718 07:30:03.647953  560034 registry.go:271] "ignoring" cg="/" pid=562456
I0718 07:30:04.628973  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:40086 169.254.169.254:80 18 53616810609619 <nil>}
I0718 07:30:04.901626  560034 registry.go:266] calculated container id 562467 -> /init.scope -> 
I0718 07:30:04.901657  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=562467
I0718 07:30:09.559880  560034 registry.go:266] calculated container id 562584 -> / -> 
I0718 07:30:09.560728  560034 registry.go:271] "ignoring" cg="/" pid=562584
I0718 07:30:10.196987  560034 registry.go:266] calculated container id 562598 -> /system.slice/crio-conmon-0d7d01dfe6c545fa62b4641e058aec7a606585a611a59a30d166aad826e2f8de.scope -> 
I0718 07:30:10.197025  560034 registry.go:271] "ignoring" cg="/system.slice/crio-conmon-0d7d01dfe6c545fa62b4641e058aec7a606585a611a59a30d166aad826e2f8de.scope" pid=562598
I0718 07:30:10.202773  560034 registry.go:266] calculated container id 562599 -> /system.slice/crio-conmon-0d7d01dfe6c545fa62b4641e058aec7a606585a611a59a30d166aad826e2f8de.scope -> 
I0718 07:30:10.202810  560034 registry.go:271] "ignoring" cg="/system.slice/crio-conmon-0d7d01dfe6c545fa62b4641e058aec7a606585a611a59a30d166aad826e2f8de.scope" pid=562599
I0718 07:30:10.240918  560034 registry.go:266] calculated container id 562604 -> /system.slice/crio-conmon-0d7d01dfe6c545fa62b4641e058aec7a606585a611a59a30d166aad826e2f8de.scope -> 
I0718 07:30:10.240958  560034 registry.go:271] "ignoring" cg="/system.slice/crio-conmon-0d7d01dfe6c545fa62b4641e058aec7a606585a611a59a30d166aad826e2f8de.scope" pid=562604
I0718 07:30:10.292999  560034 registry.go:266] calculated container id 562605 -> /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pode3f9262b_e1ca_4a5a_97ac_a6096bbae790.slice/crio-0d7d01dfe6c545fa62b4641e058aec7a606585a611a59a30d166aad826e2f8de.scope -> /k8s/kube-system/node-shell-339037c8-0ebd-4b3e-b3d2-7846b9ec0a5a/shell
W0718 07:30:10.293341  560034 registry.go:293] failed to create container pid=562605 cg=/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pode3f9262b_e1ca_4a5a_97ac_a6096bbae790.slice/crio-0d7d01dfe6c545fa62b4641e058aec7a606585a611a59a30d166aad826e2f8de.scope id=/k8s/kube-system/node-shell-339037c8-0ebd-4b3e-b3d2-7846b9ec0a5a/shell: no such file or directory
W0718 07:30:10.293341  560034 registry.go:293] failed to create container pid=562605 cg=/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pode3f9262b_e1ca_4a5a_97ac_a6096bbae790.slice/crio-0d7d01dfe6c545fa62b4641e058aec7a606585a611a59a30d166aad826e2f8de.scope id=/k8s/kube-system/node-shell-339037c8-0ebd-4b3e-b3d2-7846b9ec0a5a/shell: no such file or directory
I0718 07:30:10.294278  560034 registry.go:266] calculated container id 562606 -> /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pode3f9262b_e1ca_4a5a_97ac_a6096bbae790.slice/crio-0d7d01dfe6c545fa62b4641e058aec7a606585a611a59a30d166aad826e2f8de.scope -> /k8s/kube-system/node-shell-339037c8-0ebd-4b3e-b3d2-7846b9ec0a5a/shell
I0718 07:30:10.294569  560034 registry.go:297] "detected a new container" pid=562606 cg="/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pode3f9262b_e1ca_4a5a_97ac_a6096bbae790.slice/crio-0d7d01dfe6c545fa62b4641e058aec7a606585a611a59a30d166aad826e2f8de.scope" id=/k8s/kube-system/node-shell-339037c8-0ebd-4b3e-b3d2-7846b9ec0a5a/shell
I0718 07:30:14.632309  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:37234 169.254.169.254:80 19 53626814019374 <nil>}
I0718 07:30:14.951968  560034 registry.go:266] calculated container id 562707 -> / -> 
I0718 07:30:14.952285  560034 registry.go:271] "ignoring" cg="/" pid=562707
I0718 07:30:14.969404  560034 registry.go:266] calculated container id 562713 -> / -> 
I0718 07:30:14.969744  560034 registry.go:271] "ignoring" cg="/" pid=562713
I0718 07:30:24.642100  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:37680 169.254.169.254:80 17 53636823153717 <nil>}
I0718 07:30:25.871786  560034 registry.go:266] calculated container id 562802 -> / -> 
I0718 07:30:25.872093  560034 registry.go:271] "ignoring" cg="/" pid=562802
I0718 07:30:29.112245  560034 registry.go:266] calculated container id 562818 -> / -> 
I0718 07:30:29.112532  560034 registry.go:271] "ignoring" cg="/" pid=562818
I0718 07:30:33.092724  560034 registry.go:266] calculated container id 562841 -> / -> 
I0718 07:30:33.093056  560034 registry.go:271] "ignoring" cg="/" pid=562841
I0718 07:30:33.100071  560034 registry.go:266] calculated container id 562842 -> /init.scope -> 
I0718 07:30:33.100097  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=562842
I0718 07:30:34.649608  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:38586 169.254.169.254:80 20 53646831329736 <nil>}
I0718 07:30:39.572300  560034 registry.go:266] calculated container id 562997 -> / -> 
I0718 07:30:39.572705  560034 registry.go:271] "ignoring" cg="/" pid=562997
I0718 07:30:42.316142  560034 registry.go:266] calculated container id 563022 -> / -> 
I0718 07:30:42.316174  560034 registry.go:271] "ignoring" cg="/" pid=563022
I0718 07:30:44.655857  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:48036 169.254.169.254:80 15 53656837586265 <nil>}
I0718 07:30:47.570911  560034 registry.go:266] calculated container id 563060 -> / -> 
I0718 07:30:47.570942  560034 registry.go:271] "ignoring" cg="/" pid=563060
I0718 07:30:54.671656  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:36194 169.254.169.254:80 16 53666845281242 <nil>}
I0718 07:31:04.667505  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:47788 169.254.169.254:80 7 53676849190160 <nil>}
I0718 07:31:04.903637  560034 registry.go:266] calculated container id 563173 -> /init.scope -> 
I0718 07:31:04.903662  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=563173
I0718 07:31:07.796663  560034 registry.go:266] calculated container id 563190 -> / -> 
I0718 07:31:07.800098  560034 registry.go:271] "ignoring" cg="/" pid=563190
I0718 07:31:14.672564  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:60146 169.254.169.254:80 18 53686854325683 <nil>}
I0718 07:31:18.227195  560034 registry.go:266] calculated container id 563354 -> / -> 
I0718 07:31:18.227229  560034 registry.go:271] "ignoring" cg="/" pid=563354
I0718 07:31:24.676521  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:48908 169.254.169.254:80 19 53696858095000 <nil>}
I0718 07:31:28.974156  560034 registry.go:266] calculated container id 563438 -> / -> 
I0718 07:31:28.974186  560034 registry.go:271] "ignoring" cg="/" pid=563438
I0718 07:31:28.974194  560034 registry.go:198] TCP connection from unknown container {connection-open none 563438 127.0.0.1:44028 127.0.0.1:10248 3 53701153597411 <nil>}
I0718 07:31:34.404190  560034 registry.go:266] calculated container id 563475 -> / -> 
I0718 07:31:34.404533  560034 registry.go:271] "ignoring" cg="/" pid=563475
I0718 07:31:34.682938  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:40820 169.254.169.254:80 17 53706864699851 <nil>}
I0718 07:31:40.500742  560034 registry.go:266] calculated container id 563588 -> / -> 
I0718 07:31:40.500768  560034 registry.go:271] "ignoring" cg="/" pid=563588
I0718 07:31:41.709459  560034 registry.go:266] calculated container id 563596 -> / -> 
I0718 07:31:41.709491  560034 registry.go:271] "ignoring" cg="/" pid=563596
I0718 07:31:44.686715  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:43004 169.254.169.254:80 20 53716868426768 <nil>}
I0718 07:31:54.712340  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:34704 169.254.169.254:80 15 53726874919699 <nil>}
I0718 07:32:04.702327  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:47610 169.254.169.254:80 16 53736884281723 <nil>}
I0718 07:32:04.933650  560034 registry.go:266] calculated container id 563752 -> /init.scope -> 
I0718 07:32:04.933677  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=563752
I0718 07:32:14.711035  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:50800 169.254.169.254:80 7 53746892587023 <nil>}
I0718 07:32:18.220276  560034 registry.go:266] calculated container id 563925 -> / -> 
I0718 07:32:18.220305  560034 registry.go:271] "ignoring" cg="/" pid=563925
I0718 07:32:18.834447  560034 registry.go:266] calculated container id 563926 -> /init.scope -> 
I0718 07:32:18.834482  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=563926
I0718 07:32:24.722700  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:53072 169.254.169.254:80 18 53756897155744 <nil>}
I0718 07:32:34.410552  560034 registry.go:266] calculated container id 564050 -> / -> 
I0718 07:32:34.410579  560034 registry.go:271] "ignoring" cg="/" pid=564050
I0718 07:32:34.722241  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:36504 169.254.169.254:80 19 53766903929795 <nil>}
I0718 07:32:36.771758  560034 registry.go:266] calculated container id 564060 -> / -> 
I0718 07:32:36.774083  560034 registry.go:271] "ignoring" cg="/" pid=564060
I0718 07:32:44.730841  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:50208 169.254.169.254:80 17 53776912067894 <nil>}
I0718 07:32:49.193621  560034 registry.go:198] TCP connection from unknown container {connection-open none 564218 127.0.0.1:59318 127.0.0.1:10248 3 53781365834721 <nil>}
I0718 07:32:54.751490  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:50812 169.254.169.254:80 20 53786919630950 <nil>}
I0718 07:33:04.756451  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:53686 169.254.169.254:80 15 53796924874794 <nil>}
I0718 07:33:04.904386  560034 registry.go:266] calculated container id 564316 -> /init.scope -> 
I0718 07:33:04.904414  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=564316
I0718 07:33:14.761707  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:48048 169.254.169.254:80 16 53806928422950 <nil>}
I0718 07:33:24.771667  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:55614 169.254.169.254:80 7 53816934703039 <nil>}
I0718 07:33:32.723975  560034 registry.go:266] calculated container id 564594 -> / -> 
I0718 07:33:32.724013  560034 registry.go:271] "ignoring" cg="/" pid=564594
I0718 07:33:34.782348  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:34660 169.254.169.254:80 18 53826940807408 <nil>}
I0718 07:33:40.491193  560034 registry.go:266] calculated container id 564719 -> / -> 
I0718 07:33:40.491587  560034 registry.go:271] "ignoring" cg="/" pid=564719
I0718 07:33:41.171671  560034 registry.go:266] calculated container id 564735 -> / -> 
I0718 07:33:41.171695  560034 registry.go:271] "ignoring" cg="/" pid=564735
I0718 07:33:41.171754  560034 registry.go:266] calculated container id 564736 -> / -> 
I0718 07:33:41.171762  560034 registry.go:271] "ignoring" cg="/" pid=564736
I0718 07:33:44.771567  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:37522 169.254.169.254:80 19 53836945289088 <nil>}
I0718 07:33:54.771382  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:54692 169.254.169.254:80 17 53846952979825 <nil>}
I0718 07:33:59.356556  560034 registry.go:198] TCP connection from unknown container {connection-open none 564845 127.0.0.1:52514 127.0.0.1:10248 3 53851535472044 <nil>}
I0718 07:34:02.656025  560034 registry.go:266] calculated container id 564860 -> / -> 
I0718 07:34:02.656350  560034 registry.go:271] "ignoring" cg="/" pid=564860
I0718 07:34:04.782096  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:46188 169.254.169.254:80 20 53856963341134 <nil>}
I0718 07:34:04.900701  560034 registry.go:266] calculated container id 564888 -> /init.scope -> 
I0718 07:34:04.901017  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=564888
I0718 07:34:14.790065  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:42904 169.254.169.254:80 15 53866971757514 <nil>}
I0718 07:34:18.437000  560034 registry.go:266] calculated container id 565070 -> / -> 
I0718 07:34:18.437027  560034 registry.go:271] "ignoring" cg="/" pid=565070
I0718 07:34:18.852511  560034 registry.go:266] calculated container id 565071 -> /init.scope -> 
I0718 07:34:18.852548  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=565071
I0718 07:34:24.799959  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:54764 169.254.169.254:80 16 53876978691086 <nil>}
I0718 07:34:34.803969  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:55510 169.254.169.254:80 7 53886985646259 <nil>}
I0718 07:34:34.833397  560034 registry.go:266] calculated container id 565197 -> / -> 
I0718 07:34:34.833907  560034 registry.go:271] "ignoring" cg="/" pid=565197
I0718 07:34:44.811415  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:55998 169.254.169.254:80 18 53896993095794 <nil>}
I0718 07:34:54.820054  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:49372 169.254.169.254:80 19 53907000787290 <nil>}
I0718 07:35:04.825887  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:37186 169.254.169.254:80 17 53917007557382 <nil>}
I0718 07:35:04.901989  560034 registry.go:266] calculated container id 565467 -> /init.scope -> 
I0718 07:35:04.902371  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=565467
I0718 07:35:09.666810  560034 registry.go:266] calculated container id 565502 -> / -> 
I0718 07:35:09.666843  560034 registry.go:271] "ignoring" cg="/" pid=565502
I0718 07:35:14.832908  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:40500 169.254.169.254:80 20 53927014529835 <nil>}
I0718 07:35:18.306852  560034 registry.go:266] calculated container id 565649 -> / -> 
I0718 07:35:18.306881  560034 registry.go:271] "ignoring" cg="/" pid=565649
I0718 07:35:18.314845  560034 registry.go:266] calculated container id 565650 -> / -> 
I0718 07:35:18.314868  560034 registry.go:271] "ignoring" cg="/" pid=565650
I0718 07:35:18.826083  560034 registry.go:266] calculated container id 565651 -> /init.scope -> 
I0718 07:35:18.826111  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=565651
I0718 07:35:24.852683  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:33996 169.254.169.254:80 15 53937021944610 <nil>}
I0718 07:35:29.558772  560034 registry.go:266] calculated container id 565731 -> / -> 
I0718 07:35:29.559116  560034 registry.go:271] "ignoring" cg="/" pid=565731
I0718 07:35:29.559126  560034 registry.go:198] TCP connection from unknown container {connection-open none 565731 127.0.0.1:44322 127.0.0.1:10248 3 53941738634128 <nil>}
I0718 07:35:34.573550  560034 registry.go:266] calculated container id 565771 -> / -> 
I0718 07:35:34.573579  560034 registry.go:271] "ignoring" cg="/" pid=565771
I0718 07:35:34.847551  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:52838 169.254.169.254:80 16 53947028854663 <nil>}
I0718 07:35:44.871483  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:60206 169.254.169.254:80 7 53957037869765 <nil>}
I0718 07:35:49.516353  560034 registry.go:266] calculated container id 565942 -> / -> 
I0718 07:35:49.516401  560034 registry.go:271] "ignoring" cg="/" pid=565942
I0718 07:35:54.863024  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:46842 169.254.169.254:80 18 53967044819058 <nil>}
I0718 07:36:01.628366  560034 container.go:817] "started varlog logparser" cg="/system.slice/crond.service" log="/var/log/uptrack.log"
I0718 07:36:04.871462  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:56870 169.254.169.254:80 19 53977051941112 <nil>}
I0718 07:36:04.900632  560034 registry.go:266] calculated container id 566050 -> /init.scope -> 
I0718 07:36:04.900658  560034 registry.go:269] "ignoring without persisting" cg="/init.scope" pid=566050
I0718 07:36:14.873593  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:41330 169.254.169.254:80 17 53987055309667 <nil>}
I0718 07:36:18.485210  560034 registry.go:266] calculated container id 566233 -> / -> 
I0718 07:36:18.485239  560034 registry.go:271] "ignoring" cg="/" pid=566233
I0718 07:36:23.626143  560034 registry.go:266] calculated container id 566277 -> / -> 
I0718 07:36:23.626560  560034 registry.go:271] "ignoring" cg="/" pid=566277
I0718 07:36:24.879604  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:57170 169.254.169.254:80 20 53997061322755 <nil>}
I0718 07:36:34.657070  560034 registry.go:266] calculated container id 566354 -> / -> 
I0718 07:36:34.657404  560034 registry.go:271] "ignoring" cg="/" pid=566354
I0718 07:36:34.896404  560034 registry.go:198] TCP connection from unknown container {connection-open none 4255 20.0.10.77:54368 169.254.169.254:80 15 54007068056366 <nil>}



Multi Cluster pattern?

We have multiple k8s cluster, is it possible to utilize single or way to use coroot across clusters?

cpu throttling calculation

Hi guys!

I see in your documentation you counting throttling.
image
I see coroot just has metric container_resources_cpu_throttled_seconds_total. Without period

Coroot uses the [container_resources_cpu_throttled_seconds_total](https://coroot.com/docs/metric-exporters/node-agent/metrics#container_resources_cpu_throttled_seconds_total) metric to find out how long each container has been throttled for. If this metric of related containers is correlating with the application SLIs (Service Level Indicators), that means the lack of CPU time is caused by throttling.

i search in your open code but did't find what kind of formula are you use for calculation percent of throttling in container?

k8s clusters with big amount of IPs per node emit too much metrics

Slack ref: https://coroot-community.slack.com/archives/C0443R3BW2G/p1677599077846269

Since IPVS is used, all k8s service IPs are getting attached to kube-ipvs0 interface on each kubernetes node.

As there is always some services running in host network (ssh, kubelet, haproxy, etc) it also generates metric per each IP attached to that network interface.

5000 services = 5000 metrics for each service, on each node.

On cluster with ~30 nodes, with 3.5k pods it amounts to around 1.5m TS in container_net_tcp_listen_info metric.

coroot don't see node-agent

Hi, we still had the same clusters i describe before: #20 (comment)

For now we see some issue between coroot and node agent. We had node agent installed by it's helm-chart. Our prometheus scrape that node agent but our coroot show pods list but with error:
image
image

What metric i should check/test to make sure that metric is in Prometheus database, to make sure everything is OK with scraping and node-agent should be visible by coroot?

coroot server problem.

I have experienced about like this.

goroutine has problem.

how to fix this problem???

W0710 11:32:23.389096 1 jvm.go:17] only one JVM per instance is supported so far, will keep only azkaban.execapp.AzkabanExecutorServer -conf bin/../conf โ”‚
โ”‚ W0710 11:32:23.389106 1 jvm.go:17] only one JVM per instance is supported so far, will keep only org.apache.zookeeper.ZooKeeperMain -server 127.0.0.1:2181 โ”‚
โ”‚ W0710 11:32:23.389127 1 jvm.go:17] only one JVM per instance is supported so far, will keep only org.apache.zookeeper.server.quorum.QuorumPeerMain /usr/bin/../etc/zookeeper/zoo.cfg โ”‚
โ”‚ panic: runtime error: index out of range [1] with length 1 โ”‚
โ”‚ โ”‚
โ”‚ goroutine 113 [running]: โ”‚
โ”‚ github.com/coroot/coroot/constructor.getInstanceAndContainer(0xc000228000, 0xc000181e40, 0x11493de?, {0x0, 0x0}) โ”‚
โ”‚ /tmp/src/constructor/containers.go:27 +0xa46 โ”‚
โ”‚ github.com/coroot/coroot/constructor.loadContainers(0xc000228000, 0xc002bfc210?, 0xc00a0f11a8?, 0xc00a0f1178?) โ”‚
โ”‚ /tmp/src/constructor/containers.go:76 +0x1bba โ”‚
โ”‚ github.com/coroot/coroot/constructor.(*Constructor).LoadWorld.func10() โ”‚
โ”‚ /tmp/src/constructor/constructor.go:109 +0x2c โ”‚
โ”‚ github.com/coroot/coroot/constructor.(*Profile).stage(0x416345?, {0x114bd28?, 0x0?}, 0x0?) โ”‚
โ”‚ /tmp/src/constructor/constructor.go:59 +0x132 โ”‚
โ”‚ github.com/coroot/coroot/constructor.(*Constructor).LoadWorld(0xc002bfc0c0, {0x133fa40, 0xc00013a1e0}, 0x64a60f40, 0x64a61d32, 0x1e, 0x0) โ”‚
โ”‚ /tmp/src/constructor/constructor.go:109 +0x7f9 โ”‚
โ”‚ github.com/coroot/coroot/cache.(*recordingRulesProcessor).QueryRange(0xc000488100, {0x133fa40, 0xc00013a1e0}, {0xc000a69980, 0x25}, 0x0?, 0x64a61d32, 0x41fce5?) โ”‚
โ”‚ /tmp/src/cache/updater.go:302 +0x299 โ”‚
โ”‚ github.com/coroot/coroot/cache.(*Cache).download(0x100ccc0?, 0x64abec44, {0x133b170, 0xc000488100}, {0xc00106e940, 0x8}, 0x1e, 0xc002233a80) โ”‚
โ”‚ /tmp/src/cache/updater.go:183 +0x1c6 โ”‚
โ”‚ github.com/coroot/coroot/cache.(*Cache).processRecordingRules(0xc000257e00, 0xc00221ac60?, 0xc0009ffe10, 0x10?, 0x2?) โ”‚
โ”‚ /tmp/src/cache/updater.go:282 +0x2e6 โ”‚
โ”‚ github.com/coroot/coroot/cache.(*Cache).updaterWorker(0xc000257e00, 0xc0006fc000?, {0xc00106e940, 0x8}, 0x1e) โ”‚
โ”‚ /tmp/src/cache/updater.go:172 +0x1290 โ”‚
โ”‚ created by github.com/coroot/coroot/cache.(*Cache).updater โ”‚
โ”‚ /tmp/src/cache/updater.go:47 +0x38c โ”‚
โ”‚ Stream closed EOF for coroot/coroot-559549d5b5-xcgcj (coroot)

High memory usage and OOM loop

Hi,

we are experiencing high memory usage and OOM kill loop while using Coroot in our GKE cluster. Coroot container tries to allocate up to 14GiB of memory before it is killed. Here is the complete log of the container before getting killed:

I0126 13:44:42.589518 1 main.go:45] version: 0.12.1, url-base-path: /, read-only: false
I0126 13:44:42.589639 1 db.go:38] using postgres database
I0126 13:44:43.715195 1 cache.go:130] cache loaded from disk in 1.106531315s
I0126 13:44:43.715499 1 compaction.go:81] compaction worker started
I0126 13:44:43.716125 1 main.go:142] listening on :8080
I0126 13:44:44.716784 1 updater.go:54] worker iteration for krxa44eq
I0126 13:44:53.716464 1 compaction.go:92] compaction iteration started

Here is the graph of memory usage:
image

We set an 8GiB memory limit on the Coroot container.

Before we set the memory limit, the container allocated up to 24GiB of memory.

We tried with both SQLite and PostgreSQL and there were no differences in behavior.

Our GKE cluster version is v1.24.5-gke.600.

We have 22 Nodes, 154 Deployments, 25 DaemonSets, and 12 StatefulSets which in total have 857 Pods.

"Index out of range" problem

In our 0.17.6 installation on Kubernetes v1.24 we occasionally run into this issue in the Coroot pod logs:

coroot.docx

This brings down the entire app.

After today's incident we removed and redeployed the Helm chart, but what is the root cause of this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.