getsentry / sentry-kubernetes Goto Github PK

View Code? Open in Web Editor NEW

440.0 34.0 63.0 669 KB

Kubernetes event reporter for Sentry

License: Apache License 2.0

Dockerfile 0.59% Makefile 1.87% Go 97.28% Shell 0.26%

crash-reporting kubernetes monitoring sentry tag-non-production

sentry-kubernetes's Introduction

sentry-kubernetes

Kubernetes event reporter for Sentry.

⚠️ Note: this is BETA software that is still in development and may contain bugs. Use it at your own risk in production environments.

⚠️ Note: this is a new Go-based implementation of the agent. If you're looking for the documentation on the legacy Python-based implementation, it was moved here.

Errors and warnings in Kubernetes often go unnoticed by operators. Even when they are checked they are hard to read and understand in the context of what else is going on in the cluster. sentry-kubernetes is a small container you launch inside your Kubernetes cluster that will send errors and warnings to Sentry where they will be cleanly presented and intelligently grouped. Typical Sentry features such as notifications can then be used to help operation and developer visibility.

Configuration

SENTRY_DSN - Sentry DSN that will be used by the agent.
SENTRY_ENVIRONMENT - Sentry environment that will be used for reported events.
SENTRY_K8S_WATCH_NAMESPACES - a comma-separated list of namespaces that will be watched. Only the default namespace is watched by default. If you want to watch all namespaces, set the varible to value __all__.
SENTRY_K8S_WATCH_HISTORICAL - if set to 1, all existing (old) events will also be reported. Default is 0 (old events will not be reported).
SENTRY_K8S_CLUSTER_CONFIG_TYPE - the type of the cluster initialization method. Allowed options: auto, in-cluster, out-cluster. Default is auto.
SENTRY_K8S_KUBECONFIG_PATH - filesystem path to the kubeconfig configuration that will be used to connect to the cluster. Not used if SENTRY_K8S_CLUSTER_CONFIG_TYPE is set to in-cluster.
SENTRY_K8S_LOG_LEVEL - logging level. Can be trace, debug, info, warn, error, disabled. Default is info.
SENTRY_K8S_MONITOR_CRONJOBS - if set to 1, enables Sentry Crons integration for CronJob objects. Disabled by default.
SENTRY_K8S_CUSTOM_DSNS - if set to 1, enables custom DSN to be specified in the annotations with key k8s.sentry.io/dsn which would take precedence over `SENTRY_DSN. Disabled by default.

Adding custom tags

To add a custom tag to all events produced by the agent, set an environment variable, whose name is prefixed with SENTRY_K8S_GLOBAL_TAG_.

Example:

SENTRY_K8S_GLOBAL_TAG_cluster_name=main-cluster will add cluster_name=main_cluster tag to every outgoing Sentry event.

Integrations

SENTRY_K8S_INTEGRATION_GKE_ENABLED - if set to 1, enable the GKE integration. Default is 0 (disabled).

The GKE integration will attempt to fetch GKE/GCE metadata from the GCP metadata server, such as project name, cluster name, and cluster location.

Client-side Filters

If you don't want to report certain kinds of events to Sentry, you can configure client-side filters.

Event Reason: filtering by Event.Reason field.

SENTRY_K8S_FILTER_OUT_EVENT_REASONS is a comma separated set of event Reason values. If the event's Reason is in that list, the event will be dropped. By default, the following reasons are filtered out (muted): DockerStart, KubeletStart, NodeSysctlChange, ContainerdStart.
Event Source: filtering by Event.Source.Component field.

SENTRY_K8S_FILTER_OUT_EVENT_SOURCES is a comma separated set of Source Component values (examples include kubelet, default-cheduler, job-controller, kernel-monitor). If the event's Source Component is in that list, the event will be dropped. By default, no events are filtered out by Source Component.

Custom DSN Support

By default, the Sentry project that the agent sends events to is specified by the environment variable SENTRY_DSN. However, if the flag SENTRY_K8S_CUSTOM_DSNS is enabled, a Kubernetes object manifest may specify a custom DSN that takes precedence over the global DSN. To do so, specified the custom DSN in the annotations using the k8s.sentry.io/dsn key as follows:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: cronjob-basic-success
  labels:
    type: test-pod
  annotations:
    k8s.sentry.io/dsn: "<Insert DSN here>"
spec:
  schedule: "* * * * *"
  jobTemplate:
    spec:
      template:
        metadata:
          labels:
            type: test-pod
            run: cronjob-basic-success
        spec:
          containers:
            - name: hello
              image: busybox:1.28
              imagePullPolicy: IfNotPresent
              command:
                - /bin/sh
                - -c
                - date; echo Hello!; sleep 3
          restartPolicy: OnFailure

Integration with Sentry Crons

A useful feature offered by Sentry is Crons Monitoring. This feature may be enabled by setting the environment variable SENTRY_K8S_MONITOR_CRONJOBS variable to true. The agent is compatible with Sentry Crons and can automatically upsert CronJob objects with a Sentry project.

The agent automatically creates a Crons monitor for any detected CronJob with the monitor slug name to be the name of the CronJob. Additionally, the schedule is automatically taken from the CronJob manifest.

Moreover, any the events of any resource object (e.g. pod, job, event) that is associated with a CronJob will have the corresponding monitor slug name is a metadata. This allows the grouping of events based on Crons monitors in Issues as well.

Crons Example

The following manifest is of a cronjob that sometimes fails and completes with variable durations:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: cronjob-late-maybe-error
  labels:
    type: test-pod
spec:
  schedule: "* * * * *"
  jobTemplate:
    spec:
      backoffLimit: 0
      template:
        metadata:
          labels:
            type: test-pod
            run: cronjob-late-maybe-error
        spec:
          containers:
            - name: hello
              image: busybox:1.28
              imagePullPolicy: IfNotPresent
              command:
                - /bin/sh
                - -c
                - |
                  MINWAIT=0
                  MAXWAIT=60
                  sleep $((MINWAIT+RANDOM % (MAXWAIT-MINWAIT)))
                  sleep 3
                  r=$((RANDOM%2))
                  if [ $r -eq 0 ]; then echo Hello!; else exit 1; fi
          restartPolicy: Never

In the Sentry Crons tab of the corresponding project, we may see the following:

Local Development (out of cluster configuration)

Install necessary dependencies to run Kubernetes locally
1. Install docker and start the docker daemon
  
  https://docs.docker.com/engine/install/
  
  docker is a service that manages containers and is used by Kubernetes to create nodes (since kind actually create Kubernetes “nodes” as docker containers rather than VMs)
2. Install kind and add it to PATH
  
  https://kind.sigs.k8s.io/docs/user/quick-start/
  
  kind is a tool for running local Kubernetes clusters and we use it here for testing. The container runtime used by it is containerd, which is the same runtime used now by Docker.
3. Install kubectl, which is the command line tool we use to interact with Kubernetes clusters ran locally by kind
  
  https://kubernetes.io/docs/tasks/tools/
Run Kubernetes cluster locally for development purposes
1. Create a Kubernetes cluster with kind using the command (the cluster name is “kind” by default)
kind create cluster

b. Output information about the created cluster named “kind” or some cluster name you have chosen using the following command (replacing <cluster name> with kind if default used)

kubectl cluster-info --context kind-<cluster name>

You should see an output similar to the following:
```
Kubernetes control plane is running at https://127.0.0.1:61502
CoreDNS is running at https://127.0.0.1:61502/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
```
Run the sentry-kubernetes Go module (which must be performed after the Kubernetes cluster is already running because the module requires the kubeconfig file)
1. Clone the sentry-kubernetes repository
git clone https://github.com/getsentry/sentry-kubernetes.git

b. Pass a valid Sentry DSN to the an environment variable named SENTRY_DSN (https://docs.sentry.io/product/sentry-basics/concepts/dsn-explainer/)

c. At the root of the repository directory, build the Go module with the command

make build

d. Run the module outside of the k8s cluster by executing the command

go run .

which now starts up the process that automatically detects the cluster configuration in order to detect events

Add error-producing pods to test event capturing

Create resources (e.g. pods or deployments) using existing manifests meant to produce errors to be captured by sentry-kubernetes. For example, we can apply the manifest for a pod that exhibits crash loop behavior with the command

kubectl apply -f ./k8s/errors/pod-crashloop.yaml

b. Check that the pod is created using the command

kubectl get pods

which should produce an output similar to the following:

NAME            READY   STATUS             RESTARTS       AGE
pod-crashloop   0/1     CrashLoopBackOff   32 (33s ago)   3h10m

Notice that the Status is CrashLoopBackOff, which is the intended state for our purpose

c. Check that the sentry-kubernetes process capture this crash loop error by checking for the an output similar to the following:

[Sentry] 2023/11/08 12:07:53 Using release from Git: abc123
12:07PM INF Auto-detecting cluster configuration...
12:07PM WRN Could not initialize in-cluster config
12:07PM INF Detected out-of-cluster configuration
12:07PM INF Running integrations...
12:07PM INF Watching events starting from: Wed, 08 Nov 2023 12:07:53 -0800 namespace=default watcher=events
12:07PM INF CronJob monitoring is disabled namespace=default watcher=events
[Sentry] 2023/11/08 12:09:27 Sending error event [w0dc9c22094d7rg9b27afabc868e32] to o4506191942320128.ingest.sentry.io project: 4506191948087296
[Sentry] 2023/11/08 12:10:57 Sending error event [4808b623f0eb446eac0eb6c5f0a43681] to o4506191942320128.ingest.sentry.io project: 4506191948087296

d. Check the Issues tab of the corresponding Sentry project to ensure the events captured are shown similar to below:

Caveats

When the same event (for example, a failed readiness check) happens multiple times, Kubernetes might not report each of them individually, and instead combine them, and send with some backoff. The event message in that case will be prefixed with "(combined from similar events)" string, that we currently strip. AFAIK, there's no way to disable this batching behaviour.

Potential Improvements

For pod-related events: fetch last log lines and displaying them as breadcrumbs or stacktrace.

sentry-kubernetes's People

Contributors

Stargazers

Watchers

Forkers

mikesplain tucksaun scundall bobhenkel gianrubio sozuuuuu robopsi kfyhn dneuhaeuser-zalando q3aiml ma233 aobeef pks-os tdegiacinto stationa pieterlange strongpauly clarisights derom maxirus evanap doverhq rheehot optim-sre jacque006 thorstenkunz afiram cthesky shankar-moeng chalomobility varun-uc vpris kimxogus kmmanto mislavcimpersak unrealcraig thebuttonclan atlas-one olivierpilotte amineabdat mekza eladar2000 isabella232 adfinis-forks xiv dinhanhhuy ynpython blunderchips kirinse niklasember phongvq alexrogalskiy wereii sbkg0002 tryweirdier armbiant christophermarklee hardenchant jiahui-zhang-20 rajesh-dhakad the-guild-org

sentry-kubernetes's Issues

Can the clickhouse-init job be executed during the helm upgrade phase?

as below, it will drop all tables with _dist suffix and recreate them. Will this permanently delete the existing data in the Clickhouse?

        command:
          - /bin/bash
          - -ec
          - |-
            check_readiness() {
              local host="$1"
              local port="$2"
              local count="$3"
              until [ $count -le 0 ] || clickhouse-client --database=default --host=$host --port=$port --query="SELECT 1;" > /dev/null;
              do
                echo "Waiting for clickhouse to be ready..."
                count="$((count-1))"
                sleep 1
              done
            };
            echo "clickhouse-init started";
            check_readiness "sentry-clickhouse.infra" "9000" "5";

            for tbl in discover errors groupassignee groupedmessage outcomes_hourly migrations outcomes_mv_hourly outcomes_raw sentry sessions_hourly sessions_hourly_mv sessions_raw transactions; do
              clickhouse-client --user default --password "" --database=default --host=sentry-clickhouse.infra --port=9000 --query="DROP TABLE IF EXISTS ${tbl}_dist";
              clickhouse-client --user default --password "" --database=default --host=sentry-clickhouse.infra --port=9000 --query="CREATE TABLE ${tbl}_dist AS ${tbl}_local ENGINE = Distributed('test_shard_localhost', 'default', ${tbl}_local, rand())";
            done

            echo "clickhouse-init finished"

Missing events

Hi,

we are running this from the helm charts and it's working fine except that we only get some of the events in our cluster. For example this event I see in our Rancher interface is not reported to sentry.

FailedMount: MountVolume.SetUp failed for volume "kong-kong-token-g78rj" : secret "kong-kong-token-g78rj" not found

we also get the errors from #32 but the times don't match so I guess it's a different issue.

Timezone offset

I'm experimenting with this on a baremetal cluster whose nodes have a timezone configured. (UTC +1, Europe/Berlin)

The issue I'm seeing is that events get added in Sentry with a delay of that one hour (it realizes the event is an hour old, though).

Is this something that could be fixed by configuration, or is UTC a requirement for the kubernetes nodes?

Gather POD logs output

It seems currently this approach only works on collecting "events", like:
message
Back-off restarting failed container

without actual pod logs output that explains an error, any thoughts how this can be achieved?

yaml:

---
# Source: sentry-kubernetes/templates/serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:     
    app: sentry-kubernetes
    heritage: Helm
    release: sentry-kubernetes
    chart: sentry-kubernetes-0.3.2
  name: sentry-kubernetes
---
# Source: sentry-kubernetes/templates/secret.yaml
apiVersion: v1
kind: Secret
metadata:
  labels:     
    app: sentry-kubernetes
    heritage: Helm
    release: sentry-kubernetes
    chart: sentry-kubernetes-0.3.2
  name: sentry-kubernetes
type: Opaque
data:
  sentry.dsn: "..."
---
# Source: sentry-kubernetes/templates/clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:     
    app: sentry-kubernetes
    heritage: Helm
    release: sentry-kubernetes
    chart: sentry-kubernetes-0.3.2
  name: sentry-kubernetes
rules:
  - apiGroups:
      - ""
    resources:
      - events
    verbs:
      - get
      - list
      - watch
---
# Source: sentry-kubernetes/templates/clusterrolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:     
    app: sentry-kubernetes
    heritage: Helm
    release: sentry-kubernetes
    chart: sentry-kubernetes-0.3.2
  name: sentry-kubernetes
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: sentry-kubernetes
subjects:
  - kind: ServiceAccount
    name: sentry-kubernetes
    namespace: default
---
# Source: sentry-kubernetes/templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:     
    app: sentry-kubernetes
    heritage: Helm
    release: sentry-kubernetes
    chart: sentry-kubernetes-0.3.2
  name: sentry-kubernetes
spec:
  replicas: 
  selector:
    matchLabels:
      app: sentry-kubernetes
  template:
    metadata:
      annotations:
        checksum/secrets: ...
      labels:
        app: sentry-kubernetes
        release: sentry-kubernetes
    spec:
      containers:
      - name: sentry-kubernetes
        image: "getsentry/sentry-kubernetes:latest"
        imagePullPolicy: Always
        env:
          - name: DSN
            valueFrom:
              secretKeyRef:
                name: sentry-kubernetes
                key: sentry.dsn
          
          
          
        resources:
          {}
      serviceAccountName: sentry-kubernetes

received by:
helm template sentry-kubernetes sentry/sentry-kubernetes --set sentry.dsn=https://... > exported-sentry-kubernetes.yaml

Provide kubernetes role for RBAC

As of Kubernetes 1.8+, RBAC is enabled by default. You'll need the following setup to allow the program to access the events:

kubectl create sa sentry-kubernetes
kubectl create clusterrole sentry-kubernetes --verb=get,list,watch --resource=events
kubectl create clusterrolebinding sentry-kubernetes --clusterrole=sentry-kubernetes --user=sentry-kubernetes

kubectl run sentry-kubernetes \
  --image bretthoerner/sentry-kubernetes \
  --serviceaccount=sentry-kubernetes \
  --env="DSN=$YOUR_DSN"

When you add --dry-run -o yaml to all commands, you'll get the .yml definition files.

Please provide an oficial docker image

I would like to re-package these project in a helm chart, I'm missing an oficial image from sentry. Is it possible to you guys provide this image?

Related to #1

Limit error forwarding to specific Namespaces

Is there a way to limit the forwarding of errors to particular namespace(s)?

i.e.
I have a sandbox namespace that I do not care to send errors into sentry for. Can I filter this namespace out?

Pager Duty. not installing

Environment

K8
What version are you running? Etc.
22.11.0

Steps to Reproduce

Install using https://develop.sentry.dev/integrations/pagerduty/#enable-the-integration-in-sentry
Looking at cron I see the line in /etc/sentry/config.yaml:
pagerduty.app-id: "PQO???"

I created the app according to the documentation.
I added pagerduty.app-id to values.yaml:

config: configYml: pagerduty.app-id: "PQO???"

I applied the change.

pagerduty shows as not installed.

Looking at the logs for the web pod I see:
00:07:25 [INFO] sentry.access.api: api.access (method='GET' view='sentry.web.frontend.react_page.ReactPageView' response=200 user_id='5' is_app='None' token_type='None' is_frontend_request='True' organization_id='1' auth_id=' │
│ None' path='/settings/sentry/integrations/pagerduty/' caller_ip='10.4.59.25' user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36' rate_limited='Fal │
│ se' rate_limit_category='None' request_duration_seconds=0.12535500526428223 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None') │
│ 00:07:27 [INFO] sentry.access.api: api.access (method='GET' view='sentry.web.frontend.react_page.ReactPageView' response=200 user_id='5' is_app='None' token_type='None' is_frontend_request='True' organization_id='1' auth_id=' │
│ None' path='/settings/sentry/integrations/pagerduty/' caller_ip='10.4.59.25' user_agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36' rate_limited='Fal │
│ se' rate_limit_category='None' request_duration_seconds=0.0899038314819336 rate_limit_type='DNE' concurrent_limit='None' concurrent_requests='None' reset_time='None' group='None' limit='None' remaining='None')

What might I have missed? What pod should I look at to see errors setting up pagerduty integration?

Actual Result

What actually happened. Maybe a screenshot/recording? Maybe some logs?

Option to add custom tags like pod.Labels, Option to set SampleRate and set Event Message

Option to add custom tags like pod.Labels: We make use of labels to identify resources belonging to applications and/or teams. It would be really useful to have those as tags in sentry to filter by them. So I would propose adding labels as tags.

Option to set SampleRate: We should have option to specify SampleRate.

Option to set Event Message: This helps in grouping.

To fulfil my requirements i made changes by forking this repo: master...rajesh-dhakad:sentry-kubernetes:master

But if we can include this in parent repo would be great help.

ARM image build

Hi!
Thank you for your work, project is very helpful for me.

I'm intersted to use your project on my ARM k8s cluster, but there is no image for such architecture.
I also see there is commented section for such changes:
https://github.com/getsentry/sentry-kubernetes/blob/master/.github/workflows/build-push.yaml#L26C18-L26C18
Is it possible to add them or it was commented for some reason?

Tag Docker Released Images with Version

The docker image has no version tags which means the deployed version cannot be pinned.
https://hub.docker.com/r/getsentry/sentry-kubernetes/tags

Please provide tagged releases on Docker Hub.

Support status: looking for community maintainers

Disclaimer: this project is not officially maintained by Sentry.

The code is under the getsentry GitHub organization because it was created by a former employee as an internal Hack Week proof-of-concept.
Expect no support, expect things to be broken in unimaginable ways, use at your own risk.

There are a few alternatives developed by the community:

Extracted from discussion in #7 (comment).

Would you like to contribute bringing this project back to life? We're looking for project maintainers from the Sentry community. Please drop a comment below if you'd like to help.

Missing maintainers on incubator/sentry-kubernetes helm chart

I'm trying to expose the LOG_LEVEL env var introduced in 108ae63 in the incubator/sentry-kubernetes helm chart with my PR helm/charts#12060.

The PR seems to fail CI due to the fact that there are no maintainers configured for the chart. I'd be happy to help out maintaining the chart but my track record in the sentry community is nonexistent. If you let me know who to add I'd be happy to prepare a PR that fixes the failing CI due to lack of configured maintainers on the chart.

I'm raising this here since I'm assuming the issue won't surface by just creating an issue on the helm charts repo.

Memory leak

sentry-kubernetes on our two clusters leaks memory, at least memory usage has only gone up during a few weeks of usage:

We have RBAC enabled and given sentry-kubernetes' service account the ClusterRole view.

As a workaround, we've now set the following resource requests/limits for the sentry-kubernetes container:

resources:
  requests:
    memory: 75Mi
    cpu: 5m
  limits:
    memory: 100Mi
    cpu: 30m

We expect these limits to restart the container every few days.

add log_level via environment variable

Add support for providing log_level via environment variable in addition to --log-level.
When using the image in the kubernetes helm chart, there is no easy way to do this.
If the docker image recognizes LOG_LEVEL, then it's much easier to propose a PR to set that via heml chart values.

[Edited - Wrong repository]

Cut a new image

Cut a new sentry-kubernetes image. The upstream raven client includes fixes around parsing DSN that we need to parse our public DSN.

Affected Object Labels as Tags

We make use of labels to identify resources belonging to applications and/or teams. It would be really useful to have those as tags in sentry to filter by them.

So I would propose adding labels as tags (possibly filtered by a whitelist). I'd be happy to make a PR implementing this but I'd like to get some feedback on the idea before I make a PR that gets rejected.

Segmentation fault in cron job checkin

Environment

Currently the latest docker image: ghcr.io/getsentry/sentry-kubernetes:ff2386f6b1176a36ebfcdb96222ae5840dac8cf1
AWS EKS, version: 1.24

Steps to Reproduce

Run the sentry-kubernetes image in a Kubernetes cluster with cron job monitoring enabled
The container goes into CrashLoopBackOff state

I am confident the pod is configured correctly. We are running the same helm chart in other clusters, with exactly the same configuration, and k8s events are uploaded to sentry.

If I turn off cron job monitoring, the pod starts working.

apiVersion: v1
kind: Pod
metadata:
  name: sentry-agent-b789f68c8-c9k9h
spec:
  containers:
  - env:
    - name: SENTRY_DSN
      valueFrom:
        secretKeyRef:
          key: kubernetes
          name: sentry
    - name: SENTRY_K8S_MONITOR_CRONJOBS
      value: "1"
    - name: SENTRY_K8S_WATCH_NAMESPACES
      value: __all__
    - name: SENTRY_ENVIRONMENT
      value: production
    image: ghcr.io/getsentry/sentry-kubernetes:ff2386f6b1176a36ebfcdb96222ae5840dac8cf1
    imagePullPolicy: IfNotPresent
    name: sentry-agent
    resources: {}
    securityContext: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: sentry-agent
  serviceAccountName: sentry-agent
  terminationGracePeriodSeconds: 30

Expected Result

The pod should not crash with segmentation fault under no circumstances, especially when the service account is allowed to list cron jobs and jobs.

Actual Result

[Sentry] 2024/03/26 16:45:32 Release detection failed: exec: "git": executable file not found in $PATH
[Sentry] 2024/03/26 16:45:32 Some Sentry features will not be available. See https://docs.sentry.io/product/releases/.
[Sentry] 2024/03/26 16:45:32 To stop seeing this message, pass a Release to sentry.Init or set the SENTRY_RELEASE environment variable.
4:45PM INF Auto-detecting cluster configuration...
4:45PM INF Detected in-cluster configuration
4:45PM INF Running integrations...
4:45PM INF Watching events starting from: Tue, 26 Mar 2024 16:45:32 +0000 namespace=__all__ watcher=events
4:45PM INF Add job informer handlers for cronjob monitoring namespace=__all__ watcher=pods
4:45PM INF Add cronjob informer handlers for cronjob monitoring namespace=__all__ watcher=pods
E0326 16:45:32.093041       1 runtime.go:79] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 35 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1643440?, 0x2582f20})
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:75 +0x99
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xfffffffe?})
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:49 +0x75
panic({0x1643440, 0x2582f20})
	/usr/local/go/src/runtime/panic.go:884 +0x213
main.runSentryCronsCheckin({0x1a629a8, 0xc000385890}, 0xc0007875d0, {0xc00049dd80?, 0xc000397520?})
	/app/crons.go:38 +0xbb
main.createJobInformer.func1({0x181a6c0?, 0xc0007875d0})
	/app/informer_jobs.go:26 +0xcd
k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(...)
	/go/pkg/mod/k8s.io/[email protected]/tools/cache/controller.go:232
k8s.io/client-go/tools/cache.(*processorListener).run.func1()
	/go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:816 +0x134
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:157 +0x3e
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000484f38?, {0x1a4d600, 0xc0003fa3f0}, 0x1, 0xc0003947e0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:158 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0?, 0x3b9aca00, 0x0, 0x0?, 0xc000484f88?)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:135 +0x89
k8s.io/apimachinery/pkg/util/wait.Until(...)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:92
k8s.io/client-go/tools/cache.(*processorListener).run(0xc0003d0080)
	/go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:810 +0x6b
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1()
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:75 +0x5a
created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:73 +0x85
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x148fa1b]

goroutine 35 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0xfffffffe?})
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:56 +0xd7
panic({0x1643440, 0x2582f20})
	/usr/local/go/src/runtime/panic.go:884 +0x213
main.runSentryCronsCheckin({0x1a629a8, 0xc000385890}, 0xc0007875d0, {0xc00049dd80?, 0xc000397520?})
	/app/crons.go:38 +0xbb
main.createJobInformer.func1({0x181a6c0?, 0xc0007875d0})
	/app/informer_jobs.go:26 +0xcd
k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(...)
	/go/pkg/mod/k8s.io/[email protected]/tools/cache/controller.go:232
k8s.io/client-go/tools/cache.(*processorListener).run.func1()
	/go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:816 +0x134
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x30?)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:157 +0x3e
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000484f38?, {0x1a4d600, 0xc0003fa3f0}, 0x1, 0xc0003947e0)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:158 +0xb6
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0x0?, 0x3b9aca00, 0x0, 0x0?, 0xc000484f88?)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:135 +0x89
k8s.io/apimachinery/pkg/util/wait.Until(...)
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:92
k8s.io/client-go/tools/cache.(*processorListener).run(0xc0003d0080)
	/go/pkg/mod/k8s.io/[email protected]/tools/cache/shared_informer.go:810 +0x6b
k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1()
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:75 +0x5a
created by k8s.io/apimachinery/pkg/util/wait.(*Group).Start
	/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:73 +0x85

Helm Chart

Thanks for creating this project!

It would be helpful to have a helm chart to make this easy to install. I've created a first pass at one in the kubernetes/charts repo as an incubator project: helm/charts#2708

Just creating this issue for your visibility.

Sentry pod cannot access needed k8s APIs

Environment

How do you use Sentry?
Sentry SaaS (sentry.io)

Which SDK and version?
Latest node.js

Steps to Reproduce

Ran the script give in the readme to add a pod to my deployment

Expected Result

That it would work. :)

Actual Result

I see these logs over and over in the pod

HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"events is forbidden: User \\"system:serviceaccount:default:default\\" cannot watch resource \\"events\\" in API group \\"\\" at the cluster scope","reason":"Forbidden","details":{"kind":"events"},"code":403}\n'
2021-12-21 15:03:52,591 Exception when calling CoreV1Api->list_event_for_all_namespaces: (403)
Reason: Forbidden

This is a cluster running in DO cloud if it helps.

No events being sent. What is the format expected by "DSN"?

We have installed sentry in 3 kubernetes clusters, using

helm install \
    incubator/sentry-kubernetes \
    --name sentry\
    --namespace sys \
    --set rbac.create=false \
    --set sentry.dsn="https://[email protected]/000000

None of the projects is receiving any data.
Container produces no log to stdout.

Any ideas on how to troubleshoot the setup?
Also, can anyone confirm the DNS format is correct or if it needs to be one of the alternatives ?

https://[email protected]/proj-id
https://pub-key:[email protected]/proj-id
something else
Thx

Add ProtocolError exception handling to kubernetes.watch

There is a known issue with handling ProtocolError using the Kubernetes Python SDK (see kubernetes-client/python#540, emissary-ingress/emissary#554 and emissary-ingress/emissary#724). I am seeing this issue also when using the Docker image for sentry-kubernetes in k8 1.10.6. Please add similar handling to sentry-kubernetes and push a new Docker image.

Add a manifest that can be used without helm

It would be great if you would provide a manifest file that can be used without helm.

Basically like: https://github.com/infracloudio/botkube/blob/master/deploy-all-in-one.yaml

Support for replicas

From what I can tell, running multiple instances of this docker images in kubernetes will result in duplicate events (since each pod reads their own stream of events).

Is there a way to have the two pods sync with each other or have Sentry skip identical events (if there is a way to determine such a thing). I am asking since in case of pod evictions or outages, this pod is a potential "victim" and if it gets evicted all events happening during that period will be lost. Having multiple pods (with anti-affinity to schedule them on different nodes) should help remedy that issue.

If you have an "official" way of solving this I am all ears.

Cluster cannot be set on Helm chart

Hello,

It appears as though a k8s cluster can be specified via env var CLUSTER_NAME (https://github.com/getsentry/sentry-kubernetes/blob/master/sentry-kubernetes.py#L23) but no such option exists in the helm chart:
https://github.com/helm/charts/blob/master/incubator/sentry-kubernetes/templates/deployment.yaml

Is there a reason CLUSTER_NAME was omitted and if not, could it be added to the helm chart?

Multiple security issues in libraries - please rebuild the py image and create a maintenance version

Environment

py1.0.0a on Kubernetes

Steps to Reproduce

What: Just run a security scanner like trivy, and various old and new issues show, here are a few more recent critical ones: CVE-2022-1292 libssl1.1 1.1.0j-1~deb9u1 , CVE-2022-22823 Libexpat1 2.2.0-2+deb9u1 , CVE-2022-22823 Libexpat1 2.2.0-2+deb9u1 , CVE-2022-22823 Libexpat1 2.2.0-2+deb9u1
I am getting at least 40 criticals, and many are newer than 2020, the release of py1.0.0a . 105 critical and high in total.

This currently raises doubts about the sentry brand and the fitness for our purposes.

Expected Result

New maintenance release with current versions of dependencies available.

Actual Result

105 high and criticals in very old available last release.

ValueError: Invalid value for `involved_object`, must not be `None`

Getting this error from a deployed pod:

2019-12-19 13:42:24,350 Unhandled exception occurred.
Traceback (most recent call last):
  File "./sentry-kubernetes.py", line 66, in main
    watch_loop()
  File "./sentry-kubernetes.py", line 107, in watch_loop
    for event in stream:
  File "/usr/local/lib/python3.7/site-packages/kubernetes/watch/watch.py", line 131, in stream
    yield self.unmarshal_event(line, return_type)
  File "/usr/local/lib/python3.7/site-packages/kubernetes/watch/watch.py", line 84, in unmarshal_event
    js['object'] = self._api_client.deserialize(obj, return_type)
  File "/usr/local/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 236, in deserialize
    return self.__deserialize(data, response_type)
  File "/usr/local/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 276, in __deserialize
    return self.__deserialize_model(data, klass)
  File "/usr/local/lib/python3.7/site-packages/kubernetes/client/api_client.py", line 622, in __deserialize_model
    instance = klass(**kwargs)
  File "/usr/local/lib/python3.7/site-packages/kubernetes/client/models/v1_event.py", line 107, in __init__
    self.involved_object = involved_object
  File "/usr/local/lib/python3.7/site-packages/kubernetes/client/models/v1_event.py", line 266, in involved_object
    raise ValueError("Invalid value for `involved_object`, must not be `None`")
ValueError: Invalid value for `involved_object`, must not be `None`

Kubernetes is EKS on AWS.

kubectl version:

Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T14:25:20Z", GoVersion:"go1.12.7", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.12-eks-c500e1", GitCommit:"c500e11584c323151d6ab17526d1ed7461e45b0c", GitTreeState:"clean", BuildDate:"2019-10-22T03:11:52Z", GoVersion:"go1.11.13", Compiler:"gc", Platform:"linux/amd64"}

getsentry / sentry-kubernetes Goto Github PK

sentry-kubernetes's Introduction

sentry-kubernetes

Configuration

Adding custom tags

Integrations

Client-side Filters

Custom DSN Support

Integration with Sentry Crons

Local Development (out of cluster configuration)

Caveats

Potential Improvements

sentry-kubernetes's People

Contributors

Stargazers

Watchers

Forkers

sentry-kubernetes's Issues

Environment

Steps to Reproduce

Actual Result

Environment

Steps to Reproduce

Expected Result

Actual Result

Environment

Steps to Reproduce

Expected Result

Actual Result

Environment

Steps to Reproduce

Expected Result

Actual Result

Recommend Projects

Recommend Topics

Recommend Org