Giter Site home page Giter Site logo

Comments (6)

sananguliyev avatar sananguliyev commented on August 29, 2024

And my job disappeared from dKron, too. How can I recover the job and dKron.

from dkron.

cobolbaby avatar cobolbaby commented on August 29, 2024

I saw this error when I restarted Dkron.

time="2023-09-20T16:45:28+08:00" level=info msg="No valid config found: Applying default values." error="Config File \"dkron\" Not Found in \"[/etc/dkron /root/.dkron /opt/dkron/config]\""

2023/09/20 16:45:31 [Recovery] 2023/09/20 - 16:45:31 panic recovered:
runtime error: invalid memory address or nil pointer dereference
/usr/local/go/src/runtime/panic.go:220 (0x44cb55)
/usr/local/go/src/runtime/signal_unix.go:818 (0x44cb25)
/go/pkg/mod/github.com/hashicorp/[email protected]/api.go:769 (0xa5b65d)
/go/src/github.com/distribworks/dkron/dkron/agent.go:624 (0x1aa160b)
/go/src/github.com/distribworks/dkron/dkron/api.go:445 (0x1aa8345)
/go/pkg/mod/github.com/gin-gonic/[email protected]/context.go:174 (0x1aa5e7e)
/go/src/github.com/distribworks/dkron/dkron/api.go:133 (0x1aa5e65)
/go/pkg/mod/github.com/gin-gonic/[email protected]/context.go:174 (0x1aa5fb2)
/go/src/github.com/distribworks/dkron/dkron/api.go:139 (0x1aa5f24)
/go/pkg/mod/github.com/gin-gonic/[email protected]/context.go:174 (0xbe7d61)
/go/pkg/mod/github.com/gin-gonic/[email protected]/recovery.go:102 (0xbe7d4c)
/go/pkg/mod/github.com/gin-gonic/[email protected]/context.go:174 (0xbe6e46)
/go/pkg/mod/github.com/gin-gonic/[email protected]/logger.go:240 (0xbe6e29)
/go/pkg/mod/github.com/gin-gonic/[email protected]/context.go:174 (0xbe5ed0)
/go/pkg/mod/github.com/gin-gonic/[email protected]/gin.go:620 (0xbe5b38)
/go/pkg/mod/github.com/gin-gonic/[email protected]/gin.go:576 (0xbe567c)
/usr/local/go/src/net/http/server.go:2916 (0x7023ba)
/usr/local/go/src/net/http/server.go:1966 (0x6fd3b6)
/usr/local/go/src/runtime/asm_amd64.s:1571 (0x4675e0)


...

from dkron.

cobolbaby avatar cobolbaby commented on August 29, 2024

And my job disappeared from dKron, too. How can I recover the job and dKron.

If the raft data directory is reset due to pod reconstruction, it won't be possible to recover the task unless you have regular backups.

runtime error: invalid memory address or nil pointer dereference

After reviewing the code, I found that the error happens because Serf did not elect a leader. This issue can arise during the service startup process. If the leader cannot be elected for a long time, there would be something wrong with the configuration or network.

from dkron.

sananguliyev avatar sananguliyev commented on August 29, 2024

First of all thanks for your answer. I am using persistent volume, and I do not understand why data is gone or corrupted. Regarding the no leader problem, configuration and infrastructure is same. it always happens when I restart or apply new version. I always cross finger before I restart. I do not know exact problem since it works arbitrarily since configuration is always same and network issue is very unlikely (at least that often, otherwise other services would not work on k8s)

from dkron.

cobolbaby avatar cobolbaby commented on August 29, 2024

I am using persistent volume, and I do not understand why data is gone or corrupted.

Could you share the dkron yaml?

from dkron.

sananguliyev avatar sananguliyev commented on August 29, 2024

I am using persistent volume, and I do not understand why data is gone or corrupted.

Could you share the dkron yaml?

Yes sure. I mount the dkron config files as volume from config map and you can find them below, too.

---
apiVersion: v1
kind: ServiceAccount
metadata:
    name: dkron
    namespace: dkron

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
    namespace: dkron
    name: dkron-discovery-manager
rules:
    -   apiGroups: [""]
        resources: ["pods"]
        verbs: ["list"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
    name: dkron-cluster-discovery
    namespace: dkron
subjects:
    -   kind: ServiceAccount
        name: dkron
        namespace: dkron
roleRef:
    kind: Role
    name: dkron-discovery-manager
    apiGroup: rbac.authorization.k8s.io

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dkron-agent
  namespace: dkron
  labels:
      app: dkron
      component: agent
spec:
  replicas: 1
  selector:
    matchLabels:
        app: dkron
        component: agent
  template:
    metadata:
        name: dkron-agent
        labels:
          app: dkron
          component: agent
    spec:
      serviceAccountName: dkron
      terminationGracePeriodSeconds: 30
      affinity: 
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: node.kubernetes.io/instance-type
                operator: In
                values:
                - cpx21
      containers:
      - name: dkron
        image: "dkron/dkron:3.2.6"
        imagePullPolicy: Always
        envFrom:
          - configMapRef:
              name: dkron-env
        command: 
          - /opt/local/dkron/dkron
        args:
          - "agent"
          - "--config=/etc/dkron/dkron-agent.yml"
          - "--tag=\"type=agent\""
        volumeMounts: 
          - name: config
            mountPath: /etc/dkron/
            readOnly: true
          - name: feedless-config
            mountPath: /app/config/
            readOnly: true
        ports:
          - name: serf
            containerPort: 8946
          - name: grpc
            containerPort: 6868
        resources:
          limits: 
            memory: 1Gi
            cpu: 500m
      volumes:
        - name: config
          configMap:
            name: dkron-config
        - name: feedless-config
          configMap:
            name: feedless-config

---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: dkron-server
  namespace: dkron
  labels:
    app: dkron
    component: server
spec:
  replicas: 3 
  serviceName: dkron-server
  selector:
    matchLabels:
      app: dkron
      component: server
  template:
    metadata:
      labels: 
        app: dkron
        component: server
    spec:
      serviceAccountName: dkron
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: node.kubernetes.io/instance-type
                operator: In
                values:
                - cpx21
      containers:
        - name: dkron-server
          image: dkron/dkron:3.2.6
          imagePullPolicy: Always
          envFrom:
            - configMapRef:
                name: dkron-env
          ports:
            - name: http
              containerPort: 8080
            - name: serf
              containerPort: 8946
            - name: grpc
              containerPort: 6868
          command: 
           - /opt/local/dkron/dkron
          args:
              - "agent"
              - "--server"
              - "--config=/etc/dkron/dkron-server.yml"
          volumeMounts: 
            - name: data
              mountPath: /dkron/data
            - name: config
              mountPath: /etc/dkron/
              readOnly: true
            - name: feedless-config
              mountPath: /app/config/
              readOnly: true
          # lifecycle:
          #   preStop:
          #     exec:
          #       command: ["dkron", "leave"]
          startupProbe:
            tcpSocket:
              port: serf
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: "/health"
              port: 8080
            failureThreshold: 5
            periodSeconds: 10
            initialDelaySeconds: 10
            successThreshold: 1
            timeoutSeconds: 5
          livenessProbe:
            httpGet:
              path: "/health"
              port: 8080
            failureThreshold: 2
            periodSeconds: 10
            initialDelaySeconds: 5
            successThreshold: 1
            timeoutSeconds: 5
          resources:
            limits: 
              memory: 1Gi
              cpu: 500m
      volumes:
        - name: config
          configMap:
            name: dkron-config
        - name: feedless-config
          configMap:
            name: feedless-config
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 10Gi
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: dkron-config
  namespace: dkron
data:
  dkron-agent.yml: |+
    data-dir: /dkron/data
    retry-join: ["provider=k8s namespace=\"dkron\" label_selector=\"app=dkron\""]
    tags:
      component: agent
    log-level: info
    serf-reconnect-timeout: 5s
    disable-usage-stats: true
  dkron-server.yml: |+
    server: true
    bootstrap-expect: 2
    data-dir: /dkron/data
    retry-join: ["provider=k8s namespace=\"dkron\" label_selector=\"app=dkron\""]
    tags:
      component: server
    log-level: info
    serf-reconnect-timeout: 5s
    disable-usage-stats: true

from dkron.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.