Giter Site home page Giter Site logo

Instance down about swarmprom HOT 8 OPEN

stefanprodan avatar stefanprodan commented on May 21, 2024
Instance down

from swarmprom.

Comments (8)

Dean-Christian-Armada avatar Dean-Christian-Armada commented on May 21, 2024 1

@stefanprodan , we need your advise.

from swarmprom.

stefanprodan avatar stefanprodan commented on May 21, 2024

Node exporter and cadvisor are running on each Swarm node, so you can configure an alert for up{job="node-exporter"}

from swarmprom.

Dean-Christian-Armada avatar Dean-Christian-Armada commented on May 21, 2024

I don't think it is effective enough. As the value 0 of that certain node-exporter will not be present for long. Also, it shows only the instance IP and not the node_name.. I tried grouping it with node_name but it will not show up at all please see photos below

Screenshot of up with a down node-exporter
screen shot 2018-02-23 at 10 12 15

Screenshot of up grouping it with node_meta
screen shot 2018-02-23 at 10 13 12

from swarmprom.

stefanprodan avatar stefanprodan commented on May 21, 2024

You can use IF absent(node_meta) FOR 5m

from swarmprom.

Dean-Christian-Armada avatar Dean-Christian-Armada commented on May 21, 2024

Hi @stefanprodan , what should be the expected value on the absent(node_meta) query? The case is if there is even just a single node that went down. Specifically for my case, my "swarm-node-2" went down.

The photo below is what returned when I intentionally downed my swarm-node-2

screen shot 2018-02-26 at 10 11 52

from swarmprom.

abhisheks-cuelogic avatar abhisheks-cuelogic commented on May 21, 2024

@Dean-Christian-Armada , I am also facing the same problem. I want to create a rule whenever a node is down.
Also if a container is down I should get alert for the same.

from swarmprom.

Dean-Christian-Armada avatar Dean-Christian-Armada commented on May 21, 2024

@abhisheks-cuelogic , "Container down", you mean if you have a python container that went down then it will alert? I don't think it's possible with the container part. Prometheus needs node-exporter or other scraping like tool to determine metrics. Unless, there is an agent that can be installed inside the container to determine if it went down.

from swarmprom.

abhisheks-cuelogic avatar abhisheks-cuelogic commented on May 21, 2024

Not the container itself should alert. Can we use something like :

ALERT piwik_nginx
IF count(time() - container_last_seen{name=~"^piwik_nginx.*"} < 60)
ANNOTATIONS {
summary = "piwik_nginx container is down",
description = "piwik_nginx is down for more tha 1 minute",
}

I tried this rule, but somehow alert is always active even container is up.
prometheus-alert

from swarmprom.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.