Giter Site home page Giter Site logo

[Feat]: When defining an Alert Silencing Rule I should be able to filter down till the alert instance (chart name) about netdata-cloud HOT 7 CLOSED

hugovalente-pm avatar hugovalente-pm commented on July 30, 2024
[Feat]: When defining an Alert Silencing Rule I should be able to filter down till the alert instance (chart name)

from netdata-cloud.

Comments (7)

hugovalente-pm avatar hugovalente-pm commented on July 30, 2024

@car12o made two proposals of solution based on what we discussed on the daily, I know for 2. you had to check something before we know it is a way forward
do you think you could update this ticket with your finding when you are able to do it?

from netdata-cloud.

car12o avatar car12o commented on July 30, 2024

I can confirm on alert transition we don't have chart labels, but we have it on the alert config, although I don't know if it's what you expect.
here's some alert config examples:

template                           |chart                                                         |component    |units                |info                                                                                                                          |summary                                        |host_labels      |chart_labels                                           |
-----------------------------------+--------------------------------------------------------------+-------------+---------------------+------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------+-----------------+-------------------------------------------------------+
mdstat_mismatch_cnt                |md.mismatch_cnt                                               |RAID         |unsynchronized blocks|number of unsynchronized blocks for the ${label:device} ${label:raid_level} array                                             |                                               |                 |raid_level=!raid1 !raid10 *                            |
disk_space_usage                   |disk.space                                                    |Disk         |%                    |Total space utilization of disk ${label:mount_point}                                                                          |Disk ${label:mount_point} space usage          |_os=linux freebsd|mount_point=!/dev !/dev/* !/run !/run/* *              |
postgresql_pg_wall_disk_space_usage|disk.space                                                    |PostgreSQL   |%                    |The percentage of Disk Space being used by the pg_wall.                                                                       |Disk ${label:mount_point} (pg_wall) space usage|_os=linux freebsd|mount_point=/media/pgdata_adto                         |
DLE_CAS_sync_instance_lag          |DLE_CAS.sync_instance_lag                                     |sync_instance|seconds              |DLE_CAS Sync instance - high lag of WAL replay. Time stamp of last transaction replayed during recovery exceeds the threshold.|                                               |                 |_collect_module=DLE_CAS _collect_plugin=charts.d.plugin|
disk_space_usage                   |disk.space                                                    |Disk         |%                    |Total space utilization of disk ${label:mount_point}                                                                          |Disk ${label:mount_point} space usage          |_os=linux freebsd|mount_point=!/dev !/dev/* !/run !/run/* *              |
disk_inode_usage                   |disk.inodes                                                   |Disk         |%                    |disk ${label:mount_point} inode utilization                                                                                   |                                               |_os=linux freebsd|mount_point=!/dev !/dev/* !/run !/run/* *              |
disk_space_usage                   |disk.space                                                    |Disk         |%                    |Total space utilization of disk ${label:mount_point}                                                                          |Disk ${label:mount_point} space usage          |_os=linux freebsd|mount_point=!/dev !/dev/* !/run !/run/* *              |
rds_freeable_memory_alert          |prometheus.cloudwatch_exporter.aws_rds_freeable_memory_average|             |MB                   |AWS RDS instance freeable memory                                                                                              |                                               |                 |dbinstance_identifier= !koi-nonprod-infra-mysql *      |
disk_space_usage                   |disk.space                                                    |Disk         |%                    |Total space utilization of disk ${label:mount_point}                                                                          |Disk ${label:mount_point} space usage          |_os=linux freebsd|mount_point=!/dev !/dev/* !/run !/run/* *              |
disk_inode_usage                   |disk.inodes                                                   |Disk         |%                    |Total inode utilization of disk ${label:mount_point}                                                                          |Disk ${label:mount_point} inode usage          |_os=linux freebsd|mount_point=!/dev !/dev/* !/run !/run/* *              |

bear in mind that we do have some configs without any chart labels.

nevertheless, I think it's easier and more straight forward to filter by alert instance (chart_name/chart_id)

from netdata-cloud.

hugovalente-pm avatar hugovalente-pm commented on July 30, 2024

bear in mind that we do have some configs without any chart labels.

I think this is probably because of older version agents where we weren't using labels

nevertheless, I think it's easier and more straight forward to filter by alert instance (chart_name/chart_id)

I agree it is easier for now, the discussion was about if it would make sense to go towards a more ideal solution relying on labels since it is also how we are setting this on alert definitions.
I'm ok to progress with the alert instance and we can revisit this later

@kapantzak from your side all good?

from netdata-cloud.

car12o avatar car12o commented on July 30, 2024

I think this is probably because of older version agents where we weren't using labels

I don't think that's the case, as I sort by created timestamp and I still got some configs with empty chart labels.

from netdata-cloud.

kapantzak avatar kapantzak commented on July 30, 2024

@hugovalente-pm using alert instance seems easier and more straight forward to me too.

However I'm not sure if I have this information at that point. I see that I get contexts, names and roles from this endpoint: api/v2/spaces/{spaceID}/alarms/metas, but how do I get the instance?
@car12o

from netdata-cloud.

car12o avatar car12o commented on July 30, 2024

@kapantzak here's how to get the data, let me know if something is not clear

Get instances from alert name or context

POST /api/v2/spaces/{spaceID}/rooms/{roomID}/alerts
body:

{
  "options": ["instances"],
  "scope": {
    "nodes": ["{nodeID}"], // if you want to filter by node
    "contexts": ["{context}"] // if want to get instances by context (ex. disk.space)
  },
  "selectors": {
    "alert": ["{alert_name}"] // if want to get instances by alert name (ex. disk_space_usage)
  }
}

all these parameters are optional but as we discuss, to filter out all possible instances, we should always either specify contexts or alert.

what identifies an alert instance is the chart, the response looks like this

{
  "api": 2,
  "nodes": [
    // ...
  ],
  "alert_instances": [
    {
      "ni": 2,
      "ati": null,
      "sum": "Disk / space usage",
      "info": "Total space utilization of disk /",
      "nm": "disk_space_usage",
      "ch": "disk_space._", // chart_id - this should be the field used when posting a rule
      "ch_n": "disk_space._", // chart_name - this should be the field used to display on the UI (friendly name)
      "ctx": "disk.space",
      "st": "CLEAR",
      "v": 0,
      "t": 0,
      "tr_i": "b047bf45-f831-49c8-b8de-6d76f5712858",
      "tr_v": 9.579043377550992,
      "tr_t": 1710857916,
      "units": "%",
      "cfg": "13038942-685d-4c69-9431-5d8877db1f80",
      "src": "line=10,file=/usr/lib/netdata/conf.d/health.d/disks.conf",
      "exec": "/usr/libexec/netdata/plugins.d/alarm-notify.sh",
      "tp": "System",
      "cl": "Utilization",
      "cm": "Disk",
      "to": "sysadmin",
      "slc": {
        "state": "NONE"
      }
    }
  ]
}

from netdata-cloud.

hugovalente-pm avatar hugovalente-pm commented on July 30, 2024

this is released

from netdata-cloud.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.