Giter Site home page Giter Site logo

smejdil / zabbix-nvidia-smi-integration Goto Github PK

View Code? Open in Web Editor NEW

This project forked from richardkav/zabbix-nvidia-smi-integration

0.0 0.0 0.0 869 KB

The Zabbix LLD template for monitoring nVidia graphics cards.

Home Page: https://open-tech.cz

License: Apache License 2.0

Shell 100.00%
nvidia nvidia-docker nvidia-gpu nvidia-smi zabbix zabbix-templates

zabbix-nvidia-smi-integration's Introduction

zabbix-nvidia-smi-integration

This repository has a Zabbix template for monitoring nVidia graphics cards in particular the:

  • GPU Utilization
  • GPU Power Consumption
  • GPU Memory (Used, Free, Total)
  • GPU Temperature
  • GPU Fan Speed

The information on how to configure the Zabbix agent is below. The template should be added to the server and nVidia-SMI should be installed on the node that is to be monitored.

The following parameters need adding to the configuration directory for the agent /etc/zabbix/zabbix_agent2.d/nvidia.conf

UserParameter=gpu.discovery,/usr/local/bin/gpu-count.sh
UserParameter=gpu.temp[*],nvidia-smi --query-gpu=temperature.gpu --format=csv,noheader,nounits -i $1
UserParameter=gpu.memtotal[*],nvidia-smi --query-gpu=memory.total --format=csv,noheader,nounits -i $1
UserParameter=gpu.used[*],nvidia-smi --query-gpu=memory.used --format=csv,noheader,nounits -i $1
UserParameter=gpu.free[*],nvidia-smi --query-gpu=memory.free --format=csv,noheader,nounits -i $1
UserParameter=gpu.fanspeed[*],nvidia-smi --query-gpu=fan.speed --format=csv,noheader,nounits -i $1
UserParameter=gpu.utilisation[*],nvidia-smi --query-gpu=utilization.gpu --format=csv,noheader,nounits -i $1
UserParameter=gpu.power[*],nvidia-smi --query-gpu=power.draw --format=csv,noheader,nounits -i $1

The following code was developed from this template and further refined to avoid the need to directly parse output from nvidia smi with grep and cut.

For GPU card discovery, a script is used that outputs information via nvidia-smi in JSON, which is processed by zabbix-server.

Tested on Ubuntu 22.04.4

# nvidia-smi --version
NVIDIA-SMI version  : 550.78
NVML version        : 550.78
DRIVER version      : 550.78
CUDA Version        : 12.4

GPU test nVidia by tool gpu-burn

git clone https://github.com/wilicc/gpu-burn.git
cd gpu-burn
make
./gpu_burn 60

or

docker run --rm --gpus all gpu_burn
...
Tested 4 GPUs:
	GPU 0: OK
	GPU 1: OK
	GPU 2: OK
	GPU 3: OK

AI test ollama

# ollama run llama2:7b-chat-fp16
>>> How many people live in central Europe

Central Europe is a region that includes several countries, and the total population of these countries can vary depending on the 
definition of central Europe used. However, here are some approximate population figures for some of the countries commonly considered to 
be part of central Europe:

1. Austria: around 8.8 million people (2020 estimate)
2. Belgium: around 11.5 million people (2020 estimate)
3. Czech Republic: around 10.6 million people (2020 estimate)
4. Denmark: around 5.8 million people (2020 estimate)
5. Germany: around 83.2 million people (2020 estimate)
6. Hungary: around 9.8 million people (2020 estimate)
7. Luxembourg: around 0.6 million people (2020 estimate)
8. Netherlands: around 17.2 million people (2020 estimate)
9. Poland: around 38.5 million people (2020 estimate)
10. Slovakia: around 5.4 million people (2020 estimate)

Total population of central Europe: around 376.6 million people (based on the above estimates).

It's worth noting that these numbers are approximate and may vary depending on the source and methodology used to determine the population 
of each country. Additionally, there is no one definition of central Europe that is universally accepted, and different sources may define 
the region in slightly different ways.

Images

Zabbix nVidia monitoring Zabbix nVidia monitoring Zabbix nVidia monitoring Zabbix nVidia monitoring

To do

  • Ansible deployment

zabbix-nvidia-smi-integration's People

Contributors

richardkav avatar smejdil avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.