Giter Site home page Giter Site logo

caas-team / caas-carbon-footprint Goto Github PK

View Code? Open in Web Editor NEW
12.0 4.0 0.0 7.41 MB

Support Sustainable Computing to provide customer with metrics for their carbon footprint workload

Dockerfile 4.57% Smarty 9.04% Python 86.39%
carbon-emissions container entso-e entsoe entsoe-api kepler kubernetes sustainability

caas-carbon-footprint's Introduction

CaaS Carbon Footprint

This repository contains stuff related Sustainable Computing.

Abstract

Climate change is an omnipresent issue. We are confronted with this in the news every day. The state should do something, or someone, preferably someone else. In addition to all the pollution from cars, planes, ships, the largest producers of carbon dioxide are: data centers. The Internet has had an indescribable career over the last 30 years. If you look at the traffic density at the German internet node De-Cix, there is an all-time high almost every day.

(C) https://www.cloudexpoeurope.de/news/de-cix-thomas-king

And this will increase significantly due to all the computing needs of AI/AI.

Saving computing power is not only necessary for cost reasons, but also for climate protection reasons.

The project

Measuring the power consumption of electrical devices is relatively easy if you simply connect a measuring device between the socket and the consumer. In physical computers we have the ACPI service. More complicated in goes on Virtual Machines and our target service at CaaS: Container and Kubernetes. For a long time a lot of science and research was required to become results of Virtual Machine Power Measurements.

Luckily enough the CNCF taking care of it and invented Kepler - Kubernetes Efficient Power Level Exporter. Initial inspired to start of CaaS Carbon Footprint was the talk "Using Green Metrics to Monitor your Carbon Footprint by Ida Fürjesová & Niki Manoledaki at the PromCon 2023 in Berlin. You can watch the session on Youtube Prometheus Channel

Project goal is to create awareness to our customer for environmental pollution caused by computing power. First we make environmental pollution visible with power consumption per workload on our Kubernetes cluster. Second, we show when it makes sense to produce workload in an environmentally friendly way.

We imagine the following possibilities:

  • Shift workload in time frames when Green Energy is generated, especially batch jobs, AI train model, system upgrades
  • Limit workload if only energy from coal, oil and gas is produced
  • Use Green Energy over capacity for social computing like Folding at Home

For sure, we work in the Enterprise Business, most of our hosted application must run all the time. But you can start in small steps to support environmental protection. Not all goals can reach on the first day.

Kepler

Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe performance counters and other system stats, use ML models to estimate workload energy consumption based on these stats, and exports them as Prometheus metrics. With Kepler Dashboard this stats are visible in Grafana:

We provide a bundled Helm Chart with the origin Kepler chart, to install on cluster level, and a ServiceMonitor which rewrites the namespace label. With that the customer can collect namespaces metrics in CaaS Project Monitoring and make the metrics visible on project level.

Kepler has also the option to operate with your own model server to train a ML model for your own infrastructre. Nevertheless this can be a lot of work on a multi-cluster environment with different backends (Cloud, VM, Bare Metal). Also the energy mix must be manually configured based on the power consumption of the underlying data center. For sure, there are marketung aspects: "Our energy consumption is always green", but that's not true. It means on days without Green Energy generation you must switch of the data center. And nobody will do this. Look at this Blog Post which compares for example the Green Energy only for the AWS data center world-wide.

ENTSO-E

Entso-e provides central collection and publication of electricity generation in the EU region. It has a Restful API to query power generation in EU regions like Germany separated by generation type like Brown Coal, Gas, Oil, Solar, Nuclear, Wind.

There is a Python Package available to query the API. With this Flask App the related data are collected by type and provided as Prometheus metrics. This Dockerfile builds an image which can be used for a ServiceMonitor to pump the data to Prometheus.

example output with current metrics:

# HELP entsoe_factor_b01 Factor CO2g/kWh Biomass
# TYPE entsoe_factor_b01 gauge
entsoe_factor_b01 230
# HELP entsoe_factor_b02 Factor CO2g/kWh Brown Coal
# TYPE entsoe_factor_b02 gauge
entsoe_factor_b02 996
# HELP entsoe_factor_b04 Factor CO2g/kWh Gas
# TYPE entsoe_factor_b04 gauge
entsoe_factor_b04 378
# HELP entsoe_factor_b05 Factor CO2g/kWh Hard Coal
# TYPE entsoe_factor_b05 gauge
entsoe_factor_b05 880
# HELP entsoe_factor_b10 Factor CO2g/kWh Hydro Pumped Storage
# TYPE entsoe_factor_b10 gauge
entsoe_factor_b10 23
# HELP entsoe_factor_b11 Factor CO2g/kWh Hydro Run River
# TYPE entsoe_factor_b11 gauge
entsoe_factor_b11 23
# HELP entsoe_factor_b12 Factor CO2g/kWh Hydro Water Reservoir
# TYPE entsoe_factor_b12 gauge
entsoe_factor_b12 23
# HELP entsoe_factor_b14 Factor CO2g/kWh Nuclear
# TYPE entsoe_factor_b14 gauge
entsoe_factor_b14 39
# HELP entsoe_factor_b16 Factor CO2g/kWh Solar
# TYPE entsoe_factor_b16 gauge
entsoe_factor_b16 26
# HELP entsoe_factor_b17 Factor CO2g/kWh Waste
# TYPE entsoe_factor_b17 gauge
entsoe_factor_b17 494
# HELP entsoe_factor_b18 Factor CO2g/kWh Wind Offshore
# TYPE entsoe_factor_b18 gauge
entsoe_factor_b18 4
# HELP entsoe_factor_b19 Factor CO2g/kWh Wind Onshore
# TYPE entsoe_factor_b19 gauge
entsoe_factor_b19 9
# HELP entsoe_generation_b01 Current generation of energy with Biomass in MW
# TYPE entsoe_generation_b01 gauge
entsoe_generation_b01 4515
# HELP entsoe_generation_b02 Current generation of energy with Fossil Brown coal/Lignite in MW
# TYPE entsoe_generation_b02 gauge
entsoe_generation_b02 13202
# HELP entsoe_generation_b04 Current generation of energy with Fossil Gas in MW
# TYPE entsoe_generation_b04 gauge
entsoe_generation_b04 7422
# HELP entsoe_generation_b05 Current generation of energy with Fossil Hard coal in MW
# TYPE entsoe_generation_b05 gauge
entsoe_generation_b05 4485
# HELP entsoe_generation_b09 Current generation of energy with Geothermal in MW
# TYPE entsoe_generation_b09 gauge
entsoe_generation_b09 21
# HELP entsoe_generation_b10 Current generation of energy with Hydro Pumped Storage in MW
# TYPE entsoe_generation_b10 gauge
entsoe_generation_b10 5509
# HELP entsoe_generation_b11 Current generation of energy with Hydro Run-of-river and poundage in MW
# TYPE entsoe_generation_b11 gauge
entsoe_generation_b11 1420
# HELP entsoe_generation_b12 Current generation of energy with Hydro Water Reservoir in MW
# TYPE entsoe_generation_b12 gauge
entsoe_generation_b12 94
# HELP entsoe_generation_b14 Current generation of energy with Nuclear in MW
# TYPE entsoe_generation_b14 gauge
entsoe_generation_b14 0
# HELP entsoe_generation_b16 Current generation of energy with Solar in MW
# TYPE entsoe_generation_b16 gauge
entsoe_generation_b16 0
# HELP entsoe_generation_b17 Current generation of energy with Waste in MW
# TYPE entsoe_generation_b17 gauge
entsoe_generation_b17 814
# HELP entsoe_generation_b18 Current generation of energy with Wind Offshore in MW
# TYPE entsoe_generation_b18 gauge
entsoe_generation_b18 1371
# HELP entsoe_generation_b19 Current generation of energy with Wind Onshore in MW
# TYPE entsoe_generation_b19 gauge
entsoe_generation_b19 5208
# HELP entsoe_generation_sum Current generation of energy summary in MW
# TYPE entsoe_generation_sum gauge
entsoe_generation_sum 44061
# HELP entsoe_generation_eco Current generation of eco energy summary rate
# TYPE entsoe_generation_eco gauge
entsoe_generation_eco 0.4301309548126461
# HELP entsoe_generation_fos Current generation of fossil energy summary rate
# TYPE entsoe_generation_fos gauge
entsoe_generation_fos 0.5698690451873539
# HELP entsoe_generation_co2 Current generation of co2 per watt per second
# TYPE entsoe_generation_co2 gauge
entsoe_generation_co2 0.00598829722222222

Helm Chart

The Helm Chart caas-carbon-footprint puts all together. Beware there are options to install in Rancher environment. If not, you have to disable it. And you need an Entsoe API key (see description)

Reference

CO2 emissions factor:

Use Cases

In Example folder are use cases and ideas.

Presentations

The Presentation folder is used for presentation on conferences and meetups.

Credits

Life is for sharing. If you have an issue with the code or want to improve it, feel free to open an issue or an pull request.

caas-carbon-footprint's People

Contributors

eumel8 avatar puffitos avatar y-eight avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

caas-carbon-footprint's Issues

Grafana dashboards aren't populated properly when the scrape config interval is too high

After updating the scrape interval of the service monitor for kepler to a higher value, the default dashboards aren't displaying any data:

image

This can be addressed if the granularity of the grafana queries is turned down; instead of grabbing the rates over 1m, 3-5m should be fine at first.

The affected dashboards are:

  • Pod/Process Power Consumption (W) in Namespace
  • Pod/Process CO2 FOS Emission (C02g/h) in Namespace
  • Total Power Consumption (W) in Namespace
  • Total Power Consumption (PKG+DRAM+OTHER+GPU) by Namespace (kWh per day)

The last dashaboard isn't available anymore, because the metric kepler_container_joules_total isn't being exposed anymore and must be calculated separately.

The same must be done for the caas-project-monitoring kepler dashboards.

Summary power consumption of multiple clusters

As a requirement we need to know, how much power consumption has our platform in general, that means multiple cluster on multiple environments. If we have no multi-cluster monitoring in place, we can collect the information from each cluster:

  1. The current power consumption of container workload in Joule. 1 Joule = 1 Wattsekunde = 1 VAs. This can be a very large number:
kubectl curl -n cattle-monitoring-system  "http://prometheus-rancher-monitoring-prometheus-0:9090/api/v1/query?query=sum(kepler_container_package_joules_total)" | jq -r '.data.result[]|.value[-1]'
128948308.37400006

ask the same and convert to more readable, let's say MegaJoule

kubectl curl -n cattle-monitoring-system  "http://prometheus-rancher-monitoring-prometheus-0:9090/api/v1/query?query=sum(kepler_container_package_joules_total)%2F1000%2F1000" | jq -r '.data.result[]|.value[-1]'
128.96780355300004
  1. The daily power consumption, collected in the common metric kWh:
kubectl curl -n cattle-monitoring-system  "http://prometheus-rancher-monitoring-prometheus-0:9090/api/v1/query?query=sum(increase(kepler_container_package_joules_total%5B24h%3A1m%5D))%20*%200.00000027777777777" | jq -r '.data.result[]|.value[-1]'
10.91925820424631

This query is copied from the Kepler Grafana dashboard with the converting "watt_per_second_to_kWh", which is factor 0.0000002777777777 (1W*s = 1J and 1J = (1/3600000)kWh)

The same query for one hour

kubectl curl -n cattle-monitoring-system  "http://prometheus-rancher-monitoring-prometheus-0:9090/api/v1/query?query=sum(increase(kepler_container_package_joules_total%5B1h%3A1m%5D))%20*%200.00000027777777777" | jq -r '.data.result[]|.value[-1]'
0.4499151308907774

Which is a better visualization for a status page or status dashboard? Joule is in real time (in the second), but not very common.

Cc: @y-eight

hint: data collected via kubectl, curl plugin to ask Prometheus API on Prometheus Pod.

update kepler

Values.yml shows 0.6.1 as the kepler version. 0.7.x is working on my machine, 0.6.x is not. 0.7.2 is the current version.

Entsoe crash with ZeroDivisionError

since a day Entsoe return Error 500 while crashing the flask app:

10.42.70.250 - - [05/Feb/2024:08:31:45 +0000] "GET /metrics HTTP/1.1" 500 20 "-" "Prometheus/2.46.0"
[2024-02-05 08:35:54,011] ERROR in app: Exception on /metrics [GET]
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1455, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 869, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 867, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 852, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/appuser/app.py", line 171, in metrics
    result_eco = (int(result_b01) + int(result_b09) + int(result_b10) + int(result_b11) + int(result_b12) + int(result_b16) + int(result_b17) + int(result_b18) + int(result_b19)) / int(result_sum)
ZeroDivisionError: division by zero

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.