Giter Site home page Giter Site logo

cost-manager's Introduction

cost-manager

cost-manager is a Kubernetes controller manager that manages controllers to automate cost optimisations.

Controllers

Here we provide details of the various controllers supported by cost-manager.

spot-migrator

Spot VMs are unused compute capacity that many cloud providers support access to at significantly reduced costs (e.g. on GCP spot VMs provide a 60-91% discount). Since spot VM availability can fluctuate it is common to configure workloads to be able to run on spot VMs but to allow fallback to on-demand VMs if spot VMs are unavailable. However, even if spot VMs are available, if workloads are already running on on-demand VMs there is no reason for them to migrate.

To improve spot VM utilisation, spot-migrator periodically attempts to migrate workloads from on-demand VMs to spot VMs by draining on-demand Nodes to force cluster scale up, relying on the fact that the cluster autoscaler attempts to expand the least expensive possible node group, taking into account the reduced cost of spot VMs. If an on-demand VM is added to the cluster then spot-migrator assumes that there are currently no more spot VMs available and waits for the next migration attempt (currently every hour) however if no on-demand VMs were added then spot-migrator continues to drain on-demand VMs until there are no more left in the cluster (and all workloads are running on spot VMs). Node draining respects PodDisruptionBudgets to ensure that workloads are migrated whilst maintaining desired levels of availability.

Currently only GKE Standard clusters are supported. To allow spot-migrator to migrate workloads to spot VMs with fallback to on-demand VMs your cluster must be running at least one on-demand node pool and at least one spot node pool.

apiVersion: cost-manager.io/v1alpha1
kind: CostManagerConfiguration
controllers:
- spot-migrator
cloudProvider:
  name: gcp

pod-safe-to-evict-annotator

Certain types of Pods can prevent the cluster autoscaler from removing a Node (e.g. Pods in the kube-system Namespace that do not have a PodDisruptionBudget) leading to more Nodes in the cluster than necessary. This can be particularly problematic for workloads that cluster operators are not in control of and can have a high number of replicas, such as kube-dns or the Konnectivity agent, which are typically installed by cloud providers.

To allow the cluster autoscaler to evict all Pods that have not been explicitly marked as unsafe for eviction, pod-safe-to-evict-annotator adds the cluster-autoscaler.kubernetes.io/safe-to-evict: "true" annotation to all Pods that have not already been annotated; note that PodDisruptionBudgets can still be used to maintain desired levels of availability.

apiVersion: cost-manager.io/v1alpha1
kind: CostManagerConfiguration
controllers:
- pod-safe-to-evict-annotator
podSafeToEvictAnnotator:
  namespaceSelector:
    matchExpressions:
    - key: kubernetes.io/metadata.name
      operator: In
      values:
      - kube-system

Installation

You can install cost-manager into a GKE cluster with Workload Identity enabled as follows:

NAMESPACE="cost-manager"
kubectl get namespace "$NAMESPACE" || kubectl create namespace "$NAMESPACE"
LATEST_RELEASE_TAG="$(curl -s https://api.github.com/repos/hsbc/cost-manager/releases/latest | jq -r .tag_name)"
# GCP service account bound to the roles/compute.instanceAdmin role
GCP_SERVICE_ACCOUNT_EMAIL_ADDRESS="[email protected]"
cat <<EOF > values.yaml
image:
  tag: $LATEST_RELEASE_TAG
config:
  apiVersion: cost-manager.io/v1alpha1
  kind: CostManagerConfiguration
  controllers:
  - spot-migrator
  - pod-safe-to-evict-annotator
  cloudProvider:
    name: gcp
  podSafeToEvictAnnotator:
    namespaceSelector:
      matchExpressions:
      - key: kubernetes.io/metadata.name
        operator: In
        values:
        - kube-system
serviceAccount:
  annotations:
    iam.gke.io/gcp-service-account: $GCP_SERVICE_ACCOUNT_EMAIL_ADDRESS
EOF
helm template ./charts/cost-manager -n "$NAMESPACE" -f values.yaml | kubectl apply -f -

Testing

Build Docker image and run E2E tests using kind:

make image e2e

Roadmap

See ROADMAP.md for details.

Contributing

Contributions are greatly appreciated. The project follows the typical GitHub pull request model. See CONTRIBUTING.md for more details.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.