Giter Site home page Giter Site logo

mfleader / scale-ci-diagnosis Goto Github PK

View Code? Open in Web Editor NEW

This project forked from openshift-scale/scale-ci-diagnosis

0.0 0.0 0.0 41 KB

Tool to help diagonize issues with OpenShift clusters

License: Apache License 2.0

Shell 92.39% Dockerfile 7.61%

scale-ci-diagnosis's Introduction

scale-ci-diagnosis

Tool to help diagonose issues with OpenShift clusters. It does it by:

  • capturing prometheus DB from the running prometheus pods to local file system. This can be used to look at the metrics later by running prometheus locally with the backed up DB.
  • Capturing openshift cluster information including all the operator managed components using https://github.com/openshift/must-gather.

It also supports running conformance or end-to-end tests to make sure cluster is sane.

Run

Edit env.sh according to the needs and run the ocp_diagnosis script:

$ source env.sh; ./ocp_diagnosis.sh

options supported:
	OUTPUT_DIR=str,                       str=dir to store the capture prometheus data
	PROMETHEUS_CAPTURE=str,               str=true or false, enables/disables prometheus capture
	PROMETHEUS_CAPTURE_TYPE=str,          str=wal or full, wal captures the write ahead log and full captures the entire prometheus DB
	OPENSHIFT_MUST_GATHER=str,            str=true or false, gathers cluster data including information about all the operator managed components
	STORAGE_MODE=str,                     str=pbench, moves the results to the pbench results dir to be shipped to the pbench server in case the tool is run using pbench
	DATA_SERVER_URL=str                   str=url that points to http server that hosts data

Pbench server for storage

Pbench does a great job in terms of both collection and long term storage. The tool currently supports pbench as the storage mode instead of just storing the results on the local file system, we will be adding support to store results in Amazon S3 in the future.

In order to use pbench as the storage, the tool needs to be run using pbench and STORAGE_MODE should be set to pbench:

$ source env.sh; pbench-user-benchmark --sysinfo=none -- <path to ocp_diagnosis.sh>
# pbench-move-results --prefix ocp-diagnosis-$(date +"%Y%m%d-%H%M%S")

Snappy data server for storage

Snappy data server is a second option for storage on a filesystem. The easiest option is to deploy the data server on your host. Refer to setup and usage to deploy the data server. Once deployed, you can read over it's API at it's /docs route.

Visualize the captured data locally on prometheus server

$ ./prometheus_view.sh <db_tarball_url>

This installs prometheus server and loads up the DB, the server can be accessed at https://localhost:9090.

Conformance

Conformance runs the end-to-end test suite to check the sanity of the OpenShift cluster.

Assuming that the podman is installed, conformance/e2e test can be run using the following command:

$ podman run --privileged=true --name=conformance -d -v <path-to-kubeconfig>:/root/.kube/config quay.io/openshift-scale/conformance:latest
$ podman logs -f conformance

Similarly, docker can be used to run the conformance test:

$ docker run --privileged=true --name=conformance -d -v <path-to-kubeconfig>:/root/.kube/config quay.io/openshift-scale/conformance:latest
$ docker logs -f conformance

scale-ci-diagnosis's People

Contributors

chaitanyaenr avatar mfleader avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.