DSI: distributed system test infrastructure

This project uses GitHub PRs for changes. See below section on patch-testing if you're new.

Intro

The big picture of system performance is as follows:

Evergreen uses a project file to run a task, where each task may represent multiple tests.
As part of executing this task, Evergreen will prepare a host on which DSI will be executed.
When DSI is executed on this host, the DSI node itself will spin up a variety of hosts, depending on the exact task being run.
Some of these hosts will contain mongod instances, while others will act as workload clients. (At the moment only a single workload client is supported.)
A workload client is a node that performs some workload designed to stress the system of mongod instances. After executing, the cluster set up by the DSI node is closed down and the data stored.
While many different hosts may be used in a given run of sys-perf, DSI itself is only ever executed on Evergreen hosts. Other operations, such as setting up mongod instances, are performed using SSH on nodes spun up by DSI.

DSI in Evergreen and `system_perf.yml`

System performance testing, or "sys-perf", is an Evergreen project whose goal is detecting inabilities of the mongodb server to live up to certain performance guarantees, or to detect abrupt changes in performance. For background on Evergreen, check out the Evergreen wiki, particularly the article describing project files.

This project is controlled by the etc/system_perf.yml file. Each task will execute a series of functions, which we will discuss in order. Each function assumes the previous have been executed.

The etc/system_perf.yml file has a few high-level functions.

prepare environment
Does everything needed to prepare the DSI node for execution. It will download all git repositories, then output a bootstrap.yml that is dynamically generated to contain all the values needed for DSI to run correctly. For example, if a task specifies a certain mongodb_setup.*.yml file, then that file is stated in the bootstrap.yml file. It will also prepare the AWS secret keys. Finally, the bootstrap script in DSI is executed.
deploy cluster
Does everything needed to deploy the cluster of mongodb nodes and workload client. It executes the DSI scripts infrastructure_provisioning, workload_setup, and mongodb_setup.
run test
Actually runs the tests (workloads), now that the cluster has been properly established. This involves executing the test_control script of DSI.
analyze
Detects outliers and runs regressions against past performances.

The stages are discussed in order in the Running Locally Doc.

Running DSI (and Workloads) Locally

Please consult the Running Locally documentation for more information about installing required binaries and dependencies.

Developing on DSI

To get started, run the setup command:

./run-dsi setup

This will create a dsi_venv python virtualenv (using the built-in venv module) for the purposes of local development. You can activate this environment in your shell with the following:

source ./dsi_venv/bin/activate

You may also need to install development dependencies to run some tests:

source ./dsi_venv/bin/activate
python3 -m pip install -r ./requirements-dev.txt

Testing Changes

The repo's tests are all packaged into /testscripts/runtests.sh, which must be run from the repo root. It requires: a config.yml file in the DSI repo root (see example_config.yml).

Evergreen credentials:

Found in your local ~/.evergreen.yml file.(Instructions here if you are missing this file.)

Github authentication token:

curl -i -u <USERNAME> \
    -H 'X-GitHub-OTP: <2FA 6-DIGIT CODE>' \
    -d '{"scopes": ["repo"], "note": "get full git hash"}' \
    https://api.github.com/authorizations

(You only need -H 'X-GitHub-OTP: <2FA 6-DIGIT CODE> if you have 2-factor authentication on.)

If you don't have a config.yml file, you will see failures in tests that use the evergreen client.

Unit testing is orchestrated using nose.

Run all the unit tests:

./run-dsi ./testscripts/run-nosetest.sh

Run a specific test:

./run-dsi ./testscripts/run-nosetest.sh ./dsi/tests/test_config.py

Patch-Testing DSI

Github will automatically run self-tests on evergreen when you submit a PR, but it does not run any "real" DSI workloads (yet). To ensure you don't break any workloads, you must schedule a number of patch-builds against various mongodb performance projects.

cd mongo

# Unless you're changing the mongo server repo, these patches will be "empty".
evergreen patch -p sys-perf
evergreen patch -p sys-perf-4.4
evergreen patch -p sys-perf-4.2
evergreen patch -p sys-perf-4.0

evergreen patch -p performance
evergreen patch -p performance-4.4
evergreen patch -p performance-4.2
evergreen patch -p performance-4.0

# For each of the above patches, set the DSI module
cd /path/to/dsi
evergreen set-module -m dsi -i <your-build-id>

Notes:

Don't just schedule every task and variant. Speak with members of the STM or Perf team if you have questions.
You may not need any branches other than master.
The above list was accurate as of 2020-03-12, but new branches are cut regularly.

Patch-Testing DSI Without Compile

It can be painful to iterate on evergreen yaml files or DSI itself because workload tasks in Evergreen depend on a task that compiles the entirety of mongodb. This can take around 30 minutes. If you don't care about using the latest (tip/master) version of the server, you can force an older, pre-compiled version and skip the compile task.

There are two options.

Repeating a task with a different DSI module. This is the easy way.
Skipping compile entirely. This is the slightly harder way.

The Easy Way

You still have to suffer compile once but only once.

Create a patch of sys-perf and do the usual evergreen set-module -m dsi step.

cd mongo
evergreen patch -p sys-perf

cd /path/to/dsi
evergreen set-module -m dsi -i <id>
# ... make changes
evergreen set-module -m dsi -i <id>
# reschedule any tasks you want to run again with updated DSI

Schedule the tasks you want.

You can call evergreen set-module -m dsi multiple times on the same patch-build and re-schedule your tasks. The compile task isn't re-run.

The Slightly Harder Way

This is the fastest way to run real workloads with your DSI changes, but it requires modifying the yaml code that runs DSI.

Use a hard-coded asset path and remove the compile-task dependency

Replace this line with a static URL e.g.:

mongodb_binary_archive: "https://s3.amazonaws.com/mciuploads/dsi/5c8685d3850e61268dd41be1/447847d93d6e0a21b018d5df45528e815c7c13d8/linux/mongodb-5c8685d3850e61268dd41be1.tar.gz"

(This is the artifact URL from a previous waterfall run. Update it if tests fail to run because of new server features, etc.)

Remove the depends_on blocks for the build-variants you want to run e.g. remove these lines.
Submit this as your patch-build and then do the usual set-module dance (per above).

Here too you can use the same patch-build multiple times like the example above.

mongodb / dsi Goto Github PK

dsi's Introduction

DSI: distributed system test infrastructure

Intro

DSI in Evergreen and `system_perf.yml`

Running DSI (and Workloads) Locally

Developing on DSI

Testing Changes

Patch-Testing DSI

Patch-Testing DSI Without Compile

dsi's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

mongodb / dsi Goto Github PK

dsi's Introduction

DSI: distributed system test infrastructure

Intro

DSI in Evergreen and system_perf.yml

Running DSI (and Workloads) Locally

Developing on DSI

Testing Changes

Patch-Testing DSI

Patch-Testing DSI Without Compile

dsi's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

Recommend Topics

Recommend Org

DSI in Evergreen and `system_perf.yml`