DSI: distributed system test infrastructure
This project uses GitHub PRs for changes. See below section on patch-testing if you're new.
Intro
The big picture of system performance is as follows:
- Evergreen uses a project file to run a task, where each task may represent multiple tests.
- As part of executing this task, Evergreen will prepare a host on which DSI will be executed.
- When DSI is executed on this host, the DSI node itself will spin up a variety of hosts, depending on the exact task being run.
- Some of these hosts will contain mongod instances, while others will act as workload clients. (At the moment only a single workload client is supported.)
- A workload client is a node that performs some workload designed to stress the system of mongod instances. After executing, the cluster set up by the DSI node is closed down and the data stored.
- While many different hosts may be used in a given run of sys-perf, DSI itself is only ever executed on Evergreen hosts. Other operations, such as setting up mongod instances, are performed using SSH on nodes spun up by DSI.
system_perf.yml
DSI in Evergreen and System performance testing, or "sys-perf", is an Evergreen project whose goal is detecting inabilities of the mongodb server to live up to certain performance guarantees, or to detect abrupt changes in performance. For background on Evergreen, check out the Evergreen wiki, particularly the article describing project files.
This project is controlled by the etc/system_perf.yml file. Each task will execute a series of functions, which we will discuss in order. Each function assumes the previous have been executed.
The etc/system_perf.yml file has a few high-level functions.
-
prepare environment
Does everything needed to prepare the DSI node for execution. It will download all git repositories, then output abootstrap.yml
that is dynamically generated to contain all the values needed for DSI to run correctly. For example, if a task specifies a certainmongodb_setup.*.yml
file, then that file is stated in thebootstrap.yml
file. It will also prepare the AWS secret keys. Finally, thebootstrap
script in DSI is executed. -
deploy cluster
Does everything needed to deploy the cluster of mongodb nodes and workload client. It executes the DSI scriptsinfrastructure_provisioning
,workload_setup
, andmongodb_setup
. -
run test
Actually runs the tests (workloads), now that the cluster has been properly established. This involves executing thetest_control
script of DSI. -
analyze
Detects outliers and runs regressions against past performances.
The stages are discussed in order in the Running Locally Doc.
Running DSI (and Workloads) Locally
Please consult the Running Locally documentation for more information about installing required binaries and dependencies.
Developing on DSI
To get started, run the setup
command:
./run-dsi setup
This will create a dsi_venv
python virtualenv (using the built-in venv module) for the purposes of local development. You can activate this environment in your shell with the following:
source ./dsi_venv/bin/activate
You may also need to install development dependencies to run some tests:
source ./dsi_venv/bin/activate
python3 -m pip install -r ./requirements-dev.txt
Testing Changes
The repo's tests are all packaged into /testscripts/runtests.sh
, which must be run from the repo root. It requires: a config.yml
file in the DSI repo root (see example_config.yml
).
-
Evergreen credentials:
Found in your local
~/.evergreen.yml
file.(Instructions here if you are missing this file.) -
Github authentication token:
curl -i -u <USERNAME> \ -H 'X-GitHub-OTP: <2FA 6-DIGIT CODE>' \ -d '{"scopes": ["repo"], "note": "get full git hash"}' \ https://api.github.com/authorizations
(You only need
-H 'X-GitHub-OTP: <2FA 6-DIGIT CODE>
if you have 2-factor authentication on.)
If you don't have a config.yml
file, you will see failures in tests that use the evergreen client.
Unit testing is orchestrated using nose.
Run all the unit tests:
./run-dsi ./testscripts/run-nosetest.sh
Run a specific test:
./run-dsi ./testscripts/run-nosetest.sh ./dsi/tests/test_config.py
Patch-Testing DSI
Github will automatically run self-tests on evergreen when you submit a PR, but it does not run any "real" DSI workloads (yet). To ensure you don't break any workloads, you must schedule a number of patch-builds against various mongodb performance projects.
cd mongo
# Unless you're changing the mongo server repo, these patches will be "empty".
evergreen patch -p sys-perf
evergreen patch -p sys-perf-4.4
evergreen patch -p sys-perf-4.2
evergreen patch -p sys-perf-4.0
evergreen patch -p performance
evergreen patch -p performance-4.4
evergreen patch -p performance-4.2
evergreen patch -p performance-4.0
# For each of the above patches, set the DSI module
cd /path/to/dsi
evergreen set-module -m dsi -i <your-build-id>
Notes:
- Don't just schedule every task and variant. Speak with members of the STM or Perf team if you have questions.
- You may not need any branches other than master.
- The above list was accurate as of 2020-03-12, but new branches are cut regularly.
Patch-Testing DSI Without Compile
It can be painful to iterate on evergreen yaml files or DSI itself because workload tasks in Evergreen depend on a task that compiles the entirety of mongodb. This can take around 30 minutes. If you don't care about using the latest (tip/master) version of the server, you can force an older, pre-compiled version and skip the compile task.
There are two options.
- Repeating a task with a different DSI module. This is the easy way.
- Skipping compile entirely. This is the slightly harder way.
The Easy Way
You still have to suffer compile once but only once.
-
Create a patch of sys-perf and do the usual
evergreen set-module -m dsi
step.cd mongo evergreen patch -p sys-perf cd /path/to/dsi evergreen set-module -m dsi -i <id> # ... make changes evergreen set-module -m dsi -i <id> # reschedule any tasks you want to run again with updated DSI
-
Schedule the tasks you want.
You can call evergreen set-module -m dsi
multiple times on the same patch-build and re-schedule your tasks. The compile task isn't re-run.
The Slightly Harder Way
This is the fastest way to run real workloads with your DSI changes, but it requires modifying the yaml code that runs DSI.
Use a hard-coded asset path and remove the compile-task dependency
-
Replace this line with a static URL e.g.:
mongodb_binary_archive: "https://s3.amazonaws.com/mciuploads/dsi/5c8685d3850e61268dd41be1/447847d93d6e0a21b018d5df45528e815c7c13d8/linux/mongodb-5c8685d3850e61268dd41be1.tar.gz"
(This is the artifact URL from a previous waterfall run. Update it if tests fail to run because of new server features, etc.)
-
Remove the
depends_on
blocks for the build-variants you want to run e.g. remove these lines. -
Submit this as your patch-build and then do the usual
set-module
dance (per above).
Here too you can use the same patch-build multiple times like the example above.