Harmony regression tests run a series of self contained tests to ensure no regressions occur when portions of harmony are changed.
The regression tests can be run multiple ways. Locally in Docker against SIT, UAT and Prod. This is the preferred method of verifying no regressons have occurred, when the services have been modified.
Alternatively, each test can be run locally in a browser against SIT, UAT, PROD or localhost (harmony-in-a-box). This is a good choice for test development and verifying service changes do not cause regression failures. Generally you run locally in the browser against a single service regression test.
- Docker (to run locally in docker)
Each test suite is run in a separate Docker container using a temporary Docker image you must build before running.
From the ./test
directory make all of the regression images with:
$ make images
make -j images
can be used to make the images in parallel (faster), although this may lead to
Docker Desktop instabilities
$ cd test
$ export HARMONY_HOST_URL=<url of Harmony in the target environment>
$ export EDL_PASSWORD=<your EDL password>
$ export EDL_USER=<your EDL username>
$ export AWS_ACCESS_KEY_ID=<key for the target environment>
$ export AWS_SECRET_ACCESS_KEY=<key secret for the target environment>
$ ./run_notebooks.sh
Outputs can be found in the tests/output/<image>
directory.
Notes:
-
All notebooks require variable
EDL_USER
andEDL_PASSWORD
to be exported for authentication against earthdata login. If you are including the NetCDF-to-Zarr (n2z) tests,AWS_ACCESS_KEY_ID
andAWS_SECRET_ACCESS_KEY
must be set to values for your current test environment to access the created Zarr store. -
It's possible to run a selection of notebooks by providing a list of images to run after the run_notebooks command. e.g.
./run_notebooks.sh hga n2z
would run theharmony GDAL adapter
andNetCDF-to-Zarr
regression tests. -
HARMONY_HOST_URL
is the harmony base url for your target environment. e.g.SIT
would behttps://harmony.sit.earthdata.nasa.gov
To run the tests:
-
Create an isolated python environment for the test you wish to run. You can use the environment.yml of the test to create the environment with conda or you can create the environment with another virtual env, just ensure all of the requirement from the environment.yml file are installed. They will create conda environments named
papermill-<image>
, and you should delete any existing environment before installing from the environment.yml. -
Start the jupyter server:
jupyter notebook
. -
Browse and open the jupyter notebook file for the test. (
<image>_Regression.ipynb
) -
Update the
harmony_host_url
in the notebook. -
Run the tests.
Notebooks and support files should be placed in a subdirectory of the test
directory.
For example, in the harmony
directory we have
├── Harmony.ipynb
├── __init__.py
├── environment.yaml
└── util.py
Notebook dependencies should be listed in file named environment.yaml
at the top level of the
subdirectory. The name
field in the file should be papermill
. For example:
name: papermill-<IMAGE>
channels:
- conda-forge
- defaults
dependencies:
- python=3.7
- jupyter
- requests
- netcdf4
- matplotlib
- papermill
- pytest
- ipytest
- pip:
- harmony-py
The regression test notebooks follow semantic versioning:
major.minor.patch
Every time a regression test suite is updated, the version number in the
version.txt
file for that suite should be iterated by the appropriate type of
version increment. This will likely occur for one of three reasons:
- Adding, updating or removing tests within the notebook (or associated utility functionality).
- Adding or updating Python dependencies in the
environment.yaml
file for the test suite. - Updating the overall Docker image for all test suites, in which case all
suites should have their
version.txt
incremented.
The CI/CD pipeline for this repository will release a new Docker image for a
test suite to
ghcr.io
whenever a change in the relevant version.txt
file is merged to the main
branch.
To use these changes in the overall Harmony CI/CD pipeline in Bamboo, the environment variables for the appropriate regression test deployment environment (SIT, UAT or production) should also be updated.
Note - the manual update step for Bamboo environment variables is brittle, and improvements are being considered to make the choice of regression test image version more automated.
To increase runtime efficiency, the build relies on
micromamba.
Micromamba and mamba are meant to be drop in replacements for miniconda and
conda. The fast solving allows us to skip creating a conda-lock file, and the
dependency management is entirely defined by the environment.yaml
file.
Test notebooks should not rely on other forms of dependency management or expect user input.
They should utilize the harmony_host_url
global variable to communicate with Harmony
or to determine the Harmony environment. This variable is set by papermill
- see the
Harmony.ipynb
for how to make use of this variable. More information can be found
in the papermill
documentation on setting parameters.
New test suites must be added to the Makefile
. A new name-image
target (where name is the name of
the test suite) should be added (see the harmony-image
example), and the new image target
should be added as a dependency of the images
target. The docker image should have a name like
ghcr.io/nasa/regression-tests-<base_name>
, where base_name
is the name of the test suite.
To build the test images on github, add a new matrix target that includes the
image base name and notbook name to the list of targets in the
.github/workflows/build-all-images.yml
file.
Finally, add the image base name to the all_images
array in the
run_notebooks.sh
file and the all_tests
array in scripts/test-in-bamboo.sh
script. For instance,
if the new image is named ghcr.io/nasa/regression-tests-foo
, then we would add
foo
to both arrays.
The run_notebooks.sh
file can be used as described above to run the test suite. Notebooks are
expected to exit with a non-zero exit code on failure when run from papermill
.