Authors: Rhys Oxenham
This repository aims to demonstrate AVX512 vs AVX2 performance in the Monte Carlo simulation (which should result in ~50% perf increase), and how to set it up visually in Red Hat OpenShift.
NOTE: You will need to ensure that you have nodes that support AVX512/AVX2 in your OpenShift cluster, or the workloads will fail to schedule!
The deployment script does the following-
- Create a new dedicated OpenShift project "monte-carlo"
- Enable user-workload monitoring in built-in OpenShift monitoring
- Deploy an AVX512 and AVX2 based Monte Carlo simulation pod (with Prometheus Push Gateway)
- Enable ServiceMonitors for the two workload pods to scrape gateways
- Deploy the Grafana Community Operator (v4.7.0) and create an instance
- Create a Grafana service account to have access to Prometheus/Thanos data (built-in)
- Add Grafana Datasource for Prometheus/Thanos data (built-in)
- Deploy a Custom Grafana Dashboard to show "time_elapsed" between AVX2 and AVX512
- Show the Grafana Dashboard Route to the user
NOTE: Make sure that you've deployed the NodeFeatureDiscovery operator, and an equivalent instance, as we tag the workload pods to nodes that support required features:
spec:
restartPolicy: Never
nodeSelector:
feature.node.kubernetes.io/cpu-cpuid.AVX512BW: "true" # requires NFD
To deploy the environment, simply run the following, assuming you have already configured your OpenShift client and have system:admin access:
# oc whoami
system:admin
# ./deploy.sh
(...)
To clean up all of the resources, simply run the following-
# ./cleanup.sh
(...)