Giter Site home page Giter Site logo

qpbenchmark's Introduction

QP solvers benchmark

CI Coverage Conda version PyPI version

Benchmark for quadratic programming (QP) solvers available in Python.

The objective is to compare and select the best QP solvers for given use cases. The benchmarking methodology is open to discussions. Standard and community test sets are available: all of them can be processed using the qpbenchmark command-line tool, resulting in standardized reports evaluating all metrics across all QP solvers available on the test machine.

Test sets

The benchmark comes with standard and community test sets to represent different use cases for QP solvers:

New test sets are welcome! The qpbenchmark tool is designed to make it easy to wrap up a new test set without re-implementing the benchmark methodology. Check out creating a new test set.

Solvers

Solver Keyword Algorithm Matrices License
Clarabel clarabel Interior point Sparse Apache-2.0
CVXOPT cvxopt Interior point Dense GPL-3.0
DAQP daqp Active set Dense MIT
ECOS ecos Interior point Sparse GPL-3.0
Gurobi gurobi Interior point Sparse Commercial
HiGHS highs Active set Sparse MIT
HPIPM hpipm Interior point Dense BSD-2-Clause
MOSEK mosek Interior point Sparse Commercial
NPPro nppro Active set Dense Commercial
OSQP osqp Douglas–Rachford Sparse Apache-2.0
PIQP piqp Proximal Interior Point Dense & Sparse BSD-2-Clause
ProxQP proxqp Augmented Lagrangian Dense & Sparse BSD-2-Clause
QPALM qpalm Augmented Lagrangian Sparse LGPL-3.0
qpOASES qpoases Active set Dense LGPL-2.1
qpSWIFT qpswift Interior point Sparse GPL-3.0
quadprog quadprog Goldfarb-Idnani Dense GPL-2.0
SCS scs Douglas–Rachford Sparse MIT

Metrics

We evaluate QP solvers based on the following metrics:

  • Success rate: percentage of problems a solver is able to solve on a given test set.
  • Computation time: time a solver takes to solve a given problem.
  • Optimality conditions: we evaluate all three optimality conditions:
    • Primal residual: maximum error on equality and inequality constraints at the returned solution.
    • Dual residual: maximum error on the dual feasibility condition at the returned solution.
    • Duality gap: value of the duality gap at the returned solution.

Shifted geometric mean

Each metric (computation time, primal and dual residuals, duality gap) produces a different ranking of solvers for each problem. To aggregate those rankings into a single metric over the whole test set, we use the shifted geometric mean (shm), which is a standard to aggregate computation times in benchmarks for optimization software. This mean has the advantage of being compromised by neither large outliers (as opposed to the arithmetic mean) nor by small outliers (in contrast to the geometric geometric mean). Check out the references below for further details.

Here are some intuitive interpretations:

  • A solver with a shifted-geometric-mean runtime of $Y$ is $Y$ times slower than the best solver over the test set.
  • A solver with a shifted-geometric-mean primal residual $R$ is $R$ times less accurate on equality and inequality constraints than the best solver over the test set.

Results

The outcome from running a test set is a standardized report comparing solvers against the different metrics. Here are the results for the various qpbenchmark test sets:

You can check out results from a variety of machines, and share the reports produced by running the benchmark on your own machine, in the Results category of the discussions forum of each test set.

Limitations

Here are some known areas of improvement for this benchmark:

Check out the issue tracker for ongoing works and future improvements.

Installation

The recommended process is to install the benchmark and all solvers in an isolated environment using conda:

conda env create -f environment.yaml
conda activate qpbenchmark

Alternatively, you can install the benchmarking tool individually by pip install qpbenchmark. In that case, the benchmark will run on all supported solvers it can import.

Usage

The benchmark works by running qpbenchmark on a Python script describing the test set. For instance:

qpbenchmark my_test_set.py run

The test-set script is followed by a benchmark command, such as "run" here. We can add optional arguments to run a specific solver, problem, or solver settings:

qpbenchmark my_test_set.py run --solver proxqp --settings default

Check out qpbenchmark --help for a list of available commands and arguments.

Plots

The command line ships a plot command to compare solver performances over a test set for a specific metric. For instance, run:

qpbenchmark maros_meszaros_dense.py plot runtime high_accuracy

To generate the following plot:

image

Contributing

Contributions to improving this benchmark are welcome. You can for instance propose new problems, or share the runtimes you obtain on your machine. Check out the contribution guidelines for details.

Citation

If you use qpbenchmark in your works, please cite all its contributors as follows:

@software{qpbenchmark2024,
  title = {{qpbenchmark: Benchmark for quadratic programming solvers available in Python}},
  author = {Caron, Stéphane and Zaki, Akram and Otta, Pavel and Arnström, Daniel and Carpentier, Justin and Yang, Fengyu and Leziart, Pierre-Alexandre},
  url = {https://github.com/qpsolvers/qpbenchmark},
  license = {Apache-2.0},
  version = {2.2.2},
  year = {2024}
}

See also

References

Other benchmarks

qpbenchmark's People

Contributors

darnstrom avatar jcarpent avatar ottapav avatar stephane-caron avatar zakiakram avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

qpbenchmark's Issues

Getting versions generates unnecessary deps

From @bodono:

=> python benchmark.py run maros_meszaros --solver SCS
[2022-11-01 17:00:40,684] [warning] Run ``pip install py-cpuinfo`` for more accurate CPU info (utils.py:34)
[2022-11-01 17:00:41,489] [info] Made 0 QP solver calls (test_set.py:140)
[2022-11-01 17:00:41,490] [info] Test set results written to /usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/results/
maros_meszaros.csv (results.py:67)
Traceback (most recent call last):
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/benchmark.py", line 114, in <module>
    report.write(os.path.join(results_dir, f"{args.test_set}.md"))
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/qpsolvers_benchmark/report.py", line 82, in write
    {self.get_solvers_table()}
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/qpsolvers_benchmark/report.py", line 48, in get_solvers_table
    versions = get_solver_versions(self.test_set.solvers)
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/qpsolvers_benchmark/utils.py", line 93, in get_solver_version
s
    versions = {
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/qpsolvers_benchmark/utils.py", line 94, in <dictcomp>
    solver: metadata.version(package)
  File "/usr/local/google/home/bodonoghue/miniconda3/envs/python39/lib/python3.9/importlib/metadata.py", line 551, in version
    return distribution(distribution_name).version
  File "/usr/local/google/home/bodonoghue/miniconda3/envs/python39/lib/python3.9/importlib/metadata.py", line 524, in distribut
ion
    return Distribution.from_name(distribution_name)
  File "/usr/local/google/home/bodonoghue/miniconda3/envs/python39/lib/python3.9/importlib/metadata.py", line 187, in from_name
    raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: highspy

Plot whiskers

  1. When plotting primal_residual, and similarly for dual_residual, the plots have whiskers (see end of the orange line on the plot below).
    image

  2. When typing: python qpsolvers_benchmark\benchmark.py maros_meszaros_dense plot dual_residual high_accuracy --solvers ecos cvxopt scs osqp qpSWIFT
    the qpSWIFT data are missing due to case sensitivity with no warning. Maybe the data plot could be insensitive to the letter cases.
    image

Check SCS performance on DUALC1

From bodono/scs-python#63 (comment):

Looking at the instances you posted, I see for instance the first entry is DUALC1, which you have:
problem solver settings runtime found cost_error primal_error
DUALC1 scs default 1.48001 True 8.41269e+23 2.79628e+11

but in the results spreadsheet I have:
name solver status run_time iter obj_val obj_opt n m N setup_time solve_time scale scale_updates
DUALC1 SCS-3.0 optimal 0.002122163772583008 150 6154.912037803983 6155.2508 9 224 2025 0.330263 1.190224 0.04653712266416252 1

And I notice that the true objective value is listed as 6155.2508. My run of SCS finishes in 150 iterations and is close to the optimum with 6154.912037803983, but somehow your run has a cost_error of 8.41269e+23, which seems very high!

(FYI: in my csv the run_time is seconds as determined by a separate python timer, ie, it doesn't rely on the solvers own timers, and setup_time solve_time are both in milliseconds and are from the solvers own timer.)

Lowecase solver names automatically

From @ottapav in #52:

When typing: python qpsolvers_benchmark\benchmark.py maros_meszaros_dense plot dual_residual high_accuracy --solvers ecos cvxopt scs osqp qpSWIFT
the qpSWIFT data are missing due to case sensitivity with no warning. Maybe the data plot could be insensitive to the letter cases.

Don't count qpsolvers matrix build times in solver times

Currently we time solvers externally, so that the timing will include both matrix build and solve times. This is done in qpbenchmark/utils.py by, essentially:

start_time = perf_counter()
solution = qpsolvers.solve_problem(problem, solver=solver)
runtime = perf_counter() - start_time

Separating build and solve times would allow a more fair comparison. It should be done upstream in qpsolvers: qpsolvers/qpsolvers#255

Check SCS performance on QETAMACR

The conversion of SCS problems may be slightly suboptimal, see #19 (comment)

For instance on QETAMACR:

 - Solving QETAMACR with solver SCS-3.0
 - Optimal objective 86760.37
------------------------------------------------------------------
               SCS v3.2.1 - Splitting Conic Solver
        (c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 688, constraints m: 1089
cones:    z: primal zero / dual free vars: 354
          b: box cone vars: 735
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-18
          alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
          max_iters: 100000, normalize: 1, rho_x: 1.00e-06
          acceleration_lookback: 10, acceleration_interval: 10
lin-sys:  sparse-direct-amd-qdldl
          nnz(A): 3097, nnz(P): 4447
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 9.63e+02  7.80e+02  1.31e+04 -1.03e+04  1.00e-01  5.59e-03
   250| 3.88e+00  2.41e+01  1.46e+03  3.75e+04  1.39e+01  3.98e-02
   500| 1.81e+00  1.24e+00  7.27e+01  4.78e+04  1.39e+01  6.88e-02
   750| 4.44e+03  8.37e+03  4.17e+04  4.78e+04  1.39e+01  9.75e-02
  1000| 1.76e+00  9.52e-01  5.90e+01  5.07e+04  1.39e+01  1.26e-01
  1250| 2.05e+00  1.25e+00  6.91e+01  5.19e+04  1.39e+01  1.55e-01
  1500| 2.38e+00  1.46e+00  8.32e+01  5.32e+04  1.39e+01  1.85e-01
  1750| 2.78e+00  1.62e+00  9.37e+01  5.47e+04  1.39e+01  2.14e-01
  2000| 3.19e+00  1.83e+00  1.07e+02  5.64e+04  1.39e+01  2.43e-01
  2250| 3.28e+00  2.08e+00  1.12e+02  5.82e+04  1.39e+01  2.71e-01
  2500| 3.57e+00  2.20e+00  1.21e+02  6.02e+04  1.39e+01  3.00e-01
  2750| 4.01e+00  2.46e+00  1.32e+02  6.25e+04  1.39e+01  3.29e-01
  3000| 4.56e+00  2.79e+00  1.48e+02  6.53e+04  1.39e+01  3.58e-01
  3250| 5.17e+00  2.97e+00  1.64e+02  6.88e+04  1.39e+01  3.88e-01
  3500| 8.03e+03  1.09e+05  7.06e+04 -5.43e+03  1.39e+01  4.1(2 res
  3750| 3.65e+00  3.11e+00  1.01e+02  7.74e+04  1.39e+01  4.51e-01
  4000| 3.02e+00  1.65e+00  7.21e+01  7.92e+04  1.39e+01  4.80e-01
  4250| 2.95e+00  1.48e+00  6.84e+01  8.08e+04  1.39e+01  5.08e-01
  4500| 3.02e+00  1.47e+00  6.86e+01  8.25e+04  1.39e+01  5.38e-01
  4750| 6.91e-01  1.08e+00  1.32e+01  8.37e+04  1.39e+01  5.67e-01
  5000| 3.15e-01  5.24e-01  7.56e+00  8.39e+04  1.39e+01  5.96e-01
  5250| 3.67e-01  3.13e-01  8.63e+00  8.40e+04  1.39e+01  6.24e-01
  5500| 3.81e-01  2.40e-01  8.38e+00  8.41e+04  1.39e+01  6.53e-01
  5750| 3.80e-01  1.99e-01  8.50e+00  8.42e+04  1.39e+01  6.82e-01
  6000| 3.73e-01  2.01e-01  8.38e+00  8.43e+04  1.39e+01  7.12e-01
  6250| 4.66e+03  6.40e+04  5.01e+03  9.08e+04  1.39e+01  7.42e-01
  6500| 3.15e-01  1.95e-01  7.58e+00  8.46e+04  1.39e+01  7.72e-01
  6750| 3.19e-01  1.89e-01  7.43e+00  8.47e+04  1.39e+01  8.00e-01
  7000| 3.30e-01  1.94e-01  7.62e+00  8.48e+04  1.39e+01  8.29e-01
  7250| 3.41e-01  1.99e-01  7.82e+00  8.49e+04  1.39e+01  8.60e-01
  7500| 3.25e-01  2.19e-01  7.48e+00  8.50e+04  1.39e+01  8.90e-01
  7750| 3.18e-01  2.05e-01  7.33e+00  8.51e+04  1.39e+01  9.18e-01
  8000| 3.12e-01  2.06e-01  7.18e+00  8.51e+04  1.39e+01  9.50e-01
  8250| 3.05e-01  1.98e-01  7.05e+00  8.52e+04  1.39e+01  9.78e-01
  8500| 3.02e-01  1.93e-01  6.99e+00  8.53e+04  1.39e+01  1.01e+00
  8750| 3.02e-01  3.85e-01  6.97e+00  8.54e+04  1.39e+01  1.04e+00
  9000| 5.96e+02  8.19e+03  1.18e+03  8.79e+04  1.39e+01  1.06e+00
  9250| 2.57e-01  1.97e-01  5.87e+00  8.56e+04  1.39e+01  1.09e+00
  9500| 2.37e-01  1.72e-01  5.53e+00  8.56e+04  1.39e+01  1.12e+00
  9750| 2.21e-01  1.60e-01  5.25e+00  8.57e+04  1.39e+01  1.15e+00
 10000| 2.08e-01  1.53e-01  5.01e+00  8.58e+04  1.39e+01  1.18e+00
 10250| 1.94e-01  1.42e-01  4.72e+00  8.58e+04  1.39e+01  1.21e+00
 10500| 1.80e-01  1.38e-01  4.33e+00  8.59e+04  1.39e+01  1.24e+00
 10750| 1.71e-01  3.28e-01  4.34e+00  8.59e+04  1.39e+01  1.26e+00
 11000| 1.45e-01  1.26e-01  3.62e+00  8.60e+04  1.39e+01  1.29e+00
 11250| 1.27e-01  1.09e-01  3.35e+00  8.60e+04  1.39e+01  1.32e+00
 11500| 1.13e-01  1.02e-01  3.07e+00  8.60e+04  1.39e+01  1.35e+00
 11750| 8.21e+02  3.67e+04  1.51e+03  9.01e+04  1.39e+01  1.38e+00
 12000| 8.79e-02  1.41e-01  2.65e+00  8.61e+04  1.39e+01  1.41e+00
 12225| 7.25e-02  8.49e-02  2.35e+00  8.61e+04  1.39e+01  1.43e+00
------------------------------------------------------------------
status:  solved
timings: total: 1.43e+00s = setup: 5.27e-03s + solve: 1.43e+00s
         lin-sys: 1.30e+00s, cones: 6.17e-02s, accel: 1.50e-02s
------------------------------------------------------------------
objective = 86139.626049
------------------------------------------------------------------

TypeError: '<' not supported between instances of 'NoneType' and 'float'

Summary

This can happen when running the benchmark with calls to a new solver:

Traceback (most recent call last):
  File "/.../.micromamba/envs/qpbenchmark/bin/qpbenchmark", line 8, in <module>
    sys.exit(main())
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/benchmark.py", line 309, in main
    report(args, results)
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/benchmark.py", line 252, in report
    report.write(md_path)
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/report.py", line 260, in write
    self.__compute_dataframes()
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/report.py", line 241, in __compute_dataframes
    self.__cost_df = self.results.build_shgeom_df(
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/results.py", line 344, in build_shgeom_df
    {
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/results.py", line 345, in <dictcomp>
    settings: self.get_shgeom_for_metric_and_settings(
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/results.py", line 318, in get_shgeom_for_metric_and_settings
    means[solver] = shgeom(column_values, shift)
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/shgeom.py", line 47, in shgeom
    if (v < 0.0).any():
TypeError: '<' not supported between instances of 'NoneType' and 'float'

Reproduction steps

qpbenchmark maros_meszaros_dense_posdef.py run

Technical details

  • Operating system: Ubuntu 22.04
  • Python version: 3.10

Need to flip the sign of dual solution from Gurobi solver

Hi,

Thanks for the cool implementation!

The solution of dual variables output by Gurobi is negative. You need to flip its sign and then use it to compute the dual residual/ gap, etc.

I implemented the dual residual/ gap by myself. In my experiment, I can replicate the results from your codes by not flipping the sign, i.e. plug in the negative dual variables to compute the dual residual/ gap.

Report a measure of variance on test set results

Any measure of the mean value of data is misleading when there is large variance. We should provide some low and high estimates (min and max, or 5% and 95% percentiles) along with geometric means.

CPU thermal throttling

Is your proposal related to a problem?

From #87:

Unless very efficiently cooled, modern CPU cannot run at max speed for long. This can considerably skew timings.

Describe the solution you'd like

Make sure the benchmark solves all problem with consistent performance, monitoring and reacting to thermal throttling.

Describe alternatives you've considered

  • Monitoring throttling state between solver calls (pro: single thread, con: throttling may happen during a solver call)
  • Downscaling to e.g. 50% of the total CPU performance to lower the probability throttling happens

Additional context

See #87.

Choice of default settings

As reported here, it seems that the default settings use the default parameters of a given solver.
This is not a fair comparison, as all the solvers do not reach the same precision for their default parameters.

I would instead encourage using two different denominations provided by the benchmarks: low precision, and high_precision.
@stephane-caron What do you thinK?

Cold start only

Is your proposal related to a problem?

No.

Describe the solution you'd like

The benchmark only performs cold-start calls. It would be nice to cater to warm and hot start as well.

Additional context

Warm and hot starting are commonly used in model predictive control.

Custom plots from a script

Is your proposal related to a problem?

Arose from this discussion: #65

Describe the solution you'd like

Use qpsolvers_benchmark as a library rather than a command-line script so that users can more easily create their own custom plots.

Describe alternatives you've considered

Expanding the main script's CLI. Down that road we would end up having the whole matplotlib API surfacing in the CLI, which is not code we want to handle in the benchmark. Enabling users to more conveniently process benchmark results would be preferable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.