qpsolvers / qpbenchmark Goto Github PK

View Code? Open in Web Editor NEW

107.0 7.0 10.0 25.56 MB

Benchmark for quadratic programming solvers available in Python

License: Apache License 2.0

Python 100.00%

benchmark optimization quadratic-programming solvers

qpbenchmark's Introduction

QP solvers benchmark

Benchmark for quadratic programming (QP) solvers available in Python.

The objective is to compare and select the best QP solvers for given use cases. The benchmarking methodology is open to discussions. Standard and community test sets are available: all of them can be processed using the qpbenchmark command-line tool, resulting in standardized reports evaluating all metrics across all QP solvers available on the test machine.

Test sets

The benchmark comes with standard and community test sets to represent different use cases for QP solvers:

Free-for-all: community-built test set, new problems are welcome!
Maros-Meszaros: a standard test set with problems designed to be difficult.
Model predictive control: model predictive control problems arising e.g. in robotics.

New test sets are welcome! The qpbenchmark tool is designed to make it easy to wrap up a new test set without re-implementing the benchmark methodology. Check out creating a new test set.

Solvers

Solver	Keyword	Algorithm	Matrices	License
Clarabel	`clarabel`	Interior point	Sparse	Apache-2.0
CVXOPT	`cvxopt`	Interior point	Dense	GPL-3.0
DAQP	`daqp`	Active set	Dense	MIT
ECOS	`ecos`	Interior point	Sparse	GPL-3.0
Gurobi	`gurobi`	Interior point	Sparse	Commercial
HiGHS	`highs`	Active set	Sparse	MIT
HPIPM	`hpipm`	Interior point	Dense	BSD-2-Clause
MOSEK	`mosek`	Interior point	Sparse	Commercial
NPPro	`nppro`	Active set	Dense	Commercial
OSQP	`osqp`	Douglas–Rachford	Sparse	Apache-2.0
PIQP	`piqp`	Proximal Interior Point	Dense & Sparse	BSD-2-Clause
ProxQP	`proxqp`	Augmented Lagrangian	Dense & Sparse	BSD-2-Clause
QPALM	`qpalm`	Augmented Lagrangian	Sparse	LGPL-3.0
qpOASES	`qpoases`	Active set	Dense	LGPL-2.1
qpSWIFT	`qpswift`	Interior point	Sparse	GPL-3.0
quadprog	`quadprog`	Goldfarb-Idnani	Dense	GPL-2.0
SCS	`scs`	Douglas–Rachford	Sparse	MIT

Metrics

We evaluate QP solvers based on the following metrics:

Success rate: percentage of problems a solver is able to solve on a given test set.
Computation time: time a solver takes to solve a given problem.
Optimality conditions: we evaluate all three optimality conditions:
- Primal residual: maximum error on equality and inequality constraints at the returned solution.
- Dual residual: maximum error on the dual feasibility condition at the returned solution.
- Duality gap: value of the duality gap at the returned solution.

Shifted geometric mean

Each metric (computation time, primal and dual residuals, duality gap) produces a different ranking of solvers for each problem. To aggregate those rankings into a single metric over the whole test set, we use the shifted geometric mean (shm), which is a standard to aggregate computation times in benchmarks for optimization software. This mean has the advantage of being compromised by neither large outliers (as opposed to the arithmetic mean) nor by small outliers (in contrast to the geometric geometric mean). Check out the references below for further details.

Here are some intuitive interpretations:

A solver with a shifted-geometric-mean runtime of $Y$ is $Y$ times slower than the best solver over the test set.
A solver with a shifted-geometric-mean primal residual $R$ is $R$ times less accurate on equality and inequality constraints than the best solver over the test set.

Results

The outcome from running a test set is a standardized report comparing solvers against the different metrics. Here are the results for the various qpbenchmark test sets:

You can check out results from a variety of machines, and share the reports produced by running the benchmark on your own machine, in the Results category of the discussions forum of each test set.

Limitations

Here are some known areas of improvement for this benchmark:

Cold start only: we don't evaluate warm-start performance for now.
CPU thermal throttling: the benchmark currently does not check the status of CPU thermal throttling. Adding this feature is a good way to start contributing to the benchmark.

Check out the issue tracker for ongoing works and future improvements.

Installation

The recommended process is to install the benchmark and all solvers in an isolated environment using conda:

conda env create -f environment.yaml
conda activate qpbenchmark

Alternatively, you can install the benchmarking tool individually by pip install qpbenchmark. In that case, the benchmark will run on all supported solvers it can import.

Usage

The benchmark works by running qpbenchmark on a Python script describing the test set. For instance:

qpbenchmark my_test_set.py run

The test-set script is followed by a benchmark command, such as "run" here. We can add optional arguments to run a specific solver, problem, or solver settings:

qpbenchmark my_test_set.py run --solver proxqp --settings default

Check out qpbenchmark --help for a list of available commands and arguments.

Plots

The command line ships a plot command to compare solver performances over a test set for a specific metric. For instance, run:

qpbenchmark maros_meszaros_dense.py plot runtime high_accuracy

To generate the following plot:

Contributing

Contributions to improving this benchmark are welcome. You can for instance propose new problems, or share the runtimes you obtain on your machine. Check out the contribution guidelines for details.

Citation

If you use qpbenchmark in your works, please cite all its contributors as follows:

@software{qpbenchmark2024,
  title = {{qpbenchmark: Benchmark for quadratic programming solvers available in Python}},
  author = {Caron, Stéphane and Zaki, Akram and Otta, Pavel and Arnström, Daniel and Carpentier, Justin and Yang, Fengyu and Leziart, Pierre-Alexandre},
  url = {https://github.com/qpsolvers/qpbenchmark},
  license = {Apache-2.0},
  version = {2.2.2},
  year = {2024}
}

qpbenchmark's People

Contributors

Stargazers

Watchers

Forkers

jcarpent changzhengwu ottapav darnstrom zakiakram lirun-sat sandmaennchen ddebenedittis mathnovel

qpbenchmark's Issues

Add qpmad

Hi @stephane-caron !

Did you consider adding https://github.com/asherikov/qpmad ?
Some benchmarks are already available at https://github.com/asherikov/qpmad_benchmark, but I guess this repo provide a better view :)

Getting versions generates unnecessary deps

From @bodono:

=> python benchmark.py run maros_meszaros --solver SCS
[2022-11-01 17:00:40,684] [warning] Run ``pip install py-cpuinfo`` for more accurate CPU info (utils.py:34)
[2022-11-01 17:00:41,489] [info] Made 0 QP solver calls (test_set.py:140)
[2022-11-01 17:00:41,490] [info] Test set results written to /usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/results/
maros_meszaros.csv (results.py:67)
Traceback (most recent call last):
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/benchmark.py", line 114, in <module>
    report.write(os.path.join(results_dir, f"{args.test_set}.md"))
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/qpsolvers_benchmark/report.py", line 82, in write
    {self.get_solvers_table()}
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/qpsolvers_benchmark/report.py", line 48, in get_solvers_table
    versions = get_solver_versions(self.test_set.solvers)
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/qpsolvers_benchmark/utils.py", line 93, in get_solver_version
s
    versions = {
  File "/usr/local/google/home/bodonoghue/git/qpsolvers_benchmark/qpsolvers_benchmark/utils.py", line 94, in <dictcomp>
    solver: metadata.version(package)
  File "/usr/local/google/home/bodonoghue/miniconda3/envs/python39/lib/python3.9/importlib/metadata.py", line 551, in version
    return distribution(distribution_name).version
  File "/usr/local/google/home/bodonoghue/miniconda3/envs/python39/lib/python3.9/importlib/metadata.py", line 524, in distribut
ion
    return Distribution.from_name(distribution_name)
  File "/usr/local/google/home/bodonoghue/miniconda3/envs/python39/lib/python3.9/importlib/metadata.py", line 187, in from_name
    raise PackageNotFoundError(name)
importlib.metadata.PackageNotFoundError: highspy

Plot whiskers

When plotting primal_residual, and similarly for dual_residual, the plots have whiskers (see end of the orange line on the plot below).
When typing: python qpsolvers_benchmark\benchmark.py maros_meszaros_dense plot dual_residual high_accuracy --solvers ecos cvxopt scs osqp qpSWIFT
the qpSWIFT data are missing due to case sensitivity with no warning. Maybe the data plot could be insensitive to the letter cases.

Check SCS performance on DUALC1

From bodono/scs-python#63 (comment):

Looking at the instances you posted, I see for instance the first entry is DUALC1, which you have:
problem solver settings runtime found cost_error primal_error
DUALC1 scs default 1.48001 True 8.41269e+23 2.79628e+11

but in the results spreadsheet I have:
name solver status run_time iter obj_val obj_opt n m N setup_time solve_time scale scale_updates
DUALC1 SCS-3.0 optimal 0.002122163772583008 150 6154.912037803983 6155.2508 9 224 2025 0.330263 1.190224 0.04653712266416252 1

And I notice that the true objective value is listed as 6155.2508. My run of SCS finishes in 150 iterations and is close to the optimum with 6154.912037803983, but somehow your run has a cost_error of 8.41269e+23, which seems very high!

(FYI: in my csv the run_time is seconds as determined by a separate python timer, ie, it doesn't rely on the solvers own timers, and setup_time solve_time are both in milliseconds and are from the solvers own timer.)

Add mid_accuracy setting

Proposal to add something mid-way between low- and high-accuracy settings.

Lowecase solver names automatically

From @ottapav in #52:

When typing: python qpsolvers_benchmark\benchmark.py maros_meszaros_dense plot dual_residual high_accuracy --solvers ecos cvxopt scs osqp qpSWIFT
the qpSWIFT data are missing due to case sensitivity with no warning. Maybe the data plot could be insensitive to the letter cases.

Re-run benchmark latest HiGHS from PyPI

Following up to ERGO-Code/HiGHS#992

Describe the solution you'd like

Re-run all test sets with the latest PyPI version of HiGHS.

Don't count qpsolvers matrix build times in solver times

Currently we time solvers externally, so that the timing will include both matrix build and solve times. This is done in qpbenchmark/utils.py by, essentially:

start_time = perf_counter()
solution = qpsolvers.solve_problem(problem, solver=solver)
runtime = perf_counter() - start_time

Separating build and solve times would allow a more fair comparison. It should be done upstream in qpsolvers: qpsolvers/qpsolvers#255

Add benchmark option maros_meszaros_dense_posdef for running dense convex problems only

Please consider the feature of running only strictly convex problems, as some solvers can solve QP with strictly positive definite Hessian (P>0) only. If those solvers are running a complete set of problems, the results may look (unfairly) unattractive:

For example, ECOS warns that you cannot solve some of the problems in the command line:

Conversion to SOCP limits performance of ECOS

As @bodono pointed out in #58 (comment), socp_from_qp relies on NumPy's implementation, which in turns require the matrix to be positive definite.

This requirement comes from the conversion, not from the underlying conic solvers. This yields a https://github.com/qpsolvers/qpsolvers_benchmark/labels/fairness issue since failing to solve the problem is no fault of the solver itself.

Tighter tolerances for cost and residuals

Identify tighter values for the cost and primal tolerances.

See bodono/scs-python#63 (comment) for context.

Check SCS performance on QETAMACR

The conversion of SCS problems may be slightly suboptimal, see #19 (comment)

For instance on QETAMACR:

 - Solving QETAMACR with solver SCS-3.0
 - Optimal objective 86760.37
------------------------------------------------------------------
               SCS v3.2.1 - Splitting Conic Solver
        (c) Brendan O'Donoghue, Stanford University, 2012
------------------------------------------------------------------
problem:  variables n: 688, constraints m: 1089
cones:    z: primal zero / dual free vars: 354
          b: box cone vars: 735
settings: eps_abs: 1.0e-04, eps_rel: 1.0e-04, eps_infeas: 1.0e-18
          alpha: 1.50, scale: 1.00e-01, adaptive_scale: 1
          max_iters: 100000, normalize: 1, rho_x: 1.00e-06
          acceleration_lookback: 10, acceleration_interval: 10
lin-sys:  sparse-direct-amd-qdldl
          nnz(A): 3097, nnz(P): 4447
------------------------------------------------------------------
 iter | pri res | dua res |   gap   |   obj   |  scale  | time (s)
------------------------------------------------------------------
     0| 9.63e+02  7.80e+02  1.31e+04 -1.03e+04  1.00e-01  5.59e-03
   250| 3.88e+00  2.41e+01  1.46e+03  3.75e+04  1.39e+01  3.98e-02
   500| 1.81e+00  1.24e+00  7.27e+01  4.78e+04  1.39e+01  6.88e-02
   750| 4.44e+03  8.37e+03  4.17e+04  4.78e+04  1.39e+01  9.75e-02
  1000| 1.76e+00  9.52e-01  5.90e+01  5.07e+04  1.39e+01  1.26e-01
  1250| 2.05e+00  1.25e+00  6.91e+01  5.19e+04  1.39e+01  1.55e-01
  1500| 2.38e+00  1.46e+00  8.32e+01  5.32e+04  1.39e+01  1.85e-01
  1750| 2.78e+00  1.62e+00  9.37e+01  5.47e+04  1.39e+01  2.14e-01
  2000| 3.19e+00  1.83e+00  1.07e+02  5.64e+04  1.39e+01  2.43e-01
  2250| 3.28e+00  2.08e+00  1.12e+02  5.82e+04  1.39e+01  2.71e-01
  2500| 3.57e+00  2.20e+00  1.21e+02  6.02e+04  1.39e+01  3.00e-01
  2750| 4.01e+00  2.46e+00  1.32e+02  6.25e+04  1.39e+01  3.29e-01
  3000| 4.56e+00  2.79e+00  1.48e+02  6.53e+04  1.39e+01  3.58e-01
  3250| 5.17e+00  2.97e+00  1.64e+02  6.88e+04  1.39e+01  3.88e-01
  3500| 8.03e+03  1.09e+05  7.06e+04 -5.43e+03  1.39e+01  4.1(2 res
  3750| 3.65e+00  3.11e+00  1.01e+02  7.74e+04  1.39e+01  4.51e-01
  4000| 3.02e+00  1.65e+00  7.21e+01  7.92e+04  1.39e+01  4.80e-01
  4250| 2.95e+00  1.48e+00  6.84e+01  8.08e+04  1.39e+01  5.08e-01
  4500| 3.02e+00  1.47e+00  6.86e+01  8.25e+04  1.39e+01  5.38e-01
  4750| 6.91e-01  1.08e+00  1.32e+01  8.37e+04  1.39e+01  5.67e-01
  5000| 3.15e-01  5.24e-01  7.56e+00  8.39e+04  1.39e+01  5.96e-01
  5250| 3.67e-01  3.13e-01  8.63e+00  8.40e+04  1.39e+01  6.24e-01
  5500| 3.81e-01  2.40e-01  8.38e+00  8.41e+04  1.39e+01  6.53e-01
  5750| 3.80e-01  1.99e-01  8.50e+00  8.42e+04  1.39e+01  6.82e-01
  6000| 3.73e-01  2.01e-01  8.38e+00  8.43e+04  1.39e+01  7.12e-01
  6250| 4.66e+03  6.40e+04  5.01e+03  9.08e+04  1.39e+01  7.42e-01
  6500| 3.15e-01  1.95e-01  7.58e+00  8.46e+04  1.39e+01  7.72e-01
  6750| 3.19e-01  1.89e-01  7.43e+00  8.47e+04  1.39e+01  8.00e-01
  7000| 3.30e-01  1.94e-01  7.62e+00  8.48e+04  1.39e+01  8.29e-01
  7250| 3.41e-01  1.99e-01  7.82e+00  8.49e+04  1.39e+01  8.60e-01
  7500| 3.25e-01  2.19e-01  7.48e+00  8.50e+04  1.39e+01  8.90e-01
  7750| 3.18e-01  2.05e-01  7.33e+00  8.51e+04  1.39e+01  9.18e-01
  8000| 3.12e-01  2.06e-01  7.18e+00  8.51e+04  1.39e+01  9.50e-01
  8250| 3.05e-01  1.98e-01  7.05e+00  8.52e+04  1.39e+01  9.78e-01
  8500| 3.02e-01  1.93e-01  6.99e+00  8.53e+04  1.39e+01  1.01e+00
  8750| 3.02e-01  3.85e-01  6.97e+00  8.54e+04  1.39e+01  1.04e+00
  9000| 5.96e+02  8.19e+03  1.18e+03  8.79e+04  1.39e+01  1.06e+00
  9250| 2.57e-01  1.97e-01  5.87e+00  8.56e+04  1.39e+01  1.09e+00
  9500| 2.37e-01  1.72e-01  5.53e+00  8.56e+04  1.39e+01  1.12e+00
  9750| 2.21e-01  1.60e-01  5.25e+00  8.57e+04  1.39e+01  1.15e+00
 10000| 2.08e-01  1.53e-01  5.01e+00  8.58e+04  1.39e+01  1.18e+00
 10250| 1.94e-01  1.42e-01  4.72e+00  8.58e+04  1.39e+01  1.21e+00
 10500| 1.80e-01  1.38e-01  4.33e+00  8.59e+04  1.39e+01  1.24e+00
 10750| 1.71e-01  3.28e-01  4.34e+00  8.59e+04  1.39e+01  1.26e+00
 11000| 1.45e-01  1.26e-01  3.62e+00  8.60e+04  1.39e+01  1.29e+00
 11250| 1.27e-01  1.09e-01  3.35e+00  8.60e+04  1.39e+01  1.32e+00
 11500| 1.13e-01  1.02e-01  3.07e+00  8.60e+04  1.39e+01  1.35e+00
 11750| 8.21e+02  3.67e+04  1.51e+03  9.01e+04  1.39e+01  1.38e+00
 12000| 8.79e-02  1.41e-01  2.65e+00  8.61e+04  1.39e+01  1.41e+00
 12225| 7.25e-02  8.49e-02  2.35e+00  8.61e+04  1.39e+01  1.43e+00
------------------------------------------------------------------
status:  solved
timings: total: 1.43e+00s = setup: 5.27e-03s + solve: 1.43e+00s
         lin-sys: 1.30e+00s, cones: 6.17e-02s, accel: 1.50e-02s
------------------------------------------------------------------
objective = 86139.626049
------------------------------------------------------------------

Make IPython optional

Feedback from bodono/scs-python#63 (comment)

TypeError: '<' not supported between instances of 'NoneType' and 'float'

Summary

This can happen when running the benchmark with calls to a new solver:

Traceback (most recent call last):
  File "/.../.micromamba/envs/qpbenchmark/bin/qpbenchmark", line 8, in <module>
    sys.exit(main())
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/benchmark.py", line 309, in main
    report(args, results)
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/benchmark.py", line 252, in report
    report.write(md_path)
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/report.py", line 260, in write
    self.__compute_dataframes()
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/report.py", line 241, in __compute_dataframes
    self.__cost_df = self.results.build_shgeom_df(
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/results.py", line 344, in build_shgeom_df
    {
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/results.py", line 345, in <dictcomp>
    settings: self.get_shgeom_for_metric_and_settings(
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/results.py", line 318, in get_shgeom_for_metric_and_settings
    means[solver] = shgeom(column_values, shift)
  File "/.../.micromamba/envs/qpbenchmark/lib/python3.10/site-packages/qpbenchmark/shgeom.py", line 47, in shgeom
    if (v < 0.0).any():
TypeError: '<' not supported between instances of 'NoneType' and 'float'

Reproduction steps

qpbenchmark maros_meszaros_dense_posdef.py run

Technical details

Operating system: Ubuntu 22.04
Python version: 3.10

Need to flip the sign of dual solution from Gurobi solver

Hi,

Thanks for the cool implementation!

The solution of dual variables output by Gurobi is negative. You need to flip its sign and then use it to compute the dual residual/ gap, etc.

I implemented the dual residual/ gap by myself. In my experiment, I can replicate the results from your codes by not flipping the sign, i.e. plug in the negative dual variables to compute the dual residual/ gap.

Separate test set generation

See qpsolvers/qpsolvers#171 (comment)

Add low precision settings

See the discussion in #21

Consider adding Clarabel solver to the benchmarks

Discussed in #45

^{Originally posted by nrontsis February 6, 2023}
Clarabel is a new solver that was shown to be competitive in some QP problems of mine.

See related discussion here and related talk here.

Report a measure of variance on test set results

Any measure of the mean value of data is misleading when there is large variance. We should provide some low and high estimates (min and max, or 5% and 95% percentiles) along with geometric means.

Randomized test sets from jrl-qp

There are C++ problem generation routines in jrl-qp:

They could help improve the benchmark in qpsolvers.

CPU thermal throttling

Is your proposal related to a problem?

From #87:

Unless very efficiently cooled, modern CPU cannot run at max speed for long. This can considerably skew timings.

Describe the solution you'd like

Make sure the benchmark solves all problem with consistent performance, monitoring and reacting to thermal throttling.

Describe alternatives you've considered

Monitoring throttling state between solver calls (pro: single thread, con: throttling may happen during a solver call)
Downscaling to e.g. 50% of the total CPU performance to lower the probability throttling happens

Additional context

See #87.

Allow submitting problems from matrix files

The current submission process asks for Python code.

Alternatively, we could accept matrix files in a specific format (same as the current Maros-Meszaros test set).

Choice of default settings

As reported here, it seems that the default settings use the default parameters of a given solver.
This is not a fair comparison, as all the solvers do not reach the same precision for their default parameters.

I would instead encourage using two different denominations provided by the benchmarks: low precision, and high_precision.
@stephane-caron What do you thinK?

Cold start only

Is your proposal related to a problem?

No.

Describe the solution you'd like

The benchmark only performs cold-start calls. It would be nice to cater to warm and hot start as well.

Additional context

Warm and hot starting are commonly used in model predictive control.

Custom plots from a script

Is your proposal related to a problem?

Arose from this discussion: #65

Describe the solution you'd like

Use qpsolvers_benchmark as a library rather than a command-line script so that users can more easily create their own custom plots.

Describe alternatives you've considered

Expanding the main script's CLI. Down that road we would end up having the whole matplotlib API surfacing in the CLI, which is not code we want to handle in the benchmark. Enabling users to more conveniently process benchmark results would be preferable.

Currently we only check primal and cost errors.
We should check all optimality conditions to validate solver solutions fully.

See e.g. the summary note Optimality conditions and numerical tolerances in QP solvers for more details on this.

Check primal and cost errors to decide if a solution is correct

Currently: success rate is based on solvers' return codes
Should be: the benchmark checks the primal and cost errors to decide if a solution is successful

qpsolvers / qpbenchmark Goto Github PK

qpbenchmark's Introduction

QP solvers benchmark

Test sets

Solvers

Metrics

Shifted geometric mean

Results

Limitations

Installation

Usage

Plots

Contributing

Citation

See also

References

Other benchmarks

qpbenchmark's People

Contributors

Stargazers

Watchers

Forkers

qpbenchmark's Issues

Describe the solution you'd like

Summary

Reproduction steps

Technical details

Discussed in #45

Is your proposal related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Is your proposal related to a problem?

Describe the solution you'd like

Additional context

Is your proposal related to a problem?

Describe the solution you'd like

Describe alternatives you've considered

Recommend Projects

Recommend Topics

Recommend Org