rust-cv / levenberg-marquardt Goto Github PK

Provides abstractions to run Levenberg-Marquardt optimization

License: MIT License

Rust 91.02% HTML 1.34% Python 7.64%

levenberg-marquardt's Issues

Feature request: Returning covariance matrix and parameter error

Thanks for developing a very nice crate!
Can a feature be added to return a Covariance matrix of parameters (covar) and 1σ uncertainties on params(perror)?
I am working in the scientific field, which requires an error bound to be calculated for the fits, and this feature will help the field so much.

The original MINPACK returns a result of final QR decomposition, which enables the calculation of the covariance matrix.
The pwkit(https://github.com/pkgw/pwkit/blob/master/pwkit/lmmin.py) provides perror and covar as a result.
The lmfit(https://jugit.fz-juelich.de/mlz/lmfit/-/blob/main/lib/lmfit.hpp) also return the parameters as parerr and covar.

Return statistics

It would be helpful to return some statistics about how many iterations were required and what the termination reason was to assist in the parameterization process. This can be a separate function which includes both the improved model and the statistics.

Suggestions for improving the API

Hi,

I have some suggestions to improve the API.

Motivation

The residuals and jacobians closures are likely to share some computational steps. Currently it is hard to do those common computations once for both. They are performed redundantly.
Because jacobians returns an iterator without a lifetime the input given as a reference must be most likely copied (and then moved).
The many arguments to optimize make it hard reading the code.
Failing in residuals or jacobians is not possible.
Additional to #3, the return value should also provide an easy way to decide if the optimization failed.

Suggestion

Merge Config and optimize into one struct LevenbergMarquardt. Simplifies usage: one argument and import less.
Instead of handing in closures we hand in a trait object, for example OptimizationTarget. It has methods for
- Setting the current guess (can perform shared computations)
- Normalize
- Getting the residuals with option return type.
- Getting jacobians, with a lifetime bound:
```
fn jacobians<'a>(&'a self, params: &'a VectorN<..>) -> Box<dyn Iterator<Item=Option<Matrix<...>>> + 'a>;
```
The optimize method is changed to return a Result with an error trait.
The current API could be preserved in a new function, but I would suggest not to.

Shortcomings

The jacobians trait method can not easily return an iterator because of current limitations on impl Trait in rust. Alternatively, but also not perfect, we could give it an index and let it return only one jacobian at a time.
To use the api you would need to implement a trait and potentially define a struct. On the other hand, this could allow to use the same optimization problem for different algorithms in a consistent way.

I am willing to create a pull request for this.

Remarks regarding optimize interface

Hi, I just read the LM doc and function interface you put in the src/lib.rs file.

pub fn optimize<N: Scalar, R: Dim, C: Dim, VS: Storage<N, R, C>, MS: Storage<N, R>>(
    max_iterations: usize,
    initial_lambda: N,
    lambda_scale: N,
    threshold: N,
    init: Vector<N, R, VS>,
    jacobian_and_residuals: impl Fn(Vector<N, R, VS>) -> (Matrix<N, R, C, MS>, Matrix<N, R, C, MS>),
) -> Vector<N, R, VS> { ... }

The set of parameters chosen for the interface looks great for simple usage. I'll give some feedback however regarding advanced control I have already needed in the past. Let me know what you think of those. Maybe there can be two levels of abstraction, a "simple to use" one, and an "advance" one.

Nalgebra types

The types here are tied to nalgebra. This might be fine, but also might not be needed.

Regarding updates of the LM parameter lambda

I have already needed faster backoff of the lambda parameter. For example cases where convergence would divide by 10 but divergence would multiply by 50. Also cases where divergence would reset lambda to a given (startup for example) value.

Regarding convergence criteria

Depending on your modelization, there are multiple things that can make sense to check for convergence, including (but not exhaustive).

max number of iterations as suggested.
mse of residuals as suggested. If your model includes noise for example, a given threshold under which things are interpreted as noise makes sense.
variation of residual mse. That is also useful to increase lambda.
model variation, how much your model changes.
relative model variation. Did it change faster than before, slower?

The last 3 for example are not controllable with the current interface.

Regarding Hessian and Jacobian full computation

Let's note p the number of parameters and N the number of observations. The Jacobian is then of size (Nxp), Hessian is (pxp) and residuals "r" is (Nx1). If there are a lot of observations, i.e. N >> p this might induce high memory usage, and reduced precision depending on the ordering of the residuals. The steepest descent J^T * r and the hessian H = J^T * J can actually be computed incrementally, summing one jacobian observation at a time, which reduces memory usage. This summation can also be done in a smart way with accumulated values, for example every 100 observations accumulated independently and then those accumulated values are summed together.

So asking to provide the full Jacobian matrix prevents this kind of memory optimization.

Regarding non necessary Jacobian evaluation

Once the residuals are computed, it is usually enough to evaluate the current model. In case it is better than the previous one, we will eventually need the Jacobian. But in case it is not, actually computing the Jacobian may be a waste of time because that model will not be used further (we probably back track to use the previous model).

Add a message explaining incorrect Jacobian shape error

In #10, an issue was encountered with a bad error that doesn't describe in what way a Jacobian's dimensions are incorrect, and instead just returns that there was some issue with the Jacobian shape. We should detect precisely which dimension is mismatching and provide that information to the user for better diagnostics.

Convergence criteria

@mpizenberg mentioned some ways (in #1) that we can terminate early/check for convergence that are not currently supported, but should be:

variation of residual mse. That is also useful to increase lambda.
model variation, how much your model changes.
relative model variation. Did it change faster than before, slower?

Configuration questions

Hi there,

First of all, thanks for implementing this algorithm. I look forward to being able to use it correctly.

In my application, I'm currently using a Gauss-Newton (sometimes called Newton-Raphson) algorithm to solve for a single-shooting problem. A good description of the problem is provided at the start of this video: https://www.youtube.com/watch?v=4qFlaqCsnQA . The mathematical specifications of my implementation is available here: https://nyxspace.com/MathSpec/optimization/targeting/ .

I'm trying to switch from Gauss-Newton to Levenberg, using your library. Although in theory Gauss-Newton is not great with a bad initial guess, in my application it works decently well. Levenberg Marquardt is theoretically more resilient to bad initial guesses. However, your LM library algorithm is always taking tiny steps to guess what the correct solution (\vec{x}) could be. So, I suspect I'm incorrectly configuring it.

It converges on the correct solution in 8 evaluations if I provide it with the exact solution of the Gauss Newton algorithm. Otherwise, it'll try incredibly small changes in the solutions and assume that it has minimized the function... but the residuals are just as large as when it started the problem (and nowhere near zero).

I've provided a detailed example below, but my question straightforward:
Is there a way to tell the LM structure to only converge when the residuals are zero and ignore ftol and xtol entirely?

Thanks

Example

Gauss Newton solution

The correct solution that minimizes the problem is the following Vector3<f64>:

  ┌                     ┐
  │  0.6887498158650465 │
  │ -1.9563017351074758 │
  │  -0.560405774185829 │
  └                     ┘

Full log of the solution:

 INFO  nyx_space::md::opti::raphson_finite_diff > Targeter -- CONVERGED in 12 iterations
 INFO  nyx_space::md::opti::raphson_finite_diff >       SMA: achieved = 8100.00000       desired = 8100.00000    scaled error =    0.00000
 INFO  nyx_space::md::opti::raphson_finite_diff >       Eccentricity: achieved =    0.40000      desired =    0.40000    scaled error =   -0.00000
 INFO  nyx_space::md::opti::raphson_finite_diff >       Inclination: achieved =   28.52611       desired =   28.52611    scaled error =    0.00000
FD: Targeter solution correcting ["VelocityX", "VelocityY", "VelocityZ"] (converged in 0.032 seconds, 12 iterations):
        Correction @ 2020-01-01T00:00:00 UTC:
                VelocityX = 0.689 m/s
                VelocityY = -1.956 m/s
                VelocityZ = -0.560 m/s
                |Δv| = 2148.383 m/s
        Achieved @ 2020-01-01T00:59:20.540817260 UTC:
                SMA = 8100.000 (wanted 8100.000 ± 1.0e-3)
                Eccentricity = 0.400 (wanted 0.400 ± 1.0e-5)
                Inclination = 28.526 (wanted 28.526 ± 1.0e-1)
        Corrected state:
                total mass = 100.000000kg  @  [Earth J2000] 2020-01-01T00:00:00 UTC     position = [3835.382907, -7756.921938, -4156.921938] km velocity = [5.345645, 1.118432, -2.001254] km/s
                total mass = 100.000000kg  @  [Earth J2000] 2020-01-01T00:00:00 UTC     sma = 8100.000000 km    ecc = 0.400000  inc = 28.526111 deg     raan = 54.206044 deg    aop = 108.326570 deg    ta = 136.729436 deg
        Achieved state:
                total mass = 100.000000kg  @  [Earth J2000] 2020-01-01T00:59:20.540817260 UTC   position = [7266.068236, 5249.287121, -1534.719097] km  velocity = [-4.534582, 3.021172, 2.959670] km/s
                total mass = 100.000000kg  @  [Earth J2000] 2020-01-01T00:59:20.540817260 UTC   sma = 8100.000000 km    ecc = 0.400000  inc = 28.526111 deg     raan = 54.206044 deg    aop = 108.326570 deg    ta = 230.979694 deg

Initial params at zero

If the initial params is zero, then it'll "converge" claiming that the ftol solution: MinimizationReport { termination: Converged { ftol: true, xtol: false }, number_of_evaluations: 26, objective_function: 5001.106174351325 } .

First attempted control (the "achieved, desired, error" message shows the actual targets of the problem):

 INFO  nyx_space::md::opti::minimize_lm         >       SMA: achieved = 7903.45074       desired = 8100.00000    scaled error =  196.54926
 INFO  nyx_space::md::opti::minimize_lm         >       Eccentricity: achieved =    0.21490      desired =    0.40000    scaled error =    0.18510
 INFO  nyx_space::md::opti::minimize_lm         >       Inclination: achieved =   30.44637       desired =   28.52611    scaled error =   -1.92026
[src/md/opti/minimize_lm.rs:634] converged = false
ctrl = 
  ┌                       ┐
  │ -0.007939256315236743 │
  │  -0.01202376661064193 │
  │  0.025652932101011633 │
  └                       ┘

And 22nd attempt: this shows that we're back at the initial error in residuals.

 INFO  nyx_space::md::opti::minimize_lm         >       SMA: achieved = 8000.00000       desired = 8100.00000    scaled error =  100.00000
 INFO  nyx_space::md::opti::minimize_lm         >       Eccentricity: achieved =    0.20000      desired =    0.40000    scaled error =    0.20000
 INFO  nyx_space::md::opti::minimize_lm         >       Inclination: achieved =   30.00000       desired =   28.52611    scaled error =   -1.47389
[src/md/opti/minimize_lm.rs:634] converged = false
ctrl = 
  ┌                                     ┐
  │ -0.00000000000000003901171501256353 │
  │ -0.00000000000000005908364492440542 │
  │   0.0000000000000001260660433615449 │
  └                                     ┘


 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityX (element 0): -0.00000000000000003901171501256353
 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityY (element 1): -0.00000000000000005908364492440542
 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityZ (element 2): 0.0000000000000001260660433615449

Final attempt:

 INFO  nyx_space::md::opti::minimize_lm         >       SMA: achieved = 8000.00000       desired = 8100.00000    scaled error =  100.00000
 INFO  nyx_space::md::opti::minimize_lm         >       Eccentricity: achieved =    0.20000      desired =    0.40000    scaled error =    0.20000
 INFO  nyx_space::md::opti::minimize_lm         >       Inclination: achieved =   30.00000       desired =   28.52611    scaled error =   -1.47389
[src/md/opti/minimize_lm.rs:634] converged = false
MinimizationReport { termination: Converged { ftol: true, xtol: false }, number_of_evaluations: 26, objective_function: 5001.106174351325 }
Result correction: 
  ┌   ┐
  │ 0 │
  │ 0 │
  │ 0 │
  └   ┘

                0 km/s

Initial params set to some random but wrong solution

24 evaluations, no meaningful progress. Initial parameters set to [1.5, 1.5 ,1.5].

First attempt:

 INFO  nyx_space::md::opti::minimize_lm         >       SMA: achieved = -14499.72096     desired = 8100.00000    scaled error = 22599.72096
 INFO  nyx_space::md::opti::minimize_lm         >       Eccentricity: achieved =    1.57039      desired =    0.40000    scaled error =   -1.17039
 INFO  nyx_space::md::opti::minimize_lm         >       Inclination: achieved =   27.34254       desired =   28.52611    scaled error =    1.18357
[src/md/opti/minimize_lm.rs:634] converged = false
ctrl = 
  ┌                    ┐
  │ 2.1167838430947192 │
  │ 1.9808284453966405 │
  │  4.069698058817354 │
  └                    ┘


 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityX (element 0): 2.1167838430947192
 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityY (element 1): 1.9808284453966405
 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityZ (element 2): 4.069698058817354

Some attempt in the middle:

 INFO  nyx_space::md::opti::minimize_lm         >       SMA: achieved = 16469.24599      desired = 8100.00000    scaled error = -8369.24599
 INFO  nyx_space::md::opti::minimize_lm         >       Eccentricity: achieved =    0.44310      desired =    0.40000    scaled error =   -0.04310
 INFO  nyx_space::md::opti::minimize_lm         >       Inclination: achieved =   25.96224       desired =   28.52611    scaled error =    2.56387
[src/md/opti/minimize_lm.rs:634] converged = false
ctrl = 
  ┌                    ┐
  │ 1.5000000000005078 │
  │ 1.5000000000006835 │
  │  1.500000000052811 │
  └                    ┘


 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityX (element 0): 1.5000000000005078
 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityY (element 1): 1.5000000000006835
 INFO  nyx_space::md::opti::minimize_lm         > Correction VelocityZ (element 2): 1.500000000052811

Last attempt:

 INFO  nyx_space::md::opti::minimize_lm         >       SMA: achieved = 16469.24599      desired = 8100.00000    scaled error = -8369.24599
 INFO  nyx_space::md::opti::minimize_lm         >       Eccentricity: achieved =    0.44310      desired =    0.40000    scaled error =   -0.04310
 INFO  nyx_space::md::opti::minimize_lm         >       Inclination: achieved =   25.96224       desired =   28.52611    scaled error =    2.56387
[src/md/opti/minimize_lm.rs:634] converged = false
MinimizationReport { termination: Converged { ftol: false, xtol: true }, number_of_evaluations: 24, objective_function: 35022142.535638005 }
Result correction: 
  ┌     ┐
  │ 1.5 │
  │ 1.5 │
  │ 1.5 │
  └     ┘

                2.598076211353316 km/s

Tracking issue: Porting the MINPACK implementation

My initial implementation has progressed. The code can be found on this branch.

State

All test cases, including the line fitting test case, run in 2 seconds on my laptop.

The new implementation has all the termination criteria from MINPACK.

I will open a pull request once everythings done.

Rectangular Jacobian matrices support

Thanks for the highly underrated library, it's very nice to use.

It would be nice to be able to return a non-square Jacobian matrix from the LeastSquaresProblem::jacobian() function. Whenever I try it I get a termination: WrongDimension("jacobian") as part of the report.

At least a mention of this limitation in the documentation would be nice.

Also, for my understanding: is this a mathematical limitation, a missing implementation, or am I doing something wrong?

Support for nalgebra::Dyn storage

Hi,

That crate looked useful to me as I did not want to reinvent the Least Square wheel, but sadly it seems the LeastSquaresProblem trait can only accept constant (compile-time) nalgebra storages. In my app, both equations (therefore Jacobian) and parameters are runtime-dynamic, so I'd need to impl LeastSquaresProblem<f64, Dyn, Dyn> for Problem.

Trying exactly this yields obscure unsatisfied trait errors, so I assume this crate simply doesn't support Dyn dimensions. Is that the case? Would it be difficult to add support?

Thanks!