Hi Fabio, damping method <p dir="a

damping method and matrix solver about tf-levenberg-marquardt HOT 3 CLOSED

fabiodimarco commented on July 2, 2024

damping method and matrix solver

from tf-levenberg-marquardt.

Comments (3)

fabiodimarco commented on July 2, 2024

Hi Raphael,

thank you for the tips, I think I can add the damping normalization with max(diag(JJT) as another option to the DampingAlgorithm class in addition to the fletcher method already available. I'll try it on the two examples I provided to see if it improves the convergence speed.

I agree on using tf.keras.backend.set_floatx('float64') for curve fitting applications, it can be useful and I have used it several times for my experiments. In the examples I used float32, as it is the most common format in machine learning applications, where reducing the loss as much as possible is not the main purpose.

The solution you mentioned of increasing the damping when the matrix is singular is very useful not only for the Cholesky method, but also when using the QR method. Sometimes I had to increase the min_value of the damping which is 1e-10 by default, because it gave me runtime errors even with QR method.
However, I tried to implement it some time ago, but was not able to make it work in Tensorflow. However, I think the problems I was having on try except were due to the fact that I was using @tf.function on every function of the code. In the current version, however, the code uses @tf.function only for the Jacobian computation (since using it everywhere did not provide performance improvements and was even worse in some cases).
I'll have to do some experiment, to see if I can get it to work using try except similar to the example you provided.

Best,
Fabio

from tf-levenberg-marquardt.

raphaelvalentin commented on July 2, 2024

Hi Fabio,

Thanks a lot !

Documents that catch my interest quite a lot and that depart from common applications of ML :
[1] zhang2017.pdf
[2] Physics-Inspired_Neural_Networks_Pi-NN_for_Effcien.pdf
[3] https://curve.carleton.ca/system/files/etd/3f17140b-feb5-4b8c-bc08-0d7217977145/etd_pdf/c3a0a1172423ed7032a74896b52b411c/sadrossadat-sensitivityanalysisbasedadjointneuralnetwork.pdf
[4] Modeling_of_VLSI_MOSFET_Characteristics_Using_Neur.pdf
[5] NeuroFET_Webcast_Final.pdf

LM class optimizer can offer clear opportunities in term of accuracy and speed. Based on my experiments on a (2-6-6-1), your code implementation provides results really very interesting as you can see.

Data are from: https://github.com/google/skywater-pdk; https://github.com/google/skywater-pdk-libs-sky130_fd_pr/

Sincerely,
Raphael.

from tf-levenberg-marquardt.

fabiodimarco commented on July 2, 2024

Hi Raphael,

I updated the code to add the changes we discussed.

The DampingAlgorithm now have a adaptive_scaling option (True/False).
The train_step now handles exceptions in case the decomposition / solve fails.

The only "problem" encountered is that for tensorflow 2.4.0 in case of singular matrices the Cholesky decomposition throws an exception (as expected), this does not happen in tensorflow 2.9.0 where the output matrix is filled with NaN and this error message occurs.
Cholesky decomposition was not successful. Eigen::LLT failed with error code 1. Filling lower-triangular output with NaNs.

However the code currently handles both cases correctly, although I believe this can be considered a bug of the new version of Tensorflow.

If you have time let me know if it works as you expected.

Best,
Fabio

from tf-levenberg-marquardt.

damping method and matrix solver about tf-levenberg-marquardt HOT 3 CLOSED

Comments (3)

Related Issues (16)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent