Giter Site home page Giter Site logo

Comments (3)

fabiodimarco avatar fabiodimarco commented on July 2, 2024

Hi Raphael,

thank you for the tips, I think I can add the damping normalization with max(diag(JJT) as another option to the DampingAlgorithm class in addition to the fletcher method already available. I'll try it on the two examples I provided to see if it improves the convergence speed.

I agree on using tf.keras.backend.set_floatx('float64') for curve fitting applications, it can be useful and I have used it several times for my experiments. In the examples I used float32, as it is the most common format in machine learning applications, where reducing the loss as much as possible is not the main purpose.

The solution you mentioned of increasing the damping when the matrix is ​​singular is very useful not only for the Cholesky method, but also when using the QR method. Sometimes I had to increase the min_value of the damping which is 1e-10 by default, because it gave me runtime errors even with QR method.
However, I tried to implement it some time ago, but was not able to make it work in Tensorflow. However, I think the problems I was having on try except were due to the fact that I was using @tf.function on every function of the code. In the current version, however, the code uses @tf.function only for the Jacobian computation (since using it everywhere did not provide performance improvements and was even worse in some cases).
I'll have to do some experiment, to see if I can get it to work using try except similar to the example you provided.

Best,
Fabio

from tf-levenberg-marquardt.

raphaelvalentin avatar raphaelvalentin commented on July 2, 2024

Hi Fabio,

Thanks a lot !

Documents that catch my interest quite a lot and that depart from common applications of ML :
[1] zhang2017.pdf
[2] Physics-Inspired_Neural_Networks_Pi-NN_for_Effcien.pdf
[3] https://curve.carleton.ca/system/files/etd/3f17140b-feb5-4b8c-bc08-0d7217977145/etd_pdf/c3a0a1172423ed7032a74896b52b411c/sadrossadat-sensitivityanalysisbasedadjointneuralnetwork.pdf
[4] Modeling_of_VLSI_MOSFET_Characteristics_Using_Neur.pdf
[5] NeuroFET_Webcast_Final.pdf

LM class optimizer can offer clear opportunities in term of accuracy and speed. Based on my experiments on a (2-6-6-1), your code implementation provides results really very interesting as you can see.

Figure_2

Data are from: https://github.com/google/skywater-pdk; https://github.com/google/skywater-pdk-libs-sky130_fd_pr/

Sincerely,
Raphael.

from tf-levenberg-marquardt.

fabiodimarco avatar fabiodimarco commented on July 2, 2024

Hi Raphael,

I updated the code to add the changes we discussed.

  1. The DampingAlgorithm now have a adaptive_scaling option (True/False).
  2. The train_step now handles exceptions in case the decomposition / solve fails.

The only "problem" encountered is that for tensorflow 2.4.0 in case of singular matrices the Cholesky decomposition throws an exception (as expected), this does not happen in tensorflow 2.9.0 where the output matrix is filled with NaN and this error message occurs.
Cholesky decomposition was not successful. Eigen::LLT failed with error code 1. Filling lower-triangular output with NaNs.

However the code currently handles both cases correctly, although I believe this can be considered a bug of the new version of Tensorflow.

If you have time let me know if it works as you expected.

Best,
Fabio

from tf-levenberg-marquardt.

Related Issues (16)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.