Giter Site home page Giter Site logo

Comments (5)

conradlee avatar conradlee commented on May 19, 2024

Hmm, maybe part of the problem is that I didn't use the "track_errors" option correctly. Perhaps the it should not be passed as a member of a dictionary named "options", but just as any other keyword option. So instead of running:

model = mf.mf(norm_X, seed="nndsvd",rank=5, max_iter=20, initialize_only=True, objective="div", update="divergence",options={"track_error":True})

I should have run

model = mf.mf(norm_X, seed="nndsvd",rank=5, max_iter=20, initialize_only=True, objective="div", update="divergence",track_error=True})

If that's how I should have done it, then maybe the documentation of mf.mf needs to be updated. Right now the part that I find confusing is:

 :param options: Specify:
                            #. runtime specific options;
                            #. algorithm specific options. For details on algorithm specific options see specific algorithm
                               documentation. 

                    The following are runtime specific options.

                     :param track_factor: When :param:`track_factor` is specified, the fitted factorization model is tracked during multiple
                                        runs of the algorithm. This option is taken into account only when multiple runs are executed 
                                        (:param:`n_run` > 1). From each run of the factorization all matrix factors are retained, which 
                                        can be very space consuming. If space is the problem setting the callback function with :param:`callback` 
                                        is advised which is executed after each run. Tracking is useful for performing some quality or 
                                        performance measures (e.g. cophenetic correlation, consensus matrix, dispersion). By default fitted model
                                        is not tracked.
                     :type track_factor: `bool`
                     :param track_error: Tracking the residuals error. Only the residuals from each iteration of the factorization are retained. 
                                        Error tracking is not space consuming. By default residuals are not tracked and only the final residuals
                                        are saved. It can be used for plotting the trajectory of the residuals.
                     :type track_error: `bool`
    :type options: `dict`

The documentation above indicates that the "options" option is a dictionary, and that the track_error option is a member of that dictionary. Is that right?

In any case, it's still not clear to me how I track the error.

from nimfa.

marinkaz avatar marinkaz commented on May 19, 2024

Hi Conrad,

I updated the documentation and added an example of usage with error tracking to the main page of documentation. Specifically, updated docs of mf_run is here: http://helikoid.si/mf/mf.mf_run.html and example of usage is available here: http://helikoid.si/mf/#usage (under Example No. 3).

For clarification, we have three types of parameters, as specified in above link. General parameters specify factorization method, rank, initialization method, etc. Runtime specific options specify error tracking, fitted factorization model tracking.
Algorithm specific parameters specify parameters that corresponds to chosen factorization or initialization method (such as lambda_w or lambda_h in binary matrix factorization).

You can track error (that is, objective function value from each ITERATION is stored) or fitted factorization model (that is, matrix factors from each RUN are stored, which can be space consuming, so setting callback function which is called after each run is suggested instead). You can also track error when multiple runs are performed. In that case you get the list of errors for run n by calling res.fit.tracker.get_error(run = n), where res is returned object from factorization. Of
course by default one run is performed and getting list of errors simplifies to res.fit.tracker.get_error().

Please see example of usage in above links. I tested it and it works.

Best,

Marinka

from nimfa.

conradlee avatar conradlee commented on May 19, 2024

Thanks a lot for showing me how to track the error. Problem solved!

from nimfa.

leokury avatar leokury commented on May 19, 2024

I runned orl_images example with lsnmf method and track_error set to 'True' and printed errors as in example no. 3

model = mf.mf(V, 
                      seed = "random_vcol",
                      rank = 25, 
                      method = "lsnmf", 
                      max_iter = 10,
                      initialize_only = True,
                      sub_iter = 10,
                      inner_sub_iter = 10, 
                      beta = 0.1,
                      track_error = True)
fit = mf.mf_run(model)

print fit.fit.tracker.get_error()

I getted output below:

Reading ORL faces database ...
... Finished.
Preprocessing data matrix ...
... Finished.
Performing LSNMF factorization ...
[621.082839599566, 581.87786816798427, 96.727728418858277, 94.589352703206984, 188.06227063574102, 1017.4987367265663, 627.79675939229605, 685.04997639446071, 330.73776054656361, 212.81975253328733]
... Finished
Stats:
- iterations: 10
- final projected gradients norm: 212.820
- Euclidean distance: 33731.981

objective function value shouldn't be non-increasing?

from nimfa.

marinkaz avatar marinkaz commented on May 19, 2024

Hi leokury,

First, MF documentation of LSNMF method contains explanation of algorithm specific parameters and reference to the article, which explains in detail implemented LSNMF stopping conditions and algorithm; in particular in the article (reference is at above link) see section Stopping Conditions, Equation 19, 20 and 21.

Second, here is a small summary of the above and explanation. LSNMF algorithm is alternating nonnegative least squares matrix factorization using projected gradient method for each subproblem and is characterized by bound constrained optimization. As expressed by author (in above paper), in bound constrained optimization, a common condition to check if x_k (solution in k-th iteration) is close to a stationary point is checking if projected gradient norm of x_k is less or equal than initial gradient norm multiplied by an epsilon parameter. The role of the epsilon, that is tolerance for a stopping condition has in the case of LSNMF min_residuals parameter (See LSNMF docs, in others algorithms the meaning of the min_residuals is different, e.g. minial required residuals improvement). As you did not provide
this parameter, default value of 1e-5 is used (again LSNMF docs). The projected gradient norm is used as objective value and in checking the satisfiability of the stopping condition.

Note, that the factorization is performed until at least one stopping condition is met. In your case, met stopping condition was the maximum number of iteration (that is max_iter=10), but after each update of matrix factors projected gradient norm has been greater than initial gradient norm multiplied by tolerance, as explained in above paragraph.

So, the results you printed with tracking the error, are projected gradient norms of matrix factors and these are not necessarily non-increasing, especially after performing so small number of iterations.

There are numerous possible stopping conditions and some are algorithm dependent. Check the documentation and try different settings of algorithm specific parameters. Again, in some applications and with some algorithms finishing the factorization after the first iteration in which the objective value has been increased, is not necessarily the best solution, as many issues can arise in local optimization.

The usage of projected gradient norm in stopping condition for LSNMF is used by the author of LSNMF method as well (reference at above link).

Best,

Marinka

from nimfa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.