Comments (5)
Hmm, maybe part of the problem is that I didn't use the "track_errors" option correctly. Perhaps the it should not be passed as a member of a dictionary named "options", but just as any other keyword option. So instead of running:
model = mf.mf(norm_X, seed="nndsvd",rank=5, max_iter=20, initialize_only=True, objective="div", update="divergence",options={"track_error":True})
I should have run
model = mf.mf(norm_X, seed="nndsvd",rank=5, max_iter=20, initialize_only=True, objective="div", update="divergence",track_error=True})
If that's how I should have done it, then maybe the documentation of mf.mf needs to be updated. Right now the part that I find confusing is:
:param options: Specify:
#. runtime specific options;
#. algorithm specific options. For details on algorithm specific options see specific algorithm
documentation.
The following are runtime specific options.
:param track_factor: When :param:`track_factor` is specified, the fitted factorization model is tracked during multiple
runs of the algorithm. This option is taken into account only when multiple runs are executed
(:param:`n_run` > 1). From each run of the factorization all matrix factors are retained, which
can be very space consuming. If space is the problem setting the callback function with :param:`callback`
is advised which is executed after each run. Tracking is useful for performing some quality or
performance measures (e.g. cophenetic correlation, consensus matrix, dispersion). By default fitted model
is not tracked.
:type track_factor: `bool`
:param track_error: Tracking the residuals error. Only the residuals from each iteration of the factorization are retained.
Error tracking is not space consuming. By default residuals are not tracked and only the final residuals
are saved. It can be used for plotting the trajectory of the residuals.
:type track_error: `bool`
:type options: `dict`
The documentation above indicates that the "options" option is a dictionary, and that the track_error option is a member of that dictionary. Is that right?
In any case, it's still not clear to me how I track the error.
from nimfa.
Hi Conrad,
I updated the documentation and added an example of usage with error tracking to the main page of documentation. Specifically, updated docs of mf_run is here: http://helikoid.si/mf/mf.mf_run.html and example of usage is available here: http://helikoid.si/mf/#usage (under Example No. 3).
For clarification, we have three types of parameters, as specified in above link. General parameters specify factorization method, rank, initialization method, etc. Runtime specific options specify error tracking, fitted factorization model tracking.
Algorithm specific parameters specify parameters that corresponds to chosen factorization or initialization method (such as lambda_w
or lambda_h
in binary matrix factorization).
You can track error (that is, objective function value from each ITERATION is stored) or fitted factorization model (that is, matrix factors from each RUN are stored, which can be space consuming, so setting callback function which is called after each run is suggested instead). You can also track error when multiple runs are performed. In that case you get the list of errors for run n
by calling res.fit.tracker.get_error(run = n)
, where res
is returned object from factorization. Of
course by default one run is performed and getting list of errors simplifies to res.fit.tracker.get_error()
.
Please see example of usage in above links. I tested it and it works.
Best,
Marinka
from nimfa.
Thanks a lot for showing me how to track the error. Problem solved!
from nimfa.
I runned orl_images example with lsnmf method and track_error set to 'True' and printed errors as in example no. 3
model = mf.mf(V,
seed = "random_vcol",
rank = 25,
method = "lsnmf",
max_iter = 10,
initialize_only = True,
sub_iter = 10,
inner_sub_iter = 10,
beta = 0.1,
track_error = True)
fit = mf.mf_run(model)
print fit.fit.tracker.get_error()
I getted output below:
Reading ORL faces database ...
... Finished.
Preprocessing data matrix ...
... Finished.
Performing LSNMF factorization ...
[621.082839599566, 581.87786816798427, 96.727728418858277, 94.589352703206984, 188.06227063574102, 1017.4987367265663, 627.79675939229605, 685.04997639446071, 330.73776054656361, 212.81975253328733]
... Finished
Stats:
- iterations: 10
- final projected gradients norm: 212.820
- Euclidean distance: 33731.981
objective function value shouldn't be non-increasing?
from nimfa.
Hi leokury,
First, MF documentation of LSNMF method contains explanation of algorithm specific parameters and reference to the article, which explains in detail implemented LSNMF stopping conditions and algorithm; in particular in the article (reference is at above link) see section Stopping Conditions, Equation 19, 20 and 21.
Second, here is a small summary of the above and explanation. LSNMF algorithm is alternating nonnegative least squares matrix factorization using projected gradient method for each subproblem and is characterized by bound constrained optimization. As expressed by author (in above paper), in bound constrained optimization, a common condition to check if x_k (solution in k-th iteration) is close to a stationary point is checking if projected gradient norm of x_k is less or equal than initial gradient norm multiplied by an epsilon parameter. The role of the epsilon, that is tolerance for a stopping condition has in the case of LSNMF min_residuals
parameter (See LSNMF docs, in others algorithms the meaning of the min_residuals
is different, e.g. minial required residuals improvement). As you did not provide
this parameter, default value of 1e-5 is used (again LSNMF docs). The projected gradient norm is used as objective value and in checking the satisfiability of the stopping condition.
Note, that the factorization is performed until at least one stopping condition is met. In your case, met stopping condition was the maximum number of iteration (that is max_iter=10), but after each update of matrix factors projected gradient norm has been greater than initial gradient norm multiplied by tolerance, as explained in above paragraph.
So, the results you printed with tracking the error, are projected gradient norms of matrix factors and these are not necessarily non-increasing, especially after performing so small number of iterations.
There are numerous possible stopping conditions and some are algorithm dependent. Check the documentation and try different settings of algorithm specific parameters. Again, in some applications and with some algorithms finishing the factorization after the first iteration in which the objective value has been increased, is not necessarily the best solution, as many issues can arise in local optimization.
The usage of projected gradient norm in stopping condition for LSNMF is used by the author of LSNMF method as well (reference at above link).
Best,
Marinka
from nimfa.
Related Issues (20)
- Numpy deprecation warning HOT 2
- "Don't know how to make test" on Python 3.5 HOT 1
- Help - How to estimate Coef (as in the H matrix) from a pre-computed W (the base matrix) HOT 1
- nimfa in orange HOT 1
- Warnings in snmf.py and performance
- Unhashable type: 'matrix' when trying to use purity function HOT 3
- KeyError: <built-in function format> in SepNMF HOT 1
- mixing matrix of unseen data? HOT 4
- Unable to run examples.recommendations HOT 2
- PendingDeprecationWarning: the matrix subclass is not the recommended way to represent matrices
- How to reproduce NMF result by setting seed? HOT 2
- Added to conda HOT 3
- Does this code support Missing Value with imputation? HOT 2
- Speed Benchmarks HOT 2
- cophenetic calculation for rank estimation HOT 2
- Factorization with uncertainty and missing data
- Docs [enhancement]
- Lfnmf HOT 2
- How to use the bmf function to obtain a binary matrix?
- Sparseness measure returns 1 for array of zeros
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nimfa.