Giter Site home page Giter Site logo

ag2435 / randomly-pivoted-cholesky Goto Github PK

View Code? Open in Web Editor NEW

This project forked from eepperly/randomly-pivoted-cholesky

0.0 0.0 0.0 492 KB

Code for the paper "Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations"

License: MIT License

Shell 0.25% Python 6.80% MATLAB 0.92% Jupyter Notebook 92.03%

randomly-pivoted-cholesky's Introduction

Randomly Pivoted Cholesky

This contains code to reproduce the numerical experiments for Randomly pivoted Cholesky: Practical approximation of a kernel matrix with few entry evaluations by Yifan Chen, Ethan N. Epperly, Joel A. Tropp, and Robert J. Webber.

Algorithm

Randomly pivoted Cholesky (RPCholesky) computes a low-rank approximation to a positive semidefinite matrix $\boldsymbol{A}$. We anticipate the algorithm is most useful for kernel and Gaussian process methods, where the matrix $\boldsymbol{A}$ is defined only implicitly by a kernel function which must be evaluated at great expense to read each entry of $\boldsymbol{A}$. The result of the algorithm is a rank $k$ (column) Nyström approximation $\boldsymbol{\hat{A}} = \boldsymbol{F}\boldsymbol{F}^*$ to the matrix $\boldsymbol{A}$, computed in $\mathcal{O}(k^2N)$ operations and only $(k+1)N$ entry evaluations of the matrix $\boldsymbol{A}$. In our experience, RPCholesky consistently provides approximations of comparable accuracy to other methods with fewer entry evaluations.

Using RPCholesky

While the main purpose of these scripts is for scientific reproducibility, they also may be useful for using RPCholesky in an application. RPCholesky is implemented in the rpcholesky method in rpcholesky.py, and can be called as

nystrom_approximation = rpcholesky(A, num_pivots)

The input matrix A should be an AbstractPSDMatrix object, defined in matrix.py. A psd matrix stored as a numpy array can be made usable by rpcholesky by wrapping it in a PSDMatrix object:

nystrom_approximation = rpcholesky(PSDMatrix(ordinary_numpy_array), num_pivots)

The output of rpcholesky is a PSDLowRank object (defined in lra.py). From this object, one can obtain $\boldsymbol{F}$ (defining the Nyström approximation $\boldsymbol{\hat{A}} = \boldsymbol{FF}^*$), the pivot set $\mathsf{S}$, and the rows defining the Nyström approximation $A(\mathsf{S},:)$:

nystrom_approximation = rpcholesky(A, num_pivots)
F      = nystrom_approximation.F                    # Nystrom approximation is F @ F.T
pivots = nystrom_approximation.idx 
rows   = nystrom.rows                               # rows = A[pivots, :]

Running the experiments

The first step to reproducing the experiments from the manuscript is to run the script

./setup.sh

which sets up the file structure, loads RLS and DPP samplers, and downloads the QM9 dataset for the KRR example. The data from the figures in the paper can produced by running the following scripts in the experiments/ folder, each of which has instructions for its individual use at a comment at the top:

  1. experiments/comparison.py: compares the approximation error for different Nyström methods. Used to produce the left displays in Figure 1.
  2. experiments/chosen.py: outputs the pivots chosen by different Nyström methods. Used to produce the right displays in Figure 1.
  3. experiments/entries.py: outputs the entry evaluations for different Nyström methods. Used to produce Figure 2.
  4. experiments/qm9_krr.py: performs kernel ridge regression on the QM9 dataset. Used to produce Figure 3.
  5. experiments/cluster_biomolecule.py: performs spectral clustering on the alanine dipeptide dataset. Used to produce Figure 4.
  6. experiments/timing.py: compares the timing of different Nyström methods.

Once the relevant Python scripts have been run, the figures from the paper can be generated from the relevant MATLAB scripts in experiments/matlab_plotting/.

Figure 4 in the manuscript was completely changed in revision. Figure 4 from earlier versions of the manuscript can be generated using the scripts experiments/cluster_letters.py and experiments/cluster_letters_plot.py.

randomly-pivoted-cholesky's People

Contributors

eepperly avatar ag2435 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.