Giter Site home page Giter Site logo

srinivassubra / gets Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 6.07 MB

GETS: a Genomic Tree based Sparse solver. This package accompanies an article to be published in the forthcoming Birkhäuser-ANHA book "Explorations in the Mathematics of Data Science" --- "A genomic tree based sparse solver" by Timothy A. Davis and Srinivas Subramanian.

License: GNU Lesser General Public License v2.1

C 39.88% CMake 0.91% MATLAB 38.00% Makefile 0.99% Java 15.82% HTML 4.10% Awk 0.03% Shell 0.27%
compressive-sensing genomic-data-science metagenomics nnls optimization c csparse matlab sparse-matrix

gets's Introduction

GETS

GETS: a Genomic Tree based Sparse solver.

Written by Srinivas Subramanian and Tim Davis.

Refer to the article "A genomic tree based sparse solver"
by Timothy A. Davis and Srinivas Subramanian, accepted for publication in the forthcoming Birkhäuser-ANHA book "Explorations in the Mathematics of Data Science" (see GETS_article.pdf in this directory).

In summary, sparse recovery is performed by exploiting the inherent structure of genomic datasets to efficiently solve optimization problems via sparse matrix computations.

Nonnegative sparse recovery is performed for a genomics problem which is concerned with the reconstruction of concentrations of bacterial species from an environmental sample.

Nonnegative least squares (NNLS) or Nonnegative regularization (NNREG) optimization problems solved via the GETS solver, an efficient implementation of the Lawson Hanson algorithm.

Given a genomic dataset (k-mer matrix of DNA sequences), the inherent structure of the genomic problem is exploited to uncover an evolutionary family tree type relationship between the species. This genomic tree enables a sparse representation of the problem which is created in the offline stage of GETS, a one time computation. This allows for reduced storage and asymptotic speed ups via sparse matrix computations, whenever the solver is used.

GETS is primarily written in C, with an interface for use in MATLAB through its MEX functions.

GETS uses CSparse, a concise sparse matrix package by Tim Davis. This package (CSparse v4.0.0) is already included in the GETS/MATLAB directory and does not need to be separately installed.

Note: This repository uses Git Large File Storage (Git LFS) since the size of the Data directory is about 500MB. A repository with LFS data can be cloned (i.e., with a git clone command after making sure git lfs is installed), but the 'Download ZIP' button on the GitHub web interface doesn't include the large files, only the pointers. A complete zipped version of GETS which includes the large files can be directly downloaded via this link: https://drive.google.com/file/d/1U-uuCk20DGlMJ3A-YhubMYTMZ9-Eo1DL/view?usp=share_link.


Installation for use in MATLAB

To compile and install GETS for use in MATLAB (along with CSparse),

  1. Go to the GETS/MATLAB directory

  2. Type "gets_install" in the MATLAB Command Window or equivalently Run "gets_install.m"

The installation takes about a minute and you will see the "CSparse successfully compiled" and "GETS successfully compiled" messages displayed.


Run the MATLAB scripts to test GETS or to reproduce the results in the article

  1. First install GETS for use in MATLAB by following the steps shown above
  2. Go to the GETS/MATLAB/Tests directory
  3. Run the MATLAB scripts "NNREG_Tests_small.m" for the tests on the small dataset
  4. Run the MATLAB scripts "NNREG_Tests_large.m" for the tests on the large dataset

The tests compare the performance of GETS with MATLAB's lsqnonneg. You can display the default saved tests results, including the histogram plots, without running the actual tests, by running only the bottom section "Display test results" of the MATLAB scripts.

To run a single test for a specific right hand side, run the "NNREG_Singletest_small.m" or "NNREG_Singletest_Large.m"

To run tests for NNLS instead of NNREG, just follow the above steps with NNLS in the place of NNREG.

To perform the offline computations that create the treedata struct, run "Offline.m"


Help for GETS

In the MATLAB command window type: "help GETS" or "help gets_nnreg", "help gets_nnls", "help gets_offline" to get the MATLAB style help descriptions for using these functions along with examples. For questions or comments contact Srinivas Subramanian (email:[email protected]).

gets's People

Contributors

srinivassubra avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.