Giter Site home page Giter Site logo

sxtforreal / tensor_scrna-seq Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 46 KB

Leveraged tensor-based statistical methods on single-cell RNA-sequencing datasets to predict the progression stage of Covid-19

License: MIT License

R 100.00%
covid-19 scrna-seq-analysis tensor

tensor_scrna-seq's Introduction

Ji Lab, Duke University, School of Medicine - Tensor analysis project

Background

Precision medicine is a rapidly growing field that seeks to provide highly personalized healthcare treatments based on a patient's unique health data. The advancements in technology and the growing amount of health-related data sources such as patient demographics, health records, and single-cell RNA sequencing data have provided exciting potentials and challenges. The traditional data analysis methods cannot fully take advantage of the big data, and this is where tensor methods come into play. Tensor methods are designed to effectively analyze and extract information from large datasets and uncover latent structures that traditional data analysis methods might miss, especially the complex inter-relationships between multiple data sources.

More specifically, tensor-based statistical analysis is beneficial in many ways:

  1. By integrating multiple data sources into a tensor representation, it captures complex relationships between different sources of data.
  2. It allows for the simultaneous analysis of multiple data modalities, which provides a more complete picture of the underlying relationships between the data.
  3. Improved interpretability and prediction performance as it considers the inter-relationships between data sources.
  4. Better handling of missing data because it can leverage the relationships between different data sources to impute missing values.

In this project, we developed a pipeline to integrate multiple data sources into a high-dimensional tensor representation and perform downstream analyses including tensor decomposition and regression. The goal of this project is to predict the progression stage of Covid-19 from three patient-specific data sources: health records, gene expressions, and cell types. Our research may potentially contribute to the development of precision medicine and optimiized clinical resource allocation.

Detail

The pipeline includes:

  1. Data sources integration (matching, inputation, restriction).
  2. Implemented the Importance Sketching Low-rank Estimation for Tensors (ISLET) algorithm to approximate the true low-rank structure of the tensor.
  3. Applied Higher-Order Orthogonal Iteration (HOOI) tensor decomposition algorithm to identify major modes of variations in the latent structure across multiple data sources.
  4. Trained predictive models using the factor matrices obtained from the tensor decomposition

Software

R - 4.2.2

References

  1. Zhang, A.∗, Luo, Y.†, Raskutti, G., and Yuan, M. (2020). ISLET: fast and optimal low-rank tensor regression via importance sketchings. SIAM Journal on Mathematics of Data Science.
  2. Zhang, A.∗ (2019). Cross: Efficient tensor completion. Annals of Statistics.
  3. Zhang, A.∗ and Xia, D. (2018). Tensor SVD: Statistical and computational limits. IEEE Transactions on Information Theory.
  4. Zhang, A.∗ and Han, R.† (2019). Optimal denoising and singular value decomposition for sparse high-dimensional high-order data. Journal of the American Statistical Association.
  5. Han, R., Willett, R. and Zhang, A.∗ (2020). An optimal statistical and computational framework for generalized tensor estimation, Annals of Statistics.

tensor_scrna-seq's People

Contributors

sxtforreal avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.