Giter Site home page Giter Site logo

precise-db's Introduction

precise-db

PRECISE is an RNA-seq compendium for Escherichia coli from the Systems Biology Research group at UC San Diego.

DOI

Data

The following data files are available in the data folder:

  • log_tpm_full.csv: Expression levels for all genes in E. coli
  • log_tpm.csv: Expression levels for 3,923 genes in E. coli (noisy genes have been removed)
  • log_tpm_norm.csv: log_tpm.csv centered to reference condition (WT on glucose M9 media)
  • metadata.csv: Experimental metadata (e.g. strain descriptions, carbon source etc.) for all 278 conditions in PRECISE
  • gene_info.csv: Descriptive characteristics of genes, including location, operon, and COG group
  • TRN.csv: Known regulator-gene interactions from RegulonDB 10.0
  • S.csv: Gene coefficients for each iModulon
  • A.csv: Condition-specific activities for each iModulon
  • curated_enrichments.csv: Detailed information on iModulons and their linked regulator(s)
  • imodulon_gene_names.txt: List of gene names in each iModulon
  • imodulon_gene_bnumbers.txt: List of genes (as b-numbers) in each iModulon

Scripts

A conda environment for this code has been provided here
To generate robust independent components for a dataset, execute the run_ica.sh script:
run_ica <filename.csv>
where <filename.csv> is a comma-separated file of gene expression. Data must be centered using a reference condition (See data/log_tpm_norm.csv for an example) Additional options are included as flags. Decreasing tolerance (e.g. -t 1e-3) will reduce runtime, but will also reduce the final number of independent components.

Notebooks

The Jupyter notebook exploratory_analysis.ipynb walks users through the data files and includes a few small functions for interrogating iModulons.

Requirements:

Python 3.6 or greater
Conda environment specifications are listed in environment.yml
Versions of scikit-learn above 0.20.3 cause an error when performing ICA.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.