Giter Site home page Giter Site logo

franciscozorrilla / melanie_screen_gems Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 1.0 84.34 MB

This repo contains inputs, outputs, and step-by-step details regarding how genome scale metabolic models were reconstructed for strains used in Melanie's screen.

License: MIT License

Jupyter Notebook 100.00%

melanie_screen_gems's Introduction

⚗️ Melanie's screen GEMs

📕 Description

This repo contains inputs, outputs, and step-by-step details regarding how models were reconstructed for strains used in Melanie's screen. For more details check out the SymbNET metabolic modeling tutorial.

🧱 Contents

  • genomes DNA fasta files for 40 NCBI strains used in experimental screen
  • models Genome scale metabolic models gapfilled on various media
  • ec_jaccard EC-number-based Jaccard distance analysis files
  • media Growth media formulations used for gapfilling and simulation
  • proteomes Open reading frame (ORF)-annotated protein fasta files
  • ensembles Ensemble models for network uncertainty quantification
  • scripts Code used to run flux balance anlysis (FBA) using models
  • simulations FBA simulation outputs
  • notebooks Scripts used to visualize results
  • plots Figures generated by notebooks

🐪 Software

  • git
  • Prodigal
  • CarveMe
    • diamond
    • CPLEX

🩺 Methods

0. Clone repo

$ git clone https://github.com/franciscozorrilla/melanie_screen_GEMs.git

1. Translate genomes to ORF-annotated protein fasta files using prodigal

Move into cloned repository directory and create proteomes folder

$ cd melanie_screen_GEMs
$ mkdir -p proteomes

Run prodigal on each input genome file found in the genomes/ folder

$ while read file; do prodigal -i genomes/$file -a proteomes/${file%.*}.faa;done< <(ls genomes/)

2. Create genome scale metabolic models using CarveMe

Output models with fbc2 format gapfilling on M3 media

$ mkdir -p models/M3_gapfilled
$ while read model;do carve -v --mediadb media/media_db.tsv -g M3 --fbc2 -o models/M3_gapfilled/${model%.*}.xml proteomes/$model; done< <(ls proteomes)

Output models with fbc2 format without gapfilling

$ mkdir -p models/no_gapfill
$ while read model;do carve -v --fbc2 -o models/no_gapfill/${model%.*}.xml proteomes/$model; done< <(ls proteomes/)

3. Create ensemble models

Create ensembles with 100 versions of each strain

$ mkdir -p ensembles
$ while read model; do carve -v --fbc2 -n 100 -o ensembles/${model%.*}.xml proteomes/$model;  done< <(ls proteomes)

🏌️‍♂️ Results

1. Genes, reactions, and metabolites

2. Ensemble model jaccard distance

3. Extract EC number information from model sets

The following loop was run on the command line, from within the folder containing M3-gapfilled models to generate a list of EC numbers across models:

while read model;do 
  paste $model|grep "EC Number"|sed 's/^.*: //g'|sed 's/<.*$//g'|sort|uniq|sed "s/^/${model%.*}\t/g";
done< <(ls|grep xml) >> M3_ec_models.tsv

Alternatively extract reaction IDs:

while read model;do    
  paste $model|grep "reaction metaid"|sed 's/^.*reaction metaid="//g'|sed 's/".*$//g'|sort|uniq|sed "s/^/${model%.*}\t/g"; 
done< <(ls|grep xml) >> model_rxns.tsv

In R:

library(tidyverse)
library(vegan)

# Load list of EC numbers extracted from models

ecnum=read.delim("melanie_screen_GEMs/M3_ec_models.tsv")

# Create presence/absence matrix

ecnum %>% mutate(presence=1) %>% 
          pivot_wider(names_from = ec_number,values_from = presence,values_fill = 0) %>% 
          column_to_rownames(.,var="model") -> ec_mat


vegdist(ec_mat, method="jaccard", binary=TRUE) -> D
as.matrix(D) -> D

as.data.frame(D) %>% rownames_to_column() %>% pivot_longer(cols=NT5001:YK0002,names_to = "model",values_to = "jaccard") -> jaccard_list

ggplot(jaccard_list) + geom_tile(aes(x=rowname,y=model,fill=jaccard)) + theme(axis.text.x = element_text(angle = 45, hjust = 1))

melanie_screen_gems's People

Contributors

franciscozorrilla avatar

Watchers

 avatar

Forkers

arianccbasile

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.