greenelab / biobombe Goto Github PK

View Code? Open in Web Editor NEW

63.0 63.0 25.0 2.33 GB

BioBombe: Sequentially compressed gene expression features enhances biological signatures

Home Page: https://greenelab.github.io/BioBombe/

License: BSD 3-Clause "New" or "Revised" License

Jupyter Notebook 72.80% Python 1.57% R 1.29% Shell 0.07% HTML 24.27%

autoencoder biobombe compression gene-expression gene-sets hetnet msigdb network tcga

biobombe's People

Contributors

Stargazers

Watchers

biobombe's Issues

Generate Supplementary Figure Describing GTEx and TCGA Sex Features

Related to #163 and specifically #163 (comment)

Also need to explore gene coefficients in both models

Add t-test for NBL Cell lines in MYCN amplification signature application

Add Transcription Factor Analysis to Supplementary Coverage Figure

Update panels in coverage figure

Currently, the panels are labeled by gene set, they should be lettered by model type

Move E and F of GTEx Figure to Supplement

Restructure visualize_genesets.R

Currently the results are being read in for all gene sets. They should be read in once, and then visualized and subset.

Add Results Table For TCGA Classifier Figure Panel D

Probably good to map compressed features with high weights to their respective genesets

Remake Supplementary Figure 1

Need to update with strip text background color - also should make it so it can be in portrait orientation

Add Vince as an Author

Will need to update the author list (and title) on the website once new preprint is posted

related to #181

cc @vincerubinetti

Update GTEx Figure to Include Correlation

Add correlation estimates for panels E and F

Consider changing `Z` dimension to `k` dimension

This may alleviate potential confusion between z dimension and z score language

Validate MYCN status in NBL Cell Lines

Related with #163

Resources include https://www.nature.com/articles/sdata201733/tables/3 and https://figshare.com/articles/STAR-reads/7613975

Remove Redundant Supplementary TCGA Figures

A couple figures are redundant - added with different names

Update the colors in the stability boxplot figures

The colors in these figures are not adding anything - they are actually a bit confusing

Describe Directory Structure in Module README

a more complete description of the directory tree structure will help orient a new viewer to the results.

Network Projection Results

I will need to determine how to store these results. They are quite large and there are many of them. I am thinking some sort of figshare or zenodo link

Also, see what happens for all feature predictions on TCGA cancer type and mutation

Determine which genes were removed from the GTEx Signature Transformation

Split GTEx Figure G and H into New Supplementary Figure

This Figure is large. Panels G and H can be moved to a supplement.

Change y axis label for Feature Rank Plot

The plot generated here needs an updated y axis label. It should read: "Absolute Rank Enrichment"

Move SVCCA

SVCCA does not seem to work for the sample activation patterns in our models. I will apply SVCCA to the weight matrices instead to see if the results appear more promising.

Reorder Modules to Squish in New Module 7

I am adding a new module 7 in #71 - i will need to update the other module numbers (GTEX and TCGA)

Update GTEx Supplementary Figure

Currently, A and B are plotted on the same row with two columns. I need to make two rows and 1 column instead

Find Sex Feature in TCGA

Related to #163 as was previously done in GTEx. Also, box plots can be changed to display different correlations with transformed data in both cases as well

Make Mission of Predicting Cancer-Types and Mutations Clear

Need to write carefully about this point in README (see #90) and especially in the manuscript

I removed the colorblindr dependency in #13 because the package is not currently a conda recipe. Adding back this dependency will require a conda-forge pull request that I will save for a later date.

Update Supplementary Figure 3 - Correlation Summary

Switch panels c and d with a and b

Update Supplementary Figure S4 - Stability

Switch labels for panels B and C

Add analysis to supplementary TCGA classification

Need to predict with top 1 feature
also determine which z the features are coming from

Additional Analysis: Applications to External Datasets

Get scores for all top scoring features across k dimension and algorithm for two publicly available datasets.

See if the score is associated with "separation" of target samples

Can also split out "monocyte" vs other in that plot

Add TCGA Classify Module README

Need to add results generated in #89 to an archived resource

Update GTEx Supplementary Figure

Need to lower case panel labels and move z to k

https://github.com/greenelab/BioBombe/blob/master/8.gtex-interpret/1.visualize-gtex-blood-interpretation.ipynb

Update Figure 5 - TCGA Classify

Should add points representing raw data in panel C. What is the performance and percent zero coefficients?

Update GTEx Geneset Panels C and D

After changes are merged in #125, the function plot_gene_set() will change. I will need to rerun the visualize notebook in the gtex module after the update

Update TCGA Supplementary Figure

Switch panels A and B - also, the two panels currently in A are not in the correct order

Missing x axis label in main coverage figure

Convert z score to p value and bonferroni correct in k dimension by geneset top feature plot

related to #108

Add GTEx Module README and analysis bash script

Visualize Max Score Feature by Dimension + Algorithm

I have biobombe scores for many datasets by collections - plot z dimension of max feature

Update Figure 1 - Add numbering for each BioBombe analysis bit

Update Figure 6 - Coverage Analysis

I don't think I need to label all facets - probably just A, B, and C is sufficient

Update Stability Figure Names

Names of main and supplementary figures need to be updated. Also files should be removed.

Rename Module 6 to `6.biobombe-projection`

Hex Color Tables

as @ajlee21 pointed out in #56 here

Is it worth creating a lookup table with colors -- HEX code as you've done before?

It will be good to update HEX colors in a table lookup. Also related to #14

Split out by HGSC subtype assignment

greenelab / biobombe Goto Github PK

biobombe's People

Contributors

Stargazers

Watchers

Forkers

biobombe's Issues

Recommend Projects

Recommend Topics

Recommend Org