Giter Site home page Giter Site logo

ggoncoplot's People

Contributors

selkamand avatar

Stargazers

 avatar

Watchers

 avatar

ggoncoplot's Issues

Update combine_plots to respect gg_tmb_height and friends

gg_tmb_height and gg_gene_width are currently ignored by combine_plots

similarly, there is no option gg_metadata_height option yet.

We need to

  1. add gg_metadata_height as a user-configurable paramater

  2. rework combine_plots to respect gg_tmb_height, gg_gene_width and gg_metadata_height

Shorten error message when lots of samples with metadata have no mutations

[X] samples with metadata have no mutations. Fitering these out
โ„น To keep these samples, set metadata_require_mutations = FALSE. To view them in the oncoplot ensure you additionally set show_all_samples = TRUE

Then lists all samples - even if there are hundreds.

If theres > 10 samples missing, just print the number

Remove commented code from minimal usage example in usage vignette

gbm_df |> 
  [ggoncoplot](https://selkamand.github.io/ggoncoplot/reference/ggoncoplot.html)(
    col_genes = 'Hugo_Symbol', 
    col_samples = 'Tumor_Sample_Barcode', 
    #col_mutation_type = 'Variant_Classification', 
    # topn = 10, 
    # interactive = TRUE
  )

to

gbm_df |> 
  [ggoncoplot](https://selkamand.github.io/ggoncoplot/reference/ggoncoplot.html)(
    col_genes = 'Hugo_Symbol', 
    col_samples = 'Tumor_Sample_Barcode'
  )

Create Gene Barplot Functionality

To the right side of an oncoplot, we should optionally plot a barplot showing # of samples with gene mutated (fill colour based on mutation type)

Oncoplots upside down

Oncoplot gene rankings are inverted. Tests should have picked this up.

fix unit tests to pick up gene ranking order appropriately

Enforce fixed colour scheme across different calls

Problem:

To maximise flexibility of ggoncoplot, we don't force the mutation types defined by col_mutation_type to align to any ontology. The end-user can use whatever mutation types they like. The problem with this is that this makes it difficult to automatically choose colours for these different mutation types in a manner thats consistent across different datasets.

Currently, we use an RColorBrewer palette and decide which colour is attached to each mutation type based on the frequency of the mutation types. To demonstrate why this is not ideal lets go through an example. Say you produce an oncoplot for two different cohorts, one of which is dominated by missense mutatons, the other by silent mutations. In one of these oncoplots missense mutations will be the same colour as silent mutations in the other. This would be extremely confusing.

potential solutions

  1. Force users to use some ontology for 'mutation_type'. Then we'll know all the possible mutation types in advanced and can make a single manual palette that maps each value to a colour consistently no matter what data is input. Major downside is the lack of choice for the end user. It may also be a lot of work for end-users to convert their mutation_type ontology to whatever we enforce. What ontology should be enforced? Should we try and guess at the mapping based on names of mutation_type? we might be able to provide mappings from one ontology to another to help users streamline data preprocessing

  2. We force users to define a mapping of mutation_types to colours. We make sure they have accounted for every value in their dataset. We could help with this by providing users with a basic example palette they should supply ggoncoplot. ggoncoplot would error unless user supplied this mapping.

  3. Both -- force an ontology UNLESS user supplies a palette mapping all mutation_types to colours. Best of both worlds

Each potential solution has its benefits and drawbacks. 1 is more work for the end-user but will make it easier to integrate ggoncoplot in shiny apps and pipelines. 2 is easier and more flexible for end-user, and allows domain-specific mutation_types to be used (e.g. there'd be the option to colour mutations based on germline/somatic origins in cancer data visualisation). 3 Is more work for me, and adds some complexity to the usage BUT with some careful info/warning messages sent to cli we could probably make this quite intuitive for end-users

Plan of attack

  1. Start implementing (1) as step 1. If I have time I'll work towards (3)

On click copy sample ID to clipboard

ggiraph supports running javascript on click events (without shiny)

See below for details
https://davidgohel.github.io/ggiraph/articles/offcran/using_ggiraph.html#using-onclick-1

One typically annoying thing about oncoplots is seeing interesting samples and having to copy out sample IDs. It would be more convenient to just click the id and automatically copy the sample name.

In javascript, you can write text to clipboard

navigator.clipboard.writeText('text')

Could fire this on an onclick event

separate business vs vis logic in ggoncoplot

I need to be able to unit test the data transformation code required to plot an oncoplot.

Currently data transformation code is packaged in the same function as ggoncoplot. I should pull out data transformation into a separate
function e.g. ggoncoplot_data_prep - then i can unit test that separately to the visualisation code

Sort by clinical annotations

Should be powered by the rank package.

There's a commented section that indicates where sample sorting code should go (right before refactor of clinical & mutational dataframes. No need to use inbuilt sorting functionality of gg1d package

Deal with size issue

CRAN is the best place for this package, but currently package size is > 5mb limit.

Whats blowing out our size:

  1. documentation (3.8MB)
  2. testdata (1.6MB)

Both of these have nothing to do with the actual package functionality, so should be super solvable
2 is the easiest to solve - just move MAF csv files / R dataframes that we're using for testing into its own github R package with functions that stream the data. We can then install this package and since the data is only used for testing and docs we can add as a suggests (not an import a.k.a required dependency)

1 is a little trickier. Its probably all the interactive plots in the vignette. storing these will require some space. Rendering static plots would save the space but really take away from the documentation. Best solution would be to keep docs big but decouple from the R package. Not sure the best way to do this without causing too much pain long term. Its just so convenient to use vignettes and CI workflows. Will need more thought

Add support for grouping oncoplot by pathways

Think about:

  1. how should sorting be affected.
  2. how should you chose which pathways to show first/second/third etc.
  3. What should input look like (almost certainly a 2-column dataframe: 1 with genes, 1 with pathways)

Add data_id to ggoncoplot_prep_df returned dataframes and add test

Current tests

# Check dataframe has required names
  expect_named(prepped_df, expected = c('Sample', 'Gene', 'MutationType', 'Tooltip'), ignore.order = TRUE)
  expect_named(prepped_df_no_mutation_type, expected = c('Sample', 'Gene', 'MutationType', 'Tooltip'), ignore.order = TRUE)

Should we add test for data_id column

Avoid double rendering of mutated tiles (grey tile rendered underneath coloured)

To get the grey tiles on umutated squares we render a base tile layer of gray, then render colour over that
This may lead to nontrivial increases in render time for large cohorts (untested).

Lets change this to only render grey on the tiles that won't have mutations present - basically means we filter the data first

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.