At the moment some function files have s at the top: remove this code, and have

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

For the HPO focused package <a class="user-mention notranslate" data-hovercard-type="u

I used a package called <a href="https://docs.ropensci.org/piggyback/articles/intro.ht

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Cleanly separate functions from scripts + ensure all functions pass dev::check() about rare_disease_celltyping HOT 13 CLOSED

NathanSkene commented on June 1, 2024

Cleanly separate functions from scripts + ensure all functions pass dev::check()

from rare_disease_celltyping.

Comments (13)

bobGSmith commented on June 1, 2024 1

@bschilder Ok nice. Let me know if there's anything you think I should change. I made the parallel EWCE function just save the completed analysis as separate rds files, as that makes it easier to run the analysis in chunks and then come back to it later. I made another function that reads through the results output directory to check which gene lists have already been analysed. And finally theres a function for merging all of the small rds files into one data frame with, an extra column for the gene list names (e.g. the phenotype). I should probably add an option to do BH corrections on the merge results function.

from rare_disease_celltyping.

bobGSmith commented on June 1, 2024

github.com/ovrhuman/RareDiseaseEWCE
Made a repo for all the functions as a package here. @NathanSkene

from rare_disease_celltyping.

bobGSmith commented on June 1, 2024

Changed the rmd file for my report to call functions from the RareDiseaseEWCE package. I just need to finish off with the parallel EWCE results generating function and add that to the package. Just running it on the unfinished phenotypes now but need to tweak some things for logging failed phenotypes and merging all results to one dataframe at the end.

from rare_disease_celltyping.

NathanSkene commented on June 1, 2024

Just having a look through the functions here... many of them seem related to processing the HPO ontology. Might be worth pulling them out into a seperate repo. What percentage would you say are HPO specific? Could have an "HPO_explorer" package?

If you were to group the functions into "HPO ontology related functions" and "other", what would "other" be?

Ideally you want the functions within a package to be a somewhat coherant set which can have a vignette describing their general function.

from rare_disease_celltyping.

bobGSmith commented on June 1, 2024

The results generating stuff (not included yet) is for running EWCE in parallel and merging results into 1 data frame. Then as you say there are functions for doing stuff with the HPO, like finding ontology levels or identifying connected components within a subset. Most of those are then used in another bunch of functions which are for plotting the HPO EWCE results. I could split it up into generating results EWCE results from multiple gene lists, generic HPO_explorer stuff, and specific HPO EWCE results visualisation and analysis?

It might be possible to make some of the stuff for plotting HPO EWCE results more generic for cases when you're working with multiple gene lists as well. Would have to take out bits where it does something HPO specific and make that part into its own function for the other package i suppose?

from rare_disease_celltyping.

NathanSkene commented on June 1, 2024

I'd say the stuff for parallelising EWCE to run on multiple gene lists should be incorporated into the EWCE package. Or, given that there's functions for doing plotting etc as well, you could setup a new MultiEWCE package?

I'd defo lean towards seperating the EWCE related stuff from the HPO specific stuff though.

from rare_disease_celltyping.

bobGSmith commented on June 1, 2024

Ok yeah makes sense. I will try to split it in to two separate packages for now then.

from rare_disease_celltyping.

NathanSkene commented on June 1, 2024

For the HPO focused package @bschilder figured out a neat trick for storing the data files within github repos release files. Probably worth using that to store the HPO ontology data. Would save worrying about them changing the URL or file structure.

from rare_disease_celltyping.

bschilder commented on June 1, 2024

I used a package called piggyback which lets you store large files as "Assets". Here's some example code of how I implemented this:
https://github.com/neurogenomics/phenomix/blob/main/R/get_data.R

For the HPO itself, there's several ways you could do this. You could either use the ontologyIndex package which includes the HPO as a dataset, or you could download it in tabular format like I did here. In this case, I converted it to a binary matrix and then put it with all it's metadata in a Seurat object:

https://github.com/neurogenomics/phenomix/blob/main/R/prepare_HPO.R

I've uploaded the processed Seurat object here:
https://github.com/neurogenomics/phenomix/releases/tag/latest

from rare_disease_celltyping.

bobGSmith commented on June 1, 2024

Thanks @bschilder, at the moment I've just got ontologyIndex as a dependency. Do you think it would be better to store it like that?

I've also finished splitting the packages up into HPOExplorer (general HPO functions), MultiEWCE (Stuff related to EWCE on multiple gene lists), and HPOEWCE which is just really plotting functions fairly specific to this project.

from rare_disease_celltyping.

bschilder commented on June 1, 2024

@ovrhuman I think keeping ontologyIndex as a dep is fine bc then you can benefit from any updates that might occur later.

from rare_disease_celltyping.

bschilder commented on June 1, 2024

Regarding MultiEWCE, I'll take a look and see if there's some functions that would useful to add to EWCE 2.0 that I'm working on.

from rare_disease_celltyping.

bschilder commented on June 1, 2024

Packages now passing CRAN and Bioc checks here:
https://github.com/neurogenomics/HPOExplorer
https://github.com/neurogenomics/MultiEWCE

from rare_disease_celltyping.

Cleanly separate functions from scripts + ensure all functions pass dev::check() about rare_disease_celltyping HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent