Comments (13)
@bschilder Ok nice. Let me know if there's anything you think I should change. I made the parallel EWCE function just save the completed analysis as separate rds
files, as that makes it easier to run the analysis in chunks and then come back to it later. I made another function that reads through the results output directory to check which gene lists have already been analysed. And finally theres a function for merging all of the small rds files into one data frame with, an extra column for the gene list names (e.g. the phenotype). I should probably add an option to do BH corrections on the merge results function.
from rare_disease_celltyping.
github.com/ovrhuman/RareDiseaseEWCE
Made a repo for all the functions as a package here. @NathanSkene
from rare_disease_celltyping.
Changed the rmd file for my report to call functions from the RareDiseaseEWCE package. I just need to finish off with the parallel EWCE results generating function and add that to the package. Just running it on the unfinished phenotypes now but need to tweak some things for logging failed phenotypes and merging all results to one dataframe at the end.
from rare_disease_celltyping.
Just having a look through the functions here... many of them seem related to processing the HPO ontology. Might be worth pulling them out into a seperate repo. What percentage would you say are HPO specific? Could have an "HPO_explorer" package?
If you were to group the functions into "HPO ontology related functions" and "other", what would "other" be?
Ideally you want the functions within a package to be a somewhat coherant set which can have a vignette describing their general function.
from rare_disease_celltyping.
The results generating stuff (not included yet) is for running EWCE in parallel and merging results into 1 data frame. Then as you say there are functions for doing stuff with the HPO, like finding ontology levels or identifying connected components within a subset. Most of those are then used in another bunch of functions which are for plotting the HPO EWCE results. I could split it up into generating results EWCE results from multiple gene lists, generic HPO_explorer stuff, and specific HPO EWCE results visualisation and analysis?
It might be possible to make some of the stuff for plotting HPO EWCE results more generic for cases when you're working with multiple gene lists as well. Would have to take out bits where it does something HPO specific and make that part into its own function for the other package i suppose?
from rare_disease_celltyping.
I'd say the stuff for parallelising EWCE to run on multiple gene lists should be incorporated into the EWCE package. Or, given that there's functions for doing plotting etc as well, you could setup a new MultiEWCE package?
I'd defo lean towards seperating the EWCE related stuff from the HPO specific stuff though.
from rare_disease_celltyping.
Ok yeah makes sense. I will try to split it in to two separate packages for now then.
from rare_disease_celltyping.
For the HPO focused package @bschilder figured out a neat trick for storing the data files within github repos release files. Probably worth using that to store the HPO ontology data. Would save worrying about them changing the URL or file structure.
from rare_disease_celltyping.
I used a package called piggyback
which lets you store large files as "Assets". Here's some example code of how I implemented this:
https://github.com/neurogenomics/phenomix/blob/main/R/get_data.R
For the HPO itself, there's several ways you could do this. You could either use the ontologyIndex
package which includes the HPO as a dataset, or you could download it in tabular format like I did here. In this case, I converted it to a binary matrix and then put it with all it's metadata in a Seurat object:
https://github.com/neurogenomics/phenomix/blob/main/R/prepare_HPO.R
I've uploaded the processed Seurat object here:
https://github.com/neurogenomics/phenomix/releases/tag/latest
from rare_disease_celltyping.
Thanks @bschilder, at the moment I've just got ontologyIndex as a dependency. Do you think it would be better to store it like that?
I've also finished splitting the packages up into HPOExplorer (general HPO functions), MultiEWCE (Stuff related to EWCE on multiple gene lists), and HPOEWCE which is just really plotting functions fairly specific to this project.
from rare_disease_celltyping.
@ovrhuman I think keeping ontologyIndex
as a dep is fine bc then you can benefit from any updates that might occur later.
from rare_disease_celltyping.
Regarding MultiEWCE
, I'll take a look and see if there's some functions that would useful to add to EWCE 2.0 that I'm working on.
from rare_disease_celltyping.
Packages now passing CRAN and Bioc checks here:
https://github.com/neurogenomics/HPOExplorer
https://github.com/neurogenomics/MultiEWCE
from rare_disease_celltyping.
Related Issues (20)
- Recreate entire study in one Rmarkdown HOT 2
- Describe Human Cell Landscape CTD levels HOT 2
- Assess pLI in HPO genes HOT 6
- `phenomix`: Exploring more efficient methods for celltype enrichment HOT 5
- Identify variant-level mechanisms of each rare disease HOT 3
- Update website with the results from the new scRNA-seq datasets HOT 1
- Assess our results against known phenotype-celltype links HOT 4
- Regenerate manuscript figures with new results HOT 2
- Rewrite manuscript HOT 2
- Remake equations with color coding HOT 1
- Adjust congenital onset figure HOT 1
- Remove diagnosis/prognosis figures
- Redo Monarch recall stats HOT 1
- Create static versions of network plots
- Adjust ontology levels figure HOT 5
- Rework target prioritisation figure HOT 1
- Assess distribution of congenital phenotypes HOT 7
- Target prioritisation pipeline figure HOT 1
- Move AD/PD networks to supplementary materials HOT 1
- Include animal model availability in target prioritisation pipeline
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rare_disease_celltyping.