The cellrank_reproducibility_preprint from theislab

Clean up delta cell differentiation, Fig. 3

I will include this in the main pancreas notebook, that's much easier to maintain, single source of truth etc.

Clean up STEMNET comparison benchmarking

CoC

My remarks:

for the paths, instead of get_paths function which return a dictionary, let's create 1 file where the paths are defined as constans - I think it will be more readable doing and that we only import paths that we need, e.g.

from path import CACHE_DIR, FIG_DIR

If we go for the approach above, let's also define some naming convention for these constants (e.g. directories ending with _DIR, data-related stuff prefixed with DATA_, caching with CACHE_, etc.)
environment.yml or requirements.txt - I'd provide a conda environment.yaml named cellrank-reproduciblity with all the correct package versions (if something requires a different version, then a separate yaml file will be within that directory)
I'd also create a small skeleton .ipynb as basis for all notebooks - this should contain e.g. importing the default packages (like scanpy/cellrank/etc), printing the versions and the sections you mention (section not needed, like Plot results in preprocessing notebooks will be removed when filling the notebook up)
initials in the notebooks: it's sometimes hard for me to distinguish ml and mk, maybe if we could use different aliases or capitalize it or move it to the front before the date
same for the dates, YYYYMMDD is not friendly format (at least for me) to read, I'd include dashes as YYYY-MM-DD.
relative paths: do you mean relative to this repo's root or relative to the position of the file/notebook? I assume you mean the latter
I'd also make 1 issue for 1 figure or their dependency and do regular PRs

Clean the GPCCA toy example notebook

Supplemental figure where we illustrate the idea behind the GPCCA algortithm.

Clean up and add the pancreas notebooks

This concerns the main figures 2 and 3 as well as a number of supplemental figures. Add the Palantir pseudotime to the dataset on figshare and add the magic imputed data as an extra array to figshare.

Clean up memory performance

Prettify the table in the README a bit

The repo is public now, so we can...

add links to nbviewer

Clean up Palantir comparison benchmarking

Add skeleton notebook

This should serve as a starting point for the restructuring of notebooks.

Clean up the lung analysis

Package requirements

Informally:

R.utils
peakMEM
SparseMM
destiny
FateID
RaceID
STEMNET

Test if the pipeline works

TODOs:

Clean up FateID comparison benchmarking

Repo size

For some reason, it's huge... Did any of us commit and data inside?
Inspecting this, it's git objects (245M ./.git/objects).
And 99M ./notebooks

I suggest we prune this once everything is done - I can do a test run on my private fork to see if we can prune the git objects.

Share the final conda environment.

Had to install a couple new packages, share the final version once we're done.

Clean pancreas main notebook for figure 2

Caching

I haven't yet added this to the README. I'm still going to need scachepy in my notebooks because I don't want to re-compute velocities and my stochastic kernel each time I have to re-generate a figure. I suggest we have a caching directory that mirrors the structure of the data directory. We won't share the actual cached files because they are too large but I will place .gitkeep files so that we have the same folder structure. What are your thoughts on this?

Also in the links please...

Print all relevant versions

@michalk8 let's keep in mind to print important package versions like FateID, STEMNET or Palantir in the banchmark notebooks as these are not included in cr.logging.print_versions()

Update comparison benchmarks

Clean up

Palantir notebook (I think you should do this one @Marius1311 )
STEMNET notebook
FateID notebook (almost done)

in the main uncertainty notebook, insert links to to the stochastic MC notebook and also to the robustness notebook
make sure supplemental gene trends are saved to the same directory.
remove the code of conduct again.

theislab / cellrank_reproducibility_preprint Goto Github PK

cellrank_reproducibility_preprint's People

Contributors

Stargazers

Watchers

cellrank_reproducibility_preprint's Issues

Recommend Projects

Recommend Topics

Recommend Org