theislab / scvelo Goto Github PK

RNA Velocity generalized through dynamical modeling

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

scvelo's Introduction

scVelo - RNA velocity generalized through dynamical modeling

scVelo is a scalable toolkit for RNA velocity analysis in single cells; RNA velocity enables the recovery of directed dynamic information by leveraging splicing kinetics ¹. scVelo collects different methods for inferring RNA velocity using an expectation-maximization framework ², deep generative modeling ³, or metabolically labeled transcripts⁴.

scVelo's key applications

estimate RNA velocity to study cellular dynamics.
identify putative driver genes and regimes of regulatory changes.
infer a latent time to reconstruct the temporal sequence of transcriptomic events.
estimate reaction rates of transcription, splicing and degradation.
use statistical tests, e.g., to detect different kinetics regimes.

Citing scVelo

If you include or rely on scVelo when publishing research, please adhere to the following citation guide:

EM and steady-state model

If you use the EM (dynamical) or steady-state model, cite

@article{Bergen2020,
  title = {Generalizing RNA velocity to transient cell states through dynamical modeling},
  volume = {38},
  ISSN = {1546-1696},
  url = {http://dx.doi.org/10.1038/s41587-020-0591-3},
  DOI = {10.1038/s41587-020-0591-3},
  number = {12},
  journal = {Nature Biotechnology},
  publisher = {Springer Science and Business Media LLC},
  author = {Bergen, Volker and Lange, Marius and Peidli, Stefan and Wolf, F. Alexander and Theis, Fabian J.},
  year = {2020},
  month = aug,
  pages = {1408–1414}
}

veloVI

If you use veloVI (VI model), cite

@article{Gayoso2023,
  title = {Deep generative modeling of transcriptional dynamics for RNA velocity analysis in single cells},
  ISSN = {1548-7105},
  url = {http://dx.doi.org/10.1038/s41592-023-01994-w},
  DOI = {10.1038/s41592-023-01994-w},
  journal = {Nature Methods},
  publisher = {Springer Science and Business Media LLC},
  author = {Gayoso, Adam and Weiler, Philipp and Lotfollahi, Mohammad and Klein, Dominik and Hong, Justin and Streets, Aaron and Theis, Fabian J. and Yosef, Nir},
  year = {2023},
  month = sep
}

RNA velocity inference through metabolic labeling information

If you use the implemented method for estimating RNA velocity from metabolic labeling information, cite

@article{Weiler2023,
  title = {Unified fate mapping in multiview single-cell data},
  url = {http://dx.doi.org/10.1101/2023.07.19.549685},
  DOI = {10.1101/2023.07.19.549685},
  publisher = {Cold Spring Harbor Laboratory},
  author = {Weiler, Philipp and Lange, Marius and Klein, Michal and Pe’er, Dana and Theis, Fabian J.},
  year = {2023},
  month = jul
}

Support

Found a bug or would like to see a feature implemented? Feel free to submit an issue. Have a question or would like to start a new discussion? Head over to GitHub discussions. Your help to improve scVelo is highly appreciated. For further information visit scvelo.org.

scvelo's People

Stargazers

Watchers

Forkers

zorrodong suger0917 mgood2 ahy1221 jakakokosar marius1311 saksham219 spalit gustovagoes fidelram cliffzhou92 qinqian jashahir stefanpeidli xyl012 xparx yale73 antonio-miranda jpbernal888 sitinggan michalk8 mgawlinska houruiyan weilerp javi-jyskebank-dk nolamesfin pedrolmoura mohsennaghipourfar naghipourfar rahulbhadani kc-lan yaohuan23 mchernys dongwei1220 lijiacd985 gokceneraslan realyuyangyang yannikschaelte mcbrlab xjyx shouwenwang xloctran standardgalactic maddemaddi dongzehe volkerbergen shulp2211 abuchin denghb001 mayunlong89 sourabh-bhide ctingluo grenkoca alexanderaivazidis philloidin kunwang34 wwxkenmo ethan-earlie j1205 mubashermohammed cbiagii jxshi npklein farewellznm lipingshu shobistassen abclisy wunnies shicheng-guo rongtingting gomezcan oisin-m michaelgmz shijianasdf bharatm26 yinyeya adamgayoso alexqisong kimmo1019 justinguerra213 wgsim yunbokai johschnee idriskb yxy92 mdmkac1 remy-ai ronfinn andimunteanu feigeliudan01 moloch0 jhliu17 chrisarmstrong151 lijc0804 puckie64 pinellolab jwalewski pearl520 martinkim0 abelgurung

scvelo's Issues

scanpy 1.4.2 breaks pp.moments

The new version of scanpy released today appears to break scvelo.pp.moments (and presumably much of scanpy itself). This is reproducible with the Dentate Gyrus example data. Full error (following the example vignette):

>>> scv.pp.moments(adata, n_pcs=30, n_neighbors=30)
computing neighbors
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/scvelo/preprocessing/moments.py", line 45, in moments
    neighbors(adata, n_neighbors=n_neighbors, use_rep=use_rep, n_pcs=n_pcs, method=method, metric=metric)
  File "/usr/local/lib/python3.7/site-packages/scvelo/preprocessing/neighbors.py", line 82, in neighbors
    neighbors = Neighbors(adata)
  File "/usr/local/lib/python3.7/site-packages/scanpy/neighbors/__init__.py", line 466, in __init__
    self.n_neighbors = int(self._distances.count_nonzero() / self._distances.shape[0])
AttributeError: 'NoneType' object has no attribute 'count_nonzero'

Rolling back to scanpy 1.4.1 solves the problem. @theislab - Filing this issue here rather than in the scanpy repo as I don't have a reproducible upstream example.

scv.pp.moment error

Hello,

First try with this testing release.
I cloned the git repo, and installed with pip install . on python 3.6.5
I loaded an annotated dataframe I am using to work on (created/pp with scanpy), then:

>>> scv.pp.moments(adata)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/path/to/scvelo/scvelo/preprocessing/moments.py", line 42, in moments
    adata.layers['Ms'] = csr_matrix.dot(connectivities, adata.layers['spliced']).toarray()
  File "/path/to/local/python/python-dev/lib/python3.6/site-packages/anndata/layers.py", line 41, in __getitem__
    return self._layers[key]
KeyError: 'spliced'

Many thanks for your work,
Best.

I'm trying to run the DentateGyrus.ipynb example and facing this error

adata = scv.read("data/DentateGyrus/10X43_1.loom", sparse=True, cache=True, backup_url="http://pklab.med.harvard.edu/velocyto/DG1/10X43_1.loom")

TypeError Traceback (most recent call last)
in ()
1 adata = scv.read("data/DentateGyrus/10X43_1.loom", sparse=True, cache=True,
----> 2 backup_url="http://pklab.med.harvard.edu/velocyto/DG1/10X43_1.loom")

~/anaconda3/lib/python3.6/site-packages/scanpy/readwrite.py in read(filename, backed, sheet, ext, delimiter, first_column_names, backup_url, cache, **kwargs)
76 return _read(filename, backed=backed, sheet=sheet, ext=ext,
77 delimiter=delimiter, first_column_names=first_column_names,
---> 78 backup_url=backup_url, cache=cache, **kwargs)
79 # generate filename and read to dict
80 filekey = filename

~/anaconda3/lib/python3.6/site-packages/scanpy/readwrite.py in _read(filename, backed, sheet, ext, delimiter, first_column_names, backup_url, cache, suppress_cache_warning, **kwargs)
472 adata = _read_softgz(filename)
473 elif ext == 'loom':
--> 474 adata = read_loom(filename=filename, **kwargs)
475 else:
476 raise ValueError('Unkown extension {}.'.format(ext))

~/anaconda3/lib/python3.6/site-packages/anndata/readwrite/read.py in read_loom(filename, sparse, cleanup, X_name, obs_names, var_names, dtype)
182 var=var,
183 layers=layers,
--> 184 dtype=dtype)
185 return adata
186

~/anaconda3/lib/python3.6/site-packages/anndata/base.py in init(self, X, obs, var, uns, obsm, varm, layers, raw, dtype, shape, filename, filemode, asview, oidx, vidx)
664 layers=layers,
665 dtype=dtype, shape=shape,
--> 666 filename=filename, filemode=filemode)
667
668 def _init_as_view(self, adata_ref: 'AnnData', oidx: Index, vidx: Index):

~/anaconda3/lib/python3.6/site-packages/anndata/base.py in _init_as_actual(self, X, obs, var, uns, obsm, varm, raw, layers, dtype, shape, filename, filemode)
889
890 # layers
--> 891 self._layers = AnnDataLayers(self, layers, dtype)
892
893 def sizeof(self) -> int:

~/anaconda3/lib/python3.6/site-packages/anndata/layers.py in init(self, adata, layers, dtype, adata_ref, oidx, vidx)
35 raise ValueError('Shape does not fit.')
36 if layer.dtype != np.dtype(dtype):
---> 37 self._layers[key] = layer.astype(dtype, copy=False)
38 else:
39 self._layers[key] = layer

TypeError: astype() got an unexpected keyword argument 'copy'

scvelo==0.1.16.dev18+3f160f2 scanpy==1.4 anndata==0.6.18 loompy==2.0.16 numpy==1.13.1 scipy==0.19.1 matplotlib==2.1.2 sklearn==0.20.2 pandas==0.20.3

Installation from github failed.

Hi,

I tried to install scvelo from Github on a Linux machine:

git clone https://github.com/theislab/scvelo.git
cd scvelo
pip install .

Directory '.' is not installable. File 'setup.py' not found.

Can you provide a setup.py file in order to make the installation work?

grid embedding vs stream show opposite directions

Similar to #55 I see a discordance of arrows when using the grid embedding:

The preprocessing had been done using scvelo:

    scv.pp.filter_and_normalize(adata, n_top_genes=2000)
    scv.pp.pca(adata)
    scv.pp.neighbors(adata)
    scv.pp.moments(adata, n_pcs=15, n_neighbors=30)
    scv.tl.velocity(adata, mode='stochastic')
    scv.tl.velocity_graph(adata, basis='draw_graph_fa")
    scv.tl.velocity_embedding(adata, basis='draw_graph_fa")

Then I select a single cluster to look in detail at the arrows:

adata_cluster = adata[adata.obs.leiden=='10'].copy()

However, the grid arrows point in the opposite direction (the stream arrows agree with the individual velocities):

To try to address the problem I recompute the velocity embedding:

    scv.tl.velocity_embedding(adata_cluster, basis='draw_graph_fa")

Now the grid arrows agreed with the individual velocity:

But if I look at the velocity arrows again, now they changed direction (the stream plot also agrees with the individual velocities):

Interestingly this only happens to some clusters that seem to overlap in 2D with other clusters. For the majority of clusters I don't have a problem.

The results are reproducible also using the syntax:

scv.tl.velocity_embedding(adata, basis='draw_graph_fa", color='leiden', groups=['10'])
scv.tl.velocity_embedding_grid(adata, basis='draw_graph_fa", color='leiden', groups=['10'])

Could you help me find the reason behind this? It is very annoying to see that the arrows change direction.

clean up object

Dear,
I cannot find any documents about the parameter cleanup in sc.read(), so I don't know whether I should set cleanup=True or cleanup=False, could you explain this parameter for me ?

using velocity_embedding_grid on a subset of adata

Hi guys,
I am interested in calling the function velocity_embedding_grid not on the full adata object but only on a subset of it (e.g. on batch j). I would be happy with having a plot of the subset data in the old grid, i.e. the grid interpolation computed on the full data set. But actually it would even be better, to have a new interpolation grid based on the subset data only.
Thanks for any comments on this!
Best, Jonas

merging with a loom file

Hej,

I am trying to use scv.utils.merge(adata, adata_loom) to merge my dataset used with scanpy and the related loom dataset opened with scvelo. However, adata has dimension 7370x15000, while adata_loom has dimension 282016x33694. adata is the concatenation of 3 datasets, one from each individual, and the loom file I used is the combination of the 3 separate individuals' file through loompy.combine.

I am pretty ok with the second dimension (I kept the most expressed 15000 genes in scanpy), but the first dimension is not clear to me, since it is really large, and generates an error because adata and adata_loom have the attribute layers containing matrices of incompatible shape.
Is the first dimension to be interpreted in another way than just cells? Should I run the velocity analysis on the loom file without being allowed to merge?

Cheers,
Samuele

Is the transition probability matrix in scVelo same as RNA velocity paper ?

Hi,
I realized that the transition probability matrix estimated using RNA velocity and some graph analysis on this transition matrix might be a good way to quantify cell state transition. The scvelo actually implement such idea according to the function names. Since I am not familiar with python, I was wondering that is the transition probability matrix calculated from scvelo same as the RNA velocity paper ?
According to RNA velocity paper,

We therefore developed an approach that places the velocity arrow in the direction in which expression difference is best correlated with the estimated velocity vector, controlling for the cell density distribution. The direction was estimated as follows. We calculated a transition probability matrix P by applying an exponential kernel on the Pearson correlation coefficient between the velocity vector and cell state difference vectors:

where r_ij is the difference vector between the expression vectors s_i and s_j of cells i and j transformed with a variance stabilizing (elementwise) transformation , and d_i is the sigma-transformed velocity extrapolation vector of the cell i:

Error when importing scvelo in notebook

Hi,
I'm getting the following error when trying to initially import scvelo through import scvelo as scv (currently running scvelo 0.1.16, scanpy 1.4, numpy 1.15.4).
Any help would be greatly appreciated, thanks!

TypeError Traceback (most recent call last)
in
----> 1 import scvelo as scv
2 import scanpy as sc
3 scv.logging.print_versions()

~\Anaconda3\lib\site-packages\scvelo_init_.py in
1 """scvelo - stochastic single cell RNA velocity"""
2
----> 3 from .get_version import get_version
4 version = get_version(file)
5 del get_version

~\Anaconda3\lib\site-packages\scvelo\get_version.py in
143
144
--> 145 version = get_version(file)
146
147

~\Anaconda3\lib\site-packages\scvelo\get_version.py in get_version(package)
137 return str(
138 get_version_from_dirname(name, parent)
--> 139 or get_version_from_git(parent)
140 or get_version_from_metadata(name, parent)
141 or "0.0.0"

~\Anaconda3\lib\site-packages\scvelo\get_version.py in get_version_from_git(parent)
55 try:
56 p = run(["git", "rev-parse", "--show-toplevel"],
---> 57 cwd=parent, stdout=PIPE, stderr=PIPE, encoding="utf-8", check=True)
58 except (OSError, CalledProcessError):
59 return None

~\Anaconda3\lib\subprocess.py in run(input, timeout, check, *popenargs, **kwargs)
401 kwargs['stdin'] = PIPE
402
--> 403 with Popen(*popenargs, **kwargs) as process:
404 try:
405 stdout, stderr = process.communicate(input, timeout=timeout)

~\Anaconda3\lib\subprocess.py in init(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors)
705 c2pread, c2pwrite,
706 errread, errwrite,
--> 707 restore_signals, start_new_session)
708 except:
709 # Cleanup if the child failed starting.

~\Anaconda3\lib\subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, unused_restore_signals, unused_start_new_session)
988 env,
989 cwd,
--> 990 startupinfo)
991 finally:
992 # Child is launched. Close the parent's copy of those pipe

TypeError: CreateProcess() argument 8 must be str or None, not WindowsPath

Discordance between arrows in different velocity embeddings.

I started experimenting with scVelo and got some interesting results but also noticed some discordance between the different velocity embeddings.

Result of scv.pl.velocity_embedding:

Result of scv.pl.velocity_embedding_grid:

Result of scv.pl.velocity_embedding_stream:

Out of those 3 the grid representation seems a little off, inverting many of the thick arrows in the green and blue clusters.

Additionally, the arrows in the PAGA plot don't seem to correspond very well either:

sc.pl.paga_compare(
  adata, basis = "umap", threshold = .15, arrowsize = 10,
  edge_width_scale = .5, transitions = "transitions_confidence",
  dashed_edges = "connectivities"
)

Did I misunderstand the output of these representations?

adding the scvelo features to an other matrix

Hi,

Thank you very much for making this software, its great!
My question might seem easy but I am still new at this. I have run my data in two ways, the first one using scanpy with mnnpy for batch effect correction which I called adata, and a second time using scvelo from loomfiles, which I called bdata. My question is how can I add the bdata features onto the adata matrix or vice versa ( I don't know what it is best) so that when I run the scv.pl.velocity_embedding_grid cmd it runs on the umap (or tsne) generated from adata?

error from scv.pp.moments(adata)

Hi,
I'm attempting to run a scanpy processed .h5ad file through scvelo but am running into the error pasted below.
I'm relatively new to the whole scene, any help is much appreciated. Let me know if you need to see any other info from the attempted run, cheers!

KeyError Traceback (most recent call last)
in ()
----> 1 scv.pp.moments(adata)

~\Anaconda3\lib\site-packages\scvelo\preprocessing\moments.py in moments(data, n_neighbors, n_pcs, mode, use_rep, recurse_neighbors, renormalize, plot, copy)
58
59 logg.info('computing moments based on ' + str(mode), r=True)
---> 60 normalize_layers(adata)
61
62 connectivities = get_connectivities(adata, mode, recurse_neighbors=recurse_neighbors)

~\Anaconda3\lib\site-packages\scvelo\preprocessing\utils.py in normalize_layers(data, layers, by_total_size, max_proportion_per_cell, copy)
273
274 for layer in layers:
--> 275 if not_normalized_yet(adata, layer):
276 counts_per_cell = get_initial_size(adata, layer, by_total_size)
277 if max_proportion_per_cell is not None and (0 < max_proportion_per_cell < 1):

~\Anaconda3\lib\site-packages\scvelo\preprocessing\utils.py in not_normalized_yet(adata, layer)
269
270 def not_normalized_yet(adata, layer):
--> 271 X = adata.layers[layer]
272 return np.allclose((X.data[:10] if issparse(X) else X[0]) % 1, 0, atol=1e-3)
273

~\Anaconda3\lib\site-packages\anndata\layers.py in getitem(self, key)
43 return self._adata_ref.layers[key][self._oidx, self._vidx]
44 else:
---> 45 return self._layers[key]
46
47 def setitem(self, key, value):

KeyError: 'unspliced'

use even smaller data in tests

everything in the tests is now blazingly fasts except for package installation (44s) and the tests themselves (200s).

i think we can’t get much faster for the installation (most of it is probably dependency resolution), but i’m sure there’s a way to speed up the tests.

Failure upon GitHub installation

Hey Volker!

(python3_backup) ubuntu@ip-172-31-35-11:~/code$ git clone https://github.com/theislab/scvelo.git
Cloning into 'scvelo'...
remote: Counting objects: 541, done.
remote: Total 541 (delta 0), reused 0 (delta 0), pack-reused 541
Receiving objects: 100% (541/541), 108.40 KiB | 0 bytes/s, done.
Resolving deltas: 100% (331/331), done.
Checking connectivity... done.
(python3_backup) ubuntu@ip-172-31-35-11:~/code$ 
(python3_backup) ubuntu@ip-172-31-35-11:~/code$ 
(python3_backup) ubuntu@ip-172-31-35-11:~/code$ cd scvelo
(python3_backup) ubuntu@ip-172-31-35-11:~/code/scvelo$ 
(python3_backup) ubuntu@ip-172-31-35-11:~/code/scvelo$ 
(python3_backup) ubuntu@ip-172-31-35-11:~/code/scvelo$ ls
docs  LICENSE  pyproject.toml  pytest.ini  README.rst  scvelo  setup.py  tests
(python3_backup) ubuntu@ip-172-31-35-11:~/code/scvelo$ 
(python3_backup) ubuntu@ip-172-31-35-11:~/code/scvelo$ pip install -e .
Obtaining file:///home/ubuntu/code/scvelo
  Missing build requirements in pyproject.toml for file:///home/ubuntu/code/scvelo.
  This version of pip does not implement PEP 517 so it cannot build a wheel without 'setuptools' and 'wheel'.
  Installing build dependencies ... done
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/home/ubuntu/code/scvelo/setup.py", line 60, in <module>
        setup()
      File "/home/ubuntu/code/scvelo/setup.py", line 25, in setup
        metadata = common.make_metadata(module, ini_info)
      File "/tmp/pip-build-env-ifv505m8/lib/python3.6/site-packages/flit/common.py", line 305, in make_metadata
        md_dict.update(get_info_from_module(module))
      File "/tmp/pip-build-env-ifv505m8/lib/python3.6/site-packages/flit/common.py", line 111, in get_info_from_module
        docstring, version = get_docstring_and_version_via_import(target)
      File "/tmp/pip-build-env-ifv505m8/lib/python3.6/site-packages/flit/common.py", line 95, in get_docstring_and_version_via_import
        m = sl.load_module()
      File "<frozen importlib._bootstrap_external>", line 399, in _check_name_wrapper
      File "<frozen importlib._bootstrap_external>", line 823, in load_module
      File "<frozen importlib._bootstrap_external>", line 682, in load_module
      File "<frozen importlib._bootstrap>", line 265, in _load_module_shim
      File "<frozen importlib._bootstrap>", line 684, in _load
      File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
      File "<frozen importlib._bootstrap_external>", line 678, in exec_module
      File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
      File "scvelo/__init__.py", line 3, in <module>
        from get_version import get_version
    ModuleNotFoundError: No module named 'get_version'

I tried fixing it by setting the line

from ._version import get_versions

as in Scanpy (added by versioneer). But this still fails. I then figured that you haven't committed any versioneer files to GitHub, it seems. Maybe it's just that.

ValueError: cannot reindex from a duplicate axis

I'm trying to merge the loom file with my scanpy object using the following:

adata_velocity = scv.read('/home/ec2-user/velocyto/aggregate.loom', cache=False) scv.utils.merge(adata, adata_velocity)

But I get the following error:

`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in
1 adata_velocity = scv.read('/home/ec2-user/velocyto/aggregate.loom', cache=False)
----> 2 scv.utils.merge(adata, adata_velocity)

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/scvelo/read_load.py in merge(adata, ldata, copy)
134 same_vars = (len(_adata.var_names) == len(_ldata.var_names) and np.all(_adata.var_names == _ldata.var_names))
135 if len(common_vars) > 0 and not same_vars:
--> 136 _adata._inplace_subset_var(common_vars)
137 _ldata._inplace_subset_var(common_vars)
138

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/anndata/base.py in _inplace_subset_var(self, index)
1505 Same as adata = adata[:, index], but inplace.
1506 """
-> 1507 adata_subset = self[:, index].copy()
1508 self._init_as_actual(adata_subset, dtype=self._X.dtype)
1509

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/anndata/base.py in getitem(self, index)
1371 def getitem(self, index: Index) -> 'AnnData':
1372 """Returns a sliced view of the object."""
-> 1373 return self._getitem_view(index)
1374
1375 def _getitem_view(self, index: Index) -> 'AnnData':

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/anndata/base.py in _getitem_view(self, index)
1374
1375 def _getitem_view(self, index: Index) -> 'AnnData':
-> 1376 oidx, vidx = self._normalize_indices(index)
1377 return AnnData(self, oidx=oidx, vidx=vidx, asview=True)
1378

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/anndata/base.py in _normalize_indices(self, index)
1351 obs, var = unpack_index(index)
1352 obs = _normalize_index(obs, self.obs_names)
-> 1353 var = _normalize_index(var, self.var_names)
1354 return obs, var
1355

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/anndata/base.py in _normalize_index(index, names)
264 # incredibly faster one
265 positions = pd.Series(index=names, data=range(len(names)))
--> 266 positions = positions[index]
267 if positions.isnull().values.any():
268 not_found = positions.index[positions.isnull().values]

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/pandas/core/series.py in getitem(self, key)
909 key = check_bool_indexer(self.index, key)
910
--> 911 return self._get_with(key)
912
913 def _get_with(self, key):

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/pandas/core/series.py in _get_with(self, key)
951 return self.loc[key]
952
--> 953 return self.reindex(key)
954 except Exception:
955 # [slice(0, 5, None)] will break if you convert to ndarray,

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/pandas/core/series.py in reindex(self, index, **kwargs)
3736 @appender(generic.NDFrame.reindex.doc)
3737 def reindex(self, index=None, **kwargs):
-> 3738 return super(Series, self).reindex(index=index, **kwargs)
3739
3740 def drop(self, labels=None, axis=0, index=None, columns=None,

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/pandas/core/generic.py in reindex(self, *args, **kwargs)
4354 # perform the reindex on the axes
4355 return self._reindex_axes(axes, level, limit, tolerance, method,
-> 4356 fill_value, copy).finalize(self)
4357
4358 def _reindex_axes(self, axes, level, limit, tolerance, method, fill_value,

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/pandas/core/generic.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
4372 obj = obj._reindex_with_indexers({axis: [new_index, indexer]},
4373 fill_value=fill_value,
-> 4374 copy=copy, allow_dups=False)
4375
4376 return obj

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/pandas/core/generic.py in _reindex_with_indexers(self, reindexers, fill_value, copy, allow_dups)
4488 fill_value=fill_value,
4489 allow_dups=allow_dups,
-> 4490 copy=copy)
4491
4492 if copy and new_data is self._data:

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/pandas/core/internals/managers.py in reindex_indexer(self, new_axis, indexer, axis, fill_value, allow_dups, copy)
1222 # some axes don't allow reindexing with dups
1223 if not allow_dups:
-> 1224 self.axes[axis]._can_reindex(indexer)
1225
1226 if axis >= self.ndim:

~/anaconda3/envs/sc-tutorial/lib/python3.7/site-packages/pandas/core/indexes/base.py in _can_reindex(self, indexer)
3085 # trying to reindex on an axis with duplicates
3086 if not self.is_unique and len(indexer):
-> 3087 raise ValueError("cannot reindex from a duplicate axis")
3088
3089 def reindex(self, target, method=None, level=None, limit=None,

ValueError: cannot reindex from a duplicate axis`

shape does not fit

Dear,
I am using scvelo to process my data. I have a mouse single cell dataset (9 samples and 40,000 cells in total), scvelo works well. But, for another mouse single cell dataset (2 samples and 4,000 cells in total), if I keep all the genes (about 18,000 genes), scvelo works well, while If filter some genes with adata = adata[:, filter_result.gene_subset] (about 2,000 genes are kept), the following error occurs. Could you help me find the cause ?

In [100]: scv.pp.moments(adata)
computing neighbors
finished (0:00:04.18) --> added to `.uns['neighbors']`
'distances', weighted adjacency matrix
'connectivities', weighted adjacency matrix
computing moments based on connectivities
filtered out 1 cells that have less than 1 counts

ValueError Traceback (most recent call last)
in ()
----> 1 scv.pp.moments(adata)

~/bin/miniconda3/lib/python3.6/site-packages/scvelo/preprocessing/moments.py in moments(data, n_neighbors, n_pcs, mode, use_rep, recurse_neighbors, renormalize, plot, copy)
58
59 logg.info('computing moments based on ' + str(mode), r=True)
---> 60 normalize_layers(adata)
61
62 connectivities = get_connectivities(adata, mode, recurse_neighbors=recurse_neighbors)

~/bin/miniconda3/lib/python3.6/site-packages/scvelo/preprocessing/utils.py in normalize_layers(data, layers, by_total_size, max_proportion_per_cell, copy)
277 if max_proportion_per_cell is not None and (0 < max_proportion_per_cell < 1):
278 counts_per_cell = counts_per_cell_quantile(adata.X, max_proportion_per_cell, counts_per_cell)
--> 279 adata.layers[layer] = normalize_per_cell(adata.layers[layer], None, counts_per_cell, copy=True)
280 return adata if copy else None
281

~/bin/miniconda3/lib/python3.6/site-packages/anndata/layers.py in setitem(self, key, value)
58 else:
59 if value.shape != self._adata.shape:
---> 60 raise ValueError('Shape does not fit.')
61 self._layers[key] = value
62

ValueError: Shape does not fit.

TypeError with tl.rank_velocity_genes

Hello, I am trying to use scv.tl.rank_velocity_genes(adata, match_with='clusters') but I keep encountering this Error: TypeError: unsupported operand type(s) for +: 'int' and 'str' The full Traceback is:
File "<stdin>", line 1, in <module>

File "/data1/lan/miniconda3/lib/python3.6/site-packages/scvelo/tools/rank_velocity_genes.py", line 151, in rank_velocity_genes velocity_clusters(adata, vkey=vkey, match_with=match_with, resolution=resolution)

File "/data1/lan/miniconda3/lib/python3.6/site-packages/scvelo/tools/rank_velocity_genes.py", line 101, in velocity_clusters vc = vc.cat.rename_categories({cat: new_cat + ' ' + cat})

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Detect if adata.X is log normalised or not.

I wanted to know how the check 'Did not modify X as it looks preprocessed already' is working. In the code it is comparing counts of spliced and X.
log_advised = np.allclose(adata.X[:10].sum(), adata.layers['spliced'][:10].sum())

Can you comment on why X would be log normalised if these counts are equal?

Using precomputed UMAP coordinates

Hello! I have a dataset for which I have already performed clustering using UMAP (Seurat). I want to use the same coordinates and compute velocity. For some reason "velocity.R" keeps failing. Is it possible to use scvelo with pre-computed UMAP coordinates? Thanks!

passing an ax induces strange behavior

Hi Volker,

sorry for the insufficient example, but depending on whether I pass a matplotlib.Axes or not, I get a completely different behavior. Specifically, all colors are removed when passing ax.

scv.pl.velocity_embedding_grid(adata, basis='umap', ax=pl.figure(figsize=(10, 10)).gca())

Not a top prio, but for reference. 😉

Question about modes in velocity function

Could you explain the difference between the modes in the tl.velocity() function?

Through a cursory look at your code and readme it looks like Deterministic is the original La Manno velocity, stochastic is your new one and I have no clue what bayes is?

Just wanted clarification.

invalid value encountered in double_scalars

Dear,
when I run scv.tl.velocity_embedding(adata,basis='draw_graph_fa') and scv.pl.velocity_embedding_grid(adata,legend_loc='on data',size=10,basis='draw_graph_fa'), an error occurs:
invalid value encountered in double_scalars, there are no velocity arrows on the FA plot. But when I use umap instead, everything is OK.

pp.show_proportions cannot detect abundance of transcripts

Hi, I am trying to run scVelo using a merged object (combination of 16 loom files). Unfortunately, when I run scv.pp.show_proportions(adata) I encounter this error RuntimeWarning: invalid value encountered in true_divide [np.mean(tot_mol_cell / np.sum(tot_mol_cell_layers, 0)) for tot_mol_cell in tot_mol_cell_layers], 2) This prevents me from later on running scv.tl.velocity_graph(adata)
When I run the same object using velocyto, I don't encounter an issue, which is why I am wondering what could be happening here. Any clarity would be appreciated.

Velocity and Diffmap

Hi!

I have an issue plotting the velocity on the diffusion map. The arrows and the diffusion maps are both plotted, but the arrow are all one below the other on the right or left side of the plot and don't actually show the velocity of each cell. There is no error message when I call the function.

Thank you!
Best,
Lena

Logarithmization message

Hej,
I noticed a small thing that has created a little bit of confusion in me :)

I have been using scvelo.pp.filter_and_normalize(all_data, min_counts=20, min_counts_u=10) when playing around on my data, that has been merged with a dataset on which I applied the scanpy clustering tools (louvain, umap, tsne and paga). The matrix X has been logarithmized during the analysis.

Nonetheless I get the following message when running scvelo.pp.moments(all_data, n_pcs=15, n_neighbors=30):

Consider to logarithmize adata.X with `scv.pp.log1p` for better results.

The same happens even when I first apply scvelo.pp.log1p(all_data_loom). However, such a message disappears when using a dataset not merged with another one handled with scanpy. Is it possible that scanpy and scvelo check in a different way if the data has been logarithmized?

Can't read loom files with scv.read>=0.1.4

Hello,
I previously did some tries on scvelo 0.1.4.
Running scvelo 0.1.11.dev6+133abe9 after a git pull.
I can't read loom files anymore. I tried on my files and from downloaded examples:

adata = scv.read('data/ForebrainGlut/hgForebrainGlut.loom', cleanup=True,
sparse=True, cache=True, backup_url='http://pklab.med.harvard.edu/velocyto/hgForebrainGlut/hgForebrainGlut.loom')

adata = scv.read("data/DentateGyrus/10X43_1.loom", sparse=True,
cache=True, backup_url="http://pklab.med.harvard.edu/velocyto/DG1/10X43_1.loom")

WARNING:root:initialising LoomConnection to data/ForebrainGlut/hgForebrainGlut.loom failed, closing file connection
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/path/to/local/python/python-dev/lib/python3.6/site-packages/scanpy/readwrite.py", line 78, in read
    backup_url=backup_url, cache=cache, **kwargs)
  File "/path/to/local/python/python-dev/lib/python3.6/site-packages/scanpy/readwrite.py", line 445, in _read
    adata = read_loom(filename=filename, **kwargs)
  File "/path/to/local/python/python-dev/lib/python3.6/site-packages/anndata/readwrite/read.py", line 152, in read_loom
    with connect(filename, 'r') as lc:
  File "/path/to/local/python/python-dev/lib/python3.6/site-packages/loompy/loompy.py", line 957, in connect
    return LoomConnection(filename, mode)
  File "/path/to/local/python/python-dev/lib/python3.6/site-packages/loompy/loompy.py", line 83, in __init__
    raise e
  File "/path/to/local/python/python-dev/lib/python3.6/site-packages/loompy/loompy.py", line 72, in __init__
    self.row_graphs = loompy.GraphManager(self, axis=0)
  File "/path/to/local/python/python-dev/lib/python3.6/site-packages/loompy/graph_manager.py", line 34, in __init__
    ds._file.create_group(a)
  File "/path/to/local/python/python-dev/lib/python3.6/site-packages/h5py/_hl/group.py", line 60, in create_group
    gid = h5g.create(self.id, name, lcpl=lcpl, gcpl=gcpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5g.pyx", line 161, in h5py.h5g.create
ValueError: Unable to create group (no write intent on file)

I tried to checkout to 0.1.10 and 0.1.6, same issue.
Are they new (missing) library requirements since v0.1.4?

Thanks.

Mathematical formula for scvelo velocity estimation

thanks for this nice package! It works pretty well in my hands. I am pretty curious about the method you used to estimate velocity, in comparison to the original RNA velocity paper. Is there any plans that you will publish or post to biorxiv this soon?

'NoneType' object error

Hi,
I trying to run velocity using the scvlo after I analyzed my data using the scanpy package.
My AnnData is like this:

AnnData object with n_obs × n_vars = 708 × 3064 
    uns: 'neighbors'
    obsm: 'X_pca', 'X_umap', 'X_tsne'
    varm: 'PCs'
    layers: 'spliced', 'unspliced'

But when I move to the next step scv.pp.filter_genes(adata, min_counts=20, min_counts_u=10) I get the following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-110-9aec77ebb497> in <module>
      6 
      7 
----> 8 scv.pp.filter_genes(adata, min_counts=20, min_counts_u=10)
      9 
     10 

~/.conda/envs/velocyto_env/lib/python3.7/site-packages/scvelo/preprocessing/utils.py in filter_genes(data, min_counts, min_cells, max_counts, max_cells, min_counts_u, min_cells_u, max_counts_u, max_cells_u, min_shared_counts, min_shared_cells, copy)
    142 
    143     # set initial cell sizes before filtering
--> 144     set_initial_size(adata)
    145 
    146     layers = [layer for layer in ['spliced', 'unspliced'] if layer in adata.layers.keys()]

~/.conda/envs/velocyto_env/lib/python3.7/site-packages/scvelo/preprocessing/utils.py in set_initial_size(adata, layers)
     74     for layer in layers:
     75         adata.obs['initial_size_' + layer] = get_size(adata, layer)
---> 76     if 'initial_size' not in adata.obs.keys(): adata.obs['initial_size'] = get_size(adata)
     77 
     78 

~/.conda/envs/velocyto_env/lib/python3.7/site-packages/scvelo/preprocessing/utils.py in get_size(adata, layer)
     66 def get_size(adata, layer=None):
     67     X = adata.X if layer is None else adata.layers[layer]
---> 68     return X.sum(1).A1 if issparse(X) else X.sum(1)
     69 
     70 

AttributeError: 'NoneType' object has no attribute 'sum'

Also, the scv.pp.moments function is not working too. I found in a previous issue that I have to prepare the data with scanpy==1.4.1, unfortunately, I still have the same problem. Any help.

Thanks, HM

Accessing clusters and subsetting based on gene expression

Hi,
I would like to see if I can subset data based on what I think are useful and relevant genes. e.g., using only a few clusters. It might be something super simple but I am unable to figure it out.

Thanks!

API considerations

Hi!

While scanpy is a toolbox with many independent parts, I think scvelo is unlikely to outgrow a certain size.

I think it would be sufficient to import everything directly in scvelo/__init__.py. I have the followining mind:

Get rid of scvelo.preprocessing:
- show_proportions is a debugging function that prints stuff and is for a weird reason also able to remove layers… something not reflected in the name. maybe put it in some internal scvelo/_utils module and split it into print_proportions and clear_layers functions
- moments can be directly scvelo.moments. the function shouldn’t be named the same. maybe add_moments? then you can also import it directly in scvelo without shadowing the module name
- recipes can also just be scvelo.recipes
Get rid of tools:
- scvelo.tools.velocity, scvelo.velocity_embedding, scvelo.velocity_graph, … can just be scvelo.velocity, scvelo.velocity.embedding, …
- the main velocity function can also be named e.g. add_velocity and imported in scvelo directly.
Rename plotting.velocity to plotting.plot_velocity and import it in scvelo.

Now we have a flat module and example code would be

import scvelo  # no need to rename this, keep the short full name

adata = ...
scvelo.velocity_graph(adata, ...)
scvelo.plot_velocity(adata, ...)

or so.

"AttributeError: 'numpy.ndarray' object has no attribute 'toarray'" in scv.pp.moments

Hi!

I tried the example notebook and encounterd the error below.

The libraries are
scvelo==0.1.3+1.g7af8818, Scanpy==1.3.1, anndata==0.6.10 , numpy==1.15.1 , scipy==1.0.0,
pandas==0.23.4 , scikit-learn==0.19.1 , statsmodels==0.9.0

The dot function applied is disturbing the type of csr. or should I specify some libraries?

I'm appreciating your works and looking forward the advance of scvelo and scanpy every single day.
Best regards.

meaning of rank_velocity_genes

Hi,
I am wondering what this function rank_velocity_genes is actually doing?
In my case, I am working on a statice tissue or a non-developmental organ, the liver. That mean, I have no previous knowledge about the differentiation processes in the tissue, and I am not expecting a flow of information between different cells types, even though velocity is giving me nice vectors.
The function of rank_velocity_genes is giving a group of biologically meaningless genes. My question is what would you suggest to test such results? the number of PCA, neighbors...

*I have three clusters and I am getting the same genes, in a different order, for the three clusters.

Thanks a lot,
HM

PosixPath Error when using scv.settings.figdir

Hi there, when I specify a figdir as a string path in the the settings and use scv.settings.autosave=True scv.settings.autoshow=False I get the following output:
scv.pl.velocity_embedding(avdata, basis='umap') File "/utils/miniconda3/envs/scanpy/lib/python3.6/site-packages/scvelo/plotting/velocity_embedding.py", line 138, in velocity_embedding ax = scatter(adata, x=x, y=y, layer=layer, color=color, size=size, title=title, ax=ax, zorder=0, **scatter_kwargs) File /utils/miniconda3/envs/scanpy/lib/python3.6/site-packages/scvelo/plotting/scatter.py", line 171, in scatter savefig_or_show('' if basis is None else basis, dpi=dpi, save=save, show=show) File "/utils/miniconda3/envs/scanpy/lib/python3.6/site-packages/scvelo/plotting/utils.py", line 263, in savefig_or_show if settings.figdir[-1] != '/': settings.figdir += '/' TypeError: 'PosixPath' object does not support indexing

Starting file?

Hello,

I have 10x datasets that I have been using for Scanpy analysis, and would really like to incorporate scvelo into my analysis!

Sorry this is such a basic question, but what are the starting files for running scvelo (i.e. BAM files converted into loom files)? In the example you have posted (dendate gyrus dataset) it seems like the unspliced/spliced counts are already incorporated into the dataset.

Do I need to create a file from the BAM files, such as with RNA Velocyto?

Thank you in advance!

Velocity changes after update to 0.1.15

Hi Volker,
I just wondered why the RNA velocity flows have changed in my case (not entirely, but significantly) after I have updated scvelo to version 0.1.15 without having done any further changes. Did you change/improve anything that could explain this?
Best, Jonas

how to save all layers to loom file?

Hi,
when finished scvlo pipeline, is there anyway saving all layers in adata to Loom file like velocyto loom file?

pp.neighbors vs pp.moments

Hi Volker,
I noticed that computing the neighbourhood graph with scanpy before computing the moments makes a difference. I would have thought doing so would simply be redundant (of course for same parameters as number of principal components and neighbours) but it seems to matter: The UMAP looks different, the number of Louvain clusters for fixed resolution differs and the velocity flows change. So how does pp.moments actually compute the neighbourhood graph?
Best, Jonas

quantify scvelo to measure dynamics

Hi,

I tested scvelo on inflammed inmmune cells between different aged mice and I see a nice trend. The arrows going from normal to inflammed cells become smaller with age correlating with reduce immune function in this particular tissue and assay. I want to quantify this. How can I extract the values of the vectors for each cells so I can make an average or plot these on a boxplot?

Thanks for your help!

scvelo neighbors

Dear,
I am trying to get RNA velocity with scvelo and I find that scvelo will search neighbors by default. Now I have an scanpy anndata object, of which the neighbors are made by a customized method. I wanna keep these neighbors, are there any methods to stop scvelo re-searching neighbors automatically ?

Long time reading a file with scv.read

Hi!
I am trying to read a .loom file, whose size is approx 900MB. I have installed scvelo v0.1.13 and I have used the command

scvelo.read(myFile, sparse=True, cache=True)

The command seems to take a long time (it has been running for almost one hour), but when I look at the job running on my cluster, I can see no more data gets loaded into memory.
I was wondering id such a long time has been happening already to someone using scvelo, or if it can be caused by dependencies.

Thanks for help, you guys are doing great job 👍
Samuele

Group constraints for aggregated samples?

Hi there,

I was wondering if there was a way (or if it is necessary) to manage situations when the data is aggregated data from multiple samples, where neighbouring cells in PC space may not have the same properties due to different sample conditions (eg. differentiation time course experiments, where the lagging cells of one time point may by neighbours to fast-responding cells of the previous time point).

I know Velocyto has problems with this when using default settings because of knn imputation (a group_constrain parameter in the knn_imputation function deals with this though), but I'm not super familiar with the differences between scvelo and Velocyto, so it may not be relevant.

Any help is appreciated!

David

Plot smoothed velocities

Hello,

I have opened the github page of scvelo and noticed that there is a plot where the velocities are smoothed instead of being over each single cell. That seems a bit of a more clear view of the cell dynamics.

I cannot find the way of recreating something with my data by checking in the documentation. Could you please provide which command and options are to be used? Or is it something that has been implemented by the author of the picture out of the scvelo tool?

Thanks for the great job with scvelo,
Samuele

error in scvelo.tl.rank_velocity_genes()

command that gives me errors

scv.tl.rank_velocity_genes(data, match_with = 'louvain',resolution = 0.2)

TypeError Traceback (most recent call last)
/anaconda3/envs/scpMay/lib/python3.6/site-packages/pandas/core/ops.py in na_op(x, y)
1788 try:
-> 1789 result = op(x, y)
1790 except TypeError:

/anaconda3/envs/scpMay/lib/python3.6/site-packages/pandas/core/ops.py in rand_(left, right)
186 def rand_(left, right):
--> 187 return operator.and_(right, left)
188

TypeError: unsupported operand type(s) for &: 'numpy.ndarray' and 'Categorical'

During handling of the above exception, another exception occurred:

AssertionError Traceback (most recent call last)
in
----> 1 scv.tl.rank_velocity_genes(day11, match_with = 'louvain',resolution = 0.2)

/anaconda3/envs/scpMay/lib/python3.6/site-packages/scvelo/tools/rank_velocity_genes.py in rank_velocity_genes(data, vkey, n_genes, groupby, match_with, resolution, min_counts, min_r2, min_dispersion, copy)
149
150 if groupby is None or groupby == 'velocity_clusters':
--> 151 velocity_clusters(adata, vkey=vkey, match_with=match_with, resolution=resolution)
152 groupby = 'velocity_clusters'
153

/anaconda3/envs/scpMay/lib/python3.6/site-packages/scvelo/tools/rank_velocity_genes.py in velocity_clusters(data, vkey, match_with, resolution, copy)
65 tmp_filter = np.ones(vdata.n_vars, dtype=bool)
66 if vkey + '_genes' in vdata.var.keys():
---> 67 tmp_filter &= vdata.var[vkey + '_genes']
68
69 if 'unspliced' in vdata.layers.keys():

/anaconda3/envs/scpMay/lib/python3.6/site-packages/pandas/core/ops.py in wrapper(self, other)
1848 filler = (fill_int if is_self_int_dtype and is_other_int_dtype
1849 else fill_bool)
-> 1850 res_values = na_op(self.values, ovalues)
1851 unfilled = self._constructor(res_values,
1852 index=self.index, name=res_name)

/anaconda3/envs/scpMay/lib/python3.6/site-packages/pandas/core/ops.py in na_op(x, y)
1792 if isinstance(y, np.ndarray):
1793 # bool-bool dtype operations should be OK, should not get here
-> 1794 assert not (is_bool_dtype(x) and is_bool_dtype(y))
1795 x = ensure_object(x)
1796 y = ensure_object(y)

AssertionError:

session info

import scvelo as scv scv.logging.print_version()
Running scvelo 0.1.17 (python 3.6.7) on 2019-05-31 15:02.

import numpy as np import pandas as pd import scanpy as sc sc.settings.verbosity = 3 sc.logging.print_versions() sc.settings.set_figure_params(dpi=80)
scanpy==1.4.3 anndata==0.6.20 umap==0.3.9 numpy==1.15.4 scipy==1.1.0 pandas==0.24.2 scikit-learn==0.20.0 statsmodels==0.9.0 python-igraph==0.7.1 louvain==0.6.1

attempt

I have try
data.obs['cluster'] = data.obs['louvain'].astype(str)
but it still gave me the same error

error when running exemplary DentateGyrus.ipynb

Hi, I am try running DentateGyrus.ipynb. However, get stuck on scv.tl.velocity_graph(adata)
any suggestions ?

All previous cells work well.

from IPython.core.display import display, HTML
display(HTML("<style>.container { width:90% !important; }</style>"))
%matplotlib inline

import scvelo as scv
scv.logging.print_version()

Running scvelo 0.1.17.dev4+95beed0 (python 3.6.8) on 2019-05-23 15:23.

scv.settings.set_figure_params('scvelo')

adata = scv.read("data/DentateGyrus/10X43_1.loom", sparse=True, cache=True,
                 backup_url="http://pklab.med.harvard.edu/velocyto/DG1/10X43_1.loom")

scv.utils.show_proportions(adata)
scv.utils.cleanup(adata, clean='all')

adata

Abundance of ['spliced', 'unspliced', 'ambiguous']: [0.68 0.08 0.24]

AnnData object with n_obs × n_vars = 3396 × 25919 
    layers: 'spliced', 'unspliced'

scv.pp.filter_and_normalize(adata, min_counts=20, min_counts_u=10, n_top_genes=3000)

Filtered out 13864 genes that are detected in less than 20 counts (spliced).
Filtered out 6520 genes that are detected in less than 10 counts (unspliced).
Normalized count data: X, spliced, unspliced.
Logarithmized X.

scv.pp.moments(adata, n_pcs=30, n_neighbors=30)

computing neighbors
    finished (0:00:07.31) --> added to `.uns['neighbors']`
    'distances', weighted adjacency matrix
    'connectivities', weighted adjacency matrix
computing moments based on connectivities
    finished (0:00:00.63) --> added 
    'Ms' and 'Mu', moments of spliced/unspliced abundances (adata.layers)

scv.tl.velocity(adata)

computing velocities
    finished (0:00:00.29) --> added 
    'velocity', velocity vectors for each individual cell (adata.layers)

scv.tl.velocity_graph(adata)

---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-8-699687a77a69> in <module>
----> 1 scv.tl.velocity_graph(adata)


/work/06032/tg853315/stampede2/software/miniconda3/lib/python3.6/site-packages/scvelo/tools/velocity_graph.py in velocity_graph(data, vkey, xkey, tkey, basis, n_neighbors, n_recurse_neighbors, random_neighbors_at_max, sqrt_transform, approx, copy)
    153     vgraph = VelocityGraph(adata, vkey=vkey, xkey=xkey, tkey=tkey, basis=basis, n_neighbors=n_neighbors, approx=approx,
    154                            n_recurse_neighbors=n_recurse_neighbors, random_neighbors_at_max=random_neighbors_at_max,
--> 155                            sqrt_transform=sqrt_transform, report=True)
    156 
    157     logg.info('computing velocity graph', r=True)


/work/06032/tg853315/stampede2/software/miniconda3/lib/python3.6/site-packages/scvelo/tools/velocity_graph.py in __init__(self, adata, vkey, xkey, tkey, basis, n_neighbors, sqrt_transform, n_recurse_neighbors, random_neighbors_at_max, approx, report)
     54         if 'neighbors' not in adata.uns.keys(): neighbors(adata)
     55         if n_neighbors is None or n_neighbors < adata.uns['neighbors']['params']['n_neighbors']:
---> 56             self.indices = get_indices(dist=adata.uns['neighbors']['distances'], n_neighbors=n_neighbors)[0]
     57         else:
     58             if basis is None: basis = [key for key in ['X_pca', 'X_tsne', 'X_umap'] if key in adata.obsm.keys()][-1]


/work/06032/tg853315/stampede2/software/miniconda3/lib/python3.6/site-packages/scvelo/tools/utils.py in get_indices(dist, n_neighbors)
     92         dat[rm_idx] = 0
     93     D.eliminate_zeros()
---> 94     indices = D.indices.reshape((-1, n_neighbors))
     95     return indices, D
     96 


ValueError: cannot reshape array of size 0 into shape (0)

problem running the scv.pp.moments

Hi,

scvelo used to run fine before with scanpy version 1.3 at least. But now with the new update I am having the following issue. Appreciate your help.
from os import path
import numpy as np
import matplotlib.pyplot as pl
import scvelo as scv
import scanpy.api as sc

scv.pp.filter_and_normalize(adata1)
scv.pp.moments(adata1)
scv.tl.velocity(adata1, mode='stochastic')# Estimating the velocities for each individual cell
scv.tl.velocity_graph(adata1)
scv.tl.velocity_embedding(adata1, basis='tsne')
scv.tl.velocity_embedding(adata1, basis='umap')
scv.tl.velocity_embedding(adata1, basis='phate')
scv.tl.velocity_embedding(adata1, basis='diffmap')

Did not modify X as it looks preprocessed already.

KeyError Traceback (most recent call last)
in
7 scv.pp.filter_and_normalize(adata1)
8
----> 9 scv.pp.moments(adata1)
10 scv.tl.velocity(adata1, mode='stochastic')# Estimating the velocities for each individual cell
11 scv.tl.velocity_graph(adata1)

~/miniconda3/envs/scanpy/lib/python3.6/site-packages/scvelo/preprocessing/moments.py in moments(data, n_neighbors, n_pcs, mode, method, metric, use_rep, recurse_neighbors, renormalize, copy)
41 if any([not_yet_normalized(adata.layers[layer]) for layer in {'spliced', 'unspliced'}]):
42 normalize_per_cell(adata)
---> 43 if 'neighbors' not in adata.uns.keys() or neighbors_to_be_recomputed(adata, n_neighbors=n_neighbors):
44 if use_rep is None: use_rep = 'X_pca'
45 neighbors(adata, n_neighbors=n_neighbors, use_rep=use_rep, n_pcs=n_pcs, method=method, metric=metric)

~/miniconda3/envs/scanpy/lib/python3.6/site-packages/scvelo/preprocessing/neighbors.py in neighbors_to_be_recomputed(adata, n_neighbors)
129
130
--> 131 def neighbors_to_be_recomputed(adata, n_neighbors=None):
132 # check if neighbors graph is disrupted
133 n_neighs = (adata.uns['neighbors']['distances'] > 0).sum(1)

KeyError: 'distances'

Loosing transcripts

Hi. I am new to RNA velocity analysis, and this may already have been discussed elsewhere...

It appears that I am loosing a bunch of genes/transcripts during the Velocyto analysis, and I am trying to figure out what is going on. After running Velocyto, there are approximately 5000 final filtered transcripts (presumably ones used for the vector calculation). However, several genes/transcripts of interest that should be highly expressed in most cells (seen using software like Seurat) are not listed in the final filtered genes. Thus, I am suspecting that many genes/transcripts are, for some reason, lost during Velocyto calculation.

Does Velocyto filter out genes that are not differentially expressed (namely, genes/transcripts that are not useful to plot vector)?

My cells are captured using Fluidigm.

Very much appreciated.

New version totally invert velocity

Hi, I have been using scvelo version 0.1.16 since some time on various datasets with success, but after upgrading the package to the last 0.1.18 version, I cannot manage to reproduce my results. Using the new version even lead to somehow an inverted velocity map which does not make sense biologically, here is the code used for both versions:

scv.pp.filter_and_normalize(adata, min_counts=20, min_counts_u=10,n_top_genes=4000)
scv.pp.moments(adata)
scv.tl.velocity(adata)
scv.tl.velocity_graph(adata)

The embedding is imported because it has been computed externally

Here is a stream plot from 0.1.16:

And here is the one made with the lastest version:

What could explain such dramatic change?

Thanks for the help!

data to make plots using matlab

Hi,
Unfortunately, I am not a python expert! Somehow I managed to run the pipeline and get nice results. However, I feel more comfortable and I have control using Matlab.
I managed to save my work as loom file and open it in Matlab. My question is what are the relevant data slots that I have to use to replot the, for example, the phase diagram as the output of scv.pl.velocity?

Thanks a lot, HM

Unable to install and use scvelo

I tried to install scvelo using pip install scvelo and I got no error. But when I try to import scvelo I am getting this error:

AttributeError Traceback (most recent call last)
in ()
----> 1 import scvelo as scv
2 scv.logging.print_version()

/anaconda3/lib/python3.6/site-packages/scvelo/init.py in ()
1 """scvelo - stochastic single cell RNA velocity"""
2
----> 3 from . import preprocessing as pp
4 from . import tools as tl
5 from . import plotting as pl

/anaconda3/lib/python3.6/site-packages/scvelo/preprocessing/init.py in ()
----> 1 from .utils import show_proportions, cleanup
2 from .moments import moments
3 from .recipes import recipe_velocity

/anaconda3/lib/python3.6/site-packages/scvelo/preprocessing/utils.py in ()
----> 1 import numpy as np
2
3
4 def show_proportions(adata, copy=False):
5 """Fraction of spliced/unspliced/ambiguous abundances

/anaconda3/lib/python3.6/site-packages/numpy/init.py in ()
140 from . import _distributor_init
141
--> 142 from . import core
143 from .core import *
144 from . import compat

/anaconda3/lib/python3.6/site-packages/numpy/core/init.py in ()
55 from . import umath
56 from . import _internal # for freeze programs
---> 57 from . import numerictypes as nt
58 multiarray.set_typeDict(nt.sctypeDict)
59 from . import numeric

/anaconda3/lib/python3.6/site-packages/numpy/core/numerictypes.py in ()
109 )
110
--> 111 from ._type_aliases import (
112 sctypeDict,
113 sctypeNA,

/anaconda3/lib/python3.6/site-packages/numpy/core/_type_aliases.py in ()
61 _concrete_typeinfo[k] = v
62
---> 63 _concrete_types = {v.type for k, v in _concrete_typeinfo.items()}
64
65

/anaconda3/lib/python3.6/site-packages/numpy/core/_type_aliases.py in (.0)
61 _concrete_typeinfo[k] = v
62
---> 63 _concrete_types = {v.type for k, v in _concrete_typeinfo.items()}
64
65

AttributeError: 'tuple' object has no attribute 'type'

Any help on how to fix this would be great. I also tried the Git clone install and that gives me a similar error.

theislab / scvelo Goto Github PK

scvelo's Introduction

scVelo - RNA velocity generalized through dynamical modeling

scVelo's key applications

Citing scVelo

EM and steady-state model

veloVI

RNA velocity inference through metabolic labeling information

Support

scvelo's People

Stargazers

Watchers

Forkers

scvelo's Issues

Hi, I'm attempting to run a scanpy processed .h5ad file through scvelo but am running into the error pasted below. I'm relatively new to the whole scene, any help is much appreciated. Let me know if you need to see any other info from the attempted run, cheers!

In [100]: scv.pp.moments(adata) computing neighbors finished (0:00:04.18) --> added to .uns['neighbors'] 'distances', weighted adjacency matrix 'connectivities', weighted adjacency matrix computing moments based on connectivities filtered out 1 cells that have less than 1 counts

command that gives me errors

session info

attempt

Did not modify X as it looks preprocessed already.

Recommend Projects

Recommend Topics

Recommend Org

Hi,
I'm attempting to run a scanpy processed .h5ad file through scvelo but am running into the error pasted below.
I'm relatively new to the whole scene, any help is much appreciated. Let me know if you need to see any other info from the attempted run, cheers!

In [100]: scv.pp.moments(adata)
computing neighbors
finished (0:00:04.18) --> added to `.uns['neighbors']`
'distances', weighted adjacency matrix
'connectivities', weighted adjacency matrix
computing moments based on connectivities
filtered out 1 cells that have less than 1 counts