emdann / milopy Goto Github PK

View Code? Open in Web Editor NEW

55.0 55.0 6.0 3.21 MB

Python implementation of Milo for differential abundance testing on KNN graph

License: MIT License

Python 2.92% Jupyter Notebook 97.08%

milopy's People

Contributors

Stargazers

Watchers

Forkers

chebizarro zktuong ktpolanski canergen c0nc gladelephant

milopy's Issues

Native python implementation for NB-GLM instead of `edgeR`

Gathering some suggestions:

Patsy for model matrix encoding https://patsy.readthedocs.io/en/latest/overview.html
GLM implementation in pymc3 https://docs.pymc.io/notebooks/GLM-poisson-regression.html

error with milo.count_nhoods(adata, sample_col="sample")

Hi, I get an error when run milo.count_nhoods. It shows 'Length of values (6516) does not match length of index (6512)'. The 'nhood_ixs_refined' in obs dataframe has 6516 nonzero values. But the adata.obsm["nhoods"] has 6512 values. I haven't met this question during previous using.
My code is below
adata.obs['nhood_ixs_refined'].value_counts()
0 30573
1 6516
adata.obsm["nhoods"]
<37089x6512 sparse matrix of type '<class 'numpy.float32'>' with 232791 stored elements in Compressed Sparse Row format>
The line met error:
milo.count_nhoods(adata, sample_col="sample")
`ValueError Traceback (most recent call last)
Cell In[28], line 1
----> 1 milo.count_nhoods(adata, sample_col="sample")
2 # Length of values (5250) does not match length of index (5245)

File /opt/conda/lib/python3.10/site-packages/milopy/core.py:165, in count_nhoods(adata, sample_col)
163 nhood_adata.uns["sample_col"] = sample_col
164 # Save nhood index info
--> 165 nhood_adata.obs["index_cell"] = adata.obs_names[adata.obs["nhood_ixs_refined"] == 1]
166 nhood_adata.obs["kth_distance"] = adata.obs.loc[adata.obs["nhood_ixs_refined"]
167 == 1, "nhood_kth_distance"].values
168 adata.uns["nhood_adata"] = nhood_adata

File /opt/conda/lib/python3.10/site-packages/pandas/core/frame.py:3980, in DataFrame.setitem(self, key, value)
3977 self._setitem_array([key], value)
3978 else:
3979 # set column
-> 3980 self._set_item(key, value)

File /opt/conda/lib/python3.10/site-packages/pandas/core/frame.py:4174, in DataFrame._set_item(self, key, value)
4164 def _set_item(self, key, value) -> None:
4165 """
4166 Add series to DataFrame in specified column.
4167
(...)
4172 ensure homogeneity.
4173 """
-> 4174 value = self._sanitize_column(value)
4176 if (
4177 key in self.columns
4178 and value.ndim == 1
4179 and not is_extension_array_dtype(value)
4180 ):
4181 # broadcast across multiple columns if necessary
4182 if not self.columns.is_unique or isinstance(self.columns, MultiIndex):

File /opt/conda/lib/python3.10/site-packages/pandas/core/frame.py:4915, in DataFrame._sanitize_column(self, value)
4912 return _reindex_for_setitem(Series(value), self.index)
4914 if is_list_like(value):
-> 4915 com.require_length_match(value, self.index)
4916 return sanitize_array(value, self.index, copy=True, allow_2d=True)

File /opt/conda/lib/python3.10/site-packages/pandas/core/common.py:571, in require_length_match(data, index)
567 """
568 Check the length of data matches the length of the index.
569 """
570 if len(data) != len(index):
--> 571 raise ValueError(
572 "Length of values "
573 f"({len(data)}) "
574 "does not match length of index "
575 f"({len(index)})"
576 )

ValueError: Length of values (6516) does not match length of index (6512)`

package version:
scanpy==1.9.6 anndata==0.10.4 umap==0.5.5 numpy==1.26.3 scipy==1.11.4 pandas==1.5.3 scikit-learn==1.3.2 statsmodels==0.14.1 igraph==0.11.2 pynndescent==0.5.11 milopy == 0.1.1
Thanks.
Waiting your relpy.

plotNhoodGraphDA in python version

Hi all,

is it possible to get the same plot as with plotNhoodGraphDA() in R in the python version too? I'm aware of milopy.plot.plot_nhood_graph() but it's not as pretty :) Thanks!

Add informative error when running plot_nhood_graph before graph construction

Current implementation gives KeyError with missing Nhood_size

Bug: errors if no cells covered for a certain sample

If one sample (in adata.uns['nhood_adata'].var_names) has no cells in any neighbourhood, then DA_nhoods throws the following error:

RRuntimeError: Error in glmFit.default(sely, design, offset = seloffset, dispersion = 0.05,  : 
  nrow(design) disagrees with ncol(y)

Fix:

The problem is that count matrix is filtered by library size, but not model matrix

From

milopy/milopy/core.py

Line 250 in be1a6cc

dge = edgeR.estimateDisp(dge, model)

 dge = edgeR.estimateDisp(dge, model[keep_smp,])

Add warning when keep_smp is used

Get all cells of a neighborhood given the index_cell

Hello,
is there a simple way to determine all cells of a neighborhood by the index cell?
Using the adata.obsm['nhoods'] matrix only works if the index_cell does not belong to another neighborhood.

Thank you for your help

plotNhoodExpressionDA in milopy?

Thank you for maintaining milopy!

Regarding the function plotNhoodExpressionDA in MarioniLab/miloR#31, is there a variant of this function exposed through this package? I think it is only R at the moment. It could be useful to also provide it here.

Error in make_nhoods if the reduced dims are in pandas.DataFrame

The error:

milo.make_nhoods(adata, prop=0.05)

InvalidIndexError: (array([    6,    14,    31,    32,    36,    41,    44,    45,    46,
        1226,  1893,  2742,  3122,  3475,  3494,  3578,  3588,  4416,
        4446,  5146,  5274,  5686,  6432,  6524,  7218,  7323,  8239,
        8945, 13048, 14444, 14774, 15035, 15580, 16304, 20604, 20842,
       21208, 21289, 21313, 21352, 23599], dtype=int32), slice(None, None, None))

The quick fix:

adata.obsm[adata.uns['neighbors']['use_rep']] = adata.obsm[adata.uns['neighbors']['use_rep']].values.copy()

Solution: convert X_dimred to array if it's a DataFrame

Fix DA_nhoods warning from rpy2

/home/jovyan/my-conda-envs/emma_env/lib/python3.7/site-packages/rpy2/robjects/vectors.py:980: UserWarning: R object inheriting from "POSIXct" but without attribute "tzone".
  warnings.warn('R object inheriting from "POSIXct" but without '

DA_nhoods() .iteritems deprecated

Hi @emdann running milo.DA_nhoods(adata, design="~diagnosis") gave me an error which is due to the lack of .iteritems in pandas 2. It looks like it was deprecated since pandas 1.5.0 (https://pandas.pydata.org/pandas-docs/version/1.5/reference/api/pandas.DataFrame.iteritems.html). See traceback below

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_59903/2774797340.py in ?()
----> 1 milo.DA_nhoods(adata, design="~diagnosis")

/lustre/scratch126/cellgen/team205/jc48/miniconda3/envs/milopy_env/lib/python3.9/site-packages/milopy/core.py in ?(adata, design, model_contrasts, subset_samples, add_intercept)
    240 
    241     # Define model matrix
    242     if not add_intercept or model_contrasts is not None:
    243         design = design + ' + 0'
--> 244     model = stats.model_matrix(object=stats.formula(
    245         design), data=design_df)
    246 
    247     # Fit NB-GLM

/lustre/scratch126/cellgen/team205/jc48/miniconda3/envs/milopy_env/lib/python3.9/site-packages/rpy2/robjects/functions.py in ?(self, *args, **kwargs)
    194             r_k = prm_translate.get(k, None)
    195             if r_k is not None:
    196                 v = kwargs.pop(k)
    197                 kwargs[r_k] = v
--> 198         return (super(SignatureTranslatedFunction, self)
    199                 .__call__(*args, **kwargs))

/lustre/scratch126/cellgen/team205/jc48/miniconda3/envs/milopy_env/lib/python3.9/site-packages/rpy2/robjects/functions.py in ?(self, *args, **kwargs)
    120             # TODO: shouldn't this be handled by the conversion itself ?
    121             if isinstance(v, rinterface.Sexp):
    122                 new_kwargs[k] = v
    123             else:
--> 124                 new_kwargs[k] = conversion.py2rpy(v)
    125         res = super(Function, self).__call__(*new_args, **new_kwargs)
    126         res = conversion.rpy2py(res)
    127         return res

/lustre/scratch126/cellgen/team205/jc48/miniconda3/envs/milopy_env/lib/python3.9/functools.py in ?(*args, **kw)
    884         if not args:
    885             raise TypeError(f'{funcname} requires at least '
    886                             '1 positional argument')
    887 
--> 888         return dispatch(args[0].__class__)(*args, **kw)

/lustre/scratch126/cellgen/team205/jc48/miniconda3/envs/milopy_env/lib/python3.9/site-packages/rpy2/robjects/pandas2ri.py in ?(obj)
     52 @py2rpy.register(PandasDataFrame)
     53 def py2rpy_pandasdataframe(obj):
     54     od = OrderedDict()
---> 55     for name, values in obj.iteritems():
     56         try:
     57             od[name] = conversion.py2rpy(values)
     58         except Exception as e:

/lustre/scratch126/cellgen/team205/jc48/miniconda3/envs/milopy_env/lib/python3.9/site-packages/pandas/core/generic.py in ?(self, name)
   5985             and name not in self._accessors
   5986             and self._info_axis._can_hold_identifiers_and_holds_name(name)
   5987         ):
   5988             return self[name]
-> 5989         return object.__getattribute__(self, name)

AttributeError: 'DataFrame' object has no attribute 'iteritems'

in case others face this issue, what worked for me:

# with milopy-env enviroment active
# uninstall/reinstall older pandas
pip uninstall pandas
pip install pandas==1.3.5

then I got a different error running milo.DA_nhoods(adata, design="~diagnosis") saying the R package statmod was not found. Good old chatGPT suggested running this in the notebook, which worked 😁.

from rpy2.robjects.packages import importr
utils = importr('utils')
utils.install_packages('statmod')

Perhaps this could be avoided by specifying pandas 1.5.0 in the yaml file?

Interpret the clustering result

Thank you for your great work.

In deriving cell states with milo, we have two questions to interpret the clustering result found in adata.obsm['nhoods'].

Our understanding of the milo paper is that adata.obsm['nhoods'] may not have clustering result for all cells in data since the nehgibhoood itself is a random sample of the cells. Are we correct on this?
We understand the condition information is mainly used in conducting a bunch of differential abundance tests. Regarding the clustering process we were wondering about in Q 1, our question is if the condition information was also used in deriving the clustering result found in adata.obsm['nhoods'] after running milopy.core.make_nhoods. If the condition information was used, could you please explain how was it used?

Thank you.

Where can I find the information about to which neighborhood a cell belongs

Thank you for your amazing work.

I am trying to use your method to group all the cells of a single-cell dataset into neighborhoods or clusters. However, I could not find any information about to which neighborhood or cluster a cell belongs after running Milopy. It would be great if you would tell me how to find this information.

Thank you.

Fan

ValueError: Length mismatch

Hi,

I was running the code in the quick start from the tutorial section and this line milo.DA_nhoods(adata, design="~ condition")
gave me this error: ValueError: Length mismatch: Expected axis has 5 elements, new values have 119 elements. I am wondering whether it's something wrong on my side. The weird thing is that I wrote a script with milopy a while ago that used to work fine but did not run through when I tried to re-run it today, giving me a similar error. Please let me know if you need more information.

Thanks

Can't use model.contrasts with categorical variables

make_nhoods Error: Mean of empty slice.

Hi Emma,

Really looking forward to trying the Milo implementation in Python.

When I run:
milo.make_nhoods(adata_LP_ILE, prop=0.1)

I get the following error:

/home/jupyter/.local/lib/python3.7/site-packages/numpy/core/fromnumeric.py:3441: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
/home/jupyter/.local/lib/python3.7/site-packages/numpy/core/_methods.py:182: RuntimeWarning: invalid value encountered in true_divide
  ret, rcount, out=ret, casting='unsafe', subok=False)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_193/3513822349.py in <module>
      1 ## Assign cells to neighbourhoods
----> 2 milo.make_nhoods(adata_LP_ILE, prop=1) #default prop=0.1

~/.local/lib/python3.7/site-packages/milopy/core.py in make_nhoods(adata, neighbors_key, prop, seed)
     92         # Find closest real point (amongst nearest neighbors)
     93         dists = euclidean_distances(
---> 94             X_dimred[non_zero_cols[non_zero_rows == i], :], nh_pos.T)
     95         # Update vertex index
     96         refined_vertices[i] = nn_ixs[dists.argmin()]

/opt/conda/lib/python3.7/site-packages/sklearn/metrics/pairwise.py in euclidean_distances(X, Y, Y_norm_squared, squared, X_norm_squared)
    300            [1.41421356]])
    301     """
--> 302     X, Y = check_pairwise_arrays(X, Y)
    303 
    304     if X_norm_squared is not None:

/opt/conda/lib/python3.7/site-packages/sklearn/metrics/pairwise.py in check_pairwise_arrays(X, Y, precomputed, dtype, accept_sparse, force_all_finite, copy)
    160             copy=copy,
    161             force_all_finite=force_all_finite,
--> 162             estimator=estimator,
    163         )
    164         Y = check_array(

/opt/conda/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
    806                 "Found array with %d sample(s) (shape=%s) while a"
    807                 " minimum of %d is required%s."
--> 808                 % (n_samples, array.shape, ensure_min_samples, context)
    809             )
    810 

ValueError: Found array with 0 sample(s) (shape=(0, 30)) while a minimum of 1 is required by check_pairwise_arrays.

I would like Milo to use the previously calculated KNN graph and connectivities based on the scVI reduced dimension space, so I skipped recalculating them based on PCA. When I recalculated neighbors based on PCA, Milo ran without errors. To try to troubleshoot where the error was coming from when using the scVI-based neighbors, I ran through make_nhoods line-by-line. I noticed there were empty arrays at certain indices of non_zero_rows. This happened at random indices (the first two times being at indices 196 and 351). The function ran normally for other indices.

For example, at index 196, this line was producing an array of nan values and the 'Mean of empty slice' error:

nh_pos = np.median(
        X_dimred[non_zero_cols[non_zero_rows == 196], :], 0).reshape(-1, 1)

Thank you for your help!

Assigning neighbors_key in milopy.core.make_nhoods

Hi Emma,

Thank you for developing Milo!

I am currently using scvi-tools to correct for batch effects in my single cell data, and so, as you would probably know very well, my anndata does not have X_pca computed. I was trying to find implementation of using milopy on scvi corrected dataset, and I ran into your work again in the Pan_fetal_immune project. Also thank you very much for that as well. It's been a huge help.

I've run into one issue with assigning neighbors_key as "scvi" in the milopy.core.make_nhoods() . I believe there has been an error in line 58 (maybe on line 46 as well) of milopy/core.py use_rep = adata.uns["neighbors"]["params"]["use_rep"]. I've just switched out "neighbors" with neighbors_key and it has been fine since.

I hope it could help someone else out as well.
Thank you again,
Seyoon.

Error in glmFit.default(sely, design, offset = seloffset, dispersion = 0.05, : nrow(design) disagrees with ncol(y)

Hello,

I get an error doing the following: I subset my adata to a certain celltype, copy it into a new adata, recalculate the KNN on that subset adata and run the milo workflow. I get the following error but only at one celltype out of maybe 20:

Error in glmFit.default(sely, design, offset = seloffset, dispersion = 0.05,  : 
  nrow(design) disagrees with ncol(y)

Where could I search for the error? What could differ in that celltype from the others, that if I subset it the DA_analysis fails but not in others.

Thank you!

annotate_nhood error

Hi @emdann ,

i've been trying to run through the tutorial (gastrulation) and I'm encountering the following problem. I'm using the pbmc3k dataset as an example as it gives the same error message below. My software versions are below. would you be able to help?

import scanpy as sc
import numpy as np

import milopy
import milopy.core as milo

adata = sc.datasets.pbmc3k_processed()
adata

## Simulate experimental condition ##
adata.obs["condition"] = np.random.choice(["ConditionA", "ConditionB"], size=adata.n_obs, p=[0.5,0.5])
# we simulate differential abundance in NK cells
DA_cells = adata.obs["louvain"] == "NK cells"
adata.obs.loc[DA_cells, "condition"] = np.random.choice(["ConditionA", "ConditionB"], size=sum(DA_cells), p=[0.2,0.8])

## Simulate replicates ##
adata.obs["replicate"] = np.random.choice(["R1", "R2", "R3"], size=adata.n_obs)
adata.obs["sample"] = adata.obs["replicate"] + adata.obs["condition"]

sc.pl.umap(adata, color=["louvain","condition", "sample"])

## Build KNN graph
sc.pp.neighbors(adata, n_neighbors=10)

## Assign cells to neighbourhoods
milo.make_nhoods(adata)

## Count cells from each sample in each nhood
milo.count_nhoods(adata, sample_col="sample")

## Test for differential abundance between conditions
milo.DA_nhoods(adata, design="~ condition")

## Check results
milo_results = adata.uns["nhood_adata"].obs

milopy.utils.build_nhood_graph(adata)
milopy.plot.plot_nhood_graph(adata, alpha=0.2, min_size=5)
milopy.utils.annotate_nhoods(adata, anno_col='louvain')

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[6], line 1
----> 1 milopy.utils.annotate_nhoods(adata, anno_col='louvain')

File [~/anaconda3/envs/MEDI7281/lib/python3.11/site-packages/milopy/utils.py:152](https://file+.vscode-resource.vscode-cdn.net/Users/Ali/Documents/GitHub/Alicia-Project-1/notebooks/~/anaconda3/envs/MEDI7281/lib/python3.11/site-packages/milopy/utils.py:152), in annotate_nhoods(adata, anno_col)
    148 anno_count = adata.obsm["nhoods"].T.dot(
    149     scipy.sparse.csr_matrix(anno_dummies.values))
    150 anno_frac = np.array(anno_count[/](https://file+.vscode-resource.vscode-cdn.net/)anno_count.sum(1))
--> 152 anno_frac = pd.DataFrame(anno_frac,
    153                          columns=anno_dummies.columns,
    154                          index=adata.uns["nhood_adata"].obs_names
    155                          )
    156 adata.uns["nhood_adata"].obsm["frac_annotation"] = anno_frac.values
    157 # Turn this to list so that writing out h5ad works

File [~/anaconda3/envs/MEDI7281/lib/python3.11/site-packages/pandas/core/frame.py:722](https://file+.vscode-resource.vscode-cdn.net/Users/Ali/Documents/GitHub/Alicia-Project-1/notebooks/~/anaconda3/envs/MEDI7281/lib/python3.11/site-packages/pandas/core/frame.py:722), in DataFrame.__init__(self, data, index, columns, dtype, copy)
    712         mgr = dict_to_mgr(
    713             # error: Item "ndarray" of "Union[ndarray, Series, Index]" has no
    714             # attribute "name"
   (...)
    719             typ=manager,
    720         )
    721     else:
--> 722         mgr = ndarray_to_mgr(
    723             data,
    724             index,
    725             columns,
    726             dtype=dtype,
    727             copy=copy,
    728             typ=manager,
    729         )
    731 # For data is list-like, or Iterable (will consume into list)
    732 elif is_list_like(data):

File [~/anaconda3/envs/MEDI7281/lib/python3.11/site-packages/pandas/core/internals/construction.py:329](https://file+.vscode-resource.vscode-cdn.net/Users/Ali/Documents/GitHub/Alicia-Project-1/notebooks/~/anaconda3/envs/MEDI7281/lib/python3.11/site-packages/pandas/core/internals/construction.py:329), in ndarray_to_mgr(values, index, columns, dtype, copy, typ)
    324         values = values.reshape(-1, 1)
    326 else:
    327     # by definition an array here
    328     # the dtypes will be coerced to a single dtype
--> 329     values = _prep_ndarraylike(values, copy=copy_on_sanitize)
    331 if dtype is not None and not is_dtype_equal(values.dtype, dtype):
    332     # GH#40110 see similar check inside sanitize_array
    333     rcf = not (is_integer_dtype(dtype) and values.dtype.kind == "f")

File [~/anaconda3/envs/MEDI7281/lib/python3.11/site-packages/pandas/core/internals/construction.py:583](https://file+.vscode-resource.vscode-cdn.net/Users/Ali/Documents/GitHub/Alicia-Project-1/notebooks/~/anaconda3/envs/MEDI7281/lib/python3.11/site-packages/pandas/core/internals/construction.py:583), in _prep_ndarraylike(values, copy)
    581     values = values.reshape((values.shape[0], 1))
    582 elif values.ndim != 2:
--> 583     raise ValueError(f"Must pass 2-d input. shape={values.shape}")
    585 return values

ValueError: Must pass 2-d input. shape=()

i originally had pandas 2.0.3 but downgrading to 1.5 didn't help

 pip freeze
anndata==0.9.1
appnope @ file:///home/conda/feedstock_root/build_artifacts/appnope_1649077682618/work
asttokens @ file:///home/conda/feedstock_root/build_artifacts/asttokens_1670263926556/work
backcall @ file:///home/conda/feedstock_root/build_artifacts/backcall_1592338393461/work
backports.functools-lru-cache @ file:///home/conda/feedstock_root/build_artifacts/backports.functools_lru_cache_1687772187254/work
Bottleneck @ file:///Users/ec2-user/ci_py311/bottleneck_1678322312632/work
cffi==1.15.1
Choco==1.0.5
click==8.1.6
comm @ file:///home/conda/feedstock_root/build_artifacts/comm_1679481329611/work
contourpy==1.1.0
cycler==0.11.0
debugpy @ file:///Users/runner/miniforge3/conda-bld/debugpy_1680755705686/work
decorator @ file:///home/conda/feedstock_root/build_artifacts/decorator_1641555617451/work
executing @ file:///home/conda/feedstock_root/build_artifacts/executing_1667317341051/work
fonttools==4.40.0
h5py==3.9.0
igraph==0.10.5
importlib-metadata @ file:///home/conda/feedstock_root/build_artifacts/importlib-metadata_1688754491823/work
ipykernel @ file:///Users/runner/miniforge3/conda-bld/ipykernel_1688739399853/work
ipython @ file:///Users/runner/miniforge3/conda-bld/ipython_1685727999785/work
jedi @ file:///home/conda/feedstock_root/build_artifacts/jedi_1669134318875/work
Jinja2==3.1.2
joblib==1.3.1
jupyter_client @ file:///home/conda/feedstock_root/build_artifacts/jupyter_client_1687700988094/work
jupyter_core @ file:///Users/runner/miniforge3/conda-bld/jupyter_core_1686775757864/work
kiwisolver==1.4.4
leidenalg==0.10.0
llvmlite==0.40.1
loompy==3.0.7
MarkupSafe==2.1.3
matplotlib==3.7.2
matplotlib-inline @ file:///home/conda/feedstock_root/build_artifacts/matplotlib-inline_1660814786464/work
milopy @ file:///Users/Ali/milopy
mkl-fft==1.3.6
mkl-random @ file:///Users/ec2-user/mkl/mkl_random_1682994911338/work
mkl-service==2.4.0
natsort==8.4.0
nest-asyncio @ file:///home/conda/feedstock_root/build_artifacts/nest-asyncio_1664684991461/work
networkx==3.1
numba==0.57.1
numexpr @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_1b50c1js9s/croot/numexpr_1683227065029/work
numpy==1.23.5
numpy-groupies==0.9.22
packaging @ file:///home/conda/feedstock_root/build_artifacts/packaging_1681337016113/work
pandas==1.5.3
parso @ file:///home/conda/feedstock_root/build_artifacts/parso_1638334955874/work
patsy==0.5.3
pexpect @ file:///home/conda/feedstock_root/build_artifacts/pexpect_1667297516076/work
pickleshare @ file:///home/conda/feedstock_root/build_artifacts/pickleshare_1602536217715/work
Pillow==10.0.0
platformdirs @ file:///home/conda/feedstock_root/build_artifacts/platformdirs_1688739404342/work
prompt-toolkit @ file:///home/conda/feedstock_root/build_artifacts/prompt-toolkit_1688565951714/work
psutil @ file:///Users/runner/miniforge3/conda-bld/psutil_1681775196112/work
ptyprocess @ file:///home/conda/feedstock_root/build_artifacts/ptyprocess_1609419310487/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl
pure-eval @ file:///home/conda/feedstock_root/build_artifacts/pure_eval_1642875951954/work
pycparser==2.21
Pygments @ file:///home/conda/feedstock_root/build_artifacts/pygments_1681904169130/work
pynndescent==0.5.10
pyparsing==3.0.9
python-dateutil @ file:///home/conda/feedstock_root/build_artifacts/python-dateutil_1626286286081/work
pytz==2023.3
pyzmq @ file:///Users/runner/miniforge3/conda-bld/pyzmq_1685519327106/work
rpy2==3.5.13
scanpy==1.9.3
scikit-learn==1.3.0
scipy==1.11.1
scvelo==0.2.5
seaborn==0.12.2
session-info==1.0.0
six @ file:///home/conda/feedstock_root/build_artifacts/six_1620240208055/work
stack-data @ file:///home/conda/feedstock_root/build_artifacts/stack_data_1669632077133/work
statsmodels==0.14.0
stdlib-list==0.9.0
texttable==1.6.7
threadpoolctl==3.1.0
tornado @ file:///Users/runner/miniforge3/conda-bld/tornado_1684150418528/work
tqdm==4.65.0
traitlets @ file:///home/conda/feedstock_root/build_artifacts/traitlets_1675110562325/work
typing_extensions @ file:///home/conda/feedstock_root/build_artifacts/typing_extensions_1688315532570/work
tzdata==2023.3
tzlocal==5.0.1
umap-learn==0.5.3
wcwidth @ file:///home/conda/feedstock_root/build_artifacts/wcwidth_1673864653149/work
zipp @ file:///home/conda/feedstock_root/build_artifacts/zipp_1688902909232/work

milo.make_nhoods(combined_emb) fails if 'X' was used for nearest neighbors

milopy assumes neighbors were calculated on a reduced dimension. Especially for protein assays, this is not always the case:

milopy/milopy/core.py

Line 65 in 49db7b8

X_dimred = adata.obsm[use_rep]

Unable to save adata.uns["nhood_adata"]

If I run the complete milo pipeline and want to save all results in the end using adata.write(path) I get the following error:

NotImplementedError: Failed to write value for uns/nhood_adata, since a writer for type <class 'anndata._core.anndata.AnnData'> has not been implemented yet.
Above error raised while writing key 'uns/nhood_adata' of <class 'h5py._hl.files.File'> from /.

Even if I save the adata.uns["nhood_adata"] separately ( adata_save = adata.uns["nhood_adata"] and then adata_save.write(path)) I get an error and only the obs and var parts are saved, the others are ignored:

ValueError: Unable to create dataset (name already exists)
Above error raised while writing key '__categories/sample' of <class 'h5py._hl.group.Group'> from /.
Above error raised while writing key 'sample' of <class 'h5py._hl.group.Group'> from /.
Above error raised while writing key 'var' of <class 'h5py._hl.files.File'> from /.

AttributeError: recarray has no attribute columns

Hello,

thank you for creating milopy! I have been using it quite often lately, but now I discovered an issue:

When I start a Jupyterlab kernel, load milo as in your documentation and run it on an AnnData Object, it works flawlessly. However, if I then run it again directly after in the same session (even after reloading the AnnData Object), I get the error AttributeError: recarray has no attribute columns.

This is the code I ran:

milo.make_nhoods(adata)
milo.count_nhoods(adata, sample_col="SAMPLE")
milo.DA_nhoods(adata, design="~ GROUP")

I traced this down to the core module in line 240 (DA_nhoods), where the res object is not properly converted from R. I can quickfix this by adding

if isinstance(res, np.recarray):
        res = pd.DataFrame(res)

But probably this issue runs deeper and concerns the rpy2 conversion process?

Is this something you have encountered as well? Or do you need further information from my side to reproduce it?

Retrieving adata.obs_names belonging to each adata.uns["nhood_adata"].obs_names

Hi, is there a way to know which original cells are making each neighborhood?

Thanks!