Comments (10)
The output does not match the standard output format of any commercial technology, unless you use Xenium Ranger to convert it and then use readXenium
. The output includes the gene count matrix and cell metadata as csv files when the options are enabled. Use data.table to read it if it's large. Then reformat the gene count matrix data frame into a dgCMatrix
with cell IDs as column names and gene IDs or symbols as row names. The cell segmentation output is GeoJSON, which can be read with sf::st_read()
, which will give you an sf
data frame. Then call the SpatialFeatureExperiment()
constructor. See the SFE vignette for an example of calling the constructor: https://pachterlab.github.io/SpatialFeatureExperiment/articles/SFE.html#object-construction
from voyager.
Thanks I gave it a try and haven't been able to create the SFE object.
Here is what I did:
library(SpatialFeatureExperiment)
cellmeta <- data.table::fread('data/cell-metadata.csv.gz')
expcounts <- data.table::fread('data/expected-counts.csv.gz')
segmentation <- geojsonsf::geojson_sf('data/cell-polygons.geojson')
###
mat <- as.matrix(expcounts)
mat <- as(t(mat), "dgCMatrix")
sfe <- SpatialFeatureExperiment(list(counts = mat), colData = cellmeta,
colGeometries = list(foo = segmentation))
This is the error I get:
> sfe <- SpatialFeatureExperiment(list(counts = mat), colData = cellmeta,
+ colGeometries = list(foo = segmentation))
Error in wk_handle.wk_wkb(wkb, s2_geography_writer(oriented = oriented, :
Loop 0 is not valid: Edge 34 crosses edge 36
In addition: Warning messages:
1: In st_is_longlat(x) :
bounding box has potentially an invalid value range for longlat data
2: In st_is_longlat(x) :
bounding box has potentially an invalid value range for longlat data
This is the error I get if I try without the geojson file:
> sfe <- SpatialFeatureExperiment(list(counts = mat), colData = cellmeta)
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent
I checked for dimnames and dim and it looks fine:
> str(dimnames(mat))
List of 2
$ : chr [1:1193] "Uba52" "Carmn" "Eif5a" "Ccnd1" ...
$ : chr [1:2061] "1" "2" "3" "4" ...
> dim(mat)
[1] 1193 2061
I have tried a couple of things but can't get it to work, would it be possible to send you the files so that you can take a look?
from voyager.
Regarding this error:
Error in wk_handle.wk_wkb(wkb, s2_geography_writer(oriented = oriented, :
Loop 0 is not valid: Edge 34 crosses edge 36
In addition: Warning messages:
1: In st_is_longlat(x) :
bounding box has potentially an invalid value range for longlat data
2: In st_is_longlat(x) :
bounding box has potentially an invalid value range for longlat data
Did sf::st_read("data/cell-polygons.geojson")
not work? An annoying thing about using geospatial packages is that they often assume a coordinate reference system (CRS), which is irrelevant to the histological space. Basically CRS's are different ways the 3D globe is projected to 2D maps. It seems that in your case, geojsonsf
gave your data a CRS (must be 4326) which makes sf
think that your values are longitudes and latitudes. You can do st_crs(segmentation) <- NA
to remove the CRS. st_read
doesn't give you a CRS by default.
Regarding the error:
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent
You haven't supplied the spatialCoords
or spatialCoordsNames
argument. The default for spatialCoordsNames
is c("x", "y")
. If spatialCoords
is not specified, then spatialCoordsNames
should be found among the columns of colData
. I suppose I should update that example in the vignette for a more general case.
What's the output of traceback()
after you get this error? I don't know where it comes from. It might be caused by say the length of the column names or row names in dn
doesn't match the number of columns or rows in the array. It can be when the number of rows in segmentation
or number of rows in cellmeta
doesn't match the number of columns in mat
. From your output, it seems that you didn't assign the cell IDs to the column names of mat
. Assuming that columns in mat
already match the rows in cellmeta
and both cellmeta
and segmentation
have a column cell_id
for the cell ID. If they don't match, you can match them by segmentation <- segmentation[match(segmentation$cell_id, cellmeta$cell_id),]
.
Another possible cause is that segmentation
has geometry type POLYGON
but some cells have multiple polygons, causing segmentation
to have more rows than cellmeta
. If that's the case, then use segmentation2 <- sf::aggregate(segmentation, list(cell_id = segmentation$cell_id), unique)
to convert to MULTIPOLYGON
where each cell has one geometry. I think I'll add a function to the SFE package to deal with scenarios like this.
See the Geocomputation with R book to learn more about operations on sf
data frames and geometries.
from voyager.
Thanks I'll try this, I guess I just need to understand the object construction better.
Are there any columns that are looked for that have to have harcoded values, like cell_id?
from voyager.
The column name "cell_id" is just an example. Any name can be used, but there should be a column for cell IDs to match the cell metadata to the gene count matrix and to the cell segmentation polygons, unless you are already very sure that they are all in the same order in the cell metadata, gene count matrix, and polygons.
from voyager.
It does seem that there is a slight missmatch between the cell id in the raw cellmeta and the names in the count matrix and segmentation.
The rownames of expcounts (genes in columns and cells in rows in raw file) are a cell name starting in "1":
> head(rownames(expcounts))
[1] "1" "2" "3" "4" "5" "6"
Whereas in cellmeta the 'cell' value starts at '0'
> head(cellmeta)
cell centroid_x centroid_y centroid_z fov cluster volume population
<int> <num> <num> <num> <int> <int> <num> <int>
1: 0 353.22656 293.0781 4.765625 99 1 320.0 27
2: 1 213.03372 403.7463 4.519795 99 1 852.5 186
3: 2 77.85435 142.6783 6.038043 99 0 1150.0 508
4: 3 319.43000 233.6000 4.500000 99 1 500.0 80
5: 4 114.46059 110.4803 6.422414 99 0 507.5 197
6: 5 165.97500 464.0200 5.937500 99 1 500.0 122
And the values in segmentation cell column are also starting at '0', so I'll try to match them all.
Also, sf::st_read didnt work I think because I was trying to read the .gz file directly(?), thats why I moved to geojson_sf and you are correct that it added a CRS. Also the geojson file I used is made of multipolygons:
> head(segmentation)
Simple feature collection with 6 features and 1 field
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 68 ymin: 104 xmax: 359 ymax: 470
Geodetic CRS: WGS 84
cell geometry
1 0 MULTIPOLYGON (((349 287, 34...
2 1 MULTIPOLYGON (((204 402, 20...
3 2 MULTIPOLYGON (((70 138, 70 ...
4 3 MULTIPOLYGON (((314 233, 31...
5 4 MULTIPOLYGON (((110 111, 11...
6 5 MULTIPOLYGON (((161 463, 16...
Thanks for the feedback I'll try again tomorrow with your suggestions.
from voyager.
Also, sf::st_read didnt work I think because I was trying to read the .gz file directly(?)
I think it's probably because of the gz. Do cellmeta and segmentation have the same number of rows, which should be the same as the number of columns in the gene count matrix? Also, when the rows are cells, the gene count matrix csv file might have an unnamed column for cell ID which are row names of the matrix. When it's unnamed, data.table reads it into R as the first column with name V1
and whatever row names you get are just the default which is 1:nrow(df)
.
from voyager.
@cstrlln did it work for you to make the SFE object?
from voyager.
@alikhuseynov I got side tracked trying another tool, so didn't test the suggestions yet but plan to try in the next couple weeks. I really think it might be the cell names though.
from voyager.
@alikhuseynov I got side tracked trying another tool, so didn't test the suggestions yet but plan to try in the next couple weeks. I really think it might be the cell names though.
ok, if you send us the output files or deposit them somewhere where we can download them, I will try to help making that SFE object.
from voyager.
Related Issues (20)
- Specify whether bivariate method in `SFEMethod` class is symmetric
- Plot multiple grayscale images at once as different channels in `plotSpatialFeature` and related functions
- Allow users to select any channel in images in plotting functions
- Option to plot image in a separate adjacent panel rather than behind cells/spots
- Option to plot only polygon outlines without fill in `plotColGeometry`
- Plot scale bar when unit = "micron"
- Write alt text for vignette figures
- Vignette using multiple samples HOT 2
- Support `BioFormatsImage` and `EBImage` in `plotSpatialFeature` HOT 1
- Vignette for Tomo-seq
- Add argument to `plotSpatialFeature` for whether to show axes
- Plotting function just to show images without geometries
- Joint count vignette
- Vignette about multi-scale analyses
- make VisiumHD vignettes
- A bug of Blank plots in workflow: Seurat V5 (with image) ->toSpatialFeatureExperiment -> plotSpatialFeature HOT 1
- how to change color of plotLocalResult (remain spots with p < 0.05 only) HOT 2
- Local bivariate Moran's l (-log10p_adj Sim) look the same HOT 1
- Errors in calculateBivariate:check_Ops_vector_arg_length(e, x_nrow, e_what = e_what, x_what = x_what): when the right operand is not a DelayedArray object (or derivative), its length cannot be greater than the first dimension of the left operand HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from voyager.