Published

June 13, 2025

Technology for analysis

R and Python have become the most popular languages for analysis of (spatial) omics data over the last few years. Traditionally, R has been the main language used in statistics and therefore the number of statistics libraries is still larger compared to Python. Python, however, has become the language of choice for tasks such as image analysis and machine learning where R lacks behind (Moses and Pachter 2022).

Our goal is to use the best of both worlds. In this spirit, we try to show analyses in both R and Python where possible. Especially in point pattern analysis however, the existing libraries in Python lack behind such that we focus on the analysis in R in these chapters.

To facilitate the transition between R and Python, we will show how to convert objects between the languages such that analyses can be transitioned from one language to another.

In R we will use SpatialExperiment and SpatialFeatureExperiment objects Moses et al. (2023). The latter is an extension of the first and additionally contains geometric annotations that are encoded as simple features of the sf library (Pebesma and Bivand 2023).

In Python we will use the anndata object (Virshup et al. 2024).

Conversion from R to Python

Show the code

suppressPackageStartupMessages({
  library(SpatialFeatureExperiment)
  library(SFEData)
  library(scater)
})

Here we will load a dataset by (McKellar et al. 2021) as a SpatialFeatureExperiment from the package SFEData.

Show the code

# Load the dataset
sfe <- SFEData::McKellarMuscleData(dataset = "full")
# Take spots that are covered with tissue
sfe_tissue <- sfe[, colData(sfe)$in_tissue]
# Filter out genes with no counts
sfe_tissue <- sfe_tissue[rowSums(counts(sfe_tissue)) > 0, ]
# Convert counts log-transformed normalized expression values
sfe_tissue <- scater::logNormCounts(sfe_tissue)

For the conversion from a SingleCellExperiment to a AnnData object we will use the zellkonverter package. More details can be found in their vignettes. Note that the SpatialExperiment and SpatialFeatureExperiment objects inherit from the SingleCellExperiment class, therefore we can use the function SCE2AnnData.

Show the code

# Convert to AnnData object
ann <-  zellkonverter::SCE2AnnData(sfe_tissue)
anndata::write_h5ad(ann, "../data/McKellarMuscleData.h5ad")

The spatial coordinates however are not directly converted and stored as x and y in the obs data frame.

Show the code

import numpy as np
import scanpy as sc
import squidpy as sq

# Read AnnData object
adata = sc.read_h5ad("../data/McKellarMuscleData.h5ad")
# Create the spatial coordinates in AnnData
adata.obsm['spatial'] = np.array(list(zip((adata.obs['x'] * -1), adata.obs['y'])))

We preform more preprocessing such that we can use all function of the package esda a subpackage of pysal.

Show the code

adata.var_names = adata.var["symbol"]
adata.raw = adata.copy()
# Necessary for Local Moran function to work
adata.X = adata.layers["logcounts"].astype(np.float64) 
# Save for later use
adata.write("../data/McKellarMuscleData_processed.h5ad")

Conversion from Python to R

Here we will use 10x Visium data set from 10x Genomics (2021) downloaded from the squidpy visium datasets (Palla et al. 2022).

Show the code

import numpy as np
import scanpy as sc

#load the dataset and store as anndata object
adata = sq.datasets.visium("Visium_FFPE_Mouse_Kidney")


  0%|          | 0.00/8.72M [00:00<?, ?B/s]
  5%|5         | 456k/8.72M [00:00<00:01, 4.64MB/s]
 17%|#7        | 1.49M/8.72M [00:00<00:00, 8.33MB/s]
 48%|####8     | 4.20M/8.72M [00:00<00:00, 17.4MB/s]
100%|##########| 8.72M/8.72M [00:00<00:00, 24.2MB/s]

  0%|          | 0.00/23.6M [00:00<?, ?B/s]
  7%|7         | 1.76M/23.6M [00:00<00:01, 18.4MB/s]
 53%|#####3    | 12.6M/23.6M [00:00<00:00, 74.6MB/s]
100%|##########| 23.6M/23.6M [00:00<00:00, 88.2MB/s]

Show the code

# normalise counts already here that both R and python have same values
adata.raw = adata.copy()
sc.pp.normalize_total(adata)
sc.pp.log1p(adata)
# save file as h5ad
adata.write("../data/adata_neighbors.h5ad")

For the conversion from a AnnData to a SingleCellExperiment object we will again use the zellkonverter package.

Show the code

library(SpatialFeatureExperiment)
library(SpatialExperiment)

sce <- zellkonverter::readH5AD("../data/adata_neighbors.h5ad")
colnames(reducedDim(sce, "spatial")) <- c("x", "y")

Now we have to convert to a SpatialExperiment / SpatialFeatureExperiment object.

Show the code

spe <- toSpatialExperiment(sce)
# Reduced dimension to spatial coords
spatialCoords(spe) <- reducedDim(sce, "spatial")
# SpatialExperiment to SpatialFeatureExperiment
sfe <- SpatialFeatureExperiment::toSpatialFeatureExperiment(spe)
# the counts were already normalised in Python
assayNames(sfe) <- "logcounts"
saveRDS(sfe, "../data/sfe_Visium_Mouse_Kidney.rds")

Conversion from Seurat to Bioconductor objects

Another popular analysis framework for single cell and spatial omics in R is Seurat. To convert from Seurat object so SpatialExperiment you can use Seurat::as.SingleCellExperiment(x). For more details please have a look at this vignette. From there on you can convert to a SpatialExperiment object using toSpatialExperiment().

Alternatively, to convert from Seurat to SpatialFeatureExperiment one can use the toSpatialFeatureExperiment function as specified in the SpatialFeatureExperiment vignette.

Datasets

Dependencies

Show the code

library(ggplot2)
library(RColorBrewer)
library(ExperimentHub)
library(SpatialExperiment)
library(STexampleData)

Molecule-based dataset

Here we will process the MERFISH data from Moffitt et al. (2018) to be used as a SpatialExperiment (Righelli et al. 2022) in our vigenttes.

Setup

The dataset is loaded from ExperimentHub

Show the code

eh <- ExperimentHub()
q <- query(eh, "MERFISH")
df <- eh[["EH7546"]]

Wrangling

Show the code

# extract cell metadata
i <- seq_len(9)
cd <- data.frame(df[, i], row.names = 1)

# set sample identifiers
id <- grep("Bregma", names(cd))
names(cd)[id] <- "sample_id"

# rename spatial coordinates
xy <- grep("Centroid", names(cd))
xy <- names(cd)[xy] <- c("x", "y")

# simplify annotations
cd$cluster_id <- cd$Cell_class
for (. in c("Endothelial", "OD Mature", "OD Immature"))
  cd$cluster_id[grep(., cd$cluster_id)] <- .

# extract & sparsify assay data
y <- data.frame(df[, -i], row.names = df[, 1])
y <- as(t(as.matrix(y)), "dgCMatrix")

# construct SPE
(spe <- SpatialExperiment(
  assays = list(exprs  = y),
  spatialCoordsNames = xy,
  colData = cd))

class: SpatialExperiment 
dim: 161 73655 
metadata(0):
assays(1): exprs
rownames(161): Ace2 Adora2a ... Ucn3 Vgf
rowData names(0):
colnames(73655): 6749ccb4-2ed1-4029-968f-820a287f43c8
  6cac74bd-4ea7-4701-8701-42563cc65eb8 ...
  6b666f81-7b73-4100-9e02-b5381b39f0f3
  fdcddd97-7701-462a-b48f-979111245bd5
colData names(7): Animal_ID Animal_sex ... Neuron_cluster_ID cluster_id
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):
spatialCoords names(2) : x y
imgData names(0):

Show the code

gg <- data.frame(spatialCoords(spe), colData(spe))
pal <- brewer.pal(length(unique(gg$cluster_id)), "Paired")
ggplot(gg, aes(x, y, col = cluster_id)) +
  facet_wrap(~ sample_id, scales = "free") +
  geom_point(size = 0.1) + scale_color_manual(values = pal) +
  guides(col = guide_legend(override.aes = list(size = 2))) +
  theme_void() + theme(legend.key.size = unit(0.5, "lines"))

Save data

Show the code

# Define the directory and file paths
dir_path <- "../data"

# Check if the directory exists, and create it if it doesn't
if (!dir.exists(dir_path)) {
  dir.create(dir_path)
}

saveRDS(spe, "../data/spe.rds")

Packages and helpful functions for later use

Show the code

suppressPackageStartupMessages({
  library(SpatialExperiment)
  library(spatstat)
  library(dplyr)
  library(ggplot2)
  library(rlang)
  library(stats)
  library(STexampleData)
  library(patchwork)
  library(reshape2)
  library(sf)
  library(rgeoda)
  library(Voyager)
  library(SpatialFeatureExperiment)
  library(SFEData)
  library(scran)
  library(scater)
  library(tmap)
  library(spdep)
  library(stringr)
  library(magrittr)
  library(bluster)
  library(dixon)
  library(ExperimentHub)
  library(STexampleData)
})

Show the code

.ppp <- \(spe, marks = NULL) {
  xy <- spatialCoords(spe)
  w <- owin(range(xy[, 1]), range(xy[, 2]))
  m <- if (is.character(marks)) {
    stopifnot(
      length(marks) == 1,
      marks %in% c(
        gs <- rownames(spe),
        cd <- names(colData(spe))))
    if (marks %in% gs) {
      assay(spe)[marks, ]
    } else if (marks %in% cd) {
      spe[[marks]]
    } else stop("'marks' should be in ",
      "'rownames(.)' or 'names(colData(.))'")
  }
  ppp(xy[, 1], xy[, 2], window = w, marks = factor(m))
}

Show the code

pValuesHotspotMarks <- function(pp, alpha = 0.05){
  # Code source: https://idblr.rbind.io/post/pvalues-spatial-segregation/ 
  # License: CC-BY-SA
  # Significant p-values assumming normality of the Poisson process
  ## relrisk() computes standard errors based on asymptotic theory, assuming a Poisson process
  # call relative risk function
  f1 <- relrisk(pp_sel,se=TRUE)
  
  z <- qnorm(alpha/2, lower.tail = F)     # z-statistic
  f1$u <- f1$estimate + z*f1$SE           # Upper CIs
  f1$l <- f1$estimate - z*f1$SE           # Lower CIs
  mu_0 <- as.vector(table(spatstat.geom::marks(pp))/pp$n) # null expectations by type
  f1$p <- f1$estimate # copy structure of pixels, replace values
  for (i in 1:length(f1$p)) {
    f1$p[[i]]$v <- factor(ifelse(mu_0[i] > f1$u[[i]]$v, "lower",
                                ifelse( mu_0[i] < f1$l[[i]]$v, "higher", "none")),
                          levels = c("lower", "none", "higher"))
  }
  return(f1)
}

pValuesHotspot <- function(pp, alpha = 0.05){
  # Code source: https://idblr.rbind.io/post/pvalues-spatial-segregation/
  # License: CC-BY-SA
  # density estimate for all marks
  f1 <- density(unmark(pp), se = TRUE)
  # Significant p-values assumming normality of the Poisson process
  z <- qnorm(alpha/2, lower.tail = F)     # z-statistic
  f1$u <- f1$estimate + z*f1$SE           # Upper CIs
  f1$l <- f1$estimate - z*f1$SE           # Lower CIs
  mu_0 <- intensity(unmark(pp)) # null expectations by type
  f1$p <- f1$estimate # copy structure of pixels, replace values
  f1$p$v <- factor(ifelse(mu_0 > f1$u$v, "lower",
                          ifelse( mu_0 < f1$l$v, "higher", "none")),
                   levels = c("lower", "none", "higher"))
  
  return(f1)
}

Appendix

Session info

Show the code

sessionInfo()

R version 4.5.0 (2025-04-11)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: UTC
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices datasets  utils     methods  
[8] base     

other attached packages:
 [1] dixon_0.0-10                    splancs_2.01-45                
 [3] sp_2.2-0                        bluster_1.18.0                 
 [5] magrittr_2.0.3                  stringr_1.5.1                  
 [7] spdep_1.3-11                    spData_2.3.4                   
 [9] tmap_4.0                        scran_1.36.0                   
[11] Voyager_1.10.0                  rgeoda_0.1.0                   
[13] digest_0.6.37                   sf_1.0-20                      
[15] reshape2_1.4.4                  patchwork_1.3.0                
[17] rlang_1.1.6                     dplyr_1.1.4                    
[19] spatstat_3.3-2                  spatstat.linnet_3.2-5          
[21] spatstat.model_3.3-5            rpart_4.1.24                   
[23] spatstat.explore_3.4-2          nlme_3.1-168                   
[25] spatstat.random_3.3-3           spatstat.geom_3.3-6            
[27] spatstat.univar_3.1-2           spatstat.data_3.1-6            
[29] STexampleData_1.16.0            ExperimentHub_2.16.0           
[31] AnnotationHub_3.16.0            BiocFileCache_2.16.0           
[33] dbplyr_2.5.0                    RColorBrewer_1.1-3             
[35] SpatialExperiment_1.18.0        scater_1.36.0                  
[37] ggplot2_3.5.2                   scuttle_1.18.0                 
[39] SingleCellExperiment_1.30.0     SummarizedExperiment_1.38.0    
[41] Biobase_2.68.0                  GenomicRanges_1.60.0           
[43] GenomeInfoDb_1.44.0             IRanges_2.42.0                 
[45] S4Vectors_0.46.0                BiocGenerics_0.54.0            
[47] generics_0.1.3                  MatrixGenerics_1.20.0          
[49] matrixStats_1.5.0               SFEData_1.10.0                 
[51] SpatialFeatureExperiment_1.10.0

loaded via a namespace (and not attached):
  [1] spatialreg_1.3-6          spatstat.sparse_3.1-0    
  [3] bitops_1.0-9              EBImage_4.50.0           
  [5] httr_1.4.7                tools_4.5.0              
  [7] R6_2.6.1                  HDF5Array_1.36.0         
  [9] mgcv_1.9-3                anndata_0.7.5.6          
 [11] rhdf5filters_1.20.0       withr_3.0.2              
 [13] gridExtra_2.3             leaflet_2.2.2            
 [15] leafem_0.2.3              cli_3.6.5                
 [17] sandwich_3.1-1            labeling_0.4.3           
 [19] mvtnorm_1.3-3             proxy_0.4-27             
 [21] R.utils_2.13.0            spacesXYZ_1.5-1          
 [23] dichromat_2.0-0.1         scico_1.5.0              
 [25] limma_3.64.0              RSQLite_2.3.11           
 [27] crosstalk_1.2.1           Matrix_1.7-3             
 [29] ggbeeswarm_0.7.2          logger_0.4.0             
 [31] abind_1.4-8               R.methodsS3_1.8.2        
 [33] terra_1.8-42              lifecycle_1.0.4          
 [35] multcomp_1.4-28           yaml_2.3.10              
 [37] edgeR_4.6.1               tmaptools_3.2            
 [39] rhdf5_2.52.0              SparseArray_1.8.0        
 [41] grid_4.5.0                blob_1.2.4               
 [43] dqrng_0.4.1               crayon_1.5.3             
 [45] dir.expiry_1.16.0         lattice_0.22-7           
 [47] beachmat_2.24.0           KEGGREST_1.48.0          
 [49] magick_2.8.6              metapod_1.16.0           
 [51] zeallot_0.1.0             pillar_1.10.2            
 [53] knitr_1.50                rjson_0.2.23             
 [55] boot_1.3-31               codetools_0.2-20         
 [57] wk_0.9.4                  glue_1.8.0               
 [59] data.table_1.17.0         memuse_4.2-3             
 [61] vctrs_0.6.5               png_0.1-8                
 [63] gtable_0.3.6              assertthat_0.2.1         
 [65] cachem_1.1.0              xfun_0.52                
 [67] S4Arrays_1.8.0            mime_0.13                
 [69] DropletUtils_1.28.0       cols4all_0.8             
 [71] coda_0.19-4.1             survival_3.8-3           
 [73] sfheaders_0.4.4           units_0.8-7              
 [75] statmod_1.5.0             TH.data_1.1-3            
 [77] bit64_4.6.0-1             filelock_1.0.3           
 [79] irlba_2.3.5.1             vipor_0.4.7              
 [81] KernSmooth_2.23-26        colorspace_2.1-1         
 [83] DBI_1.2.3                 leaflegend_1.2.1         
 [85] raster_3.6-32             zellkonverter_1.18.0     
 [87] tidyselect_1.2.1          bit_4.6.0                
 [89] compiler_4.5.0            curl_6.2.2               
 [91] BiocNeighbors_2.2.0       h5mread_1.0.0            
 [93] basilisk.utils_1.20.0     DelayedArray_0.34.1      
 [95] scales_1.4.0              classInt_0.4-11          
 [97] rappdirs_0.3.3            tiff_0.1-12              
 [99] goftest_1.2-3             fftwtools_0.9-11         
[101] spatstat.utils_3.1-3      rmarkdown_2.29           
[103] basilisk_1.20.0           XVector_0.48.0           
[105] base64enc_0.1-3           htmltools_0.5.8.1        
[107] pkgconfig_2.0.3           jpeg_0.1-11              
[109] sparseMatrixStats_1.20.0  fastmap_1.2.0            
[111] htmlwidgets_1.6.4         UCSC.utils_1.4.0         
[113] DelayedMatrixStats_1.30.0 farver_2.1.2             
[115] zoo_1.8-14                jsonlite_2.0.0           
[117] BiocParallel_1.42.0       R.oo_1.27.0              
[119] BiocSingular_1.24.0       RCurl_1.98-1.17          
[121] GenomeInfoDbData_1.2.14   s2_1.1.7                 
[123] Rhdf5lib_1.30.0           Rcpp_1.0.14              
[125] ggnewscale_0.5.1          viridis_0.6.5            
[127] reticulate_1.42.0         leafsync_0.1.0           
[129] stringi_1.8.7             MASS_7.3-65              
[131] plyr_1.8.9                parallel_4.5.0           
[133] ggrepel_0.9.6             deldir_2.0-4             
[135] stars_0.6-8               Biostrings_2.76.0        
[137] splines_4.5.0             tensor_1.5               
[139] locfit_1.5-9.12           igraph_2.1.4             
[141] ScaledMatrix_1.16.0       LearnBayes_2.15.1        
[143] XML_3.99-0.18             BiocVersion_3.21.1       
[145] evaluate_1.0.3            leaflet.providers_2.0.0  
[147] renv_1.1.4                BiocManager_1.30.25      
[149] purrr_1.0.4               polyclip_1.10-7          
[151] rsvd_1.0.5                lwgeom_0.2-14            
[153] e1071_1.7-16              RSpectra_0.16-2          
[155] viridisLite_0.4.2         class_7.3-23             
[157] tibble_3.2.1              memoise_2.0.1            
[159] beeswarm_0.4.0            AnnotationDbi_1.70.0     
[161] cluster_2.1.8.1           BiocStyle_2.36.0

References

10x Genomics. 2021. “Adult Mouse Kidney (FFPE), Spatial Gene Expression Dataset Analyzed Using Space Ranger 1.3.0.”

McKellar, David W., Lauren D. Walter, Leo T. Song, Madhav Mantri, Michael F. Z. Wang, Iwijn De Vlaminck, and Benjamin D. Cosgrove. 2021. “Large-Scale Integration of Single-Cell Transcriptomic Data Captures Transitional Progenitor States in Mouse Skeletal Muscle Regeneration.” Communications Biology 4 (1): 1–12. https://doi.org/10.1038/s42003-021-02810-x.

Moffitt, Jeffrey R., Dhananjay Bambah-Mukku, Stephen W. Eichhorn, Eric Vaughn, Karthik Shekhar, Julio D. Perez, Nimrod D. Rubinstein, et al. 2018. “Molecular, Spatial, and Functional Single-Cell Profiling of the Hypothalamic Preoptic Region.” Science 362 (6416): eaau5324. https://doi.org/10.1126/science.aau5324.

Moses, Lambda, Pétur Helgi Einarsson, Kayla C. Jackson, Laura Luebbert, Ali Sina Booeshaghi, Sindri Emmanúel Antonsson, Nicolas Bray, Páll Melsted, and Lior Pachter. 2023. “Voyager: Exploratory Single-Cell Genomics Data Analysis with Geospatial Statistics.” bioRxiv. https://doi.org/10.1101/2023.07.20.549945.

Moses, Lambda, and Lior Pachter. 2022. “Museum of Spatial Transcriptomics.” Nature Methods 19 (5): 534–46. https://doi.org/10.1038/s41592-022-01409-2.

Palla, Giovanni, Hannah Spitzer, Michal Klein, David Fischer, Anna Christina Schaar, Louis Benedikt Kuemmerle, Sergei Rybakov, et al. 2022. “Squidpy: A Scalable Framework for Spatial Omics Analysis.” Nature Methods 19 (2): 171–78. https://doi.org/10.1038/s41592-021-01358-2.

Pebesma, Edzer, and Roger Bivand. 2023. Spatial Data Science: With Applications in R. 1st ed. New York: Chapman and Hall/CRC. https://doi.org/10.1201/9780429459016.

Righelli, Dario, Lukas M Weber, Helena L Crowell, Brenda Pardo, Leonardo Collado-Torres, Shila Ghazanfar, Aaron T L Lun, Stephanie C Hicks, and Davide Risso. 2022. “SpatialExperiment: Infrastructure for Spatially-Resolved Transcriptomics Data in R Using Bioconductor.” Bioinformatics 38 (11): 3128–31. https://doi.org/10.1093/bioinformatics/btac299.

Virshup, Isaac, Sergei Rybakov, Fabian J Theis, Philipp Angerer, and F Alexander Wolf. 2024. “Anndata: Access and Store Annotated Data Matrices.” Journal of Open Source Software 9 (101): 4371. https://doi.org/10.21105/joss.04371.