The SpatialExperiment
class is designed to represent spatially
resolved transcriptomics (ST) data. It inherits from the
SingleCellExperiment
class and is used in the same manner. In
addition, the class supports storage of spatial information via
spatialCoords
and storage of images via imgData
.
Arguments
- ...
Arguments passed to the
SingleCellExperiment
constructor to fill the slots of the base class.- sample_id
A
character
sample identifier, which matches thesample_id
inimgData
. Thesample_id
will also be stored in a new column incolData
, if not already present. Default =sample01
.- spatialCoordsNames
A
character
vector of column names fromcolData
containing spatial coordinates, which will be accessible withspatialCoords
. Alternatively, thespatialCoords
argument may be provided. If both are provided,spatialCoordsNames
is given precedence, and a warning is returned. Default =c("x", "y")
.- spatialCoords
A numeric matrix containing columns of spatial coordinates, which will be accessible with
spatialCoords
. Alternatively,spatialCoordsNames
may be provided. If both are provided,spatialCoordsNames
is given precedence, and a warning is returned.- scaleFactors
Optional scale factors associated with the image(s). This can be provided as a numeric value, numeric vector, list, or file path to a JSON file for the 10x Genomics Visium platform. For 10x Genomics Visium, the correct scale factor will automatically be selected depending on the resolution of the image from
imageSources
. Default =1
.- imgData
Optional
DataFrame
containing the image data. Alternatively, this can be built from the argumentsimageSources
andimage_id
(see Details).- imageSources
Optional file path(s) or URL(s) for one or more image sources.
- image_id
Optional character vector (same length as
imageSources
) containing unique image identifiers.- loadImage
Logical indicating whether to load image into memory. Default =
FALSE
.- spatialDataNames
(Deprecated) A
character
vector of column names fromcolData
to include inspatialData
. Alternatively, thespatialData
argument may be provided. If both are provided,spatialDataNames
is given precedence, and a warning is returned. (Note:spatialData
andspatialDataNames
have been deprecated;colData
andspatialCoords
should be used for all columns. The arguments have been retained for backward compatibility but may be removed in the future.)- spatialData
(Deprecated) A
DataFrame
containing columns to store inspatialData
, which must contain at least the columns of spatial coordinates. Alternatively,spatialDataNames
may be provided. If both are provided,spatialDataNames
is given precedence, and a warning is returned. (Note:spatialData
andspatialDataNames
have been deprecated;colData
andspatialCoords
should be used for all columns. The arguments have been retained for backward compatibility but may be removed in the future.)
Details
In this class, rows represent genes, and columns represent spots (for
spot-based ST platforms) or cells (for molecule-based ST platforms). As for
SingleCellExperiment
, counts
and logcounts
can be
stored in the assays
slot, and row and column metadata in
rowData
and colData
. For molecule-based ST data,
the additional measurements per molecule per cell can be stored in a
BumpyMatrix
-formatted assay
named molecules
.
The additional arguments in the constructor documented above (e.g.
spatialCoords
, imgData
, and others) represent the extensions to
the SingleCellExperiment
class to store associated spatial and
imaging information for ST data.
The constructor expects colData
to contain a column named
sample_id
. If this is not present, it will assign the value from the
sample_id
argument. If the imgData
argument is provided, the
constructor expects the imgData
DataFrame
to
already be built. Otherwise, it will build it from the imageSources
and (optional) image_id
arguments. If image_id
is not provided,
this will be assumed from sample_id
and imageSources
instead.
To combine multiple samples within a single object, see
combine
.
For 10x Genomics Visium datasets, the function read10xVisium
can be used to load data and create a SpatialExperiment
object
directly from Space Ranger output files.
See also
?"SpatialExperiment-methods"
, which includes:
spatialCoords
, spatialCoordsNames
,
imgData
, scaleFactors
?"SpatialExperiment-assays"
, which includes:
molecules
Examples
#########################################################
# Example 1: Spot-based ST (10x Visium) using constructor
#########################################################
dir <- system.file(
file.path("extdata", "10xVisium", "section1", "outs"),
package = "SpatialExperiment")
# read in counts
fnm <- file.path(dir, "raw_feature_bc_matrix")
sce <- DropletUtils::read10xCounts(fnm)
# read in image data
img <- readImgData(
path = file.path(dir, "spatial"),
sample_id="foo")
# read in spatial coordinates
fnm <- file.path(dir, "spatial", "tissue_positions_list.csv")
xyz <- read.csv(fnm, header = FALSE,
col.names = c(
"barcode", "in_tissue", "array_row", "array_col",
"pxl_row_in_fullres", "pxl_col_in_fullres"))
# construct observation & feature metadata
rd <- S4Vectors::DataFrame(
symbol = rowData(sce)$Symbol)
# construct 'SpatialExperiment'
(spe <- SpatialExperiment(
assays = list(counts = assay(sce)),
rowData = rd,
colData = DataFrame(xyz),
spatialCoordsNames = c("pxl_col_in_fullres", "pxl_row_in_fullres"),
imgData = img,
sample_id = "foo"))
#> class: SpatialExperiment
#> dim: 50 50
#> metadata(0):
#> assays(1): counts
#> rownames(50): ENSMUSG00000051951 ENSMUSG00000089699 ...
#> ENSMUSG00000005886 ENSMUSG00000101476
#> rowData names(1): symbol
#> colnames: NULL
#> colData names(5): barcode in_tissue array_row array_col sample_id
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
#> imgData names(4): sample_id image_id data scaleFactor
#############################################################
# Example 2: Spot-based ST (10x Visium) using 'read10xVisium'
#############################################################
# see ?read10xVisium for details
example(read10xVisium)
#>
#> rd10xV> dir <- system.file(
#> rd10xV+ file.path("extdata", "10xVisium"),
#> rd10xV+ package = "SpatialExperiment")
#>
#> rd10xV> sample_ids <- c("section1", "section2")
#>
#> rd10xV> samples <- file.path(dir, sample_ids, "outs")
#>
#> rd10xV> list.files(samples[1])
#> [1] "raw_feature_bc_matrix" "spatial"
#>
#> rd10xV> list.files(file.path(samples[1], "spatial"))
#> [1] "scalefactors_json.json" "tissue_lowres_image.png"
#> [3] "tissue_positions_list.csv"
#>
#> rd10xV> file.path(samples[1], "raw_feature_bc_matrix")
#> [1] "/tmp/RtmppdZi9A/temp_libpathcf263c794d1/SpatialExperiment/extdata/10xVisium/section1/outs/raw_feature_bc_matrix"
#>
#> rd10xV> (spe <- read10xVisium(samples, sample_ids,
#> rd10xV+ type = "sparse", data = "raw",
#> rd10xV+ images = "lowres", load = FALSE))
#> class: SpatialExperiment
#> dim: 50 99
#> metadata(0):
#> assays(1): counts
#> rownames(50): ENSMUSG00000051951 ENSMUSG00000089699 ...
#> ENSMUSG00000005886 ENSMUSG00000101476
#> rowData names(1): symbol
#> colnames(99): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
#> AAAGTCGACCCTCAGT-1 AAAGTGCCATCAATTA-1
#> colData names(4): in_tissue array_row array_col sample_id
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
#> imgData names(4): sample_id image_id data scaleFactor
#>
#> rd10xV> # base directory 'outs/' from Space Ranger can also be omitted
#> rd10xV> samples2 <- file.path(dir, sample_ids)
#>
#> rd10xV> (spe2 <- read10xVisium(samples2, sample_ids,
#> rd10xV+ type = "sparse", data = "raw",
#> rd10xV+ images = "lowres", load = FALSE))
#> class: SpatialExperiment
#> dim: 50 99
#> metadata(0):
#> assays(1): counts
#> rownames(50): ENSMUSG00000051951 ENSMUSG00000089699 ...
#> ENSMUSG00000005886 ENSMUSG00000101476
#> rowData names(1): symbol
#> colnames(99): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
#> AAAGTCGACCCTCAGT-1 AAAGTGCCATCAATTA-1
#> colData names(4): in_tissue array_row array_col sample_id
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
#> imgData names(4): sample_id image_id data scaleFactor
#>
#> rd10xV> # tabulate number of spots mapped to tissue
#> rd10xV> cd <- colData(spe)
#>
#> rd10xV> table(
#> rd10xV+ in_tissue = cd$in_tissue,
#> rd10xV+ sample_id = cd$sample_id)
#> sample_id
#> in_tissue section1 section2
#> FALSE 28 27
#> TRUE 22 22
#>
#> rd10xV> # view available images
#> rd10xV> imgData(spe)
#> DataFrame with 2 rows and 4 columns
#> sample_id image_id data scaleFactor
#> <character> <character> <list> <numeric>
#> 1 section1 lowres #### 0.0510334
#> 2 section2 lowres #### 0.0510334
##############################
# Example 3: Molecule-based ST
##############################
# create simulated data
n <- 1000; ng <- 50; nc <- 20
# sample xy-coordinates in [0,1]
x <- runif(n)
y <- runif(n)
# assign each molecule to some gene-cell pair
gs <- paste0("gene", seq(ng))
cs <- paste0("cell", seq(nc))
gene <- sample(gs, n, TRUE)
cell <- sample(cs, n, TRUE)
# construct data.frame of molecule coordinates
df <- data.frame(gene, cell, x, y)
# (assure gene & cell are factor so that
# missing observations aren't dropped)
df$gene <- factor(df$gene, gs)
df$cell <- factor(df$cell, cs)
# construct BumpyMatrix
mol <- BumpyMatrix::splitAsBumpyMatrix(
df[, c("x", "y")],
row = df$gene, column = df$cell)
# get count matrix
y <- with(df, table(gene, cell))
y <- as.matrix(unclass(y))
# construct SpatialExperiment
(spe_mol <- SpatialExperiment(
assays = list(
counts = y,
molecules = mol)))
#> class: SpatialExperiment
#> dim: 50 20
#> metadata(0):
#> assays(2): counts molecules
#> rownames(50): gene1 gene2 ... gene49 gene50
#> rowData names(0):
#> colnames(20): cell1 cell2 ... cell19 cell20
#> colData names(1): sample_id
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(0) :
#> imgData names(0):