vignettes/SpatialExperiment.Rmd
SpatialExperiment.Rmd
The 10X Genomics’ CellRanger pipeline will process data using standard output file formats that are saved, for each sample, in a single directory /<sample>/outs/
of the following structure:
sample
|—outs
··|—raw/filtered_feature_bc_matrix.h5
··|—raw/filtered_feature_bc_matrix
····|—barcodes.tsv
····|—features.tsv
····|—matrix.mtx
··|—spatial
····|—tissue_hires_image.png
····|—tissue_lowres_image.png
····|—detected_tissue_image.jpg
····|—aligned_fiducials.jpg
····|—scalefactors_json.json
····|—tissue_positions_list.csv
The SpatialExperiment
package provides an exemplary 10X Visium spatial gene expression data of two serial mouse brain sections (Sagittal-Posterior) available from the 10X Genomics website. These are located in the extdata/10xVisium
directory:
dir <- system.file(
file.path("extdata", "10xVisium"),
package = "SpatialExperiment")
sample_ids <- c("section1", "section2")
samples <- file.path(dir, sample_ids)
We can load these data into a SpatialExperiment
using the read10xVisium()
function, which will read in all relevant information, including the count data, spatial coordinates, scale factors, and images:
list.files(samples[1])
## [1] "raw_feature_bc_matrix.h5" "spatial"
list.files(file.path(samples[1], "spatial"))
## [1] "scalefactors_json.json" "tissue_lowres_image.png"
## [3] "tissue_positions_list.csv"
(ve <- read10xVisium(samples, sample_ids,
images = "lowres", # specify which image(s) to include
load = TRUE)) # specify whether or not to load image(s)
## class: SpatialExperiment
## dim: 32285 9984
## metadata(2): Samples Samples
## assays(1): counts
## rownames(32285): ENSMUSG00000051951 ENSMUSG00000089699 ...
## ENSMUSG00000095019 ENSMUSG00000095041
## rowData names(1): symbol
## colnames(9984): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
## TTGTTTGTATTACACG-1 TTGTTTGTGTAAATTC-1
## colData names(7): Barcode sample_id ... array_row array_col
## reducedDimNames(0):
## altExpNames(0):
## spatialCoordsNames(5) : x_coord y_coord in_tissue array_row array_col
## inTissue(1): 6710
## imgData(6): sample_id image_id ... height scaleFactor
SpatialExperiment
classSpatial data are stored as observation metadata (colData
) and include:
sample_id
specifying unique sample identifiersin_tissue
indicating whether an observation was mapped to tissuex/y_coord
storing spatial coordinatesarray_row/col
giving the spots’ row/column coordinate in the array1
A DataFrame
of spatially-related data can be accessed using the spatialCoords()
accessor:
```r
head(spatialCoords(ve))
```
```
## DataFrame with 6 rows and 5 columns
## x_coord y_coord in_tissue array_row array_col
## <integer> <integer> <logical> <integer> <integer>
## AAACAACGAATAGTTC-1 1419 2534 FALSE 0 16
## AAACAAGTATCTCCCA-1 7409 8455 TRUE 50 102
## AAACAATCTACTAGCA-1 1778 4393 FALSE 3 43
## AAACACCAATAACTGC-1 8487 2740 TRUE 59 19
## AAACAGAGCGACTCCT-1 3096 7905 TRUE 14 94
## AAACAGCTTTCAGAAG-1 6570 2052 TRUE 43 9
```
Alternatively, we can access these data using colData()
or, even simpler, the $
accessor:
```r
# tabulate number of spots mapped to tissue
table(
in_tissue = ve$in_tissue,
sample_id = ve$sample_id)
```
```
## sample_id
## in_tissue section1 section2
## FALSE 1637 1637
## TRUE 3355 3355
```
Image-related data are stored in the int_metadata
’s imgData
field as a DataFrame
with the following columns:
sample_id
and image_id
specifying the image’s sample and image identifierdata
: a list of SpatialImage
s containing the image’s grob
, path and/or URLwidth
and height
giving the image’s dimension (in pixel)scaleFactor
used to rescale spatial coordinates according to the image’s resolutionWe can retrieve these data using the imgData()
accessor:
imgData(ve)
## DataFrame with 2 rows and 6 columns
## sample_id image_id data width height scaleFactor
## <character> <character> <list> <integer> <integer> <numeric>
## 1 section1 lowres 600 600 0.0516351
## 2 section2 lowres 600 600 0.0516351
SpatialImage
classImages inside a SpatialExperiment
’s imgData
are stored as objects of class SpatialImage
. These contain three slots that can accommodate any available information associated with an image:
* `@grob`: NULL or an object class `rastergrob` from the `grid` package
@path
: NULL or a character strings specifying an image file name (.png, .jpg or .tif)@url
: NULL or a character string specifying an URL from which to retrieve the imageA list of SpatialImage
s can be retrieved from the imgData
’s data
field using the $
accessor:
imgData(ve)$data
## [[1]]
## A SpatialImage with 2 source(s):
## > loaded
## grob: Av
## path: /usr/local/lib/R/site-library/SpatialExperiment/ex
## tdata/10xVisium/section1/spatial/tissue_lowres_ima
## ge.png
## url: NA
##
## [[2]]
## A SpatialImage with 2 source(s):
## > loaded
## grob: Av
## path: /usr/local/lib/R/site-library/SpatialExperiment/ex
## tdata/10xVisium/section2/spatial/tissue_lowres_ima
## ge.png
## url: NA
Data available in an object of class SpatialImage
may be accessed via the imgGrob()
, imgPath()
and imgUrl()
accessors:
## rastergrob[GRID.rastergrob.11]
imgPath(si)
## [1] "/usr/local/lib/R/site-library/SpatialExperiment/extdata/10xVisium/section1/spatial/tissue_lowres_image.png"
imgUrl(si)
## NULL
grob
s can be used directly for plotting (e.g. using grid.draw()
or ggplot2
’s layer()
and annotation_custom()
):
```r
si <- imgData(ve)$data[[1]]
grid.draw(imgGrob(si))
```
<img src="/__w/EuroBioc2020_SpatialWorkshop/EuroBioc2020_SpatialWorkshop/docs/articles/SpatialExperiment_files/figure-html/unnamed-chunk-10-1.png" width="700" />
path
and url
provide the option to store an image’s source at minimal storage cost. This is desirable when multiple images are to be stored (say, for many samples and of different resolutions), or when a SpatialExperiment
is to be exported.
The SpatialExperiment
package provides various functions to handle which and how image data is stored in the object. These include:
loadImg
to actively load (an) image(s) from a path or URL and store it as a grob
unloadImg
to drop the grob
, while retaining the source path and/or URLaddImg
to add a new image entry (as a path, URL, or grob
)removeImg
to drop an image entry entirelyloadImg()
and add/removeImg()
are flexible in the specification of the sample/image_id
arguments. Specifically,
TRUE
is equivalent to all, e.g. sample_id = "<sample>", image_id = TRUE
will drop all images for a given sample.NULL
defaults to the first entry available, e.g., sample_id = "<sample>", image_id = NULL
will drop the first image for a given sample.For example, sample_id,image_id = TRUE,TRUE
will specify all images; NULL,NULL
corresponds to the first image entry in the imgData
; TRUE,NULL
equals the first image for all samples; and NULL,TRUE
matches all images for the first sample.
In the example below, we unload all images, i.e., drop all grob
s. As a result, grob
slots will be set to NULL
, and all SpatialImage
s now say > not loaded
.
## [[1]]
## A SpatialImage with 1 source(s):
## > not loaded
## grob: NA
## path: /usr/local/lib/R/site-library/SpatialExperiment/ex
## tdata/10xVisium/section1/spatial/tissue_lowres_ima
## ge.png
## url: NA
##
## [[2]]
## A SpatialImage with 1 source(s):
## > not loaded
## grob: NA
## path: /usr/local/lib/R/site-library/SpatialExperiment/ex
## tdata/10xVisium/section2/spatial/tissue_lowres_ima
## ge.png
## url: NA
We can again reload a single or set of images using loadImg()
:
## [[1]]
## A SpatialImage with 1 source(s):
## > not loaded
## grob: NA
## path: /usr/local/lib/R/site-library/SpatialExperiment/ex
## tdata/10xVisium/section1/spatial/tissue_lowres_ima
## ge.png
## url: NA
##
## [[2]]
## A SpatialImage with 2 source(s):
## > loaded
## grob: Av
## path: /usr/local/lib/R/site-library/SpatialExperiment/ex
## tdata/10xVisium/section2/spatial/tissue_lowres_ima
## ge.png
## url: NA
Besides a path or URL to source the image from and a numeric scale factor, addImg()
requires specification of the sample_id
the new image belongs to, and an image_id
that is not yet in use for that sample:
url <- "https://i.redd.it/3pw5uah7xo041.jpg"
ve <- addImg(ve,
sample_id = "section1", image_id = "pomeranian",
imageSource = url, scaleFactor = NA_real_, load = TRUE)
## Warning in sprintf(" 'image_id' and 'sample_id'", dQuote(c(image_id,
## sample_id))): one argument not used by format ' 'image_id' and 'sample_id''
The above code chunk has added an new image entry in the input SpatialExperiment
’s imgData
field:
```r
imgData(ve)
```
```
## DataFrame with 3 rows and 6 columns
## sample_id image_id data width height scaleFactor
## <character> <character> <list> <integer> <integer> <numeric>
## 1 section1 lowres 600 600 0.0516351
## 2 section2 lowres 600 600 0.0516351
## 3 section1 pomeranian 1200 1186 NA
```
We can remove specific images with removeImg()
:
```r
ve <- removeImg(ve,
sample_id = "section1",
image_id = "pomeranian")
imgData(ve)
```
```
## DataFrame with 2 rows and 6 columns
## sample_id image_id data width height scaleFactor
## <character> <character> <list> <integer> <integer> <numeric>
## 1 section1 lowres 600 600 0.0516351
## 2 section2 lowres 600 600 0.0516351
```
colData
replacementWhile storing of sample_id
s, the in_tissue
indicator, and spatial x/y_coords
inside the SpatialExperiment
’s colData
enables direct accessibility via the colData
and $
accessors, these fields are protected against arbitrary modification. This affects operations to the following effects:
Renaming is generally not permitted:
## Warning in .local(x, ..., value): cannot rename 'colData' fields 'x_coord',
## 'y_coord', 'in_tissue', 'array_row', 'array_col'
Replacement of sample_id
s is permitted provided that
## Warning in .local(x, ..., value): Number of unique 'sample_id's is 2, but 3 were provided.
## Overwriting
## Warning in .local(x, ..., value): New 'sample_id's must map uniquely
Valid replacement will be propagated to the imgData
:
tmp <- ve
i <- as.numeric(factor(ve$sample_id))
tmp$sample_id <- c("sample1", "sample2")[i]
imgData(tmp)
## DataFrame with 2 rows and 6 columns
## sample_id image_id data width height scaleFactor
## <character> <character> <list> <integer> <integer> <numeric>
## 1 sample1 lowres 600 600 0.0516351
## 2 sample2 lowres 600 600 0.0516351
The x/y_coord
and in_tissue
fields may be modified provided that the former is a logical vector, and the latter is a two- or three-column numeric matrix:
ve$x_coord <- "x"
ve$in_tissue <- "x"
## Warning in .local(x, ..., value): 'in_tissue' field in 'colData' should be
## 'logical'
colData() <- NULL
will retain only required fields, i.e. sample_id
, in_tissue
and x/y_coord
:
names(colData(ve))
## [1] "Barcode" "sample_id" "x_coord" "y_coord" "in_tissue" "array_row"
## [7] "array_col"
colData(ve) <- NULL
names(colData(ve))
## [1] "sample_id" "x_coord" "y_coord" "in_tissue" "array_row" "array_col"
array_row
s range from 0-77 (78 rows); array_col
s are even in 0-126 for even rows, and odd in 1-127 for odd rows (64 columns), giving in \(78 \times 64 = 4,992\) spots per sample.↩︎