library(ggplot2)
library(grid)
library(SpatialExperiment)

Reading 10X Visium data

The 10X Genomics’ CellRanger pipeline will process data using standard output file formats that are saved, for each sample, in a single directory /<sample>/outs/ of the following structure:

sample
|outs 
··|raw/filtered_feature_bc_matrix.h5
··|raw/filtered_feature_bc_matrix
····|barcodes.tsv
····|features.tsv
····|matrix.mtx
··|spatial
····|tissue_hires_image.png
····|tissue_lowres_image.png
····|detected_tissue_image.jpg
····|aligned_fiducials.jpg
····|scalefactors_json.json
····|tissue_positions_list.csv

The SpatialExperiment package provides an exemplary 10X Visium spatial gene expression data of two serial mouse brain sections (Sagittal-Posterior) available from the 10X Genomics website. These are located in the extdata/10xVisium directory:

dir <- system.file(
  file.path("extdata", "10xVisium"),
  package = "SpatialExperiment")

sample_ids <- c("section1", "section2")
samples <- file.path(dir, sample_ids)

We can load these data into a SpatialExperiment using the read10xVisium() function, which will read in all relevant information, including the count data, spatial coordinates, scale factors, and images:

list.files(samples[1])
## [1] "raw_feature_bc_matrix.h5" "spatial"
list.files(file.path(samples[1], "spatial"))
## [1] "scalefactors_json.json"    "tissue_lowres_image.png"  
## [3] "tissue_positions_list.csv"
(ve <- read10xVisium(samples, sample_ids,
  images = "lowres", # specify which image(s) to include
  load = TRUE))      # specify whether or not to load image(s)
## class: SpatialExperiment 
## dim: 32285 9984 
## metadata(2): Samples Samples
## assays(1): counts
## rownames(32285): ENSMUSG00000051951 ENSMUSG00000089699 ...
##   ENSMUSG00000095019 ENSMUSG00000095041
## rowData names(1): symbol
## colnames(9984): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
##   TTGTTTGTATTACACG-1 TTGTTTGTGTAAATTC-1
## colData names(7): Barcode sample_id ... array_row array_col
## reducedDimNames(0):
## altExpNames(0):
## spatialCoordsNames(5) : x_coord y_coord in_tissue array_row array_col
## inTissue(1): 6710
## imgData(6): sample_id image_id ... height scaleFactor

The SpatialExperiment class

Spatial data

Spatial data are stored as observation metadata (colData) and include:

  • sample_id specifying unique sample identifiers
  • in_tissue indicating whether an observation was mapped to tissue
  • x/y_coord storing spatial coordinates
  • array_row/col giving the spots’ row/column coordinate in the array1

A DataFrame of spatially-related data can be accessed using the spatialCoords() accessor:

```r
head(spatialCoords(ve))
```

```
## DataFrame with 6 rows and 5 columns
##                      x_coord   y_coord in_tissue array_row array_col
##                    <integer> <integer> <logical> <integer> <integer>
## AAACAACGAATAGTTC-1      1419      2534     FALSE         0        16
## AAACAAGTATCTCCCA-1      7409      8455      TRUE        50       102
## AAACAATCTACTAGCA-1      1778      4393     FALSE         3        43
## AAACACCAATAACTGC-1      8487      2740      TRUE        59        19
## AAACAGAGCGACTCCT-1      3096      7905      TRUE        14        94
## AAACAGCTTTCAGAAG-1      6570      2052      TRUE        43         9
```

Alternatively, we can access these data using colData() or, even simpler, the $ accessor:

```r
# tabulate number of spots mapped to tissue
table(
in_tissue = ve$in_tissue,
sample_id = ve$sample_id)
```

```
##          sample_id
## in_tissue section1 section2
##     FALSE     1637     1637
##     TRUE      3355     3355
```

Image data

Image-related data are stored in the int_metadata’s imgData field as a DataFrame with the following columns:

  • sample_id and image_id specifying the image’s sample and image identifier
  • data: a list of SpatialImages containing the image’s grob, path and/or URL
  • width and height giving the image’s dimension (in pixel)
  • scaleFactor used to rescale spatial coordinates according to the image’s resolution

We can retrieve these data using the imgData() accessor:

## DataFrame with 2 rows and 6 columns
##     sample_id    image_id   data     width    height scaleFactor
##   <character> <character> <list> <integer> <integer>   <numeric>
## 1    section1      lowres              600       600   0.0516351
## 2    section2      lowres              600       600   0.0516351

The SpatialImage class

Images inside a SpatialExperiment’s imgData are stored as objects of class SpatialImage. These contain three slots that can accommodate any available information associated with an image:

* `@grob`: NULL or an object class `rastergrob` from the `grid` package
  • @path: NULL or a character strings specifying an image file name (.png, .jpg or .tif)
  • @url: NULL or a character string specifying an URL from which to retrieve the image

A list of SpatialImages can be retrieved from the imgData’s data field using the $ accessor:

imgData(ve)$data
## [[1]]
## A SpatialImage with 2 source(s):
##       > loaded
## grob: Av
## path: /usr/local/lib/R/site-library/SpatialExperiment/ex
##       tdata/10xVisium/section1/spatial/tissue_lowres_ima
##       ge.png
##  url: NA
## 
## [[2]]
## A SpatialImage with 2 source(s):
##       > loaded
## grob: Av
## path: /usr/local/lib/R/site-library/SpatialExperiment/ex
##       tdata/10xVisium/section2/spatial/tissue_lowres_ima
##       ge.png
##  url: NA

Data available in an object of class SpatialImage may be accessed via the imgGrob(), imgPath() and imgUrl() accessors:

si <- imgData(ve)$data[[1]]
imgGrob(si)
## rastergrob[GRID.rastergrob.11]
## [1] "/usr/local/lib/R/site-library/SpatialExperiment/extdata/10xVisium/section1/spatial/tissue_lowres_image.png"
imgUrl(si)
## NULL

grobs can be used directly for plotting (e.g. using grid.draw() or ggplot2’s layer() and annotation_custom()):

```r
si <- imgData(ve)$data[[1]]
grid.draw(imgGrob(si))
```

<img src="/__w/EuroBioc2020_SpatialWorkshop/EuroBioc2020_SpatialWorkshop/docs/articles/SpatialExperiment_files/figure-html/unnamed-chunk-10-1.png" width="700" />

path and url provide the option to store an image’s source at minimal storage cost. This is desirable when multiple images are to be stored (say, for many samples and of different resolutions), or when a SpatialExperiment is to be exported.

Methods for image handling

The SpatialExperiment package provides various functions to handle which and how image data is stored in the object. These include:

  • loadImg to actively load (an) image(s) from a path or URL and store it as a grob
  • unloadImg to drop the grob, while retaining the source path and/or URL
  • addImg to add a new image entry (as a path, URL, or grob)
  • removeImg to drop an image entry entirely

Loading & unloading images

loadImg() and add/removeImg() are flexible in the specification of the sample/image_id arguments. Specifically,

  • TRUE is equivalent to all, e.g. sample_id = "<sample>", image_id = TRUE will drop all images for a given sample.
  • NULL defaults to the first entry available, e.g., sample_id = "<sample>", image_id = NULL will drop the first image for a given sample.

For example, sample_id,image_id = TRUE,TRUE will specify all images; NULL,NULL corresponds to the first image entry in the imgData; TRUE,NULL equals the first image for all samples; and NULL,TRUE matches all images for the first sample.

In the example below, we unload all images, i.e., drop all grobs. As a result, grob slots will be set to NULL, and all SpatialImages now say > not loaded.

ve <- unloadImg(ve, sample_id = TRUE, image_id = TRUE)
imgData(ve)$data
## [[1]]
## A SpatialImage with 1 source(s):
##       > not loaded
## grob: NA
## path: /usr/local/lib/R/site-library/SpatialExperiment/ex
##       tdata/10xVisium/section1/spatial/tissue_lowres_ima
##       ge.png
##  url: NA
## 
## [[2]]
## A SpatialImage with 1 source(s):
##       > not loaded
## grob: NA
## path: /usr/local/lib/R/site-library/SpatialExperiment/ex
##       tdata/10xVisium/section2/spatial/tissue_lowres_ima
##       ge.png
##  url: NA

We can again reload a single or set of images using loadImg():

ve <- loadImg(ve, sample_id = "section2")
imgData(ve)$data
## [[1]]
## A SpatialImage with 1 source(s):
##       > not loaded
## grob: NA
## path: /usr/local/lib/R/site-library/SpatialExperiment/ex
##       tdata/10xVisium/section1/spatial/tissue_lowres_ima
##       ge.png
##  url: NA
## 
## [[2]]
## A SpatialImage with 2 source(s):
##       > loaded
## grob: Av
## path: /usr/local/lib/R/site-library/SpatialExperiment/ex
##       tdata/10xVisium/section2/spatial/tissue_lowres_ima
##       ge.png
##  url: NA

Adding & removing images

Besides a path or URL to source the image from and a numeric scale factor, addImg() requires specification of the sample_id the new image belongs to, and an image_id that is not yet in use for that sample:

url <- "https://i.redd.it/3pw5uah7xo041.jpg"
ve <- addImg(ve,
  sample_id = "section1", image_id = "pomeranian",
  imageSource = url, scaleFactor = NA_real_, load = TRUE)
## Warning in sprintf(" 'image_id' and 'sample_id'", dQuote(c(image_id,
## sample_id))): one argument not used by format ' 'image_id' and 'sample_id''

The above code chunk has added an new image entry in the input SpatialExperiment’s imgData field:

```r
imgData(ve)
```

```
## DataFrame with 3 rows and 6 columns
##     sample_id    image_id   data     width    height scaleFactor
##   <character> <character> <list> <integer> <integer>   <numeric>
## 1    section1      lowres              600       600   0.0516351
## 2    section2      lowres              600       600   0.0516351
## 3    section1  pomeranian             1200      1186          NA
```
grb <- imgGrob(ve,
               sample_id = "section1",
               image_id = "pomeranian")
grid.draw(grb)

We can remove specific images with removeImg():

```r
ve <- removeImg(ve,
            sample_id = "section1",
            image_id = "pomeranian")
imgData(ve)
```

```
## DataFrame with 2 rows and 6 columns
##     sample_id    image_id   data     width    height scaleFactor
##   <character> <character> <list> <integer> <integer>   <numeric>
## 1    section1      lowres              600       600   0.0516351
## 2    section2      lowres              600       600   0.0516351
```

colData replacement

While storing of sample_ids, the in_tissue indicator, and spatial x/y_coords inside the SpatialExperiment’s colData enables direct accessibility via the colData and $ accessors, these fields are protected against arbitrary modification. This affects operations to the following effects:

Renaming is generally not permitted:

i <- grep("x|y_coord", names(colData(ve)))
names(colData(ve))[i] <- "foo"
## Warning in .local(x, ..., value): cannot rename 'colData' fields 'x_coord',
## 'y_coord', 'in_tissue', 'array_row', 'array_col'

Replacement of sample_ids is permitted provided that

  1. the number of unique sample identifiers is retained
  2. newly provided sample identifiers are a one-to-one mapping
ve$sample_id <- sample(c("a", "b", "c"), ncol(ve), TRUE)
## Warning in .local(x, ..., value): Number of unique 'sample_id's is 2, but 3 were provided.
## Overwriting
ve$sample_id <- sample(c("a", "b"), ncol(ve), TRUE)
## Warning in .local(x, ..., value): New 'sample_id's must map uniquely

Valid replacement will be propagated to the imgData:

tmp <- ve
i <- as.numeric(factor(ve$sample_id))
tmp$sample_id <- c("sample1", "sample2")[i]
imgData(tmp)
## DataFrame with 2 rows and 6 columns
##     sample_id    image_id   data     width    height scaleFactor
##   <character> <character> <list> <integer> <integer>   <numeric>
## 1     sample1      lowres              600       600   0.0516351
## 2     sample2      lowres              600       600   0.0516351

The x/y_coord and in_tissue fields may be modified provided that the former is a logical vector, and the latter is a two- or three-column numeric matrix:

ve$x_coord <- "x"
ve$in_tissue <- "x"
## Warning in .local(x, ..., value): 'in_tissue' field in 'colData' should be
## 'logical'

colData() <- NULL will retain only required fields, i.e. sample_id, in_tissue and x/y_coord:

names(colData(ve))
## [1] "Barcode"   "sample_id" "x_coord"   "y_coord"   "in_tissue" "array_row"
## [7] "array_col"
colData(ve) <- NULL
names(colData(ve))
## [1] "sample_id" "x_coord"   "y_coord"   "in_tissue" "array_row" "array_col"

Visualization


  1. array_rows range from 0-77 (78 rows); array_cols are even in 0-126 for even rows, and odd in 1-127 for odd rows (64 columns), giving in \(78 \times 64 = 4,992\) spots per sample.↩︎