在此,我們將合并兩個(gè) 10X PBMC 數(shù)據(jù)集:一個(gè)包含 4K 細(xì)胞,一個(gè)包含 8K 細(xì)胞。數(shù)據(jù)集可以在這里找到。
首先,我們?cè)跀?shù)據(jù)中讀入并創(chuàng)建兩個(gè)Seurat對(duì)象。
library(Seurat)
pbmc4k.data <- Read10X(data.dir = "../data/pbmc4k/filtered_gene_bc_matrices/GRCh38/")
pbmc4k <- CreateSeuratObject(counts = pbmc4k.data, project = "PBMC4K")
pbmc4k
## An object of class Seurat
## 33694 features across 4340 samples within 1 assay
## Active assay: RNA (33694 features, 0 variable features)
pbmc8k.data <- Read10X(data.dir = "../data/pbmc8k/filtered_gene_bc_matrices/GRCh38/")
pbmc8k <- CreateSeuratObject(counts = pbmc8k.data, project = "PBMC8K")
pbmc8k
## An object of class Seurat
## 33694 features across 8381 samples within 1 assay
## Active assay: RNA (33694 features, 0 variable features)
合并兩個(gè)Seurat對(duì)象
merge()合并兩個(gè)對(duì)象的原始計(jì)數(shù)矩陣,并創(chuàng)建一個(gè)新的對(duì)象。
pbmc.combined <- merge(pbmc4k, y = pbmc8k, add.cell.ids = c("4K", "8K"), project = "PBMC12K")
pbmc.combined
## An object of class Seurat
## 33694 features across 12721 samples within 1 assay
## Active assay: RNA (33694 features, 0 variable features)
# notice the cell names now have an added identifier
head(colnames(pbmc.combined))
## [1] "4K_AAACCTGAGAAGGCCT-1" "4K_AAACCTGAGACAGACC-1" "4K_AAACCTGAGATAGTCA-1"
## [4] "4K_AAACCTGAGCGCCTCA-1" "4K_AAACCTGAGGCATGGT-1" "4K_AAACCTGCAAGGTTCT-1"
table(pbmc.combined$orig.ident)
##
## PBMC4K PBMC8K
## 4340 8381
合并兩個(gè)以上的Seurat對(duì)象
要合并兩個(gè)以上的對(duì)象,只需將多個(gè)對(duì)象的矢量傳遞到參數(shù)中即可:我們將使用 4K 和 8K PBMC 數(shù)據(jù)集以及我們以前計(jì)算的 2,700 PBMC的Seurat 對(duì)象來演示此情況。
library(SeuratData)
InstallData("pbmc3k")
pbmc3k <- LoadData("pbmc3k", type = "pbmc3k.final")
pbmc3k
## An object of class Seurat
## 13714 features across 2638 samples within 1 assay
## Active assay: RNA (13714 features, 2000 variable features)
## 2 dimensional reductions calculated: pca, umap
pbmc.big <- merge(pbmc3k, y = c(pbmc4k, pbmc8k), add.cell.ids = c("3K", "4K", "8K"), project = "PBMC15K")
pbmc.big
## An object of class Seurat
## 34230 features across 15359 samples within 1 assay
## Active assay: RNA (34230 features, 0 variable features)
head(colnames(pbmc.big))
## [1] "3K_AAACATACAACCAC" "3K_AAACATTGAGCTAC" "3K_AAACATTGATCAGC"
## [4] "3K_AAACCGTGCTTCCG" "3K_AAACCGTGTATGCG" "3K_AAACGCACTGGTAC"
tail(colnames(pbmc.big))
## [1] "8K_TTTGTCAGTTACCGAT-1" "8K_TTTGTCATCATGTCCC-1" "8K_TTTGTCATCCGATATG-1"
## [4] "8K_TTTGTCATCGTCTGAA-1" "8K_TTTGTCATCTCGAGTA-1" "8K_TTTGTCATCTGCTTGC-1"
unique(sapply(X = strsplit(colnames(pbmc.big), split = "_"), FUN = "[", 1))
## [1] "3K" "4K" "8K"
table(pbmc.big$orig.ident)
## pbmc3k PBMC4K PBMC8K
## 2638 4340 8381
基于標(biāo)準(zhǔn)化數(shù)據(jù)的合并
默認(rèn)情況下,將基于原始計(jì)數(shù)矩陣合并對(duì)象, 如果你想合并標(biāo)準(zhǔn)化的數(shù)據(jù)矩陣以及原始計(jì)數(shù)矩陣,則應(yīng)這樣做,添加merge.data = TRUE。
pbmc4k <- NormalizeData(pbmc4k)
pbmc8k <- NormalizeData(pbmc8k)
pbmc.normalized <- merge(pbmc4k, y = pbmc8k, add.cell.ids = c("4K", "8K"), project = "PBMC12K",
merge.data = TRUE)
GetAssayData(pbmc.combined)[1:10, 1:15]
## 10 x 15 sparse Matrix of class "dgCMatrix"
##
## RP11-34P13.3 . . . . . . . . . . . . . . .
## FAM138A . . . . . . . . . . . . . . .
## OR4F5 . . . . . . . . . . . . . . .
## RP11-34P13.7 . . . . . . . . . . . . . . .
## RP11-34P13.8 . . . . . . . . . . . . . . .
## RP11-34P13.14 . . . . . . . . . . . . . . .
## RP11-34P13.9 . . . . . . . . . . . . . . .
## FO538757.3 . . . . . . . . . . . . . . .
## FO538757.2 . . . . . . . . . 1 . . . . .
## AP006222.2 . . . . . . . . . . . 1 . . .
GetAssayData(pbmc.normalized)[1:10, 1:15]
## 10 x 15 sparse Matrix of class "dgCMatrix"
##
## RP11-34P13.3 . . . . . . . . . . . . . . .
## FAM138A . . . . . . . . . . . . . . .
## OR4F5 . . . . . . . . . . . . . . .
## RP11-34P13.7 . . . . . . . . . . . . . . .
## RP11-34P13.8 . . . . . . . . . . . . . . .
## RP11-34P13.14 . . . . . . . . . . . . . . .
## RP11-34P13.9 . . . . . . . . . . . . . . .
## FO538757.3 . . . . . . . . . . . . . . .
## FO538757.2 . . . . . . . . . 0.7721503 . . . . .
## AP006222.2 . . . . . . . . . . . 1.087928 . . .