精品亚州久久,传媒在线一二三区

目錄：
1. 原理介紹
2. 操作演示
3. 關(guān)于Harmony操作是否會(huì)對(duì)差異分析產(chǎn)生影響

1. 原理介紹

官網(wǎng)：https://github.com/immunogenomics/harmony
（Harmony必須在R版本3.4以上運(yùn)行，支持 Linux, OS X, and Windows 平臺(tái)。）
文章：https://www.biorxiv.org/content/early/2018/11/04/461954
harmony算法與其他整合算法相比的優(yōu)勢(shì)：

（1）整合數(shù)據(jù)的同時(shí)對(duì)稀有細(xì)胞的敏感性依然很好；
（2）省內(nèi)存；
（3）適合于更復(fù)雜的單細(xì)胞分析實(shí)驗(yàn)設(shè)計(jì)，可以比較來自不同供體，組織和技術(shù)平臺(tái)的細(xì)胞。

基本原理：我們用不同顏色表示不同數(shù)據(jù)集，用形狀表示不同的細(xì)胞類型。首先，Harmony應(yīng)用主成分分析（一文看懂PCA主成分分析）將轉(zhuǎn)錄組表達(dá)譜嵌入到低維空間中，然后應(yīng)用迭代過程去除數(shù)據(jù)集特有的影響。

（A）Harmony概率性地將細(xì)胞分配給cluster，從而使每個(gè)cluster內(nèi)數(shù)據(jù)集的多樣性最大化。
（B）Harmony計(jì)算每個(gè)cluster的所有數(shù)據(jù)集的全局中心，以及特定數(shù)據(jù)集的中心。
（C）在每個(gè)cluster中，Harmony基于中心為每個(gè)數(shù)據(jù)集計(jì)算校正因子。
（D）最后，Harmony使用基于C的特定于細(xì)胞的因子校正每個(gè)細(xì)胞。由于Harmony使用軟聚類，因此可以通過多個(gè)因子的線性組合對(duì)其A中進(jìn)行的軟聚類分配進(jìn)行線性校正，來修正每個(gè)單細(xì)胞。
重復(fù)步驟A到D，直到收斂為止。聚類分配和數(shù)據(jù)集之間的依賴性隨著每一輪的減少而減小。

2. 操作演示

R包安裝

library(devtools)
install_github("immunogenomics/harmony")

安裝過程可能包括從源代碼編譯C++代碼，因此可能需要幾分鐘。

下載稀疏矩陣示例(https://www.dropbox.com/s/t06tptwbyn7arb6/pbmc_stim.RData?dl=1)

library(Seurat)
library(cowplot)
library(harmony)
load('data/pbmc_stim.RData') #加載矩陣數(shù)據(jù)
#在運(yùn)行Harmony之前，創(chuàng)建一個(gè)Seurat對(duì)象并按照標(biāo)準(zhǔn)PCA進(jìn)行分析。
pbmc <- CreateSeuratObject(counts = cbind(stim.sparse, ctrl.sparse), project = "PBMC", min.cells = 5) %>%
    Seurat::NormalizeData(verbose = FALSE) %>%
    FindVariableFeatures(selection.method = "vst", nfeatures = 2000) %>%
    ScaleData(verbose = FALSE) %>%
    RunPCA(pc.genes = pbmc@var.genes, npcs = 20, verbose = FALSE) #R語言中%>%的含義是什么呢，管道函數(shù)啦，就是把左件的值發(fā)送給右件的表達(dá)式，并作為右件表達(dá)式函數(shù)的第一個(gè)參數(shù)。
pbmc@meta.data$stim <- c(rep("STIM", ncol(stim.sparse)), rep("CTRL", ncol(ctrl.sparse)))#賦值條件變量

未經(jīng)校正的PC中的數(shù)據(jù)集之間存在明顯差異：

options(repr.plot.height = 5, repr.plot.width = 12)
p1 <- DimPlot(object = pbmc, reduction = "pca", pt.size = .1, group.by = "stim")
p2 <- VlnPlot(object = pbmc, features = "PC_1", group.by = "stim", pt.size = .1)
plot_grid(p1,p2)

Run Harmony

運(yùn)行Harmony的最簡(jiǎn)單方法是傳遞Seurat對(duì)象并指定要集成的變量。RunHarmony返回Seurat對(duì)象，并使用更正后的Harmony坐標(biāo)（使用Harmony代替PCA）。將plot_convergence設(shè)置為TRUE，這樣我們就可以確保Harmony目標(biāo)函數(shù)在每一輪中都變得更好。

RunHarmony函數(shù)中主要參數(shù)：

group.by.vars參數(shù)是設(shè)置按哪個(gè)分組來整合

max.iter.harmony設(shè)置迭代次數(shù)，默認(rèn)是10。運(yùn)行RunHarmony結(jié)果會(huì)提示在迭代多少次后完成了收斂。

??lambda參數(shù)，默認(rèn)值是1，決定了Harmony整合的力度。lambda值調(diào)小，整合力度變大，反之。（只有這個(gè)參數(shù)影響整合力度，調(diào)整范圍一般在0.5-2之間）

??theta參數(shù)：Diversity clustering penalty parameter. Specify for each variable in group.by.vars. Default theta=2. theta=0 does not encourage any diversity. Larger values of theta result in more diverse clusters. 這個(gè)參數(shù)我常用默認(rèn)值，但在不同文獻(xiàn)中這個(gè)參數(shù)往往不同。

??dims.use參數(shù)：Which PCA dimensions to use for Harmony. By default, use all.

sigma參數(shù)：Width of soft kmeans clusters. Default sigma=0.1. Sigma scales the distance from a cell to cluster centroids. Larger values of sigma result in cells assigned to more clusters. Smaller values of sigma make soft kmeans cluster approach hard clustering.

options(repr.plot.height = 2.5, repr.plot.width = 6)
pbmc <- pbmc %>%
RunHarmony("stim", plot_convergence = TRUE) #Harmony converged after 8 iterations

Harmory運(yùn)行后的結(jié)果儲(chǔ)存在：

pbmc@reductions$harmony

使用Embeddings命令訪問新的Harmony embeddings。

harmony_embeddings <- Embeddings(pbmc, 'harmony')
harmony_embeddings[1:5, 1:5]

讓我們查看確認(rèn)數(shù)據(jù)集在Harmony運(yùn)行之后的前兩個(gè)維度中得到很好的整合。

options(repr.plot.height = 5, repr.plot.width = 12)
p1 <- DimPlot(object = pbmc, reduction = "harmony", pt.size = .1, group.by = "stim")
p2 <- VlnPlot(object = pbmc, features = "harmony_1", group.by = "stim", pt.size = .1)
plot_grid(p1,p2)

Downstream analysis

許多下游分析是在低維嵌入而不是基因表達(dá)上進(jìn)行的。要使用校正后的Harmony embeddings而不是PC，設(shè)置reduction ='harmony'。

pbmc <- pbmc %>%
    RunUMAP(reduction = "harmony", dims = 1:20) %>%
    FindNeighbors(reduction = "harmony", dims = 1:20) %>%
    FindClusters(resolution = 0.5) %>%
    identity()

在UMAP embedding中，我們可以看到更復(fù)雜的結(jié)構(gòu)。由于我們使用harmony embeddings，因此UMAP embeddings混合得很好。

options(repr.plot.height = 4, repr.plot.width = 10)
DimPlot(pbmc, reduction = "umap", group.by = "stim", pt.size = .1, split.by = 'stim')

TSNE分析

pbmc=RunTSNE(pbmc,reduction = "harmony", dims = 1:20)
TSNEPlot(object = pbmc, pt.size = 0.5, label = TRUE,split.by='stim')

兩樣本合并的TSNE和UMAP圖

DimPlot(pbmc, reduction = "umap",pt.size = .1,  label = TRUE)
TSNEPlot(pbmc, pt.size = .1, label = TRUE)

隨后就可以尋找差異表達(dá)基因并對(duì)細(xì)胞進(jìn)行注釋。

3. 關(guān)于Harmony操作是否會(huì)對(duì)差異分析產(chǎn)生影響

Harmony輸入的是scRNA@reductions$pca的數(shù)據(jù)，得出的結(jié)果儲(chǔ)存在scRNA@reductions$harmony中。

而差異分析使用的是scRNA@assays$RNA@counts數(shù)據(jù)，互不影響。

4. 多樣本批次矯正方法匯總

工具		Batch-effect-corrected output	方法
Seurat2	R	Normalized canonical components	Canonical correlation analysis and dynamic time warping
`Seurat3`	R	Normalized gene expression matrix	CCA and mutural nearest neighbors-anchors
`Harmony`	R	Normalized feature reduction vectors	Iterative clustering in dimensionally reduced space
`MNN Correct`	R	Normalized gene expression matrix	Mutual nearest neighbor in gene expression space
fastMNN	R	Normalized principal components	MNN in dimensionally reduced space
ComBat	R	Normalized gene expression matrix	Adjusts for known batches using an empirical Bayesian framework
limma	R	Normalized gene expression matrix	Linear model/empirical Bayes model
scGen	R	Normalized gene expression matrix	Variational auto-encoders neural network model and latent space
Scanorama	R/P	Normalized gene expression matrix	Mutual nearest neighbor and panoramic stitching
MND-ResNet	P	Normalized principal components	Residual neural network for calibration
ZINB-WaVE	R	Normalized gene expression matrix	Zero-inflated negative binomial model, extension of RUV model
scMerge	R	Normalized gene expression matrix	Stably expressed genes (scSEGs) and RUVIII model
`LIGER`	R	Normalized feature reduction vectors	Integrative non-negative matrix factorization (iNMF) and joint clustering + quantile alignment
`BBKNN`	P	Connectivity graph and normalized dimension reduction vectors (UMAP)	Batch balanced k-nearest neighbors

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

單細(xì)胞數(shù)據(jù)整合-1：Harmony原理介紹和官網(wǎng)教程

單細(xì)胞數(shù)據(jù)整合-1：Harmony原理介紹和官網(wǎng)教程

1. 原理介紹

2. 操作演示

3. 關(guān)于Harmony操作是否會(huì)對(duì)差異分析產(chǎn)生影響

4. 多樣本批次矯正方法匯總

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

單細(xì)胞數(shù)據(jù)整合-1：Harmony原理介紹和官網(wǎng)教程

1. 原理介紹

2. 操作演示

3. 關(guān)于Harmony操作是否會(huì)對(duì)差異分析產(chǎn)生影響

4. 多樣本批次矯正方法匯總

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av