單細(xì)胞之軌跡分析-7:Seurat+scVelo


軌跡分析系列:


一般要去計(jì)算RNA velocity的時(shí)候,是已經(jīng)預(yù)先處理過數(shù)據(jù)了,比如做過了降維,聚類,差異分析等。因此,做RNA velocity的時(shí)候,考慮的經(jīng)常是怎么把之前的結(jié)果和RNA velocity的結(jié)果合并展示。而不是對(duì)同一份數(shù)據(jù)使用RNA velocity重新做一次降維聚類。
思路:把velocyto生成的loom文件讀取之后,和Seurat分析過的數(shù)據(jù)整合在一起,然后再導(dǎo)出為loom格式,最后用scVelo做velocity分析。

1. Introduction

需要用到的軟件:

  • scVelo (For RNA Velocity)
  • Velocyto or Kallisto Bustools (To produce our initial RNA Velocity Object)
  • Anndata (For manipulation of our RNA Velocity object)
  • Seurat
  • Samtools -- optional (Velocyto will run Samtools sort on unsorted .bam)
2. 生成loom文件

loom文件是從fastq/loom文件中得到的


pip install git+https://github.com/pachterlab/kb_python@devel
kb ref -i index.idx -g t2g.txt -f1 cdna.fa -f2 intron.fa -c1 cdna_t2c.txt -c2 intron_t2c.txt --workflow lamanno -n 4 \
fasta.fa \
gtf.gtf
kb count -i transcriptome.idx -g t2g.txt -x 10xv2 --workflow lamanno --loom -c1 cdna_t2c.txt -c2 intron_t2c.txt read_1.fastq.gz read_2.fastq.gz  

#Download dependencies first
conda install numpy scipy cython numba matplotlib scikit-learn h5py click
pip install velocyto
velocyto run -b filtered_barcodes.tsv -o output_path -m repeat_msk_srt.gtf bam_file.bam annotation.gtf
3. 讀取Seurat對(duì)象和loom文件

需要先轉(zhuǎn)換成h5ad格式,參考Seurat對(duì)象、SingleCellExperiment對(duì)象和scanpy對(duì)象的轉(zhuǎn)化

#數(shù)據(jù)轉(zhuǎn)換
library(scater)
library(Seurat)
library(SeuratData)
#remotes::install_github("mojaveazure/seurat-disk")
library(SeuratDisk)
library(patchwork)
pbmc <- readRDS("pbmc.rds")
SaveH5Seurat(pbmc, filename = "pbmc.h5Seurat")
Convert("pbmc.h5Seurat", dest = "h5ad")

讀取數(shù)據(jù)Seurat整合對(duì)象

import anndata
import scvelo as scv
import pandas as pd
import numpy as np
import matplotlib as plt
import scanpy as sc
%load_ext rpy2.ipython

adata=sc.read_h5ad('pbmc.h5ad')
adata.obs.seurat_clusters=adata.obs.seurat_clusters.astype('category')

讀取每個(gè)樣品的loom文件

data1 = anndata.read_loom("data1.loom")
data2 = anndata.read_loom("data2.loom")
data3 = anndata.read_loom("data3.loom")
4. 根據(jù)Seurat對(duì)象的細(xì)胞ID,修改loom文件細(xì)胞ID
barcodes=[bc.split(':')[1] for bc in data1.obs.index.tolist()]
barcodes=[bc[0:len(bc)-1]+ '-1_1' for bc in barcodes]
data1.obs.index=barcodes
data1.var_names_make_unique()

data2和data3的操作相同

5. 整合loom文件
ldata=data1.concatenate([data2,data3])
6. 整合loom文件和metadata
adata=scv.utils.merge(adata,ldata)

畫個(gè)umap圖檢查一下

sc.pl.umap(adata, color='celltype', frameon=False, legend_loc='on data', title='', save='_celltypes.pdf')

為不同的細(xì)胞類型、樣本、細(xì)胞類群等設(shè)置顏色(可選)
(對(duì)應(yīng)的obs名,然后跟“_colors”)

adata.uns['Group_colors'] = np.array(["#66c2a5", "#8da0cb", "#e78ac3"])
adata.uns['celltype_colors'] = np.array([""#33a02c", "#b2df8a", "#a6cee3", "#fb9a99", "#cab2d6"])
7. scVelo分析

參考scVelo

8. 提取亞群分析
cur_celltypes = ['CD4T', 'CD8T, 'Treg', 'Tnaive']
adata_subset = adata[adata.obs['celltype'].isin(cur_celltypes)]
sc.pl.umap(adata_subset, color=['celltype', 'condition'], frameon=False, title=['', ''])

sc.pp.neighbors(adata_subset, n_neighbors=15, use_rep='X_pca')
# pre-process
scv.pp.filter_and_normalize(adata_subset)
scv.pp.moments(adata_subset)

后續(xù)分析同scVelo


參考:
scvelo github網(wǎng)站:https://github.com/theislab/scvelo
scvelo官方文檔:https://scvelo.readthedocs.io/index.html
Seurat to RNA-Velocity教程:https://github.com/basilkhuder/Seurat-to-RNA-Velocity#multiple-sample-integration
scvelo實(shí)戰(zhàn)教程:
https://smorabit.github.io/tutorials/8_velocyto/
RNA velocity:scVelo 應(yīng)用

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。
禁止轉(zhuǎn)載,如需轉(zhuǎn)載請(qǐng)通過簡(jiǎn)信或評(píng)論聯(lián)系作者。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容