1. GENIE3對于輸入的expression matrix,既可以是UMI counts,也可以是library-size normalized counts,兩者的結(jié)果相近。
SCENIC: single-cell regulatory network inference and clustering
To evaluate to what extent the normalization of the input matrix affects the output of SCENIC, we also ran SCENIC on the Zeisel et al.9 data set after library-size normalization (using the standard pipeline from scran27, which performs within-cluster size-factor normalization). The results are highly comparable, both in regards to resulting clusters or cell types (ARI between the cell types obtained from raw UMI counts or normalized counts: 0.90, ARI from normalized counts compared to the author's cell types: 0.87) and to the TFs identifying the groups (26 out of the 30 regulons highlighted in Fig. 1b). Furthermore, during the course of this project we have applied GENIE3 to multiple data sets, some of them having UMI counts (e.g., mouse brain and oligodendrocytes) and others TPM (e.g., human brain and melanoma), and both units provided reliable results.
2. SCENIC詳細(xì)流程:
Running SCENIC (htmlpreview.github.io)
其中:
## If launched in a new session, you will need to reload...
# setwd("...")
# loomPath <- "..."
# loom <- open_loom(loomPath)
# exprMat <- get_dgem(loom)
# close_loom(loom)
# genesKept <- loadInt(scenicOptions, "genesKept")
# exprMat_filtered <- exprMat[genesKept,]
# library(SCENIC)
# scenicOptions <- readRDS("int/scenicOptions.Rds")
# Optional: add log (if it is not logged/normalized already)
exprMat_filtered <- log2(exprMat_filtered+1)
# Run GENIE3
runGenie3(exprMat_filtered, scenicOptions)
似乎是使用的normalized counts。
3. SCENIC不檢測抑制性regulons
SCENIC: single-cell regulatory network inference and clustering
To build the final regulons, we merge the predicted target genes of each TF module that show enrichment of any motif of the given TF. To detect repression, it is theoretically possible to follow the same approach with the negative-correlated TF modules. However, in the data sets we analyzed, these modules were less numerous and showed very low motif enrichment. For this reason, we finally decided to exclude the detection of direct repression from the workflow and continue only with the positive-correlated targets. The databases used for the analyses presented in this paper are the “18k motif collection” from iRegulon (gene-based motif rankings) for human and mouse. For each species, we used two gene-motif rankings (10 kb around the TSS or 500 bp upstream the TSS), which determine the search space around the transcTSS.
4. pySCENIC的輸出:reg.csv文件包含regulon及其target genes結(jié)果。reg.csv每一行代表一個motif及對應(yīng)的target genes。一個regulon可能對應(yīng)多個motif。SCENIC流程中將所有motif的target genes做并集,然后用AUCell計(jì)算評分。
How to get the list of target genes for one regulon from the output regulon.csv file of ctx · Issue #301 · aertslab/pySCENIC (github.com)
SCENIC: single-cell regulatory network inference and clustering