在運(yùn)行ESTIMATE包的時(shí)候,需要先把表達(dá)矩陣輸出成txt,然后運(yùn)行filterCommonGenes函數(shù),轉(zhuǎn)換成gct文件,在轉(zhuǎn)換這一步總是報(bào)錯(cuò),錯(cuò)誤提示是:
> filterCommonGenes(input.f="./step_7_estimate/exp.txt",
+ output.f="exp_genes.gct",
+ id="GeneSymbol")
[1] "Merged dataset includes 0 genes (10412 mismatched)."
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent
排除了變量類型(numric)的問題,打開txt文件看了一眼,發(fā)現(xiàn)行名和列名多了很多引號(hào),于是回過頭去看輸出txt的代碼:
write.table(a,file="./step_7_estimate/exp.txt")
突發(fā)奇想加上了sep="\t",于是不報(bào)錯(cuò)了,但是依然顯示沒有merge到基因:
> write.table(a,file="./step_7_estimate/exp.txt",sep="\t")
> filterCommonGenes(input.f="./step_7_estimate/exp.txt",
+ output.f="exp_genes.gct",
+ id="GeneSymbol")
[1] "Merged dataset includes 0 genes (10412 mismatched)."
但是手動(dòng)grep一下,發(fā)現(xiàn)很多基因其實(shí)是可以找到的
> grep("DCN",rownames(a))
[1] 4783
> grep("THBS2",rownames(a))
[1] 1903
于是去看了filterCommonGenes這個(gè)函數(shù)的源碼
> filterCommonGenes
function (input.f, output.f, id = c("GeneSymbol", "EntrezID"))
{
stopifnot((is.character(input.f) && length(input.f) == 1 &&
nzchar(input.f)) || (inherits(input.f, "connection") &&
isOpen(input.f, "r")))
stopifnot((is.character(output.f) && length(output.f) ==
1 && nzchar(output.f)))
id <- match.arg(id)
input.df <- read.table(input.f, header = TRUE, row.names = 1,
sep = "\t", quote = "", stringsAsFactors = FALSE)
merged.df <- merge(common_genes, input.df, by.x = id, by.y = "row.names")
rownames(merged.df) <- merged.df$GeneSymbol
merged.df <- merged.df[, -1:-ncol(common_genes)]
print(sprintf("Merged dataset includes %d genes (%d mismatched).",
nrow(merged.df), nrow(common_genes) - nrow(merged.df)))
outputGCT(merged.df, output.f)
}
<bytecode: 0x000002a43c866268>
<environment: namespace:estimate>
把變量名賦值好之后,嘗試分行run一下,在讀取表達(dá)矩陣這一步發(fā)現(xiàn)了問題
> input.df <- read.table(input.f, header = TRUE, row.names = 1,
+ sep = "\t", quote = "", stringsAsFactors = FALSE)
可以看到讀取的時(shí)候其實(shí)是有很多參數(shù)的,除了sep="\t"之外,還有quete=“”,而且發(fā)現(xiàn)讀進(jìn)來的表達(dá)矩陣,行名和列名都有引號(hào)。所以可能是引號(hào)的問題。
在write.table函數(shù)里加上quote=“”這個(gè)參數(shù)看一下
write.table(a,file="./step_7_estimate/exp.txt",sep="\t",
+ quote = F)
> filterCommonGenes(input.f="./step_7_estimate/exp.txt",
+ output.f="exp_genes.gct",
+ id="GeneSymbol")
[1] "Merged dataset includes 7099 genes (3313 mismatched)."
問題解決了,merge到3313個(gè)基因。