從cellrange開始,修改文件名為適用格式
scrna-filename.txt里是文件名
命令cat scrna-filename.txt | while read i ;do (mv ${i}_f1*.gz ${i}_S1_L001_R1_001.fastq.gz;mv ${i}_r2*.gz ${i}_S1_L001_R2_001.fastq.gz);done會報錯:
mv: cannot stat 'HRR573093'$'\r''_r2*.gz': No such file or directory

dos2unix是將Windows格式文件轉(zhuǎn)換為Unix、Linux格式的實用命令。Windows格式文件的換行符為\r\n ,而Unix&Linux文件的換行符為\n. dos2unix命令其實就是將文件中的\r\n 轉(zhuǎn)換為\n。
先下載一個dos2unix
dos2unix -o scrna-filename.txt scrna-filename.txt(此參數(shù)新文件覆蓋了源文件)

再運行以上cat命令無報錯,可以看到文件名被成功修改:

原先的文件名格式:

接下來是質(zhì)量控制,這一步跳過了,應(yīng)該沒問題。

接下來用cellranger的count,這個過程是最重要的,它完成細(xì)胞與基因的定量,它將比對、質(zhì)控、定量都包裝了起來
這是一個樣本示例:
cellranger count --id=HRR57572950 --transcriptome=refdata-cellranger-GRCh38-1.2.0 --fastqs=/data1/肝癌單細(xì)胞GSA數(shù)據(jù)-HCC --sample=HRR572950
要下載注釋文件,因為count的時候用到的是refdata

會報錯找不到這個文件,就自己點鏈接下載再上傳。

好消息:能跑了
壞消息:又沒空間了
刪除了非HCC的數(shù)據(jù)

真神奇啊,上午還不用“./”就能運行的,下午就必須加了
加上./好像在運行,然后顯示沒有構(gòu)建索引,但是構(gòu)建過的....

[error] Your reference doesn't appear to be indexed. Please runthe mkreference tool
2023-06-01 08:11:06 Shutting down.Saving pipestance info to "HRR572950/HRR572950.mri.tgz'For assistance upload this file to 10x Genomics by running:
cellranger upload <your email>"HRR572950/HRR572950.mri.tgz'
然后構(gòu)建索引的命令很玄學(xué)的不正確

error: The subcommand 'mkref --genome=GRCh38' wasn't recognized
Did you mean 'mkref'?
If you believe you received this message in error, try re-running with 'cellranger -- mkref --genome=GRCh38
我在github上的提問:https://github.com/10XGenomics/cellranger/issues/217
用命令cellranger mkref --genome=GRCh38 --fasta=Homo_sapiens.GRCh38.dna.primary_assembly.fa --genes=Homo_sapiens.GRCh38.84.filtered.gtfReference successfully created,但是依然是“Your reference doesn't appear to be indexed. please runthe mkreference tool”


解決了!在閑魚上找人,發(fā)現(xiàn)count的代碼不對,可以用自帶的基因組(refdata-gex-GRCh38-2020-A),代碼如下
cellranger count --id=HRR572950 --transcriptome=refdata-gex-GRCh38-2020-A --fastqs=/data1/liver-cancer-GSA-HCC --sample=HRR572950
批量執(zhí)行:
#批量執(zhí)行cellranger
def cellranger():
import os
for i in range(572951,572951):
x = "HRR" + str(i)
cmd_string = "cellranger count --id="+x+" --transcriptome=refdata-gex-GRCh38-2020-A --fastqs=/data1/liver-cancer-GSA-HCC --sample="+x
print('x:{}'.format(cmd_string))
print(os.popen(cmd_string).read())
cellranger()
aggr,整合樣本
50_76_libraries.csv:
sample_id,molecule_h5
HRR572950,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572950/outs/molecule_info.h5
HRR572951,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572951/outs/molecule_info.h5
HRR572952,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572952/outs/molecule_info.h5
HRR572954,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572954/outs/molecule_info.h5
HRR572955,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572955/outs/molecule_info.h5
HRR572956,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572956/outs/molecule_info.h5
HRR572962,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572962/outs/molecule_info.h5
HRR572964,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572964/outs/molecule_info.h5
HRR572965,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572965/outs/molecule_info.h5
HRR572966,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572966/outs/molecule_info.h5
HRR572967,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572967/outs/molecule_info.h5
HRR572968,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572968/outs/molecule_info.h5
HRR572969,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572969/outs/molecule_info.h5
HRR572972,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572972/outs/molecule_info.h5
HRR572973,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572973/outs/molecule_info.h5
HRR572974,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572974/outs/molecule_info.h5
HRR572976,/data1/liver-cancer-GSA-HCC/cellranger-7.1.0/HRR572976/outs/molecule_info.h5
命令:
cellranger aggr --id=5076 --csv=./50_76_libraries.csv --normalize=mapped