hbctraining-Introduction to ChIP-Seq Lesson 2

Quality Control of Sequence Reads

Understanding the Illumina sequencing technology

Unmapped read data(FASTQ)

????? Fastq format evolved from Fasta in that it contains sequence data and quality information. There are two main kinds of quality scoring system: Phred33 and Phred 64, differing by offset in the ASCII table. Figure3 provides the mapping of quality encoding characters of Phred33.

figure 1. fastq format
figure 2. quality scoring system
figure 3. Phred33

??? Each quality score represents the probability that the corresponding nucleotide call is incorrect.This quality score is logarithmically based and is calculated as:

Q = -10\times lg(P) where P is the probability that a base call is erroneous.

Assessing quality with FastQC

FastQC is a widely used tool in quality control.The main functions of FastQC are:

* Import of data from BAM, SAM or Fastq files (any variant)

* Providing a quick overview to tell you in which areas there may be problems

* Summary graphs and tables to quickly assess your data

* Export of results to an HTML based permanent report

* Offline operation to allow automated generation of reports without running the interactive application

??????? Among all the results of FastQC, "Per base sequence quality" plot is the most important analysis module in FastQC for ChIP-Seq. It provides the distribution of quality scores across all bases at each position in the reads. This information can help determine whether there were any problems at the sequencing facilit during sequencing. Generally, we expect a decrease in quality towards the ends of the reads, but we shouldn't see any quality drops at the beginning or in the middle of the reads.

a good quality sample
a not-so-good quality sample

PS: In my mind, the adapter module is important, as well.

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

友情鏈接更多精彩內(nèi)容