本文主要基于2016年發(fā)表在『Nature Review Genetics』雜志上綜述文章『A comparison of tools for the simulation of genomic next-generation sequencing data』。
在學(xué)習(xí)高通量測(cè)序的相關(guān)知識(shí)的時(shí)候,我們往往陷入兩個(gè)困境:(1)找不到想要的數(shù)據(jù);(2)數(shù)據(jù)太大,難以下載分析。這時(shí),高通量數(shù)據(jù)模擬的軟件就派上用場(chǎng)了。
簡(jiǎn)單來說,測(cè)序數(shù)據(jù)模擬軟件主要用于一下三個(gè)方面:
- planning experiments
- testing hypotheses
- benchmarking tools
- evaluating particular results
The simulation of NGS data can be extremely useful for planning experiments,
testing hypotheses, benchmarking tools and evaluating particular results.
Given a reference genome or dataset, for instance, one can play with
an array of sequencing technologies to choose the best-suited technology and parameters for the particular goal,
possibly optimizing time and costs.
Yet, this is still not the standard practice and researchers often base their choices on
practical considerations like technology and money availability.
As shown throughout this Review, simulation of NGS data from known genomes or transcriptomes can be extremely useful
when evaluating assembly, mapping, phasing or genotyping algorithms exposing their advantages and drawbacks under different circumstances.
這篇綜述文章評(píng)估了23個(gè)測(cè)序數(shù)據(jù)模擬軟件,介紹各自不同的特點(diǎn),需求及潛在應(yīng)用,并給出選取合適軟件的方法。

軟件列表
NGS genomic simulators decision tree.
下面的樹狀圖簡(jiǎn)單說明了選取不同方法的原則

emss-70941-f001.jpg
Main characteristics of current NGS technologies
目前不同NGS技術(shù)的一些特點(diǎn)。注意,『X』表示存在。

image.png
General overview of the sequencing process and steps that can be parameterized in the simulations

image.png
General overview of NGS simulation

image.png
General information about 23 NGS genomic simulators
23種模擬軟件的特點(diǎn)

image.png
Technical information about 23 NGS genomic simulators

image.png
Genomic variants

image.png
最后,直接給出該文章的online summary:

image.png
歡迎大家關(guān)注我的微信公眾號(hào)『生信family』

qrcode_for_gh_a055c85e7513_258-2.jpg