Short Read解析ツールのベンチマーク

machine spec.
Intel(R) Xeon(R) 2CPU 3GHz Linux Red Hat 3.4.6-8 x86_64 2CPU/ 8GB memory
Alignment Tools
Name Version Information Comments
TM1 blat v.34 http://genome.ucsc.edu/cgi-bin/hgBlat
TM2 MAQ v.0.7.1 http://maq.sourceforge.net/
TM3 bwa v.0.4.9 http://maq.sourceforge.net/bwa-man.shtml
TM4 ssaha2 v.2.3.0.1 http://www.sanger.ac.uk/Software/analysis/SSAHA2/
Assembly Tools
Name Version Information Comments
TA1 velvet v.34 http://www.ebi.ac.uk/~zerbino/velvet/
TA2 EDENA v.0.7.1
TA3 allpaths v.0.4.9
Evaluation Data
Evaluation Data Targeted Genome Sequencing instruments Reference Raw Data

(Short Read Archives)

Projects
E1 bacteria Pseudomonas syringae pv.syringae B728a Pseudomonas syringae pv.syringae B728a (Feil et al., 2005)
(Refseq: NC_007005)

6 094 698 bp

illumina/Solexa Farrer et al., “‘De novo assembly of the Pseudomonas syringae pv. syringae B728a genome using Illumina/Solexa short sequence reads.”
FEMS Microbiol Lett. 2009 Feb;291(1):103-11,

PMID: 19077061

European Nucleotide Archive: project ID:32555
E2 plant Arabidopsis thaliana Col-0 Arabidopsis thaliana Col-0 (1001 genomes) illumina/Solexa Ossowski et al., “Sequencing of natural strains of Arabidopsis thaliana with short reads”, Genome Res. 2008 Dec;18(12):2024-33.,
PMID: 18818371
SRX000702 http://1001genomes.org
Evaluation Data
Data Name Size of targeted Genome Original number of short reads The number of quality filtered reads mapped number process time options
E1_1 B728a 6 094 698 bp 3 551 133 paired reads 3 535 967 paired reads
E1_2 7 102 266
unpaired reads
7 071 934
unpaired reads
(ssaha2)
(blat)
(MAQ)
(BWA2)
A2
A3
Evaluation Data
tool options process time
E1_2? blat -out=blast8 real 5m16.141s
user 5m12.198s
sys 0m1.952s