Finished genomes
NC_011748 Escherichia coli 55989, complete genome; Length: 5,154,862 nt
NC_011752 Escherichia coli 55989 plasmid 55989p, complete sequence; Length: 72,482 nt
NC_013353 Escherichia coli O103:H2 str. 12009, complete genome; Length: 5,449,314 nt
NC_013354 Escherichia coli O103:H2 str. 12009 plasmid pO103, complete sequence; Length: 75,546 nt
...
NC_004914 Stx2 converting phage II, complete genome Length: 62,706 nt
Data
. elem min q1 q2 q3 max mean n50 sum
run1 92370 5 99 106 109 123 102 107 9433083
run2 122208 5 96 105 109 133 100 106 12248530
run3 96765 5 100 106 110 129 103 107 9958873
run4 222275 5 101 107 110 135 103 107 22924825
run5 95750 5 92 103 108 133 97 104 9379125
run1-5 629368 5 99 106 109 135 102 107 63944436 (12.2X)
run6 79341 5 104 108 111 137 105 108 8355410
run7 74388 5 97 105 109 129 100 106 7469876
run1-7 783097 5 99 106 110 137 102 107 79769722 (15.2X)
ctg 1217 100 251 938 5215 72019 4274 13577 5201850
ctg.v2 513 62 346 896 4903 204342 10330 53266 5299150
Links
EdgeBio Assembly
read QC: FastQC
read trimming: CLC
. elem min q1 q2 q3 max mean n50 sum
run 629368 5 99 106 109 135 102 107 63944436
run.trimmed 617257 10 46 69 86 94 64 79 39745910
read alignment: CLC; 85% of untrimmed reads aligned
6X+ cvg regions: 490 SNPS s & 1,848 INDELS.
De novo assembly of 90K reads that did not map, 58K assembled together => 363 contigs ranging in size from 400 to 7800 bp.
Contigs were blasted against NRNT at NCBI with a minimum .01 e-value.
Hits: E. coli strains (such as O83:H1 and O42) , Shigella flexneri, Salmonella, and Cronobacter.
De novo assembly => 3,297 contigs with an N25 of 4K and N50 of 2.5K.
CBCB best assembly
run1-5
. elem min q1 q2 q3 max mean n50 sum
ctg 505 35 204 758 8927 185186 9823 41503 4960767
ctg.denovo 4261 31 41 58 104 2713 119 31 506842
total 4766 31 44 59 161 185186 1147 41503 5467609
The reads were first assembled using Ecoli_55989 as reference. The unmapped reads were assembled denovo using ABYSS.
total reads: 629,368
reads aligned to Ecoli_55989 using "bwa bwasw": 545,233 (the consensus was called using "samtools pileup")
unaligned reads : 84,135
- Ftp file locations (run1-5)
ftp://ftp.cbcb.umd.edu/pub/data/assembly/Ecoli_TY-2482/
ftp://ftp.cbcb.umd.edu/pub/data/assembly/Ecoli_TY-2482/assemble.sh
ftp://ftp.cbcb.umd.edu/pub/data/assembly/Ecoli_TY-2482/asm.summary
ftp://ftp.cbcb.umd.edu/pub/data/assembly/Ecoli_TY-2482/asm.ctg.fasta
ftp://ftp.cbcb.umd.edu/pub/data/assembly/Ecoli_TY-2482/asm.ctg.denovo.fasta
run1-7
. elem min q1 q2 q3 max mean n50 sum
ctg 445 41 202 651 7622 185186 11156 46725 4964630
ctg.denovo 4704 31 39 57 88 2990 113 31 531947
total 5149 31 40 58 129 185186 1068 46725 5496577
run1-7 using Escherichia coli 55989 & Stx2 converting phage II as reference
. #ctgs maxCtg sumCtg
NC_011748 387 185186 4922625
NC_011752 50 6554 41684
NC_004914* 38 16960 44334
denovo 3745 2990 445588
/fs/szattic-asmg5/dpuiu/Ecoli_TY-2482/Assembly/sam.run1-7_phage
run1-7 using Escherichia coli 55989 & viral db as reference
ref #ctgs maxCtg sumCtg refLen refGC refDescription
NC_011748* 408 185186 4914924 5154862 50.66 Escherichia coli 55989, complete genome
denovo* 3504 2990 412557
NC_004914* 38 18079 52920 62706 49.9 Stx2 converting phage II, complete genome
NC_011752* 51 6554 41755 72482 46.13 Escherichia coli 55989 plasmid 55989p, complete sequence
NC_009514 48 2592 21527 47021 49.11 Phage cdtI, complete genome
NC_005344 7 7217 11583 39043 47.46 Enterobacteria phage Sf6, complete genome
NC_011357 15 1613 7044 62147 50.91 Stx2-converting phage 1717, complete prophage genome
NC_002371 3 6346 6565 41724 47.09 Enterobacteria phage P22 virus, complete genome
NC_011356 9 2036 5840 54896 51.12 Enterobacteria phage YYZ-2008, complete prophage genome
NC_003444 2 2817 4159 37074 50.76 Enterobacteria phage SfV, complete genome
NC_004813 10 769 3269 57930 50.6 Enterobacteria phage BP-4795, complete genome
NC_008464 4 1036 2922 60238 49.06 Stx2-converting phage 86, complete genome
NC_005856 5 1204 2246 94800 47.31 Enterobacteria phage P1, complete genome
NC_001416 9 351 1880 48502 49.85 Enterobacteria phage lambda, complete genome
NC_000924 4 1199 1776 61670 49.36 Enterobacteria phage 933W, complete genome
NC_002167 2 1339 1496 39732 49.78 Enterobacteria phage HK97, complete genome
NC_003525 2 1078 1349 61765 49.38 Stx2 converting phage I, complete genome
NC_010392 4 470 1118 48491 51.09 Phage Gifsy-1, complete genome
NC_005841 5 412 1117 41391 47.43 Enterobacteria phage ST104, complete genome
NC_003356 4 471 1107 42575 49.35 Enterobacteria phage phiP27, complete genome
NC_002730 2 467 862 38297 46.68 Enterobacteria phage HK620, complete genome
NC_002166 1 672 672 40751 49.48 Enterobacteria phage HK022, complete genome
NC_004313 1 493 493 40149 51.01 Salmonella phage ST64B, complete genome
NC_011976 1 419 419 43016 47.26 Salmonella phage epsilon34, complete genome
NC_001954 3 148 349 8454 43.7 Enterobacteria phage If1, complete genome
NC_007804 1 272 272 39104 48.97 Escherichia phage phiV10, complete genome
NC_001895 2 130 209 33593 50.16 Enterobacteria phage P2, complete genome
/fs/szattic-asmg5/dpuiu/Ecoli_TY-2482/Assembly/sam.run1-7_phage.redo/
Other CBCB assemblies run1-5
. elem min q1 q2 q3 max mean n50 sum
CA.ctg+deg 22395 64 107 159 253 2367 204 216 4567575
newbler.Ecoli_55989.ctg 8357 100 218 397 696 5038 534 639 4465330
AMOScmp.Ecoli_55989.ctg 1321 65 425 1780 5257 40345 3774 7775 4985978