Bacillus anthracis: Difference between revisions
(→CA) |
(→CA) |
||
Line 103: | Line 103: | ||
Summary | Summary | ||
#elem min max mean median n50 sum | #elem min max mean median n50 sum | ||
ctg 476 844 122753 11450 2025 33522 5450075 | ctg 476 844 122753 11450 2025 33522 5450075 # larger N%0 than the NCBI assembly | ||
deg 1760 172 4878 852 818 882 1498894 | deg 1760 172 4878 852 818 882 1498894 | ||
ctg+deg 2236 172 122753 3108 881 23707 6948969 | ctg+deg 2236 172 122753 3108 881 23707 6948969 |
Revision as of 16:09, 18 September 2008
Background
- 89 known strains
- Most virulent: Ames(USA 2001), Vollum (WWII, biological weapon)
- Benign: Sterne (used as vaccine)
Virulence factors that distinguish Bacillus anthracis from Bacillus cereus are encoded on two plasmids, pXO1 (anthrax toxin) and pXO2 (capsule genes). The capsule protects against phagocytosis once the vegetative bacterium enters the bloodstream. The anthrax toxin consists of 3 components, a protective antigen (PA), lethal factor (LF), and edema factor (EF). PA/LF and PA/EF complexes are internalized by host cells where the LF (metalloprotease) and EF (calmodulin-dependent adenylate cyclase) components act. At high levels LF induces cell death and release of the bacterium while EF increases host susceptibility to infection and promotes fluid accumulation in the cells. (NCBI)
- Ames, AmesAncestor, Stern are 99.9% identical; no rearrangements
Genome Projects
Center Complete Assembly Progress Total TIGR/JCVI 2 8 1 11 LANL 0 6 0 6 DOE 1 0 0 1 NMRC 0 0 2 2 Total 3 14 3 20
TIGR/JCVI Strains
Contigs Traces Status Completed Strain 0 96,532 Progress . A0039 62 67,600 Assembly 2007/07/25 Tsiankovskii-I AA ; ??? possible update 1(+2) 101,379 Complete 2004/05/20 Ames Ancestor (Ames 0581) AA ; Insignia; pXO1, pXO2 42 86,181 Assembly 2004/06/04 A1055 AA ; Insignia 1(+2)469 83,552 Assembly 2005/05/16 A2012 ??? not in Insignia; The 1 contig contains 469 gaps; 65508 of the traces have no qualities; pXO1, pXO2 1 125,879 Complete 2002/05/16 Ames ??? complete but not in AA ; Insignia 49 0 Assembly 2004/06/07 Australia 94 ??? no TRACES ; Insignia 30 90,308 Assembly 2004/06/04 CNEVA-9066 Insignia 64 92,429 Assembly 2004/06/07 Kruger B AA ; Insignia 52 103,144 Assembly 2004/06/04 Vollum Insignia 44 95,078 Assembly 2004/06/07 Western North America USA6153 Insignia
LANL Strains
No traces in TA; none in Insignia
Contigs Traces Status Completed Strain 60 0 Assembly 2008/04/08 A0174 60 0 Assembly 2008/02/12 A0193 68 0 Assembly 2008/03/24 A0389 46 0 Assembly 2008/02/12 A0442 57 0 Assembly 2008/03/24 A0465 63 0 Assembly 2008/01/16 A0488
+ 2 plasmid genome projects (pX01,pX02) completed in 1999
DOE Strains
Contigs Traces Status Completed Strain 1 147,665 Complete 2004/06/24 Sterne ??? complete but not in AA; Insignia; pXO1, pXO2
Contigs Traces Status Strain 0 0 Progress 34F2(NMRC) 0 0 Progress 34F2 delta gerH
Complete
Strain Status chromosome pXO1 pXO2 A2012 Assembly . 181677 94829 Ames Complete 5227293 . . AmesAncestor Complete 5227419 181677 94830 Sterne Complete 5228663 181654 96231
Strain Assemblies
A2012
- NCBI Genome
RefId Len GC% chromosome NZ_AAAC02000001 5093554 35.36 pXO1 NC_003980 181677 32 pXO2 NC_003981 94829 33
chromosome #elem min max mean median n50 sum ctg 469 2 132589 10770 5988 22044 5051208 # 42,346 N's
Traces:
- 18,045 reads have qual. & 65,507 don't
Libraries: Lib Mean Stdev Count T13322 2000 600 32133 T13323 4000 1200 31455 1047127226559 2000 600 18036 T10914 3000 900 1719 T10930 10000 3000 150 GBZH 4500 . 29 ... Total 83553
CA
Summary
#elem min max mean median n50 sum ctg 476 844 122753 11450 2025 33522 5450075 # larger N%0 than the NCBI assembly deg 1760 172 4878 852 818 882 1498894 ctg+deg 2236 172 122753 3108 881 23707 6948969 scf 325 1001 245405 16824 1250 86284 5467834
0cvg(no plasmids)
#elem min max mean median n50 sum 1con 223 1 783 91.28 33 252 20355 ctg-deg 973 1 10316 239 140 444 232946
0cvg(including plasmids)
#elem min max mean median n50 sum ctg-deg 841 1 3183 183 82 402 154114
Ames
- Complete
NC_003997.3 5227293 35.38 Bacillus anthracis str. Ames, complete genome
- not in AA
AMOScmp-alignmentTrimmed
- no 0 cvg regions when factory trimmed reads aligned to it
- -D LAYERR=90 => 1 piece
ref=5227293 bp assembly=5227311 amosvalidate=>1555 snps nucmer align of assembly to ref & filter -q => 287 snps
- Many stretched & missoriented mates
- Location: /fs/szasmg3/dpuiu/Bacilus_anthracis/Ames/Assembly/2008_0820_AMOScmp-alignmentTrimmed
CA
Output:
#elem min max mean median n50 sum snps scf 59 1021 893362 83396 1970 593436 4920387 ctg 67 1021 736364 73430 2483 280568 4919826 496 deg 245 70 141988 2016 671 34515 493924
- There are many stretched mates; no compressed ones !!!
Ctg 0 cvg regions:
#id len gc% start end len cvg ctg7180000001099 2637 35.04 439 2637 2198 0 33.97% gc ; blastn:96% identity to Bacilus cereus ; blastx: alpha/beta hydrolase & protein disulfide isomerase hits ctg7180000001288 28653 36.17 19125 19233 108 0 52.29% gc ; blastn: 100% identity to Ames Ancestor & Sterne ctg7180000001288 28653 36.17 28636 28653 17 0 aligned to other Bacillus anthracis strains ctg7180000001300 156978 35.35 1 161 160 0 46.58% gc ; clonning vector ctg7180000001302 70072 34.28 69464 70072 608 0 55.83% gc ; clonning vector
Deg 0 cvg regions:
all have high GC% (>48.24%) probably cloning vector
Ref breaks:
NC_003997.3 145564 NC_003997.3 627742 NC_003997.3 1151234 NC_003997.3 2085561 NC_003997.3 3515380
Ames Ancestor
- Complete & in AA
NC_007530.2 5227419 35.38 Bacillus anthracis str. 'Ames Ancestor', complete genome NC_007322.2 181677 32.53 Bacillus anthracis str. 'Ames Ancestor' plasmid pXO1, complete sequence NC_007323.3 94830 33.04 Bacillus anthracis str. 'Ames Ancestor' plasmid pXO2, complete sequence
- Downloaded from AA and converted to bank
- Location: /fs/szasmg3/dpuiu/Bacilus_anthracis/AmesAncestor/CS--AI-293
Sterne
- Complete but not in AA.
RefId Len GC% chromosome NC_005945.1 5228663 35.38 pXO1 NC_001496.1 181654 32 pXO2 NC_002146.1 96231 33
!!! the plasmids are not listed with the genome project
- Traces available in TA. Looks like some reads are missing; I'm getting many 0 cvg regions when aligning the reads to the finished genome
#elem min max mean median n50 sum 0cvg 53 8 4946 529 272 1508 28017
AMOScmp
- version: June 12 2007
- untrimmed reads => 358 ctg
- 53 zero cvg regions, max is almost 5K
AMOScmp-alignmentTrimmed
- reads are trimmed according to alignment coords
#elem min max mean median n50 sum ctg 46 736 468060 112792 49000 355370 5188425
CA
- runCA-OBT.pl script
- version 5.1
- Output:
#elem min max mean median n50 sum ctg 204 1000 468299 26560 1310 189443 5418252 deg 145 266 32820 1756 804 21949 254596 ctg+deg 349 266 468299 16255 1146 181331 5672848 scf 186 1000 671877 29170 1274 294809 5425585 singleton 2418
Ctg 0 cvg regions:
all have high GC% (>52.08%) probably cloning vector
Deg 0 cvg regions: actually they align to pXO1
#id len gc% deg7180000001258.1-297 297 31.65 deg7180000001300.420-6468 6049 32.05 deg7180000001254.1-28442 28442 33.01 deg7180000001258.557-1060 504 34.52 deg7180000001253.1-576 576 35.76 deg7180000001300.6728-6824 97 36.08 deg7180000001300.1-160 160 38.12 ..
minimus2
Input:
#elem min max mean median n50 sum AMOS 46 736 468060 112792 49000 355370 5188425 CA 349 266 468299 16255 1146 181331 5672848 AMOS+CA 395 266 468299 27497 1210 215336 10861273
Output:
#elem min max mean median n50 sum ctg 47 931 468316 110527 37238 355370 5194753 singl 282 24 32820 1602 1053 1341 451821 ctg+singl 329 24 468316 17163 1103 296377 5646574
Vollum
NCBI data:
#elem min max mean median n50 sum ctg 52 311 812727 105547 29178 400992 5488459
CA
#elem min max mean median n50 sum ctg+deg 57 693 1541457 96154 14641 422289 5480756
ctg 39 1073 1541457 136843 51912 422289 5336858 deg 18 693 54711 7994 2244 54711 143898 scf 25 1440 1593252 214001 94958 676360 5350037
- No alignment breaks vs NCBI assembly