Bacillus anthracis: Difference between revisions

From Cbcb
Jump to navigation Jump to search
Dpuiu (talk | contribs)
Dpuiu (talk | contribs)
No edit summary
Line 25: Line 25:
   1(+2)    101,379  Complete  2004/05/20  [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome&cmd=Retrieve&dopt=Overview&list_uids=403  Ames Ancestor (Ames 0581)]  AA ; Insignia
   1(+2)    101,379  Complete  2004/05/20  [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome&cmd=Retrieve&dopt=Overview&list_uids=403  Ames Ancestor (Ames 0581)]  AA ; Insignia
   42      86,181  Assembly  2004/06/04  [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome&cmd=Retrieve&dopt=Overview&list_uids=5239 A1055] AA ; Insignia
   42      86,181  Assembly  2004/06/04  [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome&cmd=Retrieve&dopt=Overview&list_uids=5239 A1055] AA ; Insignia
   1(+2)   83,552  Assembly  2005/05/16  [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome&cmd=Retrieve&dopt=Overview&list_uids=5278 A2012] ??? not in Insignia
   1(+2)469 83,552  Assembly  2005/05/16  [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome&cmd=Retrieve&dopt=Overview&list_uids=5278 A2012] ??? not in Insignia; The  1 contig contains 469 gaps
   1        125,879  Complete  2002/05/16  [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome&cmd=Retrieve&dopt=Overview&list_uids=299 Ames]  ??? complete but not in AA ; Insignia
   1        125,879  Complete  2002/05/16  [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome&cmd=Retrieve&dopt=Overview&list_uids=299 Ames]  ??? complete but not in AA ; Insignia
   49      0        Assembly  2004/06/07  [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome&cmd=Retrieve&dopt=Overview&list_uids=5241 Australia 94]  ??? no TRACES ; Insignia
   49      0        Assembly  2004/06/07  [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome&cmd=Retrieve&dopt=Overview&list_uids=5241 Australia 94]  ??? no TRACES ; Insignia

Revision as of 14:40, 17 September 2008

Background

  • 89 known strains
  • Most virulent: Ames(USA 2001), Vollum (WWII, biological weapon)
  • Benign: Sterne (used as vaccine)
  • Ames, AmesAncestor, Stern are 99.9% identical; no rearrangements

Genome Projects

NIH Genome Projects

 Center          Complete     Assembly    Progress  Total
 TIGR/JCVI       2            8           1         11
 LANL            0            6           0         6
 DOE             1            0           0         1 
 NMRC            0            0           2         2
 Total           3            14          3         20

TIGR/JCVI Strains

 Contigs  Traces   Status    Completed   Strain 
 0        96,532   Progress  .           A0039  
 62       67,600   Assembly  2007/07/25  Tsiankovskii-I AA ; ??? possible update
 1(+2)    101,379  Complete  2004/05/20  Ames Ancestor (Ames 0581)  AA ; Insignia
 42       86,181   Assembly  2004/06/04  A1055 AA ; Insignia
 1(+2)469 83,552   Assembly  2005/05/16  A2012 ??? not in Insignia; The  1 contig contains 469 gaps
 1        125,879  Complete  2002/05/16  Ames  ??? complete but not in AA ; Insignia
 49       0        Assembly  2004/06/07  Australia 94   ??? no TRACES ; Insignia
 30       90,308   Assembly  2004/06/04  CNEVA-9066 Insignia
 64       92,429   Assembly  2004/06/07  Kruger B AA ; Insignia
 52       103,144  Assembly  2004/06/04  Vollum Insignia
 44       95,078   Assembly  2004/06/07  Western North America USA6153 Insignia

LANL Strains

No traces in TA; none in Insignia

 Contigs  Traces   Status    Completed    Strain       
 60       0        Assembly  2008/04/08   A0174
 60       0        Assembly  2008/02/12   A0193                              
 68       0        Assembly  2008/03/24   A0389
 46       0        Assembly  2008/02/12   A0442
 57       0        Assembly  2008/03/24   A0465
 63       0        Assembly  2008/01/16   A0488

+ 2 plasmid genome projects (pX01,pX02) completed in 1999

DOE Strains

 Contigs  Traces   Status    Completed    Strain
 1        147,665  Complete  2004/06/24   Sterne ??? complete but not in AA; Insignia

Naval Medical Research Center

 Contigs  Traces   Status    Strain
 0        0        Progress  34F2(NMRC)
 0        0        Progress  34F2 delta gerH

Strain Assemblies

A2012

  • Assembled into 1 sequence with ~469 gaps & 42,346 N's
 NZ_AAAC02000001 5093554 35.36
       #elem   min     max     mean    median  n50     sum
 ctg   469     2       132589  10770   5988    22044   5051208

Ames

  • Complete
 NC_003997.3    5227293 35.38  Bacillus anthracis str. Ames, complete genome
  • not in AA

AMOScmp-alignmentTrimmed

  • no 0 cvg regions when factory trimmed reads aligned to it
  • -D LAYERR=90 => 1 piece
 ref=5227293 bp
 assembly=5227311
 amosvalidate=>1555 snps
 nucmer align of assembly to ref & filter -q => 287 snps
  • Many stretched & missoriented mates
  • Location: /fs/szasmg3/dpuiu/Bacilus_anthracis/Ames/Assembly/2008_0820_AMOScmp-alignmentTrimmed

CA

Output:

         #elem   min     max     mean    median  n50     sum       snps
 scf     59      1021    893362  83396   1970    593436  4920387
 ctg     67      1021    736364  73430   2483    280568  4919826   496
 deg     245     70      141988  2016    671     34515   493924
  • There are many stretched mates; no compressed ones !!!

Ctg 0 cvg regions:

 #id                   len     gc%     start   end     len     cvg
 ctg7180000001099      2637    35.04   439     2637    2198    0        33.97% gc ; blastn:96% identity to Bacilus cereus ; blastx: alpha/beta hydrolase & protein disulfide isomerase hits
 ctg7180000001288      28653   36.17   19125   19233   108     0        52.29% gc ; blastn: 100% identity to Ames Ancestor & Sterne
 ctg7180000001288      28653   36.17   28636   28653   17      0        aligned to other Bacillus anthracis strains
 ctg7180000001300      156978  35.35   1       161     160     0        46.58% gc ; clonning vector
 ctg7180000001302      70072   34.28   69464   70072   608     0        55.83% gc ; clonning vector

Deg 0 cvg regions:

 all have high GC% (>48.24%) probably cloning vector

Ref breaks:

 NC_003997.3   145564
 NC_003997.3   627742
 NC_003997.3   1151234
 NC_003997.3   2085561
 NC_003997.3   3515380

Ames Ancestor

  • Complete & in AA
 NC_007530.2    5227419 35.38  Bacillus anthracis str. 'Ames Ancestor', complete genome
 NC_007322.2    181677 32.53  Bacillus anthracis str. 'Ames Ancestor' plasmid pXO1, complete sequence
 NC_007323.3    94830  33.04  Bacillus anthracis str. 'Ames Ancestor' plasmid pXO2, complete sequence
  • Downloaded from AA and converted to bank
  • Location: /fs/szasmg3/dpuiu/Bacilus_anthracis/AmesAncestor/CS--AI-293

Sterne

  • Complete but not in AA.
  • NC_005945.1 5228663 bp 35.38% gc
  • Traces available in TA. Looks like some reads are missing; I'm getting many 0 cvg regions when aligning the reads to the finished genome
       #elem   min     max     mean    median  n50     sum
 0cvg  53      8       4946    529     272     1508    28017

AMOScmp

  • version: June 12 2007
  • untrimmed reads => 358 ctg
  • 53 zero cvg regions, max is almost 5K

AMOScmp-alignmentTrimmed

  • reads are trimmed according to alignment coords
       #elem   min     max     mean    median  n50     sum
 ctg   46      736     468060  112792  49000   355370  5188425

CA

  • runCA-OBT.pl script
  • version 5.1
  • Output:
 #elem   min     max     mean    median  n50     sum
 ctg     204     1000    468299  26560   1310    189443  5418252
 deg     145     266     32820   1756    804     21949   254596
 ctg+deg 349     266     468299  16255   1146    181331  5672848
 scf     186     1000    671877  29170   1274    294809  5425585
 singleton  2418

Ctg 0 cvg regions:

 all have high GC% (>52.08%) probably cloning vector

Deg 0 cvg regions: actually they align to pXO1

 #id                             len     gc%
 deg7180000001258.1-297          297     31.65
 deg7180000001300.420-6468       6049    32.05
 deg7180000001254.1-28442        28442   33.01
 deg7180000001258.557-1060       504     34.52
 deg7180000001253.1-576          576     35.76
 deg7180000001300.6728-6824      97      36.08
 deg7180000001300.1-160          160     38.12
 ..

minimus2

Input:

                 #elem   min     max     mean    median  n50     sum
 AMOS            46      736     468060  112792  49000   355370  5188425
 CA              349     266     468299  16255   1146    181331  5672848
 AMOS+CA         395     266     468299  27497   1210    215336  10861273

Output:

                 #elem   min     max     mean    median  n50     sum
 ctg             47      931     468316  110527  37238   355370  5194753
 singl           282     24      32820   1602    1053    1341    451821
 ctg+singl       329     24      468316  17163   1103    296377  5646574

Vollum

NCBI data:

         #elem   min     max     mean    median  n50     sum
 ctg     52      311     812727  105547  29178   400992  5488459

CA

         #elem   min     max     mean    median  n50     sum
ctg+deg  57      693     1541457 96154   14641   422289  5480756
 ctg     39      1073    1541457 136843  51912   422289  5336858
 deg     18      693     54711   7994    2244    54711   143898
 scf     25      1440    1593252 214001  94958   676360  5350037
  • No alignment breaks vs NCBI assembly