Clostridium botulinum: Difference between revisions
		
		
		
		Jump to navigation
		Jump to search
		
| Line 52: | Line 52: | ||
| == Assembly  == | == Assembly  == | ||
| '''Location:''' | |||
|   /fs/szasmg/Bacteria/C_botulinum | |||
| '''2007_0725_WGA''' | '''2007_0725_WGA''' | ||
|      on the 63,115 Sanger reads | |||
|      runCA-OBT.pl (default params)   |      runCA-OBT.pl (default params)   | ||
|      location: 2007_0725_WGA |      location: 2007_0725_WGA | ||
|      => 109 scaffolds, 243 contigs |      => 109 scaffolds, 243 contigs, 3,823,075 bp | ||
|      => library inser estimates mean=1840.917 stdev=866.039 |      => library inser estimates mean=1840.917 stdev=866.039 | ||
| '''2007_0801_AMOScmp-relaxed''' | '''2007_0801_AMOScmp-relaxed''' | ||
|    on the 63,115 Sanger reads | |||
|     MINCLUSTER=30 , MAXTRIM=50 |     MINCLUSTER=30 , MAXTRIM=50 | ||
|     => 2 scaffolds, 148 contigs |     => 2 scaffolds, 148 contigs, 3,883,789 bp | ||
|    [[Media:CB.2007_0801_AMOScmp-relaxed.qc|CB.qc]] |    [[Media:CB.2007_0801_AMOScmp-relaxed.qc|CB.qc]] | ||
|    [[Media:CB.2007_0801_AMOScmp-relaxed.chromo.png|CB.chromo.png]] |    [[Media:CB.2007_0801_AMOScmp-relaxed.chromo.png|CB.chromo.png]] | ||
| Line 68: | Line 72: | ||
|    [[Media:CB-scaff.2007_0801_AMOScmp-relaxed.png|CB-scaff.png]] |    [[Media:CB-scaff.2007_0801_AMOScmp-relaxed.png|CB-scaff.png]] | ||
| ''' | '''2007_0830_AMOScmp-relaxed''' | ||
| '''2007_0830_AMOScmp-relaxed''' | |||
Revision as of 16:44, 4 September 2007
Data sources
Sanger:
Hall strain A (ATCC 3502) chromosome: 3,886,916 bp 28.24 GC% plasmid: 16,344 bp 26.80 GC%
Mummerplot: Complete Genome vs Complete Genome
63,115 Sanger reads
 Read problems:
   no quality       : default 20 assigned to all the bases
   no mate pairing  : can be inferred from names (.p1c, .q1c => 27,331 mates); however there seem to be many errors (links from chromosome to the plasmid)
   no library info  : assumed there was only one library used
   no trimming info : almost all reads have "CONTAINED" alignments to the reference
                      CLR=1,len(read)
   there are 124 regions in the reference which are not covered by reads
   17K reads missing from Sanger ftp
78,975 Sanger reads
Cbot[1-9]*.[pq][12] 68028 #article: insert sizes of 1.5–2.kb and 2.0–2.kb, 2.2–2.kb, and 2.5–4.0 kb CbBAC1*.s1c 305 CbBAC4*.[pq]1c 430 CbBAC7*.[spq]1c 474 Cbot_ends*.[pq]1c 1604 #article: 19 kb inserts (2kb stdev) ; based on nucmer alignements: 9kb inserts (2kb stdev) CBOT[1-9]*.[pqw] 509 #415 primer walks CBOTC 166 #all primer walks J*.[pqs] 7459 Total 78976
NCBI :
Reads have not been submitted to TA
The initial genome assembly was obtained from:
- 69,632 paired end sequences (giving 9.15-fold coverage) derived from four genomic shotgun libraries (all in pUC18 with insert sizes of 1.5–2.0 kb and 2.0–2.2 kb, 2.2–2.5 kb, and 2.5–4.0 kb) using dye terminator chemistry on ABI3700 automated sequencers;
- 1,604 pairedend sequences from one pBACe3.6 library with insert sizes of 15–23 kb (a clone coverage of 3.9-fold) were used as a scaffold.
- 9,343 directed sequencing reads were generated during finishing.
(Total 80,579 reads => 17,464 missing from ftp site)
Assembly
Location:
/fs/szasmg/Bacteria/C_botulinum
2007_0725_WGA
on the 63,115 Sanger reads runCA-OBT.pl (default params) location: 2007_0725_WGA => 109 scaffolds, 243 contigs, 3,823,075 bp => library inser estimates mean=1840.917 stdev=866.039
2007_0801_AMOScmp-relaxed
on the 63,115 Sanger reads MINCLUSTER=30 , MAXTRIM=50 => 2 scaffolds, 148 contigs, 3,883,789 bp CB.qc CB.chromo.png CB.plasmid.png CB-scaff.png
2007_0830_AMOScmp-relaxed
2007_0830_AMOScmp-relaxed