Francisella tularensis: Difference between revisions

From Cbcb
Jump to navigation Jump to search
Line 31: Line 31:


Final assembly steps:
Final assembly steps:
   1. Reads were downloaded from TA and formatted using tarchive2ca  
   1. The complete genome sequence was downloaded from NCBI: NC_008369.1
   * NC_008369.2
  2. Reads were downloaded from TA and formatted using tarchive2ca  
  2. There are 2 Sanger libraries for this project  
   3. There are 2 Sanger libraries for this project  
       BFTBP: #reads=58051 , insert_mean=2000, insert_stdev=666  
       BFTBP: #reads=58051 , insert_mean=2000, insert_stdev=666  
       BFTDP: #reads=10409 , insert_mean=2000, insert_stdev=666  
       BFTDP: #reads=10409 , insert_mean=2000, insert_stdev=666  
   3. The reads have been retrimmed using veraTrim (-T 10 -M 100 -E 500)
   4. The reads have been retrimmed using veraTrim (-T 10 -M 100 -E 500)
   4. runCA-OBT.pl has been used to assemble all the reads
   5. runCA-OBT.pl has been used to assemble all the reads  
   5. The library sizes were updates using the WGA estimates
    location: 2007_0724_WGA-default/
    =>160 scaff, 163 contigs, 23X coverage
   6. The library sizes were updates using the WGA estimates
       BFTBP: insert_mean=2690.042, insert_stdev=643.126
       BFTBP: insert_mean=2690.042, insert_stdev=643.126
       BFTDP: insert_mean=3675.914, insert_stdev=1225
       BFTDP: insert_mean=3675.914, insert_stdev=1225
   6. The WGA was aligned to the reference using nucmer; one rearrangement, one   
   7. The WGA was aligned to the reference using nucmer; one rearrangement, one   
     deletion and several SNP's were noticed
     deletion and several SNP's were noticed
   7. The reads were assembled using AMOScmp (default parameters); 2 missoriented
   8. The reads were assembled using AMOScmp (default parameters)  
    read pile regions were noticed
    location: 2007_0724_AMOSCMP-default/
   8. The assembly was aligned to itself; 950 bp inverted repeats were identified as  
    => 1 scaffold, 22 contigs
     flanking the problem regions
    2 missoriented read pile regions were noticed
   9. The 2 regions were flipped   
   9. The assembly was aligned to itself; 950 bp inverted repeats were identified as  
 
     flanking the problem regions; the coordinates are:
      16336-21562   (5  KB)
      167086-184936 (17 KB)
  10. The 2 regions were flipped ; the new reference is called NC_008369.2
   11. Several small contig (step 8) read clear ranges have been extend to their OBT
      trimming points
  12. AMOScmp was rerun using more relaxed parameters:
        nucmer      MINCLUSTER=30
        casm-layout MAXTRIM=50
    location: 2007_0731_AMOSCMP-veraTrim-updateDst-relaxed-updateClr-fixRef2->best
    => 1 scaffold, 8 contigs
 


Final assembly location:       
Final assembly location:       

Revision as of 19:49, 8 August 2007

Data sources:

NCBI:

Broad:

Baylor:

Type A:

FSC033 : Broad assembly (15 contigs)

Assemblies location:

 /fs/szasmg/Bacteria/F_tularensis_tularensis_FSC033/

Type B:

OSU18 : BCM complete

Final assembly steps:

 1. The complete genome sequence was downloaded from NCBI: NC_008369.1
 2. Reads were downloaded from TA and formatted using tarchive2ca 
 3. There are 2 Sanger libraries for this project 
      BFTBP: #reads=58051 , insert_mean=2000, insert_stdev=666 
      BFTDP: #reads=10409 , insert_mean=2000, insert_stdev=666 
 4. The reads have been retrimmed using veraTrim (-T 10 -M 100 -E 500)
 5. runCA-OBT.pl has been used to assemble all the reads 
    location: 2007_0724_WGA-default/ 
    =>160 scaff, 163 contigs, 23X coverage
 6. The library sizes were updates using the WGA estimates
      BFTBP: insert_mean=2690.042, insert_stdev=643.126
      BFTDP: insert_mean=3675.914, insert_stdev=1225
 7. The WGA was aligned to the reference using nucmer; one rearrangement, one  
    deletion and several SNP's were noticed
 8. The reads were assembled using AMOScmp (default parameters) 
    location: 2007_0724_AMOSCMP-default/
    => 1 scaffold, 22 contigs
    2 missoriented read pile regions were noticed
 9. The assembly was aligned to itself; 950 bp inverted repeats were identified as 
    flanking the problem regions; the coordinates are:
      16336-21562   (5  KB)
      167086-184936 (17 KB)
 10. The 2 regions were flipped ; the new reference is called NC_008369.2
 11. Several small contig (step 8) read clear ranges have been extend to their OBT 
     trimming points
 12. AMOScmp was rerun using more relaxed parameters: 
       nucmer      MINCLUSTER=30 
       casm-layout MAXTRIM=50
    location: 2007_0731_AMOSCMP-veraTrim-updateDst-relaxed-updateClr-fixRef2->best
    => 1 scaffold, 8 contigs 
 

Final assembly location:

 /fs/szasmg/Bacteria/F_tularensis_holarctica_OSU18/best

Assembly locations:

 /fs/szasmg/Bacteria/F_tularensis_holarctica_OSU18/

Novicida

Mediasiatica