Salmonella: Difference between revisions

From Cbcb
Jump to navigation Jump to search
No edit summary
Line 7: Line 7:
   /fs/ftp-cbcb/pub/data/dsommer/
   /fs/ftp-cbcb/pub/data/dsommer/
   /fs/szasmg/Bacteria/Salmonella/
   /fs/szasmg/Bacteria/Salmonella/
   /tmp/B_SPA (host: sycamore)
   /fs/szasmg/Bacteria/Salmonella/S_enterica_paratyphi_A/


== SPA ==
== SPA ==
Line 16: Line 16:


Best assembly:  
Best assembly:  
   B_SPA/edit_dir/B_SPA.fasta.screen.ace.83
   /fs/szasmg/Bacteria/Salmonella/S_enterica_paratyphi_A/edit_dir/B_SPA.fasta.screen.ace.83
File location:
 
  /tmp/B_SPA/edit_dir/B_SPA.fasta.screen.ace.83
Longest contig:  
Longest contig:  
   CO Contig1368 4813926 88824 1869182 C  
   CO Contig1368 4813926 88824 1869182 C  
Line 41: Line 40:
Create mate pair file (Bambus format, tab delimited)
Create mate pair file (Bambus format, tab delimited)
   $ cat B_SPA.mates
   $ cat B_SPA.mates
    library small  2000    4000    .*
    library small  2000    4000    (p).*
    pair    p(.*).b1$       p(.*).g1$
    pair    (p.*)\.b1$     (p.*)\.g1$
   
   
    library medium  4500    5500    .*
    library medium  4500    5500    (oyg).*
    pair    oyg(.*).b1$    oyg(.*).g1$
    pair    (oyg.*).b1$    (oyg.*).g1$
 
   
    library large  35000  45000  .*
    library large  35000  45000  (P_AA).*
    pair    P_AA(.*).b1$    P_AA(.*).g1$
    pair    (P_AA.*).b1$    (P_AA.*).g1$


Rerun convertion utilities:
Rerun convertion utilities:
   $ toAmos -m B_SPA.mates -ace B_SPA.fasta.screen.ace.83 -o B_SPA.afg  
   $ toAmos -m B_SPA.mates -ace B_SPA.fasta.screen.ace.83 -o B_SPA.afg  
   $ bank-transact -b B_SPA.bnk -m B_SPA.afg -c
   $ bank-transact -b B_SPA.bnk -m B_SPA.afg -c

Revision as of 14:54, 22 October 2007

Data

Strain:

 Salmonella enterica subsp. enterica serovar Paratyphi A str. ATCC 9150: B_SPA
 Salmonella typhimurium LT2                                            : B_STM

File locations:

 /fs/ftp-cbcb/pub/data/dsommer/
 /fs/szasmg/Bacteria/Salmonella/
 /fs/szasmg/Bacteria/Salmonella/S_enterica_paratyphi_A/

SPA

  1. Traces:
 All directories: 103971 (unique)
 B_SPA : 102405  (unique) => 1566 missing

Best assembly:

 /fs/szasmg/Bacteria/Salmonella/S_enterica_paratyphi_A/edit_dir/B_SPA.fasta.screen.ace.83

Longest contig:

 CO Contig1368 4813926 88824 1869182 C 
 !!! Other Salmonella's are also 4.8M

The *.b1,*g1 reads seem to be mated!

Mate pairs:

 p(.*).[bg]1
 oyg(.*).[bg]1
 P_AA(.*).[bg]1

Estimate lib insert sizes:

 $ toAmos -ace B_SPA.fasta.screen.ace.83
 $ grep -c ^rds B_SPA.afg         # check if links were created
 $ more toAmos.error              # check if there were any convertion errors
 $ bank-transact -b B_SPA.bnk -m B_SPA.afg -c
 $ bank2contig B_SPA.bnk > B_SPA.contig
 $ cat B_SPA.contig | grep ^# | grep -v ^## | sort 
 # look at distances between mated reads

Create mate pair file (Bambus format, tab delimited)

 $ cat B_SPA.mates
    library small   2000    4000    (p).*
    pair    (p.*)\.b1$      (p.*)\.g1$
    
    library medium  4500    5500    (oyg).*
    pair    (oyg.*).b1$     (oyg.*).g1$
    
    library large   35000   45000   (P_AA).*
    pair    (P_AA.*).b1$    (P_AA.*).g1$

Rerun convertion utilities:

 $ toAmos -m B_SPA.mates -ace B_SPA.fasta.screen.ace.83 -o B_SPA.afg 
 $ bank-transact -b B_SPA.bnk -m B_SPA.afg -c