Megachile rotundata: Difference between revisions

From Cbcb
Jump to navigation Jump to search
Line 98: Line 98:
   contigs              9742349    31    32    33    37    114832    60.09      44        585430821       
   contigs              9742349    31    32    33    37    114832    60.09      44        585430821       
   contigs(>100bp)      177327    100    131    261    1398  114832    1333.68    3897      236496823    # N50 for Bee was 7K
   contigs(>100bp)      177327    100    131    261    1398  114832    1333.68    3897      236496823    # N50 for Bee was 7K
   scaf                7863      102    903    3272  17692  2338728    37825.70  240706    297423517    # N50 for Bee was 1.17G
   scaf                7863      102    903    3272  17692  2338728    37825.70  240706    297423517    # N50 for Bee was 1.17M
   
   
* Location  
* Location  

Revision as of 16:11, 2 September 2010

Data

Original Traces

  • 8 pairs of data files (paired ends)
 cat trace.count | grep _1_ | sed 's/_sequence.txt//' | perl -ane 'print "  ",$F[1],"\t",$F[0]/4,"\t",$F[0]/2,"\n";'
 lib        insert   mates           reads        readLen   ~coverage(500M genome)  reverse  adaptors            comments
 s_2_1_3kbp 3000     21563283        43,126,566   124       11                      ?        circularizarion
 s_2_1_5kbp 5000/300 36218589        72,437,178   35        5                       yes      ?                   insert size is << 5kbp
 s_2_1_8kbp 8000     198377          396,754      124       0.1                     ?        ?
 s_3_1      475      35548153        71,096,306   124       18
 s_4_1      475      35471044        70,942,088   124       18
 s_5_1      475      35616846        71,233,692   124       18
 s_6_1      475      35303840        70,607,680   124       18
 s_7_1      475      34893313        69,786,626   124       18

Adaptors

 >circularizarion
 CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA
 >circularizarion.revcomp
 TCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACG 

Location

 /fs/szattic-asmg5/Bees/Megachile_rotundata/error_correction/large_libs/s_?_?_?kb.sequence.cor.all.txt
 ftp://ftp.cbcb.umd.edu/pub/data/assembly/Megachile_rotundata/reads/s_?_?_?kb.sequence.cor.all.txt.gz

 /fs/szattic-asmg5/Bees/Megachile_rotundata/frg  # frg files to assemble

Assemblies

  • CA Version: 6.1 (09/01/2010) /fs/szdevel/dpuiu/SourceForge/wgs-6.1/Linux-amd64/bin/runCA
  • SOAP version 1.04: /nfshomes/dpuiu/szdevel/SOAPdenovo_Release1.04/

CA noOBT

Gatekeeper

  • ~ 74X cvg
 LOAD                  STATS          
 7                     libInput       
 7                     libLoaded      
 0                     libErrors      
 5                     libWarnings    
 
 326,236,387           frgLoaded        
 326236387             numRandom      
 326236387             numPacked       

 LibraryName           numActiveFRG  numDelFRG  numMatedFRG  readLength   clearLength  
 GLOBAL                326236387     0          315518526    37451489553  37418130441  
 LegacyUnmatedReads    0             0          0            0            0            
 s_2_3kb               9107424       0          9107424      942165284    910444046    
 s_2_8kb               209336        0          209336       21814418     20787384     
 s_3                   63618839      0          61696784     7343024554   7342819494   
 s_4                   63544688      0          61255960     7291557748   7291478152   
 s_5                   63370860      0          61084368     7271218123   7271051639   
 s_6                   63780887      0          61685156     7359094156   7359012512   
 s_7                   62604353      0          60479498     7222615270   7222537214   

Meryl

 meryl -Dh -s 0-mercounts/asm-C-ms22-cm0 | more
 Found 30567166217 mers.
 Found 268251409 distinct mers.
 Found 9679077 unique mers.
 Largest mercount is 87908217; 1896 mers are too big for histogram.
 1       9679077 0.0361  0.0003
 2       8374869 0.0673  0.0009
 3       2494762 0.0766  0.0011
 ...
 54      5310305 0.6544  0.1789
 ...
 1047970 1       1.0000  0.6652

Overlap

  • job count :
 cat 1-overlapper/ovlopts.pl | grep ^\"h | wc -l
 924
  • Stats
 overlapStore -d asm.ovlStore | awk '{print $1}' | uniq -c | awk '{print $1}' | count.pl | getSummary.pl -i 0 -j 1
 overlapStats -G asm.gkpStore -O asm.ovlStore -o asm

Location

 mulberry:/scratch2/dpuiu/Megachile_rotundata/Assembly/wgs-noOBT

SOAPdenovo (Tanja)

 cat *.ContigIndex | grep -v ^E | grep -v ^i | count.pl -i 1 | getSummary.pl -j 1 -t "contigs"
 cat *.ContigIndex | grep -v ^E | grep -v ^i | count.pl -i 1 | getSummary.pl -j 1 -min 100 -t "contigs(>100bp)"
 grep "^>" *.scaf | getSummary.pl -i 2 -t scaf
  • Stats
 .                    elem       min    q1     q2     q3     max        mean       n50        sum            
 contigs              9742349    31     32     33     37     114832     60.09      44         585430821      
 contigs(>100bp)      177327     100    131    261    1398   114832     1333.68    3897       236496823     # N50 for Bee was 7K
 scaf                 7863       102    903    3272   17692  2338728    37825.70   240706     297423517     # N50 for Bee was 1.17M

  • Location
 /fs/szattic-asmg5/Bees/Megachile_rotundata/Assembly/assembly5kbForAll

SOAPdenovo (Daniela)

  • Stats
 .                    elem       min    q1     q2     q3     max        mean       n50        sum            
 contigs(all)         6917796    31     32     34     40     121554     70.46      73         487401812      
 contigs(>100bp)      210666     100    124    222    1174   121554     1108.69    3138       233563401
 scaff                25119      351    1896   4444   10914  1102803    11041.00   26876      277338897
  • Location
 mulberry:/scratch2/dpuiu/Megachile_rotundata/Assembly/SOAPdenovo-redo