Assembly merge: Difference between revisions

From Cbcb
Jump to navigation Jump to search
Line 54: Line 54:


Merged assemblies(contigs&singletons):
Merged assemblies(contigs&singletons):
   assemblers    type        input-data                #reads  #ctgs  min    max    mean    stdev  ctgs-sum      
   assemblers    type        input-data                #reads  #ctgs  min    max    mean    stdev  ctgs-sum     comments   
   minimus       denovo      edena+velvet(contigs)    39245  23644  45      6688    257.15  232.94  6080063
   minimus(ovl40) denovo      edena+velvet(contigs)    39245  23644  45      6688    257.15  232.94  6080063     very few 40bp overlaps are found
  minimus(ovl20) denovo      edena+velvet(contigs)    39245  18603  45      6688    322.32  311.02  5996244

Revision as of 13:46, 28 March 2008

Cases

No reference assembly

One data set, multiple denovo assemblers

Example:

 * Solexa data
 * edena & velvet assemblers

Solutions:

 * merge 2 assembly sets
 * run minimus on them

Multipls data sets, one(multiple) denovo assemblers

Example:

 Solexa & 454 data
 velvet assemblers for each set

One reference assembly

Multiple reference assemblies


Examples

Pseudomonas_syringae

Reference:

 Name           Length  %GC
 NC_004578.1    6397126 58.40
 NC_004633.1    73661   55.15
 NC_004632.1    67473   56.17

Repeats:

 desc    #repeats   min     max     mean    stdev    sum
 50bp+   991        50      7362    393.73  792.41   390192
 100bp+  429        100     7362    815.36  1060.29  349793

Data:

 Type            #reads       min     max     mean
 Solexa          6340136      32      32      32
 Sim(ulated)     6538167      32      32      32
 454             77466        35      371     240

Single assemblies:

 Assembler   type         input-data  #reads         #ctgs   min     max     mean      stdev     ctgs-sum      #singl       
 edena       denovo       Solaxa      6340136        14084   100     5075    210.92    145.68    2970720       4893301(77%)
 velvet      denovo       Solaxa      6340136        25161   45      5057    241.83    212.61    6084887
 edena-sim   denovo       Sim         6538167        2068    100     47881   2994.03   4857.76   6191673       198699(3%)
 velvet-sim  denovo       Sim         6538167        2207    45      56810   2820.91   5348.36   6225757       123591(2%)
 AMOScmp     comparative  Solaxa      6340136        187     20      577929  34863.06  91692.34  6519394       698638(11%)

Merged assemblies(contigs&singletons):

 assemblers     type         input-data                #reads  #ctgs   min     max     mean     stdev   ctgs-sum     comments    
 minimus(ovl40) denovo       edena+velvet(contigs)     39245   23644   45      6688    257.15   232.94  6080063      very few 40bp overlaps are found
 minimus(ovl20) denovo       edena+velvet(contigs)     39245   18603   45      6688    322.32   311.02  5996244