From Cbcb
Jump to navigation Jump to search


  • Create "group" object to sub-cluster sequences/contigs/scaffolds/etc
  • Remove iid concept - keep bid and let message files use eids for cross-referencing. this will allow the merging of files
  • Remove the layout objects - equivalent with contig without a sequence
  • New way to store sequence data targeted at short sequences for which it's too verbose to store names and such
  • figure out a way to handle 454 mate-pairs, esp. in hawkeye
  • build a better 'toAmos'

Note: some progress in these directions has been made but the changes need to be cleaned up.


  • integrate other aligners - Daniela has a preliminary pipeline that works with SOAP but why not Bowtie? A solution would be to allow inputs in BAM format.

Bambus 2

  • Handle scaffolds better than currently done (now the code uses "contigs" instead of scaffolds)
  • Write good documentation and put together pipelines for various types of data and underlying assemblers