Cbcb:Pop-Lab:Chris-Report

From Cbcb
Revision as of 00:47, 13 June 2009 by Cmhill (talk | contribs) (→‎Tasks)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

June 12, 2009

Tasks

Intergenic space and gene ontology work

  • Investigated GO annotation tools that Bo has linked me to.
  • I have been working on the Snail, Lottia gigantea, genome.
  • Found the top 10% intergenic space with relative ease, however there was no GO annotation file for the genes. Most of the genomes of interest will not have a GO annotation file (shocker, I know).
  • BLAST'd the sequences using Blast2GO, ALL in ONE tool for functional annotation of (novel) sequences and the analysis of annotation data. http://www.blast2go.org/
  • Blast2GO has a pipeline version, so I have been planning a way to take new incomplete genomes and have them run through this annotation pipeline.
  • BLAST took around 8 hours for ~2200 sequences using an online database. I will run another group on the CBCB's BLAST.
  • GO annotation using Blast2GO took around 24 hours.
  • I plan on running HMMs for the approximate 200 sequences with no suitable hits this weekend.

Partition

  • Drastically overhauling the backend of my partition program to make it more extendable/robust.
  • Adding XML support to make parsing easier and increase possible functionality.
  • Need to focus on getting James something usable asap.

Summer Goals

  • Finish analyzing the intergenic space of the incomplete genomes. Determine future possibilities/publications.
  • Modify/extend partition program to incorporate into James' metagenomic pipeline.
  • Include viral metagenomic data into GeneParser.

Mar 2, 2009

Tasks

Intergenic space and gene ontology work with Cristian

  • Approach
 Get the gene-spacing information - typically a simple parse of GFF files, if they are available.
 Get the gene function information - use GO if available.
 Rank the genes based on 5' spacing size.
 Take 10% longest, 10% shortest and middle 20% and find out what they do with GeneMerge
  • Completed genomes
    • Anenome, Nematostella vectensis
    • Frog, Xenopus tropicalis
    • Waterflea, Daphnia pulex
  • Incomplete genomes that require blast
    • Gastropod Snail, Lottia gigantea
    • Polychaete Worm, Capitella sp
  • Future genomes
    • Leech, Helobdella robusta
    • Flatworm (Planaria), Schmidtea mediterranea
    • Sea Slug, Aplysia californica
    • Snail, Biomphalaria glabrata
    • Slime-mold, Dictyostelium purpureum QSDP1
  • Waiting to hear back from Cristian about blast value cut-offs for incomplete genomes.

Partitioning System

  • Splitting contigs based on "subgroup" information - essentially extending breaking up an entire assembly into multiple "sub-assemblies" each containing just reads from a single subgroup.

Conserved genomic elements in bacteria

  • Update elements based on Adam's changes to Insignia.
  • Find something to write about.

Interesting Stuff