Cbcb:Pop-Lab:Chris-Report
Jump to navigation
Jump to search
June 12, 2009
Tasks
Intergenic space and gene ontology work
- Investigated GO annotation tools that Bo has linked me to.
- I have been working on the Snail, Lottia gigantea, genome.
- Found the top 10% intergenic space with relative ease, however there was no GO annotation file for the genes. Most of the genomes of interest will not have a GO annotation file (shocker, I know).
- BLAST'd the sequences using Blast2GO, ALL in ONE tool for functional annotation of (novel) sequences and the analysis of annotation data. http://www.blast2go.org/
- Blast2GO has a pipeline version, so I have been planning a way to take new incomplete genomes and have them run through this annotation pipeline.
- BLAST took around 8 hours for ~2200 sequences using an online database. I will run another group on the CBCB's BLAST.
- GO annotation using Blast2GO took around 24 hours.
- I plan on running HMMs for the approximate 200 sequences with no suitable hits this weekend.
Partition
- Drastically overhauling the backend of my partition program to make it more extendable/robust.
- Adding XML support to make parsing easier and increase possible functionality.
- Need to focus on getting James something usable asap.
Summer Goals
- Finish analyzing the intergenic space of the incomplete genomes. Determine future possibilities/publications.
- Modify/extend partition program to incorporate into James' metagenomic pipeline.
- Include viral metagenomic data into GeneParser.
Mar 2, 2009
Tasks
Intergenic space and gene ontology work with Cristian
- Approach
Get the gene-spacing information - typically a simple parse of GFF files, if they are available. Get the gene function information - use GO if available. Rank the genes based on 5' spacing size. Take 10% longest, 10% shortest and middle 20% and find out what they do with GeneMerge
- Completed genomes
- Anenome, Nematostella vectensis
- Frog, Xenopus tropicalis
- Waterflea, Daphnia pulex
- Incomplete genomes that require blast
- Gastropod Snail, Lottia gigantea
- Polychaete Worm, Capitella sp
- Future genomes
- Leech, Helobdella robusta
- Flatworm (Planaria), Schmidtea mediterranea
- Sea Slug, Aplysia californica
- Snail, Biomphalaria glabrata
- Slime-mold, Dictyostelium purpureum QSDP1
- Waiting to hear back from Cristian about blast value cut-offs for incomplete genomes.
Partitioning System
- Splitting contigs based on "subgroup" information - essentially extending breaking up an entire assembly into multiple "sub-assemblies" each containing just reads from a single subgroup.
Conserved genomic elements in bacteria
- Update elements based on Adam's changes to Insignia.
- Find something to write about.
Interesting Stuff
- New insights into aging based on transcription factors, Prevailing theory of aging challenged in Stanford worm study