Cbcb:Pop-Lab:Chris-Report: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
(→Tasks) |
||
(7 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
== June 12, 2009 == | |||
===Tasks=== | |||
'''Intergenic space and gene ontology work''' | |||
*Investigated GO annotation tools that Bo has linked me to. | |||
*I have been working on the Snail, ''Lottia gigantea'', genome. | |||
*Found the top 10% intergenic space with relative ease, however there was no GO annotation file for the genes. Most of the genomes of interest will not have a GO annotation file (shocker, I know). | |||
*BLAST'd the sequences using Blast2GO, '''ALL in ONE tool for functional annotation of (novel) sequences and the analysis of annotation data.''' http://www.blast2go.org/ | |||
*Blast2GO has a pipeline version, so I have been planning a way to take new incomplete genomes and have them run through this annotation pipeline. | |||
*BLAST took around 8 hours for ~2200 sequences using an online database. I will run another group on the CBCB's BLAST. | |||
*GO annotation using Blast2GO took around 24 hours. | |||
*I plan on running HMMs for the approximate 200 sequences with no suitable hits this weekend. | |||
'''Partition''' | |||
*Drastically overhauling the backend of my partition program to make it more extendable/robust. | |||
*Adding XML support to make parsing easier and increase possible functionality. | |||
*Need to focus on getting James something usable asap. | |||
===Summer Goals=== | |||
*Finish analyzing the intergenic space of the incomplete genomes. Determine future possibilities/publications. | |||
*Modify/extend partition program to incorporate into James' metagenomic pipeline. | |||
*Include viral metagenomic data into GeneParser. | |||
== Mar 2, 2009 == | == Mar 2, 2009 == | ||
'''Intergenic | ===Tasks=== | ||
'''Intergenic space and gene ontology work with Cristian''' | |||
*Approach | *Approach | ||
Get the gene-spacing information - typically a simple parse of GFF files, if they are available. | |||
Get the gene function information - use GO if available. | |||
Rank the genes based on 5' spacing size. | |||
Take 10% longest, 10% shortest and middle 20% and find out what they do with GeneMerge | |||
*Completed genomes | *Completed genomes | ||
**Anenome, ''Nematostella vectensis'' | **Anenome, ''Nematostella vectensis'' | ||
Line 11: | Line 33: | ||
**Waterflea, ''Daphnia pulex'' | **Waterflea, ''Daphnia pulex'' | ||
*Incomplete genomes that require blast | *Incomplete genomes that require blast | ||
**Gastropod Snail, Lottia gigantea | **Gastropod Snail, ''Lottia gigantea'' | ||
**Polychaete Worm, Capitella sp | **Polychaete Worm, ''Capitella sp'' | ||
*Future genomes | *Future genomes | ||
**Leech, ''Helobdella robusta'' | **Leech, ''Helobdella robusta'' | ||
Line 18: | Line 40: | ||
**Sea Slug,'' Aplysia californica'' | **Sea Slug,'' Aplysia californica'' | ||
**Snail,'' Biomphalaria glabrata'' | **Snail,'' Biomphalaria glabrata'' | ||
**Slime-mold, ''Dictyostelium purpureum QSDP1'' | |||
*Waiting to hear back from Cristian about blast value cut-offs for incomplete genomes. | |||
'''Partitioning System''' | |||
*Splitting contigs based on "subgroup" information - essentially extending breaking up an entire assembly into multiple "sub-assemblies" each containing just reads from a single subgroup. | |||
'''Conserved genomic elements in bacteria''' | |||
*Update elements based on Adam's changes to Insignia. | |||
*Find something to write about. | |||
===Interesting Stuff=== | |||
*New insights into aging based on transcription factors, [http://med.stanford.edu/news_releases/2008/july/aging-worm.html Prevailing theory of aging challenged in Stanford worm study] |
Latest revision as of 00:47, 13 June 2009
June 12, 2009
Tasks
Intergenic space and gene ontology work
- Investigated GO annotation tools that Bo has linked me to.
- I have been working on the Snail, Lottia gigantea, genome.
- Found the top 10% intergenic space with relative ease, however there was no GO annotation file for the genes. Most of the genomes of interest will not have a GO annotation file (shocker, I know).
- BLAST'd the sequences using Blast2GO, ALL in ONE tool for functional annotation of (novel) sequences and the analysis of annotation data. http://www.blast2go.org/
- Blast2GO has a pipeline version, so I have been planning a way to take new incomplete genomes and have them run through this annotation pipeline.
- BLAST took around 8 hours for ~2200 sequences using an online database. I will run another group on the CBCB's BLAST.
- GO annotation using Blast2GO took around 24 hours.
- I plan on running HMMs for the approximate 200 sequences with no suitable hits this weekend.
Partition
- Drastically overhauling the backend of my partition program to make it more extendable/robust.
- Adding XML support to make parsing easier and increase possible functionality.
- Need to focus on getting James something usable asap.
Summer Goals
- Finish analyzing the intergenic space of the incomplete genomes. Determine future possibilities/publications.
- Modify/extend partition program to incorporate into James' metagenomic pipeline.
- Include viral metagenomic data into GeneParser.
Mar 2, 2009
Tasks
Intergenic space and gene ontology work with Cristian
- Approach
Get the gene-spacing information - typically a simple parse of GFF files, if they are available. Get the gene function information - use GO if available. Rank the genes based on 5' spacing size. Take 10% longest, 10% shortest and middle 20% and find out what they do with GeneMerge
- Completed genomes
- Anenome, Nematostella vectensis
- Frog, Xenopus tropicalis
- Waterflea, Daphnia pulex
- Incomplete genomes that require blast
- Gastropod Snail, Lottia gigantea
- Polychaete Worm, Capitella sp
- Future genomes
- Leech, Helobdella robusta
- Flatworm (Planaria), Schmidtea mediterranea
- Sea Slug, Aplysia californica
- Snail, Biomphalaria glabrata
- Slime-mold, Dictyostelium purpureum QSDP1
- Waiting to hear back from Cristian about blast value cut-offs for incomplete genomes.
Partitioning System
- Splitting contigs based on "subgroup" information - essentially extending breaking up an entire assembly into multiple "sub-assemblies" each containing just reads from a single subgroup.
Conserved genomic elements in bacteria
- Update elements based on Adam's changes to Insignia.
- Find something to write about.
Interesting Stuff
- New insights into aging based on transcription factors, Prevailing theory of aging challenged in Stanford worm study