Cbcb:Pop-Lab:Chris-Report: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
(→Tasks) |
||
(One intermediate revision by the same user not shown) | |||
Line 2: | Line 2: | ||
===Tasks=== | ===Tasks=== | ||
'''Intergenic space and gene ontology work''' | '''Intergenic space and gene ontology work''' | ||
*Investigated GO annotation tools that Bo has linked me to. | |||
*I have been working on the Snail, ''Lottia gigantea'', genome. | *I have been working on the Snail, ''Lottia gigantea'', genome. | ||
*Found the top 10% intergenic space. BLAST'd the sequences using Blast2GO, '''ALL in ONE tool for functional annotation of (novel) sequences and the analysis of annotation data.''' http://www.blast2go.org/ | *Found the top 10% intergenic space with relative ease, however there was no GO annotation file for the genes. Most of the genomes of interest will not have a GO annotation file (shocker, I know). | ||
*BLAST took around | *BLAST'd the sequences using Blast2GO, '''ALL in ONE tool for functional annotation of (novel) sequences and the analysis of annotation data.''' http://www.blast2go.org/ | ||
*Blast2GO has a pipeline version, so I have been planning a way to take new incomplete genomes and have them run through this annotation pipeline. | |||
*BLAST took around 8 hours for ~2200 sequences using an online database. I will run another group on the CBCB's BLAST. | |||
*GO annotation using Blast2GO took around 24 hours. | *GO annotation using Blast2GO took around 24 hours. | ||
*I plan on running HMMs for the approximate 200 sequences with no suitable hits this weekend. | *I plan on running HMMs for the approximate 200 sequences with no suitable hits this weekend. | ||
'''Partition''' | '''Partition''' | ||
*Drastically overhauling the backend of my partition program to make it more extendable/robust. | *Drastically overhauling the backend of my partition program to make it more extendable/robust. | ||
*Adding XML support to make parsing easier. | *Adding XML support to make parsing easier and increase possible functionality. | ||
*Need to focus on getting James something usable asap. | |||
===Summer Goals=== | ===Summer Goals=== | ||
*Finish analyzing the intergenic space of the incomplete genomes. Determine future possibilities/publications. | *Finish analyzing the intergenic space of the incomplete genomes. Determine future possibilities/publications. |
Latest revision as of 00:47, 13 June 2009
June 12, 2009
Tasks
Intergenic space and gene ontology work
- Investigated GO annotation tools that Bo has linked me to.
- I have been working on the Snail, Lottia gigantea, genome.
- Found the top 10% intergenic space with relative ease, however there was no GO annotation file for the genes. Most of the genomes of interest will not have a GO annotation file (shocker, I know).
- BLAST'd the sequences using Blast2GO, ALL in ONE tool for functional annotation of (novel) sequences and the analysis of annotation data. http://www.blast2go.org/
- Blast2GO has a pipeline version, so I have been planning a way to take new incomplete genomes and have them run through this annotation pipeline.
- BLAST took around 8 hours for ~2200 sequences using an online database. I will run another group on the CBCB's BLAST.
- GO annotation using Blast2GO took around 24 hours.
- I plan on running HMMs for the approximate 200 sequences with no suitable hits this weekend.
Partition
- Drastically overhauling the backend of my partition program to make it more extendable/robust.
- Adding XML support to make parsing easier and increase possible functionality.
- Need to focus on getting James something usable asap.
Summer Goals
- Finish analyzing the intergenic space of the incomplete genomes. Determine future possibilities/publications.
- Modify/extend partition program to incorporate into James' metagenomic pipeline.
- Include viral metagenomic data into GeneParser.
Mar 2, 2009
Tasks
Intergenic space and gene ontology work with Cristian
- Approach
Get the gene-spacing information - typically a simple parse of GFF files, if they are available. Get the gene function information - use GO if available. Rank the genes based on 5' spacing size. Take 10% longest, 10% shortest and middle 20% and find out what they do with GeneMerge
- Completed genomes
- Anenome, Nematostella vectensis
- Frog, Xenopus tropicalis
- Waterflea, Daphnia pulex
- Incomplete genomes that require blast
- Gastropod Snail, Lottia gigantea
- Polychaete Worm, Capitella sp
- Future genomes
- Leech, Helobdella robusta
- Flatworm (Planaria), Schmidtea mediterranea
- Sea Slug, Aplysia californica
- Snail, Biomphalaria glabrata
- Slime-mold, Dictyostelium purpureum QSDP1
- Waiting to hear back from Cristian about blast value cut-offs for incomplete genomes.
Partitioning System
- Splitting contigs based on "subgroup" information - essentially extending breaking up an entire assembly into multiple "sub-assemblies" each containing just reads from a single subgroup.
Conserved genomic elements in bacteria
- Update elements based on Adam's changes to Insignia.
- Find something to write about.
Interesting Stuff
- New insights into aging based on transcription factors, Prevailing theory of aging challenged in Stanford worm study