Cbcb:Pop-Lab:Ted-Report: Difference between revisions
Jump to navigation
Jump to search
(Added tentative progress report for week ending June 12) |
(→June 5, 2009: Added genome lengths for organisms, tablulated list) |
||
Line 33: | Line 33: | ||
*# Finally, preference was given to organisms I recognized from gut microbiome papers/discussions, or failing that, just a really awesome name. | *# Finally, preference was given to organisms I recognized from gut microbiome papers/discussions, or failing that, just a really awesome name. | ||
** The final list is: | ** The final list is: | ||
{| class="wikitable" border="1" | |||
|- | |||
! Organism | |||
! Classification | |||
! Genome Length | |||
|- | |||
| Methanobrevibacter_smithii_ATCC_35061 | |||
| archaea | |||
| 1853160 bp | |||
|- | |||
| Bacteroides fragilis NCTC 9343 | |||
| bacteroidetes | |||
| 5205140 bp | |||
|- | |||
| Porphyromonas gingivalis W83 | |||
| bacteroidetes | |||
| 2343476 bp | |||
|- | |||
| Aster yellows witches'-broom phytoplasma AYWB | |||
| firmicutes | |||
| 706569 bp | |||
|- | |||
| Bacillus subtilis subsp. subtilis str. 168 | |||
| firmicutes | |||
| 4214630 bp | |||
|} | |||
== June 12, 2009 == | == June 12, 2009 == |
Revision as of 19:09, 13 June 2009
Summer 2009 Goals
- Research traditional approaches taken to gene-level analysis of metagenomic data
- Critically evaluate the traditional approaches in general and in the context of current Pop Lab projects
- Identify portions of the analysis that can be automated
- Develop scalable tools to do the automated analysis
May 15, 2009
- Read VisANT paper and user manual[1]. Determined VisANT will work for manual metabolic pathway analysis of even large scale data sets and can be automated by running in "Batch Mode".
- Need to read about FastHMM[2]
- Still need to make "Welcome Wiki" for n00bs (read: new members)
May 22, 2009
- Made Welcome Wiki
- Read metagenomics papers
- Determined that VisANT can be used with Bo's data by importing it as MicroArray data
May 29, 2009
- Took an early Summer vacation last weekend:
- Drove to NC to see friend graduate with BS' in CS & Physics
- Went sailing for the first time at girlfriends' parents' place in VA
- Refined Welcome Wiki
- Read metagenomics/pathway reconstruction/analysis papers
- Organized reading group for Palsson Systems Bio book
June 5, 2009
- Read metagenomics/pathway reconstruction/analysis papers and first two chapters of Palsson book.
- Currently building test set for incorporation of Phymm into metagenomics pipeline.
- A single archaeal genome was chosen from the findings of Mihai's 2006 Science paper analyzing the human distal gut.
- Two pairs of bacterial genomes were chosen for the test set using columns on the NCBI RefSeq Complete Genomes website[3]:
- The pairs of bacterial genomes were taken from the Groups: Bacteroidetes/Chlorobi and Firmicutes because they are the two most predominant groups present in the human gut.
- I chose genomes with a complete set of NCBI Tools available.
- After this I attempted to choose genomes with significantly different GC content.
- Finally, preference was given to organisms I recognized from gut microbiome papers/discussions, or failing that, just a really awesome name.
- The final list is:
Organism | Classification | Genome Length |
---|---|---|
Methanobrevibacter_smithii_ATCC_35061 | archaea | 1853160 bp |
Bacteroides fragilis NCTC 9343 | bacteroidetes | 5205140 bp |
Porphyromonas gingivalis W83 | bacteroidetes | 2343476 bp |
Aster yellows witches'-broom phytoplasma AYWB | firmicutes | 706569 bp |
Bacillus subtilis subsp. subtilis str. 168 | firmicutes | 4214630 bp |
June 12, 2009
- Today is my birthday!!! :D
- Last week's meeting was a success!
- The books came in Wednesday so the reading is now readily available.
- I am reading chapter 3 and the 2006 paper for Friday.
- I met with Arthur to discuss Phymm.
- I have gotten better at using MetaSim and am currently generating the previously described test data set composed of 1 million 200bp reads.
- I have also been relearning how to use the AMOS package in preparation of piping the output from Phymm into it
- Note: It appears that Phymm can be "parallelized" by dividing the query file into smaller files and merging the output files. According to Arthur, each read is scored independently. So the only limits are the number of reads and the number of processors. I am considering parsing the test set into as many as 10 files.