Bumblebee: Difference between revisions
Jump to navigation
Jump to search
(→Data) |
No edit summary |
||
Line 14: | Line 14: | ||
* Lane 3: 8Kbp insert; 32,540,640 reads | * Lane 3: 8Kbp insert; 32,540,640 reads | ||
* Formatting: | * Formatting: keep only the first 100bp | ||
= Assembly = | = Assembly = | ||
* | * Meryl | ||
meryl -Dh -s 0-mercounts/asm-C-ms22-cm1 >! 22mers.hist | |||
Found 3136399464 mers. | |||
Found 379123530 distinct mers. | |||
Found 201257394 unique mers. | |||
Largest mercount is 12006651; 90 mers are too big for histogram. | |||
* countKmers | |||
most frequent 22mer : AGCATACATTATACGAAGTTAT ~ 16% of the seqs | |||
most frequent 42mer : CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA ~ 10% of the seqs (pPAC7.9124-9165) | |||
* Location | |||
/fs/szdevel/dpuiu/SourceForge/wgs-assembler.030210/Linux-amd64/bin/runCA | /fs/szdevel/dpuiu/SourceForge/wgs-assembler.030210/Linux-amd64/bin/runCA |
Revision as of 03:23, 4 March 2010
Data
- Location:
/fs/szattic-asmg4/Bees/Bombus_impatiens
- There are 7 pairs of data files (paired ends) : lanes 1..3,5..8 (lane 4 wasn't used)
- Tasks to figure out:
1. Erroneous reads/bases, which we need to correct or discard 2. GC bias, so we can compute a-stats properly 3. Redundancy in the long paired ends, which are lane 1 and lane 2.
- Lane 1: 3Kbp insert; 34,944,099 reads
- Lane 3: 8Kbp insert; 32,540,640 reads
- Formatting: keep only the first 100bp
Assembly
- Meryl
meryl -Dh -s 0-mercounts/asm-C-ms22-cm1 >! 22mers.hist Found 3136399464 mers. Found 379123530 distinct mers. Found 201257394 unique mers. Largest mercount is 12006651; 90 mers are too big for histogram.
- countKmers
most frequent 22mer : AGCATACATTATACGAAGTTAT ~ 16% of the seqs most frequent 42mer : CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA ~ 10% of the seqs (pPAC7.9124-9165)
- Location
/fs/szdevel/dpuiu/SourceForge/wgs-assembler.030210/Linux-amd64/bin/runCA