Comparative assemblies: Difference between revisions
Jump to navigation
Jump to search
No edit summary |
No edit summary |
||
Line 24: | Line 24: | ||
* Update read clr's; run AMOScmp | * Update read clr's; run AMOScmp | ||
=== Contig merging === | |||
* Identify adjacent contig end overlaps | |||
* Programs that do alignment & sequence merging: | |||
* EMBOSS merger: does not handle long sequences | |||
* fastaMerge.pl | |||
Input: multiFasta file; contigs must be ordered and oriented; only checks adjacent contig ends | |||
Example: ~dpuiu/bin/fastaMerge.pl -min 5 -max 30 -id 0.8 file.fasta -debug 1 > file.merge.fasta | |||
ctg1_id ctg2_id ovl_len ovl_id | |||
20 21 10 1 | |||
34 35 18 1 | |||
36 37 9 0.88 | |||
... | |||
2008_0109_AMOSCmp-PA14-relaxed-17-nucmer-redo2 assembly: # contigs 2053 -> 1927 | |||
=== Multiple references === | === Multiple references === |
Revision as of 20:12, 26 February 2008
AMOScmp pipeline
Short reads(Solexa)
Modified parameters
* Smaller nucmer alignement/cluster sizes : default are 20/65 ; drop to 16/16 ; as low as 14/14; 12/12 gives too many spurious alignments: -D MINMATCH=20 -D MINCLUSTER=20 * Drop casm-layout min ovl from 10 to 5: -D MINOVL=5 * Drop casm-layout majority from 70 to 50: -D MAJORITY=50 * Drop make-consensus alignment wiggle from 15 to 2 -D ALIGNWIGGLE=2 * Use make-consensus -x option ??? * Use promer instead of nucmer: alignement/cluster sizes of 6/11 (in AA)
Read trimming
* Quality trimming: to stringent
* Align to reference using nucmer (small -c -l); trim reads to alignment coordinates * Identify 0 cvg regions; don't trim reads adjacent to these regions * Update read clr's; run AMOScmp
Contig merging
* Identify adjacent contig end overlaps * Programs that do alignment & sequence merging: * EMBOSS merger: does not handle long sequences * fastaMerge.pl Input: multiFasta file; contigs must be ordered and oriented; only checks adjacent contig ends Example: ~dpuiu/bin/fastaMerge.pl -min 5 -max 30 -id 0.8 file.fasta -debug 1 > file.merge.fasta ctg1_id ctg2_id ovl_len ovl_id 20 21 10 1 34 35 18 1 36 37 9 0.88 ... 2008_0109_AMOSCmp-PA14-relaxed-17-nucmer-redo2 assembly: # contigs 2053 -> 1927