Data
Original Traces
- 8 pairs of data files (paired ends)
cat trace.count | grep _1_ | sed 's/_sequence.txt//' | perl -ane 'print " ",$F[1],"\t",$F[0]/4,"\t",$F[0]/2,"\n";'
lib insert mates reads readLen ~coverage(500M genome) reverse adaptors comments
s_2_1_3kbp 3000 21563283 43,126,566 124 11 ? circularizarion
s_2_1_5kbp 5000/300 36218589 72,437,178 35 5 yes ? insert size is << 5kbp
s_2_1_8kbp 8000 198377 396,754 124 0.1 ? ?
s_3_1 475 35548153 71,096,306 124 18
s_4_1 475 35471044 70,942,088 124 18
s_5_1 475 35616846 71,233,692 124 18
s_6_1 475 35303840 70,607,680 124 18
s_7_1 475 34893313 69,786,626 124 18
Adaptors
>circularizarion
CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA
>circularizarion.revcomp
TCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACG
Location
/fs/szattic-asmg5/Bees/Megachile_rotundata/error_correction/large_libs/s_?_?_?kb.sequence.cor.all.txt
ftp://ftp.cbcb.umd.edu/pub/data/assembly/Megachile_rotundata/reads/s_?_?_?kb.sequence.cor.all.txt.gz
/fs/szattic-asmg5/Bees/Megachile_rotundata/frg # frg files to assemble
Assemblies
- CA Version: 6.1 (09/01/2010) /fs/szdevel/dpuiu/SourceForge/wgs-6.1/Linux-amd64/bin/runCA
- SOAP version 1.04: /nfshomes/dpuiu/szdevel/SOAPdenovo_Release1.04/
CA noOBT
Gatekeeper
LOAD STATS
7 libInput
7 libLoaded
0 libErrors
5 libWarnings
326,236,387 frgLoaded
326236387 numRandom
326236387 numPacked
LibraryName numActiveFRG numDelFRG numMatedFRG readLength clearLength
GLOBAL 326236387 0 315518526 37451489553 37418130441
LegacyUnmatedReads 0 0 0 0 0
s_2_3kb 9107424 0 9107424 942165284 910444046
s_2_8kb 209336 0 209336 21814418 20787384
s_3 63618839 0 61696784 7343024554 7342819494
s_4 63544688 0 61255960 7291557748 7291478152
s_5 63370860 0 61084368 7271218123 7271051639
s_6 63780887 0 61685156 7359094156 7359012512
s_7 62604353 0 60479498 7222615270 7222537214
Meryl
meryl -Dh -s 0-mercounts/asm-C-ms22-cm0 | more
Found 30567166217 mers.
Found 268251409 distinct mers.
Found 9679077 unique mers.
Largest mercount is 87908217; 1896 mers are too big for histogram.
1 9679077 0.0361 0.0003
2 8374869 0.0673 0.0009
3 2494762 0.0766 0.0011
...
54 5310305 0.6544 0.1789
...
1047970 1 1.0000 0.6652
Overlap
cat 1-overlapper/ovlopts.pl | grep ^\"h | wc -l
924
overlapStore -d asm.ovlStore | awk '{print $1}' | uniq -c | awk '{print $1}' | count.pl | getSummary.pl -i 0 -j 1
overlapStats -G asm.gkpStore -O asm.ovlStore -o asm
Location
mulberry:/scratch2/dpuiu/Megachile_rotundata/Assembly/wgs-noOBT
SOAPdenovo (Tanja)
cat *.ContigIndex | grep -v ^E | grep -v ^i | count.pl -i 1 | getSummary.pl -j 1 -t "contigs"
cat *.ContigIndex | grep -v ^E | grep -v ^i | count.pl -i 1 | getSummary.pl -j 1 -min 100 -t "contigs(>100bp)"
grep "^>" *.scaf | getSummary.pl -i 2 -t scaf
. elem min q1 q2 q3 max mean n50 sum
contigs 9742349 31 32 33 37 114832 60.09 44 585430821
contigs(>100bp) 177327 100 131 261 1398 114832 1333.68 3897 236496823
scaf 7863 102 903 3272 17692 2338728 37825.70 240706 297423517
/fs/szattic-asmg5/Bees/Megachile_rotundata/Assembly/assembly5kbForAll
SOAPdenovo (Daniela)
. elem min q1 q2 q3 max mean n50 sum
contigs(all) 6917796 31 32 34 40 121554 70.46 73 487401812
contigs(>100bp) 210666 100 124 222 1174 121554 1108.69 3138 233563401
scaff 25119 351 1896 4444 10914 1102803 11041.00 26876 277338897
mulberry:/scratch2/dpuiu/Megachile_rotundata/Assembly/SOAPdenovo-redo