Trace formatting: Difference between revisions

From Cbcb
Jump to navigation Jump to search
Line 47: Line 47:
     +
     +
     IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII;>1
     IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII;>1
     @s_8_100_975_684
     ...
    AGTAGACACCTGGTGAACGAACCAACTGAGAAACGA
    +
    IIIIIIIIIIIIIIIIIIIIIIIIII-EII)IIIIG
    @s_8_100_216_399
    TTTAGCCCAGAACTTTTACTGCATTTGACTAGTCAA
    +
    IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII/I9


== Assembly Formats ==
== Assembly Formats ==


   [http://bioportal.cgb.indiana.edu/docs/tools/cap3/aceform ace]
   [http://bioportal.cgb.indiana.edu/docs/tools/cap3/aceform ace]

Revision as of 20:03, 30 October 2007

Sanger

454 (single reads)

454 (paired ends)

Features:

 * approximately 84-nucleotide DNA fragments 
 * have a ~ 44-mer linker sequence in the middle 
 * flanked by a ~ 20-mer sequence on each side. 
 * The two flanking 20-mers are segments of DNA that were originally located approximately 2.5 kb apart in the genome of interest.  
 * The ordering and orienting of contigs generates scaffolds which provide a high-quality draft sequence of the genome.

Anomalies:

 * the linker can appear (tandem,completely/partially) more than once

Links:

 1_paired_end.pdf

Solexa/Illumina

Links:

 Strep suis Solexa data set for download at Sanger
 NCBI Solexa example data set
 ismb2007Poster.pdf
 Smith_Rennes_2007.pdf

Software:

 * IO_LIB package /fs/sz-user-supported/common/packages/io_lib-1.11-x86_64/bin/
 * STADEN package /fs/sz-user-supported/common/packages/staden-src-1-7-0/distrib/unix-rel-1-7-0/linux-bin

Example:

 $ solexa2srf s_8_0100_seq.txt  -o s_8_0100_seq.srf
 $ srf2fastq s_8_0100_seq.srf

   @s_8_100_293_551
   CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCACC
   +
   IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
   @s_8_100_35_698
   TATATGATTGACAATATAAAAATATGAGTATAAAAT
   +
   IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII4/:I
   @s_8_100_880_947
   TTATTATCTTTATTGACGTACCTCTAGAAGACCCAA
   +
   IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII;>1
   ...

Assembly Formats

 ace