Kalanchoe: Difference between revisions

From Cbcb
Jump to navigation Jump to search
 
(13 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Data =
= Data =
* 300M genome
* ~5x 454 from a variety of library sizes  
* ~5x 454 from a variety of library sizes  
* ~20x illumina.  
* ~20x illumina (Illumina scale qualities).  
CABOG ... throws a lot of errors on our architecture for some reason.
 
* Location:
  /fs/szattic-asmg7/Kalenchoe_genome/
 
== 454 ==
* Read stats
  LIB          reads  mates      meaIns          stdIns          20mers                  ids                     
  fff01.frg.gz 545792 0                                          AT                      GLMTKIY01
  fff02.frg.gz 459461 0                                          AT                      GLMTKIY02
  fff03.frg.gz 477691 0                                          AC                      GLZRKVN01
  fff04.frg.gz 610848 0                                          AAACCCTAAACCCTAAACCCTA  GKFZ9MZ01
  fff05.frg.gz 450912 0                                          AAAACCCATAAAGTTGTTATTT  GKFZ9MZ02
  fff06.frg.gz 548462 0                                          AACAAGGCACACAGGGGATAGG  GKH094001
  fff11.frg.gz 418299 118317      20k,17072      4268            CG                      GMF8K3302   
  fff12.frg.gz 807808 273276      8k,6609        1652            AT                      GK7ZAL002   
  fff13.frg.gz 638072 205830      8k,6571        1642            ACGTACGTACGTACGTACGTAC  GLC77YN02   
  fff14.frg.gz 771593 231598      3k,2749        687            AT                      GK7ZAL001   
  fff15.frg.gz 634113 165697      3k,2768        692            AT                      GLC77YN01   
 
* linker issues
'titanium' == TCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACG and
              CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA
 
* Locations
  /fs/szattic-asmg7/Kalenchoe_genome/rawreads/FFF/
  /fs/szattic-asmg7/Kalenchoe_genome/rawreads/FFFp/
 
== Illumina ==
* 7 libs, 250 bp insert size
* trimmed all reads to 64bp
* Location
  /fs/szattic-asmg7/Kalenchoe_genome/rawreads/Illumina/
  /fs/szattic-asmg7/Kalenchoe_genome/rawreads/correctlyMatedIllumina/
  /fs/szattic-asmg7/dpuiu/Kalenchoe_genome/Data/IlluminaTrimmedMatedCor/


= Assembly =
= Assembly =


== Blogs ==
* CA.454
* http://groups.google.com/group/bgi-soap/browse_thread/thread/0a4b727e8ec526c9
  .                        elem    min  q1    q2    q3    max    mean  n50    sum
   assemble the illumina lane with SOAPdenovo and then
  scf                      29225    235  1161  1449  3293  233867  5972  22778  174536065
   assemble with newbler the 454 lanes and the resulting contig of SOAP2.
  ctg                      65154    64  1160  1502  2240  19244  1931  2133  125781315
   SOAP uses a de bruijn graph data structure that is well suited for short illumina reads but it is not enough flexible in order to handle 454 reads.
  deg                      279542  62  302  452  579  6662    461  534    129007100
   Newbler, instead, is based on Overlap Layout Approach that work well with long reads
 
* SOAPdenovo.Illumina
  .                        elem    min  q1    q2    q3    max    mean  n50    sum
  scf                      320855  100  141  275  590  56087  716  1987  229678477
  ctg                      3630647  32  34    50    79    21193  97    143    352876334
  scf2                    320855  100  141  269  579  56087  709  1993  227485865
  ctg2                    333643  33  138  260  560  45815  680  1873  226926306
  shreds2                  690592  1    139  270  610  2000    524  1107  362212087
 
* CA.454.IlluminaShreds
  .                       elem    min  q1    q2    q3    max    mean  n50    sum
  scf                      26334    187  1200  1610  3603  583607  8703  43384  229194795
  ctg                      55625    64  1262  1975  3990  70870  3575  5949  198884011
  deg                      251480  64  267  437  560  17662  442  527    111232984
 
* Locations
  /fs/szattic-asmg7/dpuiu/Kalenchoe_genome/Assembly/CA.454/
   /fs/szattic-asmg7/dpuiu/Kalenchoe_genome/Assembly/SOAPdenovo.Illumina/
   /fs/szattic-asmg7/dpuiu/Kalenchoe_genome/Assembly/CA.454.IlluminaShreds/
 
* Ftp Location:
   ftp://ftp.cbcb.umd.edu/pub/data/assembly/Kalenchoe/
   ftp://ftp.cbcb.umd.edu/pub/data/assembly/Kalenchoe/README

Latest revision as of 14:50, 15 June 2011

Data

  • 300M genome
  • ~5x 454 from a variety of library sizes
  • ~20x illumina (Illumina scale qualities).
  • Location:
 /fs/szattic-asmg7/Kalenchoe_genome/

454

  • Read stats
 LIB          reads  mates       meaIns          stdIns          20mers                  ids                       
 fff01.frg.gz 545792 0                                           AT                      GLMTKIY01
 fff02.frg.gz 459461 0                                           AT                      GLMTKIY02
 fff03.frg.gz 477691 0                                           AC                      GLZRKVN01
 fff04.frg.gz 610848 0                                           AAACCCTAAACCCTAAACCCTA  GKFZ9MZ01
 fff05.frg.gz 450912 0                                           AAAACCCATAAAGTTGTTATTT  GKFZ9MZ02
 fff06.frg.gz 548462 0                                           AACAAGGCACACAGGGGATAGG  GKH094001

 fff11.frg.gz 418299 118317      20k,17072       4268            CG                      GMF8K3302    
 fff12.frg.gz 807808 273276      8k,6609         1652            AT                      GK7ZAL002    
 fff13.frg.gz 638072 205830      8k,6571         1642            ACGTACGTACGTACGTACGTAC  GLC77YN02    
 fff14.frg.gz 771593 231598      3k,2749         687             AT                      GK7ZAL001    
 fff15.frg.gz 634113 165697      3k,2768         692             AT                      GLC77YN01    
  • linker issues
'titanium' == TCGTATAACTTCGTATAATGTATGCTATACGAAGTTATTACG and
              CGTAATAACTTCGTATAGCATACATTATACGAAGTTATACGA
  • Locations
 /fs/szattic-asmg7/Kalenchoe_genome/rawreads/FFF/
 /fs/szattic-asmg7/Kalenchoe_genome/rawreads/FFFp/

Illumina

  • 7 libs, 250 bp insert size
  • trimmed all reads to 64bp
  • Location
 /fs/szattic-asmg7/Kalenchoe_genome/rawreads/Illumina/
 /fs/szattic-asmg7/Kalenchoe_genome/rawreads/correctlyMatedIllumina/
 /fs/szattic-asmg7/dpuiu/Kalenchoe_genome/Data/IlluminaTrimmedMatedCor/

Assembly

  • CA.454
 .                        elem     min  q1    q2    q3    max     mean  n50    sum
 scf                      29225    235  1161  1449  3293  233867  5972  22778  174536065
 ctg                      65154    64   1160  1502  2240  19244   1931  2133   125781315
 deg                      279542   62   302   452   579   6662    461   534    129007100
  • SOAPdenovo.Illumina
 .                        elem     min  q1    q2    q3    max     mean  n50    sum
 scf                      320855   100  141   275   590   56087   716   1987   229678477
 ctg                      3630647  32   34    50    79    21193   97    143    352876334

 scf2                     320855   100  141   269   579   56087   709   1993   227485865
 ctg2                     333643   33   138   260   560   45815   680   1873   226926306
 shreds2                  690592   1    139   270   610   2000    524   1107   362212087
  • CA.454.IlluminaShreds
 .                        elem     min  q1    q2    q3    max     mean  n50    sum
 scf                      26334    187  1200  1610  3603  583607  8703  43384  229194795
 ctg                      55625    64   1262  1975  3990  70870   3575  5949   198884011
 deg                      251480   64   267   437   560   17662   442   527    111232984
  • Locations
 /fs/szattic-asmg7/dpuiu/Kalenchoe_genome/Assembly/CA.454/
 /fs/szattic-asmg7/dpuiu/Kalenchoe_genome/Assembly/SOAPdenovo.Illumina/
 /fs/szattic-asmg7/dpuiu/Kalenchoe_genome/Assembly/CA.454.IlluminaShreds/
  • Ftp Location:
 ftp://ftp.cbcb.umd.edu/pub/data/assembly/Kalenchoe/
 ftp://ftp.cbcb.umd.edu/pub/data/assembly/Kalenchoe/README