Bos taurus redo: Difference between revisions

From Cbcb
Jump to navigation Jump to search
No edit summary
No edit summary
Line 9: Line 9:
* ftp download counts
* ftp download counts


* CENTER_NAME counts
   COUNT          CENTER_NAME     
   COUNT          CENTER_NAME     
   35629020        BCM            Baylor College of Medicine
   35629020        BCM            Baylor College of Medicine
Line 28: Line 29:
* Avg read LEN=984
* Avg read LEN=984
* Avg read CLIPPING RANGE (CLB intersect CLV)=760
* Avg read CLIPPING RANGE (CLB intersect CLV)=760
----


* TRACE_TYPE_CODE counts
   COUNT        CENTER_NAME    TRACE_TYPE_CODE        #LIBS(all)    #LIBS(10K+ reads)
   COUNT        CENTER_NAME    TRACE_TYPE_CODE        #LIBS(all)    #LIBS(10K+ reads)
   24863599      BCM            WGS                    89            31
   24863599      BCM            WGS                    89            31
Line 56: Line 57:
   2485          UIACBCB        EST
   2485          UIACBCB        EST
   '''1019449'''      total
   '''1019449'''      total
----


* STRATEGY & TRACE_TYPE_CODE counts
   COUNT          CENTER_NAME    STRATEGY        TRACE_TYPE_CODE
   COUNT          CENTER_NAME    STRATEGY        TRACE_TYPE_CODE
   12545304        BCM            .              WGS
   12545304        BCM            .              WGS
Line 89: Line 90:
   18623          SC              EST            EST
   18623          SC              EST            EST
   2485            UIACBCB        .              EST
   2485            UIACBCB        .              EST
* TOTAL and 3' VECTOR TRIMMED counts
  CENTER_NAME    TRACE_TYPE_CODE TOTAL          3'CLV<LEN
  BCM            WGS            24863599        10968979
  BCM            SHOTGUN        10748529        5052692
  NISC            SHOTGUN        737900          28972
  BCCAGSC        CLONEEND        125597          125484
  UIUC            CLONEEND        114753          90243
  TIGR            CLONEEND        65171          46389
  GSC            CLONEEND        53556          53556
  CENARGEN        WGS            26246          26246
  BARC            CLONEEND        25454          25454
  BCM            CLONEEND        16892          6751
  CENARGEN        CLONEEND        16787          16787
  UOKNOR          SHOTGUN        15150          2885
  TIGR_JCVIJTC    CLONEEND        10651          339
  UOKNOR          FINISHING      151            ?
  WUGSC          CLONEEND        49              ?
  BCCAGSC        EST            527017  524173
  MARC            EST            207204  207204
  MARC            PCR            171667  171667
  BARC            EST            81913  78597
  SC              EST            18623  7350
  UIACBCB        EST            2485    2485


= Data =
= Data =

Revision as of 15:39, 26 December 2008

BCM

NCBI

  • CENTER_NAME counts
 COUNT           CENTER_NAME     
 35629020        BCM             Baylor College of Medicine
 737900          NISC            NIH Intramural Sequencing Center
 652614          BCCAGSC         British Columbia Cancer Agency Genome Sciences Centre                           # TA query_tracedb CENTER_NAME = "BCCAGSC" => 652,510 
 378871          MARC            USDA, ARS, US Meat Animal Research Center
 114753          UIUC            University of Illinois at Urbana-Champaign                                      # TA query_tracedb CENTER_NAME = "UIUC" => 106,368
 107367          BARC            USDA, ARS, Beltsville Agricultural Research Center
 65171           TIGR            The Institute for Genome Research
 53556           GSC             Genoscope
 43033           CENARGEN        Embrapa Genetic Resources and Biotechnology
 18623           SC              The Sanger Center
 15301           UOKNOR          University of Oklahoma Norman Campus, Advanced Center for Genome Technology
 10651           TIGR_JCVIJTC    The Institute for Genomic Research, Traces generated at JCVIJTC                 # TA query_tracedb CENTER_NAME="JCVI"
 2485            UIACBCB         University of Iowa Center for Bioinformatics and Computation Biology (UIACBCB)
 49              WUGSC           Washington University, Genome Sequencing Center                                 # TA query_tracedb CENTER_NAME = "WUGSC" => 9
 37829394        total           total                                                                           # TA query_tracedb SPECIES_CODE = "BOS TAURUS" => 37,788,710 
  • Avg read LEN=984
  • Avg read CLIPPING RANGE (CLB intersect CLV)=760
  • TRACE_TYPE_CODE counts
 COUNT         CENTER_NAME     TRACE_TYPE_CODE        #LIBS(all)     #LIBS(10K+ reads)
 24863599      BCM             WGS                    89             31
 10748529      BCM             SHOTGUN                10             10
 737900        NISC            SHOTGUN                4              3
 125597        BCCAGSC         CLONEEND
 114753        UIUC            CLONEEND
 65171         TIGR            CLONEEND
 53556         GSC             CLONEEND
 26246         CENARGEN        WGS
 25454         BARC            CLONEEND
 16892         BCM             CLONEEND               1              1      VBBAA   mea=167000  std=25000
 16787         CENARGEN        CLONEEND
 15150         UOKNOR          SHOTGUN
 10651         TIGR_JCVIJTC    CLONEEND
 151           UOKNOR          FINISHING
 49            WUGSC           CLONEEND
 36809945      total

 #don't include 
 527017        BCCAGSC         EST
 207204        MARC            EST
 171667        MARC            PCR
 81913         BARC            EST
 81913         BARC            EST
 2485          UIACBCB         EST
 1019449       total
  • STRATEGY & TRACE_TYPE_CODE counts
 COUNT           CENTER_NAME     STRATEGY        TRACE_TYPE_CODE
 12545304        BCM             .               WGS
 11425910        BCM             WGA             WGS
 5223683         BCM             CLONE           SHOTGUN
 4479883         BCM             POOLCLONE       SHOTGUN
 1044963         BCM             .               SHOTGUN
 892385          BCM             SNP             WGS
 737900          NISC            CLONE           SHOTGUN
 125597          BCCAGSC         CLONEEND        CLONEEND
 114753          UIUC            CLONEEND        CLONEEND 
 65171           TIGR            CLONEEND        CLONEEND
 53556           GSC             CLONEEND        CLONEEND
 26246           CENARGEN        .               WGS
 25454           BARC            .               CLONEEND
 16892           BCM             CLONEEND        CLONEEND
 16787           CENARGEN        CLONEEND        CLONEEND
 12195           UOKNOR          .               SHOTGUN
 10651           TIGR_JCVIJTC    CLONEEND        CLONEEND
 2955            UOKNOR          CLONE           SHOTGUN
 151             UOKNOR          .               FINISHING
 49              WUGSC           CLONEEND        CLONEEND

 #don't include
 527017          BCCAGSC         EST             EST
 145820          MARC            EST             EST
 117958          MARC            COMPARATIVE     PCR
 81913           BARC            EST             EST
 61384           MARC            CLONE           EST
 53709           MARC            Re-Sequencing   PCR
 18623           SC              EST             EST
 2485            UIACBCB         .               EST
  • TOTAL and 3' VECTOR TRIMMED counts
 CENTER_NAME     TRACE_TYPE_CODE TOTAL           3'CLV<LEN
 BCM             WGS             24863599        10968979
 BCM             SHOTGUN         10748529        5052692
 NISC            SHOTGUN         737900          28972
 BCCAGSC         CLONEEND        125597          125484
 UIUC            CLONEEND        114753          90243
 TIGR            CLONEEND        65171           46389
 GSC             CLONEEND        53556           53556
 CENARGEN        WGS             26246           26246
 BARC            CLONEEND        25454           25454
 BCM             CLONEEND        16892           6751
 CENARGEN        CLONEEND        16787           16787
 UOKNOR          SHOTGUN         15150           2885
 TIGR_JCVIJTC    CLONEEND        10651           339
 UOKNOR          FINISHING       151             ?
 WUGSC           CLONEEND        49              ?
 BCCAGSC         EST             527017  524173
 MARC            EST             207204  207204
 MARC            PCR             171667  171667
 BARC            EST             81913   78597
 SC              EST             18623   7350
 UIACBCB         EST             2485    2485

Data

Local Files:

 /fs/szasmg3/bos_taurus/data/
 /fs/szasmg2/Drosophila/D_pseudoobscura/Vectors
 /nfshomes/dpuiu/db/UniVec

Software

Figaro

  • trims vector only at 5' end
  • call lucy trimming for qaulities

Lucy

  • both vector sequence and splice sites are required

Atlas

  • web site
  • atlas-screen-trim-file : "calls cross_match and atlas-screen-window to create trimmed reads file (scan in from each end of read looking for 50-base windows of high quality and no vector); "