Bos taurus: Difference between revisions

From Cbcb
Jump to navigation Jump to search
No edit summary
No edit summary
Line 102: Line 102:


Summary:
Summary:
                #ctgs  min    max    mean    median  n50    total_bp
                              #ctgs  min    max    mean    median  n50    total_bp
   before       210657  71      840370  13709  1523    78508  2887902366
   UMD_Freeze2.0_contam       210657  71      840370  13709  1523    78508  2887902366
   after        187699 71      840370  15224  1609    79583   2857615915 
   UMD_Freeze2.0              187704 71      840370  15224  1609    79580   2857605192


Files: /fs/szasmg3/dpuiu/bos_taurus/submission/decontam
Files: /fs/szasmg3/dpuiu/bos_taurus/submission/decontam

Revision as of 20:30, 16 December 2008

BCM

NCBI

 SPECIES_CODE = "BOS TAURUS"                                                          37,788,710 traces
 SPECIES_CODE = "BOS TAURUS" and CENTER_NAME = "BCM"                                  35,596,825 traces
 
 SPECIES_CODE = "BOS TAURUS" and CENTER_NAME = "BCM" and TRACE_TYPE_CODE = "WGS"      24,863,627 traces
 SPECIES_CODE = "BOS TAURUS" and CENTER_NAME = "BCM" and TRACE_TYPE_CODE = "SHOTGUN"  10,716,306 traces
 SPECIES_CODE = "BOS TAURUS" and CENTER_NAME = "BCM" and TRACE_TYPE_CODE = "CLONEEND"     16,892 traces

Submission

  • Title: "A whole-genome assembly of the cow, Bos taurus"
  • Authors:
 Steven Salzberg
 Aleksey Zimin
 Arthur Delcher
 Liliana Florea
 David Kelley
 Finian Hanrahan
 Guillaume Marcais
 Geo Pertea
 Michael Roberts
 Michael Schatz
 Curt Van Tassell
 James Yorke
 Poorani S.
  • Assembler:
 Celera Assembler and UMD Overlapper.
  • Sequencing Center :
 Baylor College of Medicine. 
  • Source of DNA used for sequencing:

The source of the BAC library DNA was Hereford bull L1 Domino 99375, registration number 41170496. Dr. Michael MacNeil's laboratory, USDA-ARS, Miles City, MT provided the blood. The DNA for the whole genome shotgun sequences was provided by Dr. Timothy Smith's laboratory, U.S. Meat Animal Research Center, Clay Center, NE from white blood cells from L1 Dominette 01449, American Hereford Association registration number 42190680 (a daughter of L1 Domino 99375). A skin cell fibroblast cell line from the same animal is available from Dr. Carol Chitko-McKown's laboratory, although there is no sequence from that cell line.

  • Sequence modifiers:
 [organism=Bos taurus][breed=Hereford][tech=wgs][chromosome=...]
  • Submission: Use sequin
 /nfshomes/dpuiu/szdevel/sequin.8.10/sequin
  • Sequence:
 Contig length summary:
           #seqs   min     max     mean    median  n50     sum
 all       210657  71      840370  13709   1523    78511   2887902366
 placed    75775   88      840370  34512   13416   88287   2615171268
 unplaced  134882  71      166670  2022    1322    1742    272731098


Problems

Duplicates

 deg0003136509,7180003440308 : both unplaced
 deg0003084562,7180002954167 : both unplaced

Contaminants

210657 TOTAL

All:

                             #elem   min     max     mean    median  n50     sum          unplaced       placed 
 list.exclude_contigs(ctgs)  4813    316     16661   1511    1485    1514    7271697      4801           12
 list.trim_contigs(ctgs)     19049   457     433546  4424    1204    70673   84270091     16631          2418
 list.trim_contigs(regions)  30279   48      2479    354     319     445     10724357     .              .
 

Exclude: 4813 DELETED

                             #elem   min     max     mean    median  n50     sum          unplaced       placed    
 all                         4813    316     16661   1511    1485    1514    7271697      4801           12     all DELETED
 mitochondrion               73      1003    2081    1368    1299    1390    99872        73             0
 degenerates                 43      316     1098    842     872     889     36201        38             5
 contaminants                4697    861     16661   1519    1487    1517    7135624      4690           7

Trim: 17870 DELETED

                             #elem   min     max     mean    median  n50     sum          unplaced      placed
 all                         19049   457     433546  4424    1204    70673   84270091     16631         2418
 2000-                       17870   457     1995    1234    1188    1224    22044745     16448         1422       all DELETED
 2000+                       1179    2004    433546  52778   28023   112295  62225346     183           996        some DELETED
 

Trim 2000+ vector in the middle: 7+4+6=17 DELETED , 4*2+6*2=20 ADDED

                             #elem   min     max     mean    median  n50     sum          unplaced      placed
 all                         24      2190    274661  64464   20588   162585  1547139      8             16
 deleted                     7       2190    3246    2667    2654    2654    18668        6             1         all DELETED
 trimmed                     7       4182    162585  53460   20588   102303  374220       2             5
 replaced_by_2               4       35208   224013  113166  143258  143258  452665       0             4         join100004686,join100006091,join100007519,join100007923                 
 split_into_2                6       5365    274661  116931  149360  227489  701586       0             6         7180003327752,7180003312564,7180003340990,7180003313669,7180003378076,7180003341221
 TOTAL+ADDED         = 210657+20=210677
 TOTAL+ADDED-DELETED = 210657+20-4813-17870-17=187977 contigs

Summary:

                             #ctgs   min     max     mean    median  n50     total_bp
 UMD_Freeze2.0_contam        210657  71      840370  13709   1523    78508   2887902366
 UMD_Freeze2.0               187704  71      840370  15224   1609    79580   2857605192

Files: /fs/szasmg3/dpuiu/bos_taurus/submission/decontam

Local files

  • Freeze dir files
 /fs/szasmg3/bos_taurus/Bos_taurus_UMD_2.0/contigs.unplaced.fa  : sequences
 /fs/szasmg3/bos_taurus/Bos_taurus_UMD_2.0/bos_taurus.agp       : all scaffolds
  • Files uploaded
 Ftp server: ftp-private.ncbi.nlm.nih.gov
 Account: cbcb_trc
 Dir: uploads/
 Local files: /fs/szasmg3/dpuiu/bos_taurus/submission/ftp/   : 22 *sqn + 1 agp