Bos taurus

From Cbcb
Revision as of 20:47, 11 December 2008 by Dpuiu (talk | contribs) (→‎Contaminants)
Jump to navigation Jump to search

BCM

NCBI

 SPECIES_CODE = "BOS TAURUS"                                                          37,788,710 traces
 SPECIES_CODE = "BOS TAURUS" and CENTER_NAME = "BCM"                                  35,596,825 traces
 
 SPECIES_CODE = "BOS TAURUS" and CENTER_NAME = "BCM" and TRACE_TYPE_CODE = "WGS"      24,863,627 traces
 SPECIES_CODE = "BOS TAURUS" and CENTER_NAME = "BCM" and TRACE_TYPE_CODE = "SHOTGUN"  10,716,306 traces
 SPECIES_CODE = "BOS TAURUS" and CENTER_NAME = "BCM" and TRACE_TYPE_CODE = "CLONEEND"     16,892 traces

Submission

  • Title: "A whole-genome assembly of the cow, Bos taurus"
  • Authors:
 Steven Salzberg
 Aleksey Zimin
 Arthur Delcher
 Liliana Florea
 David Kelley
 Finian Hanrahan
 Guillaume Marcais
 Geo Pertea
 Michael Roberts
 Michael Schatz
 Curt Van Tassell
 James Yorke
 Poorani S.
  • Assembler:
 Celera Assembler and UMD Overlapper.
  • Sequencing Center :
 Baylor College of Medicine. 
  • Source of DNA used for sequencing:

The source of the BAC library DNA was Hereford bull L1 Domino 99375, registration number 41170496. Dr. Michael MacNeil's laboratory, USDA-ARS, Miles City, MT provided the blood. The DNA for the whole genome shotgun sequences was provided by Dr. Timothy Smith's laboratory, U.S. Meat Animal Research Center, Clay Center, NE from white blood cells from L1 Dominette 01449, American Hereford Association registration number 42190680 (a daughter of L1 Domino 99375). A skin cell fibroblast cell line from the same animal is available from Dr. Carol Chitko-McKown's laboratory, although there is no sequence from that cell line.

  • Sequence modifiers:
 [organism=Bos taurus][breed=Hereford][tech=wgs][chromosome=...]
  • Submission: Use sequin
 /nfshomes/dpuiu/szdevel/sequin.8.10/sequin
  • Sequence:
 Contig length summary:
           #seqs   min     max     mean    median  n50     sum
 all       210657  71      840370  13709   1523    78511   2887902366
 placed    75775   88      840370  34512   13416   88287   2615171268
 unplaced  134882  71      166670  2022    1322    1742    272731098


Problems

Duplicates

 deg0003136509,7180003440308 : both unplaced
 deg0003084562,7180002954167 : both unplaced

Contaminants

210657 TOTAL

All:

                             #elem   min     max     mean    median  n50     sum          unplaced       placed 
 list.exclude_contigs(ctgs)  4813    316     16661   1511    1485    1514    7271697      4801           12
 list.trim_contigs(ctgs)     19049   457     433546  4424    1204    70673   84270091     16631          2418
 list.trim_contigs(regions)  30279   48      2479    354     319     445     10724357     .              .
 

Exclude: 4813 DELETED

                             #elem   min     max     mean    median  n50     sum          unplaced       placed    
 all                         4813    316     16661   1511    1485    1514    7271697      4801           12     all DELETED
 mitochondrion               73      1003    2081    1368    1299    1390    99872        73             0
 degenerates                 43      316     1098    842     872     889     36201        38             5
 contaminants                4697    861     16661   1519    1487    1517    7135624      4690           7

Trim: 17870 DELETED

                             #elem   min     max     mean    median  n50     sum          unplaced      placed
 all                         19049   457     433546  4424    1204    70673   84270091     16631         2418
 2000-                       17870   457     1995    1234    1188    1224    22044745     16448         1422       all DELETED
 2000+                       1179    2004    433546  52778   28023   112295  62225346     183           996        some DELETED
 

Trim 2000+ vector in the middle: 7+4+6=17 DELETED , 4*2+6*2=20 ADDED

                             #elem   min     max     mean    median  n50     sum          unplaced      placed
 all                         24      2190    274661  64464   20588   162585  1547139      8             16
 deleted                     7       2190    3246    2667    2654    2654    18668        6             1         all DELETED
 trimmed                     7       4182    162585  53460   20588   102303  374220       2             5
 replaced_by_2               4       35208   224013  113166  143258  143258  452665       0             4         join100004686,join100006091,join100007519,join100007923                 
 split_into_2                6       5365    274661  116931  149360  227489  701586       0             6         7180003327752,7180003312564,7180003340990,7180003313669,7180003378076,7180003341221
 TOTAL+ADDED         = 210657+20=210677
 TOTAL+ADDED-DELETED = 210657+20-4813-17870-17=187977 contigs

Summary:

              #ctgs   min     max     mean    median  n50     total_bp
 before        210657  71      840370  13709   1523    78508   2887902366
 after         187977  71      840370  15204   1607    79580   2857967035
 difference    22680                                           29935331 

Files: /fs/szasmg3/dpuiu/bos_taurus/submission/decontam

Local files

  • Freeze dir files
 /fs/szasmg3/bos_taurus/Bos_taurus_UMD_2.0/contigs.unplaced.fa  : sequences
 /fs/szasmg3/bos_taurus/Bos_taurus_UMD_2.0/bos_taurus.agp       : all scaffolds
  • Files uploaded
 Ftp server: ftp-private.ncbi.nlm.nih.gov
 Account: cbcb_trc
 Dir: uploads/
 Local files: /fs/szasmg3/dpuiu/bos_taurus/submission/ftp/   : 22 *sqn + 1 agp