Bos taurus: Difference between revisions

From Cbcb
Jump to navigation Jump to search
No edit summary
Line 65: Line 65:
== Contaminants ==
== Contaminants ==


210657 TOTAL
All:
                              #elem  min    max    mean    median  n50    sum          unplaced      placed
  list.exclude_contigs(ctgs)  4813    316    16661  1511    1485    1514    7271697      4801          12
  list.trim_contigs(ctgs)    19049  457    433546  4424    1204    70673  84270091    16631          2418
  list.trim_contigs(regions)  30279  48      2479    354    319    445    10724357    .              .
 
----
Exclude: 4813 DELETED
                              #elem  min    max    mean    median  n50    sum          unplaced      placed   
  all                        4813    316    16661  1511    1485    1514    7271697      4801          12    all DELETED
  mitochondrion              73      1003    2081    1368    1299    1390    99872        73            0
  degenerates                43      316    1098    842    872    889    36201        38            5
  contaminants                4697    861    16661  1519    1487    1517    7135624      4690          7
----
Trim: 17870 DELETED
                              #elem  min    max    mean    median  n50    sum          unplaced      placed
  all                        19049  457    433546  4424    1204    70673  84270091    16631        2418
  2000-                      17870  457    1995    1234    1188    1224    22044745    16448        1422      all DELETED
  2000+                      1179    2004    433546  52778  28023  112295  62225346    183          996        some DELETED
 
Trim 2000+ vector in the middle: 7+4+6=17 DELETED , 4*2+6*2=20 ADDED
                              #elem  min    max    mean    median  n50    sum          unplaced      placed
  all                        24      2190    274661  64464  20588  162585  1547139      8            16
  deleted                    7      2190    3246    2667    2654    2654    18668        6            1        all DELETED
  trimmed                    7      4182    162585  53460  20588  102303  374220      2            5
  replaced_by_2              4      35208  224013  113166  143258  143258  452665      0            4        join100004686,join100006091,join100007519,join100007923               
  split_into_2                6      5365    274661  116931  149360  227489  701586      0            6        7180003327752,7180003312564,7180003340990,7180003313669,7180003378076,7180003341221
  TOTAL+ADDED        = 210657+20=210677
  TOTAL+ADDED-DELETED = 210657+20-4813-17870-17=187977 contigs
Summary:
                               #ctgs  min    max    mean    median  n50    total_bp
                               #ctgs  min    max    mean    median  n50    total_bp
   UMD_Freeze2.0_contam        210657  71      840370  13709  1523    78508  2887902366
   UMD_Freeze2.0_contam        210657  71      840370  13709  1523    78508  2887902366

Revision as of 15:08, 18 December 2008

BCM

NCBI

 SPECIES_CODE = "BOS TAURUS"                                                          37,788,710 traces
 SPECIES_CODE = "BOS TAURUS" and CENTER_NAME = "BCM"                                  35,596,825 traces
 
 SPECIES_CODE = "BOS TAURUS" and CENTER_NAME = "BCM" and TRACE_TYPE_CODE = "WGS"      24,863,627 traces
 SPECIES_CODE = "BOS TAURUS" and CENTER_NAME = "BCM" and TRACE_TYPE_CODE = "SHOTGUN"  10,716,306 traces
 SPECIES_CODE = "BOS TAURUS" and CENTER_NAME = "BCM" and TRACE_TYPE_CODE = "CLONEEND"     16,892 traces

Submission

  • Title: "A whole-genome assembly of the cow, Bos taurus"
  • Authors:
 Steven Salzberg
 Aleksey Zimin
 Arthur Delcher
 Liliana Florea
 David Kelley
 Finian Hanrahan
 Guillaume Marcais
 Geo Pertea
 Michael Roberts
 Michael Schatz
 Curt Van Tassell
 James Yorke
 Poorani S.
  • Assembler:
 Celera Assembler and UMD Overlapper.
  • Sequencing Center :
 Baylor College of Medicine. 
  • Source of DNA used for sequencing:

The source of the BAC library DNA was Hereford bull L1 Domino 99375, registration number 41170496. Dr. Michael MacNeil's laboratory, USDA-ARS, Miles City, MT provided the blood. The DNA for the whole genome shotgun sequences was provided by Dr. Timothy Smith's laboratory, U.S. Meat Animal Research Center, Clay Center, NE from white blood cells from L1 Dominette 01449, American Hereford Association registration number 42190680 (a daughter of L1 Domino 99375). A skin cell fibroblast cell line from the same animal is available from Dr. Carol Chitko-McKown's laboratory, although there is no sequence from that cell line.

  • Sequence modifiers:
 [organism=Bos taurus][breed=Hereford][tech=wgs][chromosome=...]
  • Submission: Use sequin
 /nfshomes/dpuiu/szdevel/sequin.8.10/sequin
  • Sequence:
 Contig length summary:
           #seqs   min     max     mean    median  n50     sum
 all       210657  71      840370  13709   1523    78511   2887902366
 placed    75775   88      840370  34512   13416   88287   2615171268
 unplaced  134882  71      166670  2022    1322    1742    272731098


Problems

Duplicates

 deg0003136509,7180003440308 : both unplaced
 deg0003084562,7180002954167 : both unplaced

Contaminants

                             #ctgs   min     max     mean    median  n50     total_bp
 UMD_Freeze2.0_contam        210657  71      840370  13709   1523    78508   2887902366
 UMD_Freeze2.0               187704  71      840370  15224   1609    79580   2857605192

Files: /fs/szasmg3/dpuiu/bos_taurus/submission/decontam

Local files

  • Freeze dir files
 /fs/szasmg3/bos_taurus/Bos_taurus_UMD_2.0/contigs.unplaced.fa  : sequences
 /fs/szasmg3/bos_taurus/Bos_taurus_UMD_2.0/bos_taurus.agp       : all scaffolds
  • Files uploaded
 Ftp server: ftp-private.ncbi.nlm.nih.gov
 Account: cbcb_trc
 Dir: uploads/
 Local files: /fs/szasmg3/dpuiu/bos_taurus/submission/ftp/   : 22 *sqn + 1 agp