Pine tree: Difference between revisions
Jump to navigation
Jump to search
(→Reads) |
|||
Line 123: | Line 123: | ||
** Lane 3, with the lower concentration, should have higher quality data than lane 2 but with a higher cost per bp. | ** Lane 3, with the lower concentration, should have higher quality data than lane 2 but with a higher cost per bp. | ||
** The loss in quality was quantitativly small, so we don't expect the extra expense of lowering the concentration will be justified empirically. | ** The loss in quality was quantitativly small, so we don't expect the extra expense of lowering the concentration will be justified empirically. | ||
** The first library, FC70M6V_6_001, is a ~40x library created from a pool of ~1000 fosmids. In general, we do not put the insert size in the filename. | ** The first library, FC70M6V_6_001, is a ~40x library created from a pool of ~1000 fosmids. In general, we do not put the insert size in the filename. | ||
** However, we did estimate the insert size to be 343bp with a below median standard deviation of 30. So roughly 15% of the inserts are < 313bp and have > 3bp overlap. This seems to fit well with your result. | ** However, we did estimate the insert size to be 343bp with a below median standard deviation of 30. So roughly 15% of the inserts are < 313bp and have > 3bp overlap. This seems to fit well with your result. | ||
** Each lane is multiplexed into sub-lanes indicated by 00Z. So the amount of reads in the file is variable and not nessesarily reflective of the cluster density. | ** Each lane is multiplexed into sub-lanes indicated by 00Z. So the amount of reads in the file is variable and not nessesarily reflective of the cluster density. | ||
** The Drosophila libraries were each run in 1/4 lane and the fosmid pool was run in 1/2 lane. The pool has roughy double the sequence content of the | ** The Drosophila libraries were each run in 1/4 lane and the fosmid pool was run in 1/2 lane. The pool has roughy double the sequence content of the | ||
** Drosophila libraries run in lane 2 at nominal density. | ** Drosophila libraries run in lane 2 at nominal density. |
Revision as of 16:55, 15 July 2011
Links
- dendrome@ucdavis
- pinegenome.org
- NCBI Taxonomy record Pinus taeda or "loblolly pine"
Data
- UCDAVIS plone
https://dendrome.ucdavis.edu/TGPlone/research-projects/pinerefseq dpuiu ddr5fft6 https://dendrome.ucdavis.edu/TGPlone/research-projects/pinerefseq/files/library-and-flow-cell-data/prs-tracking-database-archive/
- IPST ftp
ftp genomepc1.umd.edu ftpuser pinegenome cd PineUpload052911/ bin prompt # no Y/N? mget *
- Local data
ginkgo: /scratch1/dpuiu/PINE/PineUpload052911 /scratch1/dpuiu/PINE/PineUpload070711
PineUpload052911
Chloroplast
len gc% cChloroplast 120481 38.55
cBACs
. elem min q1 q2 q3 max mean n50 sum len 102 8288 89909 116121 140549 172161 113400 126689 11566806 gc% 102 34.44 36.56 37.61 38.80 52.88 37.94 37.66 3870.87
Reads
library readLen #mates FC638TR_001_8 146 22,729,231 FC638TR_002_8 146 18,412,638
- Notes
- First 2bp of each read have higher A count
- GC% variation:
medianGC% cChloroplast 38.55 cBAC 37.61 reads 39.04
- cCholoplast alignments (bwasw)
library #hits %hits FC638TR_001_8_1 475254 2.09 FC638TR_001_8_2 473331 2.08 FC638TR_002_8_1 1009331 5.48 FC638TR_002_8_2 1004341 5.45
- cBAC alignments (bwasw)
library #hits %hits FC638TR_001_8_1 9722204 42.77 FC638TR_001_8_2 9481188 41.71 FC638TR_002_8_1 7684164 41.73 FC638TR_002_8_2 7469151 40.56
Sampled reads
- 100K sampled reads from each library (2*2*100K=400K)
. elem min q1 q2 q3 max mean n50 sum gc% 400000 0.68 34.93 39.04 43.15 95.89 39.20 40.41 .
- FC638TR_001_8_1 : 100K reads
ref qry aligner #hits %hits %identity(median) cBAC FC638TR_001_8_1 bwasw 42971 43 nucmer 12477 12.5 95 bowtie 1186 1.2% cChloroplast bwasw 2031 2% nucmer 1943 1.9% 100 bowtie 1490 1.5%
- FC638TR_00[12]_8_[12] : 4*100K reads
ref qry aligner #hits %hits cBAC FC638TR_001_8_1 bwasw 42971 43 FC638TR_001_8_2 41915 42 FC638TR_002_8_1 42128 42 FC638TR_002_8_2 40606 41 cChloroplast FC638TR_001_8_1 2031 2 FC638TR_001_8_2 2033 2 FC638TR_002_8_1 5370 5.3 FC638TR_002_8_2 5330 5.3
PineUpload070711
Ecoli
len gc% cE_coli 4639675 50.79
Cloning vector
len gc% pFosDT5_2 8345 47.93
Reads (Drosophila)
lib readLen #mates mea,std FC70M6V_6_001 160,156 23546475 343,30 TIL_242_FC70M6V_2_002 160,156 9917211 242 TIL_242_FC70M6V_3_002 160,156 6276300 242 TIL_254_FC70M6V_2_004 160,156 9279789 254 TIL_254_FC70M6V_3_004 160,156 5924239 254 TIL_270_FC70M6V_2_003 160,156 10188776 270 TIL_270_FC70M6V_3_003 160,156 6556676 270 TIL_288_FC70M6V_2_001 160,156 9524524 288 TIL_288_FC70M6V_3_001 160,156 6158919 288
- kastevens@ucdavis.edu:
- The files labeled TIL_XXX_FC70M6V_Y_00Z, are Drosophila libraries with a median target insert size of XXX. They come in pairs and can be merged.
- Regarding pairing, each insert size was run in two lanes Y at two different concentrations.
- Lane 3, with the lower concentration, should have higher quality data than lane 2 but with a higher cost per bp.
- The loss in quality was quantitativly small, so we don't expect the extra expense of lowering the concentration will be justified empirically.
- The first library, FC70M6V_6_001, is a ~40x library created from a pool of ~1000 fosmids. In general, we do not put the insert size in the filename.
- However, we did estimate the insert size to be 343bp with a below median standard deviation of 30. So roughly 15% of the inserts are < 313bp and have > 3bp overlap. This seems to fit well with your result.
- Each lane is multiplexed into sub-lanes indicated by 00Z. So the amount of reads in the file is variable and not nessesarily reflective of the cluster density.
- The Drosophila libraries were each run in 1/4 lane and the fosmid pool was run in 1/2 lane. The pool has roughy double the sequence content of the
- Drosophila libraries run in lane 2 at nominal density.