Turkey

From Cbcb
Jump to navigation Jump to search

Data

Chicken (Gallus gallus)

Stats:

.                                                elem       min    q1       q2       q3       max        mean       n50        sum            
Chr1..28,32,MT,W,Z,E22C19W28_E50C23,E64          34         1028   4512026  12968165 30671729 200994015  30377803   94230402   1,032,845,329
gaps(N's)                                        524913     1      30       64       254      1504285    268        792        141,055,297
chicken.len

Files:

 /fs/szasmg3/dpuiu/chicken/

Zebrafinch (Taeniopygia guttata)

Chr stats:

.                                                elem       min    q1      q2       q3       max        mean       n50        sum            
all(random dumplication)                         70         9909   369730  2517995  16419078 175225315  17616947   73657157   1,233,186,341  
all(gaps)                                        107061     25     100     100      100      500000     92         100        9,879,775
  
Chr1,1A,1B,2,3,4,4A,5..28,LG2,LG5,LGE22,M,Un,Z   37         9909   4907541 15652063 36305782 175225315  32343381   73657157   1,196,705,108
zebrafinch.len

Files:

 /fs/szasmg3/dpuiu/zebrafinch/

Turkey (Meleagris gallopavo)

Files:

 /fs/szasmg3/dpuiu/turkey/

Assembly2.0

Stats: preliminary

.                     elem       min    q1       q2       q3       max        mean       n50        sum            
 Ch1..30,40,41        32         531    6400446  15119779 34928883 184590300  28263595   70426150   904,435,047
 gaps                 147792     100    100      100      100      2999       268        860        39,738,918

Stats: final

                      elem       min    q1      q2       q3        max        mean       n50        sum
 Ch1..30,40,41        32         531    7024757 18811362 37793329  207174646  31576111   75696247   1,010,435,575
 Ch1..30,40,41,Un     33         531    7024757 18811362 37793329  207174646  32954439   75696247   1,087,496,503
 turkey.len

Reads:

 TotalUsableReads=151,843,863 (151M)
 AvgClearRange=102
 ContigReads=139021843(91.56%)
 DegenContigReads=8392124(5.53%)
 SurrogateReads=1317962(0.87%)
 SingletonReads=3314375(2.18%)
 Cvg=15X

Assembly stats(all):

 .                    elem        min    q1     q2     q3     max        mean       n50        sum
 scf                  27,007      66     1354   1988   4793   9558742    37856      1538143    1,022,394,764
 ctg                  145,663     64     1512   3433   8500   91891      6391       12594      930,953,352
 deg                  440,796     64     102    256    485    8055       312        483        137,835,235

Assembly stats(placed): preliminary AGP

 .                    elem        min    q1     q2     q3     max        mean       n50        sum
 scf                  2,504       1001   5868   35589  272564 9558742    362085     1830406    906,662,877
 ctg                  111,752     64     1919   4886   10524  91891      7616       13635      851,209,123
 deg                  36,072      64     144    331    530    8055       373        521        13,487,006
 ctg+deg              147,824     64     520    2783   8197   91891      5849       13426      864,696,129 

Assembly stats(placed by synteny): final AGP

 .                    elem       min    q1     q2     q3     max        mean       n50        sum            
 ctg                  131,217    64     1651   3975   9289   91891      6866       12989      901,044,472      
 deg                  31,426     64     128    283    530    8055       357        540        11,241,382      
 ctg+deg              162,643    64     731    2602   7576   91891      5609       12829      912,285,854

Files:

 /fs/szattic-asmg4/turkey/Assembly2.0/

Scaffold alignment to chicken

  • Parameters:
 nucmer -l 12 -c 65 -g 1000 -b 1000
 delta-filter -1
  • Scf stats
                      elem        min    q1     q2     q3     max        mean       n50        sum      
 aligned              22,045      66     1450   2276   5670   9558742    45827      1562815    1,010,256,240  
 unaligned            4,962       73     1159   1411   1935   119729     2446       2654       12,138,524

 1+alignments/scf     22045       1      1      1      2      1660       6          136        153866

 2+alignments/2+chr   50         11625  55577  1398890 3387095 7409211   1883381    4298282    94,169,060
  • Ctg stats (ctgs in aligned scaff)
                      elem        min    q1     q2     q3     max        mean       n50        sum      
 aligned              139790      64     1580   3665   8822   91891      6585       12739      920,634,899
 unaligned            5873        64     1148   1399   1887   22071      1756       1766       10,318,453
  • Alignment stats
 .                    elem        min    q1     q2     q3     max        mean       n50        sum
 len(all)             202105      11     681    1895   5189   134408     4315       10045      872,231,977 
 len(filter-1)        163390      12     1191   2673   6437   134409     5188       10410      847,715,057
 %id(filter-1)        163390      11.24  81.10  84.82  87.68  100.00     83         85         .
  • turkey scf vs chicken & turkey chr : 15% of the scaffold sequence seem to align in opposite orientation !!! Could the scaffold be misoriented by mistake?
 .                    elem       min    q1     q2     q3     max        mean       n50        sum
 opposite             1527       925    2604   7579   32323  6964320    78342      1018939    119629225
 same                 2619       97     2591   11510  128530 9558742    306323     1873938    802261737

Mapping

  • (200+ alignments)
chickenChr    turkeyChr  #alignments
1             Chr1       35025

2             Chr3       18143 :  Chr6 followed by Chr3
2             Chr6       7612

3             Chr2       17765  : Chr2 5' flipped

4             Chr4       11226  : Chr6 followed by Chr4
4             Chr9       2132

5             Chr5       8516

6             Chr8       4552

7             Chr7       4394

8             Chr10      3654 : Chr10 5' flipped

9             Chr11      2729

10            Chr12      2500

11            Chr13      2629

12            Chr14      2158 

13            Chr15      2136

14            Chr16      2109

15            Chr17      1524

17            Chr19      1285

18            Chr20      1374 : Chr20 3' flipped

19            Chr21      1155

20            Chr22      1828

21            Chr23      887

22            Chr24      511

23            Chr25      751

24            Chr26      862

25            Chr27      3

26            Chr28      592

27            Chr29      568

28            Chr30      553

Z             Chr41      4178
Z             Chr1       404
 
W             Chr41      24

E22C19W28_E50C23  ChrUn  7l

E64               ChrUn  20
  • Scaffolds with multiple alignment blocks:
    • 44 on different Chr
    • 30 on same chr; 11 appear to be partially flipped
    nl scfid         chickenChr
    1  7180002103050 2
    2  7180002103154 6
    3  7180002103203 3 25 # new
    4  7180002103204 10 28
    5  7180002103206 18
    6  7180002103213 4 26 # new
    7  7180002103242 5    # partially flipped 
    8  7180002103280 1 8
    9  7180002103298 7
   10  7180002103329 6    # partially flipped 
   11  7180002103402 8    # partially flipped 
   12  7180002103421 9
   13  7180002103425 2 7  # new
   14  7180002103431 6    # partially flipped 
   15  7180002103433 1    # partially flipped 
   16  7180002103480 5 6
   17  7180002103500 12 13 # new
   18  7180002103519 3 9
   19  7180002103555 6 18  # new
   20  7180002103557 8
   21  7180002103561 3
   22  7180002103574 1
   23  7180002103597 2 17  # new
   24  7180002103605 2 3   # new
   25  7180002103608 8
   26  7180002103614 2
   27  7180002103617 2     # partially flipped 
   28  7180002103618 1     # partially flipped 
   29  7180002103619 11    # partially flipped 
   30  7180002103620 4
   31  7180002103621 1 2 28
   32  7180002103627 1
   33  7180002103637 6 7   # new
   34  7180002103638 2 18  # new
   35  7180002103642 4
   36  7180002103648 1 3
   37  7180002103653 1 5   # new
   38  7180002103663 6
   39  7180002103668 1
   40  7180002103669 8 9   # new
   41  7180002103670 1 4   # new
   42  7180002103672 2 3
   43  7180002103675 1     # partially flipped 
   44  7180002103677 1
   45  7180002103679 2     # partially flipped 
   46  7180002103681 1 5
   47  7180002103682 1 21
   48  7180002103683 4 17
   49  7180002103684 13    # partially flipped 
   50  7180002103685 1 2
   51  7180002103686 1 3
   52  7180002103688 3 8
   53  7180002103693 12 15
   54  7180002103694 1 2   # new
   55  7180002103695 3
   56  7180002103698 2 12
   57  7180002103702 6 11
   58  7180002103714 4 5
   59  7180002103715 1 2 4
   60  7180002103717 2 10
   61  7180002103720 1 6 7 # new
   62  7180002103723 4 6
   63  7180002103725 1 14
   64  7180002103728 1 9   # new
   65  7180002103736 1 5   # new
   66  7180002103740 7
   67  7180002103742 1 3   # new
   68  7180002103743 6 8 17 
   69  7180002103744 1 17  # new
   70  7180002103750 2 3   # new
   71  7180002103752 9 18
   72  7180002103762 2
   73  7180002103771 1 3 19
   74  7180002103798 7 26  # new

Scaffold alignment to zebrafinch

  • Parameters:
 nucmer -l 12 -c 65 -g 1000 -b 1000
 delta-filter -1
  • Alignment stats (44 scf : subset 10)
 .                    elem        min    q1     q2     q3     max        mean       n50        sum
 len(subset 10)*      5286        12     233    485    860    12853      675        1033       3570025
 %id(subset 10)       5286        40.99  74.20  78.57  85.63  100.00     80         79         .

Chromosome alignment to chicken

  • Parameters:
 nucmer -l 12 -c 65 -g 1000 -b 1000
 delta-filter -1 # not yet
  • Alignment stats
 .                    elem        min    q1     q2     q3     max        mean       n50        sum
 len(all)             185138      11     600    2011   5567   134408     4407       10093      815928282
 len(delta-filter -r) 155094      11     1065   2783   6592   134408     5165       10302      801185719
 len(delta-filter -1) 148515      11     1144   2953   6836   134408     5341       10421      793361287

BACs.old

  • Markers:
 37918 : total CH260's
 8558  : assembled in scaffolds
 8641  : total 78TKNMI
  • Scf stats:
                    elem       min    q1      q2      q3      max        mean       n50        sum
 1+markers          1228       1001   24541   247381  879303  9558742    696129     1984837    854,846,919
 0markers           25779      66     1338    1911    4245    1214147    6499       26354      167,547,845

 1+markers/scf      1228       1      1       2       7       110        6          19         8,262

 2+markers/2+chr    38         671404 1525677 2968427 4298282 7409211    3084380    4013969    117,206,475

BACs

  • Scf len stats:
                    elem       min    q1      q2      q3      max        mean       n50        sum 
 1+markers          2478       1001   6013    36597   278486  9558742    365837     1830406    906,544,909
 0 markers          24529      66     1323    1848    3839    325966     4722       11201      115,849,855
  
 2+markers/2+chr    60         283784 1158965 2021582 3549120 7409211    2457241    3411361    147,434,495
 3+markers/2+chr    38         426424 1609106 2833228 4013969 7409211    3061980    3819803    116,355,251


  • Ctg len stats:
                    elem       min    q1      q2      q3      max        mean       n50        sum 
 1+markers          23077      76     6408    11837   19433   91891      14425      19768      332,889,618

Scf splits (Aleksey)

    1  7180002103685 6 156 161 jumps from chr6 to chr1 4049114-4201400
    2  7180002103648 1 45 79 1187881-1198679
    3  7180002103620 241786-307810                                                         # aligns to one chicken chr
    4  7180002103280 56334-114382
    5  7180002103762 386780-485750                                                         # aligns to one chicken chr
    6  7180002103638 111865-184832
    7  7180002103743 707755-712324
    8  7180002103743 1618441-1646472
    9  7180002103743 1895159-1956617
   10  7180002103683 3122611-3324351
   11  7180002103642 536597-587034                                                         # aligns to one chicken chr
   12  7180002103204 94910-122663
   13  7180002103681 5 33 57 jumps from chr5 to chr1 943178-1075454 map looks ok
   14  7180002103715 9 243 270 jumps from chr3 to chr9 547913-610659 map looks ok
   15  7180002103725 1 129 187 jumps from chr16 to chr1 1904425-2067581, map looks ok
   16  7180002103728 11 83 131 jumps from chr11 to chr1 2456073-2532176, map looks ok
   17  7180002103698 3 240 266 jumps from chr14 to chr3 588551-618407, map looks ok
   18  7180002103686 1 34 41 jumps from chr2 to chr1 292876-340742, map look ok
   19  7180002103621 3 40 57 jumps from chr1 to chr3 707868-766695, map looks ok
   20  7180002103720 7 63 130 jumps from chr7 to chr13 1890283-1900965, map looks ok
   21  7180002103682 23 68 75 jumps from chr1 to chr23 270646-281964, map looks ok
   22  7180002103605 2 43 60 jumps from chr2 to chr3 1059724-1121629, map looks ok
   23  7180002103688 10 131 162 jumps from chr10 to chr2 3129178-3331813, map looks ok
   24  7180002103672 6 31 55 jumps from chr2 to chr6, 800904-850720, map looks ok
   25  7180002103771 2 13 26 jumps from chr21 to chr2 516684-703439, map looks ok
   26  7180002103519 11 52 62 jumps from chr11 to chr2 1685597-1695161, map looks ok
   27  7180002103597 3 120 150 jumps from chr3 to chr19 2839516-3067987, map looks ok
   28  7180002103717 3 61 96 jumps from chr3 to chr12, 2101452-2251116, map looks ok
   29  7180002103743 10 101 257 jumps from chr8 to chr10, 3601251-3670472 map look ok,
   30  7180002103743 jump from chr10 to chr19, 6212398-6251410 map looks ok
   31  7180002103714 4 95 146 jumps from chr5 to chr4 1553913-1593600, map looks ok
   32  7180002103723 4 133 179 jumps from chr9 to chr4 1656209-1721059, map looks ok
   33  7180002103752 20 100 166 jumps from chr11 to chr20, 1951227-2017628,map looks ok
   34  7180002103480 5 79 119 jumps from chr8 to chr5, 1086539-1133932, map looks ok
   35  7180002103702 13 124 145 jumps from chr8 to chr13 935622-1070705, map looks ok
   36  7180002103693 14 73 84 jumps from chr17 to chr14, 477273-532094, map looks ok
   37  7180002103614                                                                          # aligns to one chicken chr
   38  7180002103677                                                                          # aligns to one chicken chr

Split ids: cat Chr_preliminary.agp | grep W | grep -v ChrUn | awk '{print $11}' | grep ^7181 | sort -u | nl

    1  7181002103204
    2  7181002103280
    3  7181002103480
    4  7181002103519
    5  7181002103620
    6  7181002103621
    7  7181002103648
    8  7181002103672
    9  7181002103681
   10  7181002103682
   11  7181002103683
   12  7181002103685
   13  7181002103686
   14  7181002103688
   15  7181002103693
   16  7181002103698
   17  7181002103702
   18  7181002103714
   19  7181002103715
   20  7181002103717
   21  7181002103723
   22  7181002103725
   23  7181002103743
   24  7181002103752
   25  7181002103771

Zebrafinch chr sample vs Chicken chr

  • Sample 1Kbp every 1M in Zebrafinsh chr
 ChickenChr          ZebraChr  count(>2)       
 1                   chr1      406
 1*                  chr1A     287
 1*                  chr1B     124  # not sampled

 2                   chr2      589

 3                   chr3      436

 4                   chr4      217
 4*                  chr4A     77

 5                   chr5      244

 6                   chr6      132

 7                   chr7      155

 8                   chr8      116

 9                   chr9      103

 10                  chr10     108

 11                  chr11     105

 12                  chr12     88

 13                  chr13     75

 14                  chr14     65

 15                  chr15     56

 16                  nothing

 17                  chr17     49

 18                  chr18     45

 19                  chr19     53

 20                  chr20     63

 21                  chr21     26

 22                  chr22     11

 23                  chr23     20

 24                  chr24     32

 26                  chr26     14

 27                  chr27     13

 28                  chr28     14

 Z                   chrZ      165

 W                   chrZ      30    # not sampled

 E64                 nothing

 E22C19W28_E50C23*   chrLGE22  3

Synteny

MSU:

 "We do see a couple of very small translocations between chromosomes 1 and 4,but these are so small that they could be errors in the chicken assembly or, more likely, paralogous sequences that perhaps were two copies in the last common ancestor and chicken kept one and turkey the other. We don't see translocations between chromosomes Z and 1, so I expect that these alignments are due to a repetitive element (CR1 being the most likely), but the Z assembly is tentative even in chicken, so it's hard to be sure."

Syntenic regions:

           chickenRegions  turkeyRegions  chickenChr  turkeyChr
 all       209166          311363                                    # nucmer -l 12 -c 65 -g 1000 -b 1000                       
 filter-1  183058          259760         142         186            # delta-filter -1
 filter    170658          239592         125         129            # filter-anc.pl -maxDist 200000 -W 20 -p 0.1 
 merge0    3260            2250           125         130            # merge-anc.pl  -maxDist 200000 
 merge1    1573            1368           110         93             # merge-anc.pl  -maxDist 200000  -minCount 8  -minLen 10000
 merge2    376             488            49          47             # merge-anc.pl  -maxDist 1000000 -minCount 20 -minLen 100000

Problems

ctg7180001625741

  • 1 ctg scaff: 7180002083787(1.4Kbp)
  • Single links to 2 diff scaff: 7180002103637 & 7180002103666
  • Synteny info (Daniela)
 cat /fs/szasmg3/dpuiu/turkey/Alignment2.0/chicken-turkey.ctg/turkey.ctg.posmap.merge | grep -C 20 7180001625741
 #                chickenChr                                  turkeyChr
 7180002057801    6                 36246991   36257816   -1  Chr8   35888371   35899195   r  U  100
 7180001625741    6                 36269529   36271001   -1  .      .          .          .  .  .
 ...
 7180002074579    6                 36382217   36386350   -1  Chr8   35899296   35903428   r  N  20910
  • Synteny info (Aleksey)
 cat /fs/ftp-cbcb/pub/data/turkey/Assembly2.0/place_by_sinteny/contigs.chicken.order.with_AGP.valid.txt | grep -C 1 7180001625741 | pretty
 #                                                chickenChr  turkeyChr
 1       1790  36269140  36267359  7180001578245  chr6        Chr7   20109532  20111855  2324  -  7180001578245
 307     1472  36270694  36269529  7180001625741  chr6        ChrUn  32131240  32132711  1472  0  7180001625741*
 2343    5341  36282706  36279707  7180001914610  chr6        Chr7   36045401  36052860  7460  -  7180001914610
 cat turkey.posmap.ctgscf | grep 7180002103637 | egrep -n '7180001578245|7180001914610'
 ...
 302:7180001914610       7180002103637   2403512 2410972 f
 391:7180001578245       7180002103637   3013067 3015391 f
 ...
 463: 

Scf 7180002103637 aligns both to Chr6 & Chr7

 cat /fs/szasmg3/dpuiu/turkey/Alignment2.0/chicken-turkey.scf/turkey.scf-chicken.filter-1.merge0.anc | grep 7180002103637
 7180002103637    7                 3817505  38384769   2          2285066  23413757   21084724   651501   705639   284          -1       23.41
 7180002103637    6                 3817505  37400442   2285109    2410972  36410067   36277407   37596    43600    21           -1       38.69
 7180002103637    7                 3817505  38384769   2410993    3817505  21084676   19700317   384609   354790   151          -1       23.49
 grep 7180002103637 /fs/szasmg3/dpuiu/turkey/BACs/BAC_map_final.txt | pretty | sort -nk9 | nl
    1  CH260098J15_SP6      7_10  7  21623779   150000  7180001914412  7180002103637  2833   2833
    2  78TKNMI023L02_SP6    7_10  7  21655259   150000  7180001914413  7180002103637  462    8036
    3  78TKNMI020I14_T7     7_10  7  21786579   150000  7180001914413  7180002103637  6568   14142
    ...
   91  78TKNMI028M05_T7     8_13  8  34451891   150000  7180001914600  7180002103637  4694   2314126
   92  CH260110M21_T7       8_13  8  34375922   150000  7180001914602  7180002103637  4382   2344157
   93  CH260102C12_T7       8_13  8  34561953   150000  7180001914608  7180002103637  7429   2400413
    ...
  155  CH260102B06_SP6      7_10  7  18173403   150000  7180001914714  7180002103637  578    3753466
  156  CH260091G02_T7       7_10  7  18147518   150000  7180001914716  7180002103637  14232  3777334
  157  78TKNMI020K20_T7     7_10  7  17944915   150000  7180001914719  7180002103637  5969   3809779

Annotation