Methanobrevibacter smithii
		
		
		
		Jump to navigation
		Jump to search
		
Data
NCBI
1 complete + 2 draft assembly strains:
Methanobrevibacter smithii ATCC 35061
- complete: NC_009515.1 1,853,160bp 31.03%GC
- Ref: /fs/szdata/ncbi/ftp.ncbi.nih.gov/genomes/Bacteria/Methanobrevibacter_smithii_ATCC_35061/*fna
- Published: Genomic and metabolic adaptations of Methanobrevibacter smithii to the human gut PNAS
- Assembled: Phrap and PCAP
- IS elements:
 cat NC_009515.gff | grep "IS%20element" | grep CDS | awk '{print $1,$2,$3,$4,$5,$6,$7,$5-$4+1}' 
 NC_009515.1    RefSeq  CDS  504499   505509   .  -  1011  
 NC_009515.1    RefSeq  CDS  505464   505748   .  -  285   
 NC_009515.1    RefSeq  CDS  508688   508984   .  +  297   
 NC_009515.1    RefSeq  CDS  509066   509491   .  +  426   
 NC_009515.1    RefSeq  CDS  509745   509900   .  +  156   
 NC_009515.1    RefSeq  CDS  1542734  1543111  .  -  378   
 NC_009515.1    RefSeq  CDS  1543120  1543254  .  -  135   
 NC_009515.1    RefSeq  CDS  1543220  1543417  .  -  198   
~/bin/RepeatSearch.amos -D REPEATLEN=36 ms . elem min q1 q2 q3 max mean n50 sum repeats.36+ 317 36 43 63 104 3732 182.96 1363 57999 uniq.36+ 309 1 24 109 2195 97573 5809.58 42534 1795160 repeats.350+ 26 386 691 1386 1406 3732 1369.12 1390 35597 uniq.350+ 25 1 13517 52288 90848 258405 72702.48 131389 1817562
Methanobrevibacter smithii DSM 2374
- draft: NZ_ABYV02000000
- 1,727,775 bp
- 25 contigs: NZ_ABYV02000001 .. NZ_ABYV02000025
Methanobrevibacter smithii DSM 2375
- draft: NZ_ABYW00000000
- 1,704,865 bp
- 24 contigs: NZ_ABYW01000001 .. NZ_ABYW01000024
Methanobrevibacter smithii DSM 11975
- progress
WUSTL (Gordon Lab)
/fs/szattic-asmg4/methanobrevibacter_smithii/Data
- 22 strains
- half-way through sequencing (should have all the data by early to mid-January)
- right now (--Dpuiu 15:19, 15 December 2009 (EST)):
- 10 strains sequenced by GAII Illumina (36mers) with 3-8 million reads per strain (coverage is 50-150x),
- 12 strains sequenced by 454-Titanium sequencing, with 20,000 to 90,000 reads per strain (coverage is ~5-20x).
- 7 strains sequenced by Illumina and 454
 
Illumina: 36 bp single reads
   nl  strain      reads   bases       cvg
    1  FR1LH1      5186537 186715332   100
    2  FR1LH3      7211080 259598880   140
    3  FR1LH6      4968262 178857432   96
    4  TS145A      6536457 235312452   126
    5  TS145B      8277390 297986040   160
       TS146e4    26899427 968379372   522  # 8417021 of the reads (~ 160X cvg)  have qual==32
    6  TS94-3      4886376 175909536   94
    7  TS94-5      4785200 172267200   92
    8  TS95-2      2896065 104258340   56
    9  TS95-3      5064150 182309400   98
   10  TS95-4      3557512 128070432   69
   11  TS95-5      4559830 164153880   88
454: avg 342 bp single reads
   nl  strain      reads   bases       cvg
    1  TS145A      83667   28587141    15
    2  TS145B      45203   15720997    8
    3  TS146-3     49854   17608862    10
    4  TS146e4     58633   18306825    10
    5  TS146e5A    27844   8311560     4
    6  TS146e5B    73182   23547825    13
    7  TS147e8     68487   20662109    11
    8  TS94-3      449545  157306990   85
    9  TS94-5      76513   26734802    14
   10  TS95-2      73255   25779806    14
   11  TS95-4      85737   29231201    16
   12  TS95-5      96757   35000794    19
Illumina & 454 : 7 strands have both Illumina & 454 reads
                                Illumina                        454
                       ---------------------------     ---------------------------                             
   nl  strand          #reads  #bases          cvg     #read   #bases          cvg       avg%idFinishedGenome
    1  FR1LH1          5186537 186715332       100     .       .               .         98
    2  FR1LH3          7211080 259598880       140     .       .               .       
    3  FR1LH6          4968262 178857432       96      .       .               .       
    4* TS145A          6536457 235312452       126     83667   28587141        15        98
    5  TS145B          8277390 297986040       160     45203   15720997        8       
    6  TS146-3         .       .               .       49854   17608862        10
    7  TS146e4        26899427 968379372       522     58633   18306825        10
    8* TS146e5A        .       .               .       27844   8311560         4         98
    9  TS146e5B        .       .               .       73182   23547825        13
   10  TS147e8         .       .               .       68487   20662109        11
   11* TS94-3          4886376 175909536       94      449545  157306990       85        92
   12  TS94-5          4785200 172267200       92      76513   26734802        14
   13  TS95-2          2896065 104258340       56      73255   25779806        14
   14  TS95-3          5064150 182309400       98      .       .               .       
   15  TS95-4          3557512 128070432       69      85737   29231201        16
   16  TS95-5          4559830 164153880       88      96757   35000794        19
   nl  strand      IlluminaGC   454GC
    1  FR1LH1      38.89        .      
    2  FR1LH3      33.33        .      
    3  FR1LH6      33.33        .      
    4  TS145A      41.67        31.54  
    5  TS145B      36.11        31.52  
    6  TS146-3     .            31.85  
    7  TS146e4     33.33        31.72  
    8  TS146e5A    .            32.77  
    9  TS146e5B    .            31.96  
   10  TS147e8     .            31.11  
   11  TS94-3      30.56        30.67  
   12  TS94-5      30.56        30.69  
   13  TS95-2      36.11        30.60  
   14  TS95-3      33.33        .      
   15  TS95-4      30.56        30.70  
   16  TS95-5      33.33        30.56  
Based on nucmer alignments of finished genome and newbler contigs looks like
- TS145A.4 & TS146e5A.8 99% id
- TS145A.4 & TS94-3.11 92% id
Assembly
- Available online at: ftp://ftp.cbcb.umd.edu/pub/data/dpuiu/Methanobrevibacter_smithii/
- newbler assemblies are generally the "best": longer contigs, fewer 0cvg regions ...
- CBCB velvet & newbler assemblies are slightly better than the WUSTL ones
- velvet contigs slightly longer than the SOAPdenovo ones;
- velvet assemblies contain slightly fewer bases than SOAPdenovo ones & have more 0cvg regions (compared to the reference)
- can further merge the newbler(454) & velvet/SOAPdenovo(Illumina) assemblies; would merge some contigs together
AMOScmp.Illumina (CBCB)
               #ctgs   min     q1      q2      q3      max     mean    n50     sum     0cgv
 FR1LH1.1      1684    36      55      109     650     33114   958     5106    1614205 275309
 FR1LH3.2      1642    36      55      106     631     33115*  984     5331    1616391 273301
 FR1LH6.3      1649    36      55      108     660     26387   979     5180    1615744 273654
 TS145A.4      1918    36      59      124     805     28850   838     3234    1608620 284953
 TS145B.5      1723    36      58      110     647     33114   936     4463    1612823 277233
 TS94-3.11     ??? 
 TS94-5.12     8859    36      56      85      155     10840   159     242     1408957 658124
 TS95-2.13     9535    36      55      82      147     7508    144     209     1380166 695098
 TS95-3.14     9152    36      56      85      154     10159   156     238     1436137 642175
 TS95-4.15     8859    36      56      85      155     10840   159     242     1408957 658124
 TS95-5.16     9238    36      55      84      152     10160   155     235     1435894 645655
SOAPdenovo.Illumina (CBCB)
               #ctgs   min     q1      q2      q3      max     mean    n50     sum     0cgv
 FR1LH1.1      774     45      75      137     2158    50285   2353    10320   1821758 218968
 FR1LH3.2      767     45      73      108     977     56788*  2371    14018   1818948 221732
 FR1LH6.3      757     45      74      129     1396    50013   2402    12294   1818674 219385
 TS145A.4      1798    45      127     459     1381    13456   991     2225    1781905 230976
 TS145B.5      1098    45      83      306     2231    20523   1631    4869    1791302 216380
 TS94-3.11     1405    45      59      98      634     35690   1349    7940    1895871 249447
 TS94-5.12     1562    45      58      101     706     25048   1216    6456    1899512 251003
 TS95-2.13     6611    45      131     215     364     2994    281     375     1863491 404797
 TS95-3.14     1687    45      58      85      388     25368   1184    6931    1997443 213660
 TS95-4.15     1562    45      58      101     706     25048   1216    6456    1899512 251003
 TS95-5.16     1259    45      63      85      285     43875   1576    12739   1984835 209415
velvet.Illumina (CBCB)
strain #ctgs min q1 q2 q3 max mean n50 sum 0cgv FR1LH1.1 518 45 73 158 3131 67401 3491 16101 1808646 219880 FR1LH3.2 532 45 72 105 729 111372 3382 27042 1799368 227816 FR1LH6.3 547 45 72 111 992 85069 3309 20751 1810235 220469 TS145A.4 1002 45 398 1154 2228 14317 1575 2707 1578999 355737 TS145B.5 687 45 105 773 3424 31627 2567 6873 1763953 223089 TS146e4.7 375 45 71 129 1127 153113* 4740 40215 1777614 209640 # 522X cvg TS146e4.7.filt 7641 45 99 143 218 1244 177 207 1355197 626004 # 150X cvg (only q32 reads) TS94-3.11 738 45 63 102 755 99478 2526 17184 1864601 250197 TS94-5.12 751 45 63 98 887 51451 2455 15583 1844453 259266 TS95-2.13 7235 45 107 198 349 2547 265 370 1922674 382074 TS95-3.14 1018 45 68 85 341 60536 1934 15001 1969398 211351 TS95-4.15 751 45 63 98 887 51451 2455 15583 1844453 259266 TS95-5.16 881 45 65 76 350 47906 2226 15468 1961518 212351
velvet.Illumina (WUSTL)
 strain        elem       min    q1     q2     q3     max        mean       n50        sum            
 FR1LH1.1      403        101    282    1950   5891   41268      4388       10473      1768436
 FR1LH3.2      274        100    267    1375   7601   88330*     6469       19398      1772541
 FR1LH6.3      291        100    224    1225   7594   84927      5768       17024      1678650
 TS145A.4      1127       101    409    1004   2075   11833      1479       2545       1667855
 TS145B.5      445        100    721    2392   4817   20558      3396       5896       1511226
 TS94-3.11     449        100    152    671    5121   50277      3958       12756      1777364
 TS94-5.12     468        100    154    634    4768   44436      3818       12667      1786983
 TS95-2.13     4476       100    179    303    531    3089       414        567        1853314
 TS95-3.14     484        100    186    725    5581   49558      3918       11725      1896543
 TS95-4.15     508        100    235    1211   5315   33145      3685       10173      1872243
 TS95-5.16     469        100    203    927    5304   42207      4038       12739      1894003
 MsmALI        209        102    371    3882   9790   58310      7882       21331      1647404
 MsmPS-copy    497        100    341    1399   4492   43636      3497       8361       1738127
AMOScmp.454 (CBCB)
strain #ctgs min q1 q2 q3 max mean n50 sum 0cgv TS146e4.7 213 71 1454 3954 9800 62523 7725 15060 1645592 208920 # original reads TS146e4.7 168 56 1312 4609 13025 62537 9908 24440 1664607 194003 # alignment trimmed reads
CA.bog.454 (CBCB)
strain #ctgs min q1 q2 q3 max mean n50 sum 0cgv TS145A.4 39 1135 12340 37904 61833 166391 44902 83237 1751181 227444 TS146e4.7 111 1039 5766 11729 21286 105399 16055 22905 1782182 200113
newbler.deNovo.454 (CBCB)
strain #ctgs min q1 q2 q3 max mean n50 sum 0cgv TS145A.4 48 131 883 9320 56233 210376 37056 112225 1778716 196515 TS145B.5 117 104 2185 7850 22256 79935 15147 34389 1772299 203614 TS146-3.6 441 100 248 398 497 106331 4295 37197 1894514 202108 TS146e4.7 99 135 926 13013 24440 147105 17974 36272 1779476 198936 TS146e5A.8 1183 98 326 770 1953 19970 1446 2812 1710725 350637 TS146e5B.9 432 100 191 325 462 212980 4411 106361 1905607 195632 TS147e8.10 86 132 1662 8952 35506 122338 22610 55697 1944471 221774 TS94-3.11 723 96 238 361 456 395484* 2934 91958 2121818 224389 TS94-5.12 67 103 778 5001 46879 173674 28164 82955 1887011 226423 TS95-2.13 78 108 626 5843 45099 140027 25376 88829 1979337 187754 TS95-4.15 58 111 387 11498 59591 169655 34241 89436 1986032 186252 TS95-5.16 58 103 286 5198 59620 188214 34081 115574 1976737 186181
newbler.deNovo.454 (WUSTL)
strain #ctgs min q1 q2 q3 max mean n50 sum 0cgv TS145A.4 62 106 386 3049 46046 166070 28692 82515 1778941 TS145B.5 130 101 2104 7697 18636 79979 13620 34155 1770674 TS146-3.6 384 100 349 447 1527 83020 4885 28235 1876197 TS146e4.7 145 101 1901 7807 17092 73674 12238 24421 1774620 TS146e5A.8 1199 100 591 932 1529 7108 1199 1566 1438364 much worse TS146e5B.9 112 111 490 1598 20451 129166 16084 52376 1801428 TS147e8.10 136 100 1197 5950 17566 140484 14275 41840 1941435 TS94-3.11 567 111 332 408 471 284245* 3642 93923 2065443 TS94-5.12 87 130 375 1511 19610 189466 21686 71682 1886713 TS95-2.13 78 116 350 8420 41426 136894 25419 73086 1982724 TS95-4.15 60 112 373 2519 59614 200840 33056 115852 1983394 TS95-5.16 50 132 463 20422 73626 191648 39557 89465 1977893
newbler.refMapper.454 (CBCB)
 strain        #ctgs   min     q1      q2      q3      max     mean    n50     sum     0cgv
 TS146e4.7     254     101     206     851     6803    72759   6408    25310   1627809 229435
best
Generated by merging newbler with velvet/SOAPdenovo contigs using minimus2
strain #ctgs min q1 q2 q3 max mean n50 sum 0cgv FR1LH1.1 169 47 307 3469 13513 113154 10746 29975 1816093 210847 FR1LH3.2 159 47 162 2932 13757 113158 11288 37473 1794914 221522 FR1LH6.3 157 50 269 3785 15911 126755 11603 29272 1821689 211256 TS145A.4 48 131 883 9320 56233 210376 37056 112225 1778716 196515 TS145B.5 53 71 1018 11425 57301 147750 33590 83876 1780320 198279 TS146-3.6 441 100 248 398 497 106331 4295 37197 1894514 202108 TS146e4.7 99 135 926 13013 24440 147105 17974 36272 1779476 198936 TS146e5A.8 1183 98 326 770 1953 19970 1446 2812 1710725 350637 TS146e5B.9 432 100 191 325 462 212980 4411 106361 1905607 195632 TS147e8.10 86 132 1662 8952 35506 122338 22610 55697 1944471 221774 TS94-3.11 38 55 764 9841 66968 395587 49738 144195 1890074 223341 TS94-5.12 49 65 766 7044 51688 285237 38534 111850 1888200 225957 TS95-2.13 57 110 1383 9455 50972 165660 34754 94352 1981005 182973 TS95-3.14 408 47 141 419 5698 52172 4663 16369 1902753 202507 TS95-4.15 79 47 147 2317 16017 200685 26493 138013 2092967 179552 TS95-5.16 47 69 311 14482 77588 188308 42099 115630 1978688 184590
Strains
TS145A.4
- 126x Illumina & 15x 454
. elem min q1 q2 q3 max mean n50 sum cvg Illumina 6536457 36 36 36 36 36 36.00 36 235312452 126 454 85737 30 247 377 453 626 340.94 422 29231201 15
- Read assemblies:
#ctgs min q1 q2 q3 max mean n50 sum 0cgv AMOScmp.Illumina (soap2 -v5 -g0 -f2) 1918 36 59 124 805 28850 838 3234 1608620 284953 AMOScmp.Illumina (soap1 -v3 -g3 -f1 -c35) 1717 36 56 100 528 28868 964 4879 1656159 255687 SOAPdenovo.Illumina* 1798 45 127 459 1381 13456 991 2225 1781905 230976 velvet.Illumina 1002 45 398 1154 2228 14317 1575 2707 1578999 355737 AMOScmp.454 166 116 1110 3614 11776 110905 9952 24150 1652197 203385 CA.454 203 1002 2802 6431 12196 41845 8646 13047 1755208 245426 newbler.454** 48 131 883 9320 56233 210376 37056 112225 1778716 196515 SOAPdenovo.454 6531 45 103 179 306 1806 237 321 1553899 478469 velvet.454 21165 45 53 65 83 548 73 74 1561277 821841 AMOScmp.Illumina_454(soap1 -v3 -g3 -f1 -c35) 274 36 46 86 4129 117510 6136 34524 1681406 186848 SOAPdenovo.Illumina_454 5795 45 75 184 425 3983 317 588 1840810 243391 velvet.Illumina_454 4229 45 135 269 507 3529 384 584 1626062 396716
Contig assemblies(no singletons):
#ctgs min q1 q2 q3 max mean n50 sum 0cgv minimus2.SOAPdenovo.Illumina-newbler.454 41 46 883 17898 61580 287497 43446 114143 1781287 197555 # 3 contigs don't contain any newbler original ctg minimus2.velvet.Illumina-newbler.454 36 289 1802 38243 70184 210459 49402 113849 1778497 197611 # all contigs contain at least 1 newbler original ctg
Contig assemblies(include singletons):
#ctgs min q1 q2 q3 max mean n50 sum 0cgv minimus2.SOAPdenovo.Illumina-newbler.454 83 45 97 425 17898 287497 21689 114143 1800192 196885 minimus2.velvet.Illumina-newbler.454 61 45 677 2956 43814 210459 29821 113849 1819092 197424
TS146e5A.8
- 4x 454
- Read assemblies:
#ctgs min q1 q2 q3 max mean n50 sum 0cgv AMOScmp.454 736 56 675 1452 2902 17360 2092 3319 1540404 324145 newbler.454* 1183 98 326 770 1953 19970 1446 2812 1710725 350637
TS94-3.11
- 94x Illumina & 85x 454
. elem min q1 q2 q3 max mean n50 sum 454 449545 29 255 390 463 1190 349.92 433 157306990 Illumina 4886376 36 36 36 36 36 36.00 36 175909536
- Read assemblies:
#ctgs min q1 q2 q3 max mean n50 sum 0cgv SOAPdenovo.Illumina 1405 45 59 98 634 35690 1349 7940 1895871 249447 velvet.Illumina 738 45 63 102 755 99478 2526 17184 1864601 250197 AMOScmp.454 255 52 585 2580 8321 61854 6531 16349 1665535 225050 newbler.454 723 96 238 361 456 395484 2934 91958 2121818 224389 newbler.454.1000+ 44 1007 2480 9817 64051 395484 42923 106659 1888613 CA.454 55 1001 1777 20479 51987 188883 34348 81660 1889177 244396
newbler.454
- ReadStatus counts
total % Assembled 435637 96.91 Singleton 8178 1.82 PartiallyAssembled 4511 1 Outlier 856 0.19 Repeat 363 0.08 total 449545 100
- Location
/fs/szattic-asmg4/methanobrevibacter_smithii/Assembly.CBCB/TS94-3.11/newbler.454/
CA.bog
. elem min q1 q2 q3 max mean n50 sum ctg 55 1001 1777 20479 51987 188883 34348.67 81660 1889177 deg 922 64 341 457 515 13313 460.62 489 424691
- Location
/fs/szattic-asmg4/methanobrevibacter_smithii/Assembly.CBCB/TS94-3.11/CA.bog/