Methanobrevibacter smithii
Jump to navigation
Jump to search
Data
NCBI
1 complete + 2 draft assembly strains:
Methanobrevibacter smithii ATCC 35061
- complete: NC_009515.1
- 1,853,160bp
- Ref: /fs/szdata/ncbi/ftp.ncbi.nih.gov/genomes/Bacteria/Methanobrevibacter_smithii_ATCC_35061/*fna
- Published: Genomic and metabolic adaptations of Methanobrevibacter smithii to the human gut PNAS
- Assembled: Phrap and PCAP
- IS elements:
cat NC_009515.gff | grep "IS%20element" | grep CDS | awk '{print $1,$2,$3,$4,$5,$6,$7,$5-$4+1}' NC_009515.1 RefSeq CDS 504499 505509 . - 1011 NC_009515.1 RefSeq CDS 505464 505748 . - 285 NC_009515.1 RefSeq CDS 508688 508984 . + 297 NC_009515.1 RefSeq CDS 509066 509491 . + 426 NC_009515.1 RefSeq CDS 509745 509900 . + 156 NC_009515.1 RefSeq CDS 1542734 1543111 . - 378 NC_009515.1 RefSeq CDS 1543120 1543254 . - 135 NC_009515.1 RefSeq CDS 1543220 1543417 . - 198
~/bin/RepeatSearch.amos -D REPEATLEN=36 ms . elem min q1 q2 q3 max mean n50 sum repeats.36+ 317 36 43 63 104 3732 182.96 1363 57999 uniq.36+ 309 1 24 109 2195 97573 5809.58 42534 1795160 repeats.350+ 26 386 691 1386 1406 3732 1369.12 1390 35597 uniq.350+ 25 1 13517 52288 90848 258405 72702.48 131389 1817562
Methanobrevibacter smithii DSM 2374
- draft: NZ_ABYV02000000
- 1,727,775 bp
- 25 contigs: NZ_ABYV02000001 .. NZ_ABYV02000025
Methanobrevibacter smithii DSM 2375
- draft: NZ_ABYW00000000
- 1,704,865 bp
- 24 contigs: NZ_ABYW01000001 .. NZ_ABYW01000024
Methanobrevibacter smithii DSM 11975
- progress
WUSTL (Gordon Lab)
/fs/szattic-asmg4/methanobrevibacter_smithii/Data
- 22 strains
- half-way through sequencing (should have all the data by early to mid-January)
- right now (--Dpuiu 15:19, 15 December 2009 (EST)):
- 10 strains sequenced by GAII Illumina (36mers) with 3-8 million reads per strain (coverage is 50-150x),
- 12 strains sequenced by 454-Titanium sequencing, with 20,000 to 90,000 reads per strain (coverage is ~5-20x).
- 7 strains sequenced by Illumina and 454
Illumina: 36 bp single reads
nl strain reads bases cvg 1 FR1LH1 5186537 186715332 100 2 FR1LH3 7211080 259598880 140 3 FR1LH6 4968262 178857432 96 4 TS145A 6536457 235312452 126 5 TS145B 8277390 297986040 160 TS146e4 26899427 968379372 522 # 8417021 of the reads (~ 160X cvg) have qual==32 6 TS94-3 4886376 175909536 94 7 TS94-5 4785200 172267200 92 8 TS95-2 2896065 104258340 56 9 TS95-3 5064150 182309400 98 10 TS95-4 3557512 128070432 69 11 TS95-5 4559830 164153880 88
454: avg 342 bp single reads
nl strain reads bases cvg 1 TS145A 83667 28587141 15 2 TS145B 45203 15720997 8 3 TS146-3 49854 17608862 10 4 TS146e4 58633 18306825 10 5 TS146e5A 27844 8311560 4 6 TS146e5B 73182 23547825 13 7 TS147e8 68487 20662109 11 8 TS94-3 449545 157306990 85 9 TS94-5 76513 26734802 14 10 TS95-2 73255 25779806 14 11 TS95-4 85737 29231201 16 12 TS95-5 96757 35000794 19
Illumina & 454 : 7 strands have both Illumina & 454 reads
Illumina 454
--------------------------- ---------------------------
nl strand #reads #bases cvg #read #bases cvg avg%idFinishedGenome
1 FR1LH1 5186537 186715332 100 . . . 98
2 FR1LH3 7211080 259598880 140 . . .
3 FR1LH6 4968262 178857432 96 . . .
4* TS145A 6536457 235312452 126 83667 28587141 15 98
5 TS145B 8277390 297986040 160 45203 15720997 8
6 TS146-3 . . . 49854 17608862 10
7 TS146e4 26899427 968379372 522 58633 18306825 10
8* TS146e5A . . . 27844 8311560 4 98
9 TS146e5B . . . 73182 23547825 13
10 TS147e8 . . . 68487 20662109 11
11* TS94-3 4886376 175909536 94 449545 157306990 85 92
12 TS94-5 4785200 172267200 92 76513 26734802 14
13 TS95-2 2896065 104258340 56 73255 25779806 14
14 TS95-3 5064150 182309400 98 . . .
15 TS95-4 3557512 128070432 69 85737 29231201 16
16 TS95-5 4559830 164153880 88 96757 35000794 19
Based on nucmer alignments of finished genome and newbler contigs looks like
- TS145A.4 & TS146e5A.8 99% id
- TS145A.4 & TS94-3.11 92% id
Assembly
- Available online at: ftp://ftp.cbcb.umd.edu/pub/data/dpuiu/Methanobrevibacter_smithii/
- newbler assemblies are generally the "best": longer contigs, fewer 0cvg regions ...
- CBCB velvet & newbler assemblies are slightly better than the WUSTL ones
- velvet contigs slightly longer than the SOAPdenovo ones;
- velvet assemblies contain slightly fewer bases than SOAPdenovo ones & have more 0cvg regions (compared to the reference)
- can further merge the newbler(454) & velvet/SOAPdenovo(Illumina) assemblies; would merge some contigs together
AMOScmp.Illumina (CBCB)
#ctgs min q1 q2 q3 max mean n50 sum 0cgv FR1LH1.1 1684 36 55 109 650 33114 958 5106 1614205 275309 FR1LH3.2 1642 36 55 106 631 33115* 984 5331 1616391 273301 FR1LH6.3 1649 36 55 108 660 26387 979 5180 1615744 273654 TS145A.4 1918 36 59 124 805 28850 838 3234 1608620 284953 TS145B.5 1723 36 58 110 647 33114 936 4463 1612823 277233 TS94-3.11 ??? TS94-5.12 8859 36 56 85 155 10840 159 242 1408957 658124 TS95-2.13 9535 36 55 82 147 7508 144 209 1380166 695098 TS95-3.14 9152 36 56 85 154 10159 156 238 1436137 642175 TS95-4.15 8859 36 56 85 155 10840 159 242 1408957 658124 TS95-5.16 9238 36 55 84 152 10160 155 235 1435894 645655
SOAPdenovo.Illumina (CBCB)
#ctgs min q1 q2 q3 max mean n50 sum 0cgv FR1LH1.1 774 45 75 137 2158 50285 2353 10320 1821758 218968 FR1LH3.2 767 45 73 108 977 56788* 2371 14018 1818948 221732 FR1LH6.3 757 45 74 129 1396 50013 2402 12294 1818674 219385 TS145A.4 1798 45 127 459 1381 13456 991 2225 1781905 230976 TS145B.5 1098 45 83 306 2231 20523 1631 4869 1791302 216380 TS94-3.11 1405 45 59 98 634 35690 1349 7940 1895871 249447 TS94-5.12 1562 45 58 101 706 25048 1216 6456 1899512 251003 TS95-2.13 6611 45 131 215 364 2994 281 375 1863491 404797 TS95-3.14 1687 45 58 85 388 25368 1184 6931 1997443 213660 TS95-4.15 1562 45 58 101 706 25048 1216 6456 1899512 251003 TS95-5.16 1259 45 63 85 285 43875 1576 12739 1984835 209415
velvet.Illumina (CBCB)
strain #ctgs min q1 q2 q3 max mean n50 sum 0cgv
FR1LH1.1 518 45 73 158 3131 67401 3491 16101 1808646 219880
FR1LH3.2 532 45 72 105 729 111372* 3382 27042 1799368 227816
FR1LH6.3 547 45 72 111 992 85069 3309 20751 1810235 220469
TS145A.4 1002 45 398 1154 2228 14317 1575 2707 1578999 355737
TS145B.5 687 45 105 773 3424 31627 2567 6873 1763953 223089
TS146e4.7 375 45 71 129 1127 153113 4740 40215 1777614 209640
TS94-3.11 738 45 63 102 755 99478 2526 17184 1864601 250197
TS94-5.12 751 45 63 98 887 51451 2455 15583 1844453 259266
TS95-2.13 7235 45 107 198 349 2547 265 370 1922674 382074
TS95-3.14 1018 45 68 85 341 60536 1934 15001 1969398 211351
TS95-4.15 751 45 63 98 887 51451 2455 15583 1844453 259266
TS95-5.16 881 45 65 76 350 47906 2226 15468 1961518 212351
velvet.Illumina (WUSTL)
strain elem min q1 q2 q3 max mean n50 sum FR1LH1.1 403 101 282 1950 5891 41268 4388 10473 1768436 FR1LH3.2 274 100 267 1375 7601 88330* 6469 19398 1772541 FR1LH6.3 291 100 224 1225 7594 84927 5768 17024 1678650 TS145A.4 1127 101 409 1004 2075 11833 1479 2545 1667855 TS145B.5 445 100 721 2392 4817 20558 3396 5896 1511226 TS94-3.11 449 100 152 671 5121 50277 3958 12756 1777364 TS94-5.12 468 100 154 634 4768 44436 3818 12667 1786983 TS95-2.13 4476 100 179 303 531 3089 414 567 1853314 TS95-3.14 484 100 186 725 5581 49558 3918 11725 1896543 TS95-4.15 508 100 235 1211 5315 33145 3685 10173 1872243 TS95-5.16 469 100 203 927 5304 42207 4038 12739 1894003 MsmALI 209 102 371 3882 9790 58310 7882 21331 1647404 MsmPS-copy 497 100 341 1399 4492 43636 3497 8361 1738127
newbler.deNovo.454 (CBCB)
strain #ctgs min q1 q2 q3 max mean n50 sum 0cgv
TS145A.4 48 131 883 9320 56233 210376 37056 112225 1778716 196515
TS145B.5 117 104 2185 7850 22256 79935 15147 34389 1772299 203614
TS146-3.6 441 100 248 398 497 106331 4295 37197 1894514 202108
TS146e4.7 99 135 926 13013 24440 147105 17974 36272 1779476 198936
TS146e5A.8 1183 98 326 770 1953 19970 1446 2812 1710725 350637
TS146e5B.9 432 100 191 325 462 212980 4411 106361 1905607 195632
TS147e8.10 86 132 1662 8952 35506 122338 22610 55697 1944471 221774
TS94-3.11 723 96 238 361 456 395484* 2934 91958 2121818 224389
TS94-5.12 67 103 778 5001 46879 173674 28164 82955 1887011 226423
TS95-2.13 78 108 626 5843 45099 140027 25376 88829 1979337 187754
TS95-4.15 58 111 387 11498 59591 169655 34241 89436 1986032 186252
TS95-5.16 58 103 286 5198 59620 188214 34081 115574 1976737 186181
newbler.deNovo.454 (WUSTL)
strain #ctgs min q1 q2 q3 max mean n50 sum 0cgv
TS145A.4 62 106 386 3049 46046 166070 28692 82515 1778941
TS145B.5 130 101 2104 7697 18636 79979 13620 34155 1770674
TS146-3.6 384 100 349 447 1527 83020 4885 28235 1876197
TS146e4.7 145 101 1901 7807 17092 73674 12238 24421 1774620
TS146e5A.8 1199 100 591 932 1529 7108 1199 1566 1438364 much worse
TS146e5B.9 112 111 490 1598 20451 129166 16084 52376 1801428
TS147e8.10 136 100 1197 5950 17566 140484 14275 41840 1941435
TS94-3.11 567 111 332 408 471 284245* 3642 93923 2065443
TS94-5.12 87 130 375 1511 19610 189466 21686 71682 1886713
TS95-2.13 78 116 350 8420 41426 136894 25419 73086 1982724
TS95-4.15 60 112 373 2519 59614 200840 33056 115852 1983394
TS95-5.16 50 132 463 20422 73626 191648 39557 89465 1977893
newbler.refMapper.454 (CBCB)
strain #ctgs min q1 q2 q3 max mean n50 sum 0cgv
TS146e4.7 254 101 206 851 6803 72759 6408 25310 1627809 229435
best
Generated by merging newbler with velvet/SOAPdenovo contigs using minimus2
strain #ctgs min q1 q2 q3 max mean n50 sum 0cgv
FR1LH1.1 169 47 307 3469 13513 113154 10746 29975 1816093 210847
FR1LH3.2 159 47 162 2932 13757 113158 11288 37473 1794914 221522
FR1LH6.3 157 50 269 3785 15911 126755 11603 29272 1821689 211256
TS145A.4 48 131 883 9320 56233 210376 37056 112225 1778716 196515
TS145B.5 53 71 1018 11425 57301 147750 33590 83876 1780320 198279
TS146-3.6 441 100 248 398 497 106331 4295 37197 1894514 202108
TS146e4.7 99 135 926 13013 24440 147105 17974 36272 1779476 198936
TS146e5A.8 1183 98 326 770 1953 19970 1446 2812 1710725 350637
TS146e5B.9 432 100 191 325 462 212980 4411 106361 1905607 195632
TS147e8.10 86 132 1662 8952 35506 122338 22610 55697 1944471 221774
TS94-3.11 38 55 764 9841 66968 395587 49738 144195 1890074 223341
TS94-5.12 49 65 766 7044 51688 285237 38534 111850 1888200 225957
TS95-2.13 57 110 1383 9455 50972 165660 34754 94352 1981005 182973
TS95-3.14 408 47 141 419 5698 52172 4663 16369 1902753 202507
TS95-4.15 79 47 147 2317 16017 200685 26493 138013 2092967 179552
TS95-5.16 47 69 311 14482 77588 188308 42099 115630 1978688 184590
Strains
TS145A.4
- 126x Illumina & 15x 454
. elem min q1 q2 q3 max mean n50 sum cvg Illumina 6536457 36 36 36 36 36 36.00 36 235312452 126 454 85737 30 247 377 453 626 340.94 422 29231201 15
- Read assemblies:
#ctgs min q1 q2 q3 max mean n50 sum 0cgv AMOScmp.Illumina 1918 36 59 124 805 28850 838 3234 1608620 284953 SOAPdenovo.Illumina* 1798 45 127 459 1381 13456 991 2225 1781905 230976 velvet.Illumina 1002 45 398 1154 2228 14317 1575 2707 1578999 355737 AMOScmp.454 135 114 1012 4414 12554 110913 12255 36870 1654428 203524 CA.454 203 1002 2802 6431 12196 41845 8646 13047 1755208 245426 newbler.454** 48 131 883 9320 56233 210376 37056 112225 1778716 196515 SOAPdenovo.454 6531 45 103 179 306 1806 237 321 1553899 478469 velvet.454 21165 45 53 65 83 548 73 74 1561277 821841 SOAPdenovo.Illumina_454 5795 45 75 184 425 3983 317 588 1840810 243391 velvet.Illumina_454 4229 45 135 269 507 3529 384 584 1626062 396716
Contig assemblies(no singletons):
#ctgs min q1 q2 q3 max mean n50 sum 0cgv minimus2.SOAPdenovo.Illumina-newbler.454 41 46 883 17898 61580 287497 43446 114143 1781287 197555 # 3 contigs don't contain any newbler original ctg minimus2.velvet.Illumina-newbler.454 36 289 1802 38243 70184 210459 49402 113849 1778497 197611 # all contigs contain at least 1 newbler original ctg
Contig assemblies(include singletons):
#ctgs min q1 q2 q3 max mean n50 sum 0cgv minimus2.SOAPdenovo.Illumina-newbler.454 83 45 97 425 17898 287497 21689 114143 1800192 196885 minimus2.velvet.Illumina-newbler.454 61 45 677 2956 43814 210459 29821 113849 1819092 197424
TS146e5A.8
- 4x 454
- Read assemblies:
#ctgs min q1 q2 q3 max mean n50 sum 0cgv AMOScmp.454 736 56 675 1452 2902 17360 2092 3319 1540404 324145 newbler.454* 1183 98 326 770 1953 19970 1446 2812 1710725 350637
TS94-3.11
- 94x Illumina & 85x 454
. elem min q1 q2 q3 max mean n50 sum 454 449545 29 255 390 463 1190 349.92 433 157306990 Illumina 4886376 36 36 36 36 36 36.00 36 175909536
- Read assemblies:
#ctgs min q1 q2 q3 max mean n50 sum 0cgv SOAPdenovo.Illumina 1405 45 59 98 634 35690 1349 7940 1895871 249447 velvet.Illumina 738 45 63 102 755 99478 2526 17184 1864601 250197 AMOScmp.454 255 52 585 2580 8321 61854 6531 16349 1665535 225050 newbler.454 723 96 238 361 456 395484 2934 91958 2121818 224389 newbler.454.1000+ 44 1007 2480 9817 64051 395484 42923 106659 1888613 CA.454 55 1001 1777 20479 51987 188883 34348 81660 1889177 244396
newbler.454
- ReadStatus counts
total % Assembled 435637 96.91 Singleton 8178 1.82 PartiallyAssembled 4511 1 Outlier 856 0.19 Repeat 363 0.08 total 449545 100
- Location
/fs/szattic-asmg4/methanobrevibacter_smithii/Assembly.CBCB/TS94-3.11/newbler.454/
CA.bog
. elem min q1 q2 q3 max mean n50 sum ctg 55 1001 1777 20479 51987 188883 34348.67 81660 1889177 deg 922 64 341 457 515 13313 460.62 489 424691
- Location
/fs/szattic-asmg4/methanobrevibacter_smithii/Assembly.CBCB/TS94-3.11/CA.bog/