Brugia malayi
Articles
Genome Info
- 6 chromosomes: 1-5, XY
- ~ 90M, 30% GC, 32% coding, 15% repeats
Other sequences
- mitochondrion finished: 13,657 bp; 24% GC
- Wolbachia endosymbiont strain TRS from Brugia malayi strain wMel complete: 1,080,084 bp; 34%GC (New England Biolabs)
- Wolbachia endosymbiont strain wMel progress (TIGR)
- Rodent: some trace contamination; Example: Mus musculus is ~40%GC
Genome Project
Brugia malayi has a diploid genome of approximately 110 Mb, organized in 6 pairs of chromosomes (five pairs of autosomes and one pair of sex chromosomes). In addition to the nuclear genome, B. malayi has a mitochondrial genome of about 14kb, and the genome of the harbored bacterial endosymbiont Wolbachia sp (1-2Mb).
The B. malayi genome project has been completed by The Institute for Genomic Research. Whole Genome Shotgun sequencing was used to obtain more than eight-fold coverage of the genome. The complete genome was assembled into approximately 8200 scaffolds and deposited in GenBank. The accession for the WGS project is AAQA00000000 and consists of sequences AAQA01000001-AAQA01029808.
- TIGR Genome project (TRS strain)
- NCBI AAQA00000000 AAQA01000001-AAQA01029808
* 26,879 good ctgs * 2,929 jird contaminants (Example: AAQA01001321 : mouse 99%id hits)
good ctg len #elem min max mean median n50 sum all 26879 200 611244 3241 1005 19005 87119350 10K+ 1224 10036 611244 41018 23135 60727 50206329 good ctg GC% #elem min max mean median n50 all 26878 0.00 72.30 28.86 28.56 29.46 10K+ 1224 24.38 38.44 30.38 30.43 30.62
contaminant ctg len #elem min max mean median n50 sum all 2929 200 8994 740 675 763 2167588 contaminant ctg GC% #elem min max mean median n50 all 2929 18.09 75.96 44.1 43.59 44.80
Traces
Libraries: * 2K : bulk of the sequence @TIGR * 15-20 K @TIGR * 8,000 BAC clones @Children's Hospital Oakland Research Institute. (!!! no NCBI TA submission)
Trace summary:
* all: 1,260,215 * TRACE_TYPE_CODE * WGS: 1,258,277 * TRANSPOSON: 1,437 * PRIMER_WALK 501 * CENTER NAME * TIGR: 856,624 * JCVI 403,591 * NO BACS !!!; max INSERT_SIZE=23K
- TI's: 1172642810, ... ,1174845185
- SEQ_LIB_ID's : 1047111480027, ... , 1047174912885
FRG file:
- FRG.src : same as TI's above
- FRG.acc: 2 ..
- DST.acc: 1260217, ... , 1260234
Problems:
- All library insert sizes are underestimated
- The contaminant reads align at ~91-93% id to the contaminant ctgs while the Mt/We reads align at 99% id to Mt/We finished seq. What %id thold to use for contaminant?
Assemblies
TIGR
- 9X coverage, 856K Sanger traces => 8,200 scaff & 29,808 ctg (avg. scaff=~10K & avg ctg=~3K)
- "scaffolds totaling ~71 Mb of data with a further ~17.5 Mb of contigs not integrated into any scaffold (orphan contigs)" (Science 2007)
CBCB
[Scaffolds] TotalScaffolds=10317 TotalContigsInScaffolds=12753 MeanContigsPerScaffold=1.24 MinContigsPerScaffold=1 MaxContigsPerScaffold=53 [Contigs] TotalContigsInScaffolds=12753 TotalBasesInScaffolds=77964006 TotalVarRecords=87058 MeanContigLength=6113 MinContigLength=273 MaxContigLength=376744 N50ContigBases=24748 [Reads] TotalReadsInput=1178192 TotalUsableReads=1173016 AvgClearRange=791 ContigReads=663383(56.55%) BigContigReads=544689(46.43%) SmallContigReads=118694(10.12%) DegenContigReads=124230(10.59%) SurrogateReads=295861(25.22%) PlacedSurrogateReads=44577(3.80%) SingletonReads=134119(11.43%) ChaffReads=134119(11.43%) [Coverage] ContigsOnly=6.86 Contigs_Surrogates=9.47 Contigs_Degens_Surrogates=9.33 AllReads=11.91
Files
* /fs/szattic/asmg1/adelcher/Genomes/Brugia : Art's files * /fs/sztmpscratch/cole/tarchive_download/brugia_malay : Cole's files * /fs/szasmg3/dpuiu/Brugia_malayi/ : Daniela's files