Repeat search: Difference between revisions

From Cbcb
Jump to navigation Jump to search
Line 1: Line 1:
= Mobile elements =
* plasmids
* bacteriophages:
** up to 20% of the genome
** most common transporters of virulence genes in bacteria
** have site specificity
* transposable elements
** up to 2Kbp
** no site specificity
= Tandem repeats =
= Tandem repeats =
* satellites (spanning megabases of DNA, associated with heterochromatin)
* satellites (spanning megabases of DNA, associated with heterochromatin)
* minisatellites (repeat units in the range 6-100 bp, spanning hundreds of base-pairs)  
* minisatellites (repeat units in the range 6-100 bp, spanning hundreds of base-pairs)  
* microsatellites (repeat units in the range 1-5 bp, spanning a few tens of nucleotides).
* microsatellites (repeat units in the range 1-5 bp, spanning a few tens of nucleotides).
= Insertion Elements(IS) =
* 0.7-2.5K bp
* small, genetically compact (1-2 ORFs) : transposase and/or reverse transcriptase
* end in short terminal inverted repeat sequences (IR) 10-40bp
* [http://www-is.biotoul.fr/ ISFinder]


= Software Packages =
= Software Packages =

Revision as of 18:47, 6 November 2008

Mobile elements

  • plasmids
  • bacteriophages:
    • up to 20% of the genome
    • most common transporters of virulence genes in bacteria
    • have site specificity
  • transposable elements
    • up to 2Kbp
    • no site specificity


Tandem repeats

  • satellites (spanning megabases of DNA, associated with heterochromatin)
  • minisatellites (repeat units in the range 6-100 bp, spanning hundreds of base-pairs)
  • microsatellites (repeat units in the range 1-5 bp, spanning a few tens of nucleotides).


Insertion Elements(IS)

  • 0.7-2.5K bp
  • small, genetically compact (1-2 ORFs) : transposase and/or reverse transcriptase
  • end in short terminal inverted repeat sequences (IR) 10-40bp
  • ISFinder

Software Packages

 Library:
   $ ls /fs/szdevel/dpuiu/RepeatMasker/Libraries/RepeatMaskerLib.embl 
 
   $ ~/bin//readseq.sh -f Fasta -o RepeatMaskerLib.fasta RepeatMaskerLib.embl
 
   $ infoseq RepeatMaskerLib.fasta | getSummary.pl -c 1 -t Len
             #elem   min     max     mean    median  n50     sum
     Len     9055    4       35042   2205    890     4846    19966330