Difference between revisions of "Partition"

From Cbcb
Jump to navigation Jump to search
Line 18: Line 18:
 
==Tutorial==
 
==Tutorial==
 
===Format of a input regular expression (*.re) file===
 
===Format of a input regular expression (*.re) file===
 +
Example of an input file that clusters animals by their first letter:
 +
  [animals]
 +
  info = all animals
 +
  [A]
 +
  info = animals that start with A
 +
  regexp = [Aa].*
 +
  [B]
 +
  info = animals that start with B
 +
  regexp = [Bb].*
 +
  ...

Revision as of 14:32, 13 August 2009

Summary

Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:

   /fs/szasmg/metagenomics/Partition/Partition.py [stable]
   /fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest, but possibly unstable builds]

Options

-p    Populate the given partition/XML.
-b    Given the input file, build a partition.
-m    Metadata file that will be used to populate the partitions.
-h    Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.
-f    Input fasta file.
-s    Split the fasta file based on the partition information and output to the directory.
-o    Name of the output .part file.
-c    Convert an old partition format into the new xml format.

Tutorial

Format of a input regular expression (*.re) file

Example of an input file that clusters animals by their first letter:

  [animals]
  info = all animals
  [A]
  info = animals that start with A
  regexp = [Aa].*
  [B]
  info = animals that start with B
  regexp = [Bb].*
  ...