Difference between revisions of "Partition"

From Cbcb
Jump to navigation Jump to search
Line 4: Line 4:
     /fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']
     /fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']
-p    Populate the given partition/XML.
-p    Populate the given partition/XML.

Revision as of 14:33, 13 August 2009


Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:

   /fs/szasmg/metagenomics/Partition/Partition.py [stable]
   /fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest, but possibly unstable builds]


-p    Populate the given partition/XML.
-b    Given the input file, build a partition.
-m    Metadata file that will be used to populate the partitions.
-h    Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.
-f    Input fasta file.
-s    Split the fasta file based on the partition information and output to the directory.
-o    Name of the output .part file.
-c    Convert an old partition format into the new xml format.


Format of a input regular expression (*.re) file

Example of an input file that clusters animals by their first letter:

  info = all animals
  info = animals that start with A
  regexp = a.*
  info = animals that start with B
  regexp = b.*