https://wiki.umiacs.umd.edu/cbcb/api.php?action=feedcontributions&user=Cmhill&feedformat=atomCbcb - User contributions [en]2024-03-28T13:19:57ZUser contributionsMediaWiki 1.39.6https://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5688Partition2009-08-27T21:23:45Z<p>Cmhill: </p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o dir/file]<br />
<br />
The -o option specifies if the user wants to save the partition file as something other than the default temppart.xml.<br />
===Use a metadata file to populate the partitions.===<br />
A metadata file is a file that contains information about a set of sequences.<br />
#SampleID LRHand Sex <br />
S1 R F <br />
S2 L M<br />
<br />
If the first row of the metadata file does not contain the column information, then specify a separate column header file with '''-c [file]'''.<br />
<br />
The format of a *.re file changes slightly with the addition of the metadata file. In addition to specifying the regexp, a '''category''' field must also be added to specify which column the regexp will be checked against.<br />
<br />
To populate the xml file with the metadata information and split a given fasta file:<br />
./Partition.py -b [input.re] -f [input.fasta] -m [metadata.map] -s [dir/output]</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5687Partition2009-08-27T20:44:40Z<p>Cmhill: /* Use a metadata file to populate the partitions. */</p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o dir/file]<br />
<br />
The -o option specifies if the user wants to save the partition file as something other than the default temppart.xml.<br />
===Use a metadata file to populate the partitions.===<br />
A metadata file is a file that contains information about a set of sequences.<br />
#SampleID LRHand Sex <br />
S1 R F <br />
S2 L M<br />
<br />
If the first row of the metadata file does not contain the column information, then specify a separate column header file with '''-c [file]'''.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5686Partition2009-08-27T20:44:33Z<p>Cmhill: /* Use a metadata file to populate the partitions. */</p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o dir/file]<br />
<br />
The -o option specifies if the user wants to save the partition file as something other than the default temppart.xml.<br />
===Use a metadata file to populate the partitions.===<br />
A metadata file is a file that contains information about a set of sequences.<br />
#SampleID LRHand Sex <br />
S1 R F <br />
S2 L M<br />
<br />
If the first row of the metadata file does not contain the column information, then specify a separate column header file with ''-c [file]''.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5685Partition2009-08-27T20:44:25Z<p>Cmhill: /* Use a metadata file to populate the partitions. */</p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o dir/file]<br />
<br />
The -o option specifies if the user wants to save the partition file as something other than the default temppart.xml.<br />
===Use a metadata file to populate the partitions.===<br />
A metadata file is a file that contains information about a set of sequences.<br />
#SampleID LRHand Sex <br />
S1 R F <br />
S2 L M<br />
<br />
If the first row of the metadata file does not contain the column information, then specify a separate column header file with '-c [file]'.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5684Partition2009-08-27T20:43:39Z<p>Cmhill: /* Use a metadata file to populate the partitions. */</p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o dir/file]<br />
<br />
The -o option specifies if the user wants to save the partition file as something other than the default temppart.xml.<br />
===Use a metadata file to populate the partitions.===<br />
A metadata file is a file that contains information about a set of sequences.<br />
#SampleID LRHand Sex <br />
S1 R F <br />
S2 L M<br />
<br />
If the first row of the metadata file does not contain the column information, then specify a separate column header file with =-c [file]=.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5683Partition2009-08-27T20:43:23Z<p>Cmhill: </p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o dir/file]<br />
<br />
The -o option specifies if the user wants to save the partition file as something other than the default temppart.xml.<br />
===Use a metadata file to populate the partitions.===<br />
A metadata file is a file that contains information about a set of sequences.<br />
#SampleID LRHand Sex <br />
S1 R F <br />
S2 L M<br />
<br />
If the first row of the metadata file does not contain the column information, then specify a separate column header file with **-c [file]**.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5682Partition2009-08-27T20:35:09Z<p>Cmhill: /* Use a metadata file to populate the partitions. */</p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o dir/file]<br />
<br />
The -o option specifies if the user wants to save the partition file as something other than the default temppart.xml.<br />
===Use a metadata file to populate the partitions.===<br />
A metadata file is a file that contains information about a set of sequences.<br />
#SampleID LRHand Sex <br />
S1 R F <br />
S2 L M<br />
<br />
If the first row of the metadata file does not contain the column information, then specify a separate column header file with *-c [file]*</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5681Partition2009-08-27T20:20:50Z<p>Cmhill: /* Use a metadata file to populate the partitions. */</p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o dir/file]<br />
<br />
The -o option specifies if the user wants to save the partition file as something other than the default temppart.xml.<br />
===Use a metadata file to populate the partitions.===<br />
A metadata file is a file that contains information about a set of sequences.<br />
#SampleID LRHand Sex <br />
S1 R F <br />
S2 L M</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5680Partition2009-08-27T20:20:23Z<p>Cmhill: </p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o dir/file]<br />
<br />
The -o option specifies if the user wants to save the partition file as something other than the default temppart.xml.<br />
===Use a metadata file to populate the partitions.===<br />
A metadata file is a file that contains information about a set of sequences.<br />
#SampleID LRHand Sex <br />
S1 R F <br />
S2 L M</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5575Partition2009-08-13T18:57:37Z<p>Cmhill: /* Use a file of metadata that will be used to populate the partitions. */</p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o dir/file]<br />
<br />
The -o option specifies if the user wants to save the partition file as something other than the default temppart.xml.<br />
===Use a metadata file to populate the partitions.===</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5574Partition2009-08-13T18:57:18Z<p>Cmhill: </p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o dir/file]<br />
<br />
The -o option specifies if the user wants to save the partition file as something other than the default temppart.xml.<br />
===Use a file of metadata that will be used to populate the partitions.===</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5573Partition2009-08-13T18:30:42Z<p>Cmhill: /* Given the input file, build and populate a partition */</p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o dir/file]<br />
<br />
The -o option specifies if the user wants to save the partition file as something other than the default temppart.xml.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5567Partition2009-08-13T15:47:40Z<p>Cmhill: </p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).<br />
===Given the input file, build and populate a partition===<br />
./Partition.py -b [input.re] -f [input.fasta] [-o output/dir]</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5564Partition2009-08-13T14:53:44Z<p>Cmhill: /* Format of an input regular expression (*.re) file */</p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of the *.re input file is:<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
where [name] represents the name of the partition, and the key-value pairs will represent the attributes.<br />
<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5563Partition2009-08-13T14:52:50Z<p>Cmhill: </p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of an input regular expression (*.re) file===<br />
The format of an<br />
[name]<br />
key1 = value1<br />
key2 = value2<br />
...<br />
[name2]<br />
key1 = value1<br />
...<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...<br />
Using the format above, the partition can only have two levels. It is possible to have multiple levels, but the input file needs to be an xml file (explained below).</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5562Partition2009-08-13T14:33:27Z<p>Cmhill: </p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
===Options===<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of a input regular expression (*.re) file===<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5561Partition2009-08-13T14:33:05Z<p>Cmhill: /* Format of a input regular expression (*.re) file */</p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
'''Options'''<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of a input regular expression (*.re) file===<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = a.*<br />
[B]<br />
info = animals that start with B<br />
regexp = b.*<br />
...</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5560Partition2009-08-13T14:32:46Z<p>Cmhill: </p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
'''Options'''<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of a input regular expression (*.re) file===<br />
Example of an input file that clusters animals by their first letter:<br />
[animals]<br />
info = all animals<br />
[A]<br />
info = animals that start with A<br />
regexp = [Aa].*<br />
[B]<br />
info = animals that start with B<br />
regexp = [Bb].*<br />
...</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5559Partition2009-08-13T14:14:55Z<p>Cmhill: </p>
<hr />
<div>==Summary==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
'''Options'''<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
==Tutorial==<br />
===Format of a input regular expression (*.re) file===</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5558Partition2009-08-13T14:14:09Z<p>Cmhill: </p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
''' Options '''<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
== Tutorial ==<br />
=== Format of a input regular expression (*.re) file ===</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5557Partition2009-08-13T14:13:19Z<p>Cmhill: /* Tutorial */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
''' Options '''<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
== Tutorial ==<br />
''' Format of a input regular expression (*.re) file '''</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5556Partition2009-08-13T14:13:10Z<p>Cmhill: /* Tutorial */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
''' Options '''<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
=== Tutorial ===<br />
''' Format of a input regular expression (*.re) file '''</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5555Partition2009-08-13T14:11:45Z<p>Cmhill: </p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
''' Options '''<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre><br />
<br />
== Tutorial ==<br />
''' Format of a input regular expression (*.re) file '''</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5554Partition2009-08-13T14:10:28Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
''' Options '''<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre></div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5553Partition2009-08-13T14:10:11Z<p>Cmhill: </p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
'' Options ''<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre></div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5552Partition2009-08-13T14:09:42Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest, but possibly unstable builds''']<br />
<br />
== Options ==<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre></div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5551Partition2009-08-13T14:09:24Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest but possibly unstable builds''']<br />
<br />
== Options ==<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre></div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5550Partition2009-08-13T14:08:42Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
<pre>/fs/szasmg/metagenomics/Partition/Partition.py ['''stable''']<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py ['''latest but possibly unstable builds''']</pre><br />
<br />
== Options ==<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre></div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5549Partition2009-08-13T14:08:08Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
<pre>/fs/szasmg/metagenomics/Partition/Partition.py [stable]<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest but possibly unstable builds]</pre><br />
<br />
== Options ==<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre></div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5548Partition2009-08-13T14:07:47Z<p>Cmhill: </p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
<pre>/fs/szasmg/metagenomics/Partition/Partition.py [latest stable]<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest but possibly unstable builds]</pre><br />
== Options ==<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre></div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5547Partition2009-08-13T14:07:06Z<p>Cmhill: </p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
*/fs/szasmg/metagenomics/Partition/Partition.py [latest stable]<br />
*/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest but possibly unstable builds]<br />
== Options ==<br />
<pre><br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.<br />
</pre></div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5546Partition2009-08-13T14:04:06Z<p>Cmhill: /* Options */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
*/fs/szasmg/metagenomics/Partition/Partition.py [latest stable]<br />
*/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest but possibly unstable builds]<br />
== Options ==<br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5545Partition2009-08-13T14:03:54Z<p>Cmhill: /* Options */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
*/fs/szasmg/metagenomics/Partition/Partition.py [latest stable]<br />
*/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest but possibly unstable builds]<br />
== Options ==<br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information<br />
for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5544Partition2009-08-13T14:03:33Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
*/fs/szasmg/metagenomics/Partition/Partition.py [latest stable]<br />
*/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest but possibly unstable builds]<br />
== Options ==<br />
-p Populate the given partition/XML.<br />
-b Given the input file, build a partition.<br />
-m Metadata file that will be used to populate the partitions.<br />
-h Header information for the metadata, if not present column information<br />
for metadata will be found in first line of the metadata.<br />
-f Input fasta file.<br />
-s Split the fasta file based on the partition information<br />
and output to the directory.<br />
-o Name of the output .part file.<br />
-c Convert an old partition format into the new xml format.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5543Partition2009-08-13T14:02:31Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
*/fs/szasmg/metagenomics/Partition/Partition.py [latest stable]<br />
*/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest but possibly unstable builds]</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5542Partition2009-08-13T13:51:25Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
**/fs/szasmg/metagenomics/Partition/Partition.py**<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest but possibly unstable builds]</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5541Partition2009-08-13T13:51:07Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
'/fs/szasmg/metagenomics/Partition/Partition.py'<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest but possibly unstable builds]</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5540Partition2009-08-13T13:50:16Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
*<br />
/fs/szasmg/metagenomics/Partition/Partition.py<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest but possibly unstable builds]</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5539Partition2009-08-13T13:49:59Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
*/fs/szasmg/metagenomics/Partition/Partition.py*<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest but possibly unstable builds]</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5538Partition2009-08-13T13:49:51Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file. Partition.py is located at:<br />
*/fs/szasmg/metagenomics/Partition/Partition.py<br />
/fs/szasmg/metagenomics/Partition/MetaPart/src/Partition.py [latest but possibly unstable builds]</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5537Partition2009-08-13T13:47:06Z<p>Cmhill: /* Summary */</p>
<hr />
<div>== Summary ==<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5536Partition2009-08-13T13:46:53Z<p>Cmhill: </p>
<hr />
<div>== Summary ==<br />
<br />
Partition is a python script that takes an input of regular expressions and metadata to build an xml file of matching header information from a fasta-formatted file.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Partition&diff=5522Partition2009-08-10T14:59:44Z<p>Cmhill: </p>
<hr />
<div>Testing.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Cbcb:Pop-Lab:Software&diff=5521Cbcb:Pop-Lab:Software2009-08-10T14:59:35Z<p>Cmhill: /* Software developed by members of our group */</p>
<hr />
<div>==Software developed by members of our group==<br />
[http://amos.sourceforge.net AMOS] - open-source assembly package <br><br />
[http://amos.sourceforge.net/docs/bambus Bambus] - hierachical genome scaffolder <br><br />
[http://amos.sourceforge.net/Figaro Figaro] - sequence trimmer that does not require knowledge about the vector sequence <br><br />
[http://www.cbcb.umd.edu/~whitej/metastats/detection.shtml MetaStats] - statistical methods for comparing metagenomic samples <br><br />
[http://www.cbcb.umd.edu/soma On-Soma] - scaffolder for optical mapping data <br><br />
[http://ardb.cbcb.umd.edu ARDB] - antibiotic resistance database <br><br />
[http://bowtie-bio.sourceforge.net Bowtie] - ultrafast and memory-efficient short read aligner<br><br />
Prokaryotic annotation pipeline <br><br />
Comparative annotation viewer <br><br />
[[Partition]]</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Cbcb:Pop-Lab:Partition&diff=5520Cbcb:Pop-Lab:Partition2009-08-10T14:59:21Z<p>Cmhill: </p>
<hr />
<div>Testing.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Cbcb:Pop-Lab:Software&diff=5519Cbcb:Pop-Lab:Software2009-08-10T14:59:14Z<p>Cmhill: /* Software developed by members of our group */</p>
<hr />
<div>==Software developed by members of our group==<br />
[http://amos.sourceforge.net AMOS] - open-source assembly package <br><br />
[http://amos.sourceforge.net/docs/bambus Bambus] - hierachical genome scaffolder <br><br />
[http://amos.sourceforge.net/Figaro Figaro] - sequence trimmer that does not require knowledge about the vector sequence <br><br />
[http://www.cbcb.umd.edu/~whitej/metastats/detection.shtml MetaStats] - statistical methods for comparing metagenomic samples <br><br />
[http://www.cbcb.umd.edu/soma On-Soma] - scaffolder for optical mapping data <br><br />
[http://ardb.cbcb.umd.edu ARDB] - antibiotic resistance database <br><br />
[http://bowtie-bio.sourceforge.net Bowtie] - ultrafast and memory-efficient short read aligner<br><br />
Prokaryotic annotation pipeline <br><br />
Comparative annotation viewer <br><br />
[[Cbcb:Pop-Lab:Partition]]</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Cbcb:Pop-Lab:How_to_use_the_partition_software%3F&diff=5518Cbcb:Pop-Lab:How to use the partition software?2009-08-10T01:31:13Z<p>Cmhill: </p>
<hr />
<div>Testing.</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Cbcb:Pop-Lab:How-to&diff=5517Cbcb:Pop-Lab:How-to2009-08-10T01:31:06Z<p>Cmhill: /* How-To repository */</p>
<hr />
<div>==How-To repository==<br />
[http://www.cbcb.umd.edu/intranet/resources.shtml Getting started at the CBCB] <br><br />
[[Cbcb:Pop-Lab:AMOS-CVS How to use AMOS through CVS]] <br><br />
[[Cbcb:Pop-Lab:AMOScmp-SR How to use AMOScmp with short read data ]] <br><br />
How do I annotate a genome at the CBCB?<br><br />
[[Cbcb:Pop-Lab:How do I run the new Bambus | How do I run the new Bambus?]] <br><br />
How do I use the antibiotic resistance database?<br><br />
How do I use the antibiotic resistance database locally? <br><br />
How do I run jobs on the grid? <br><br />
[[Cbcb:Pop-Lab:OTUs How do I create OTUs from 16S rRNA sequence data? | How do I create OTUs from 16S rRNA sequence data?]] <br><br />
How do I compare metagenomic datasets through the metastats website?<br><br />
How do I compare metagenomic datasets using R directly? <br><br />
How do I find CRISPRs in a new genome?<br><br />
[[Cbcb:Pop-Lab:SOMA How do I scaffold a genome using optical maps (both locally and through the web) ]] <br><br />
How do I generate graph information out of Minimus?<br><br />
What tools are available for doing <i>in silico</i> finishing at the CBCB? <br><br />
How do I generate a scaffold graph starting from a 454 .ace file?<br><br />
How do I draw a pretty picture of a scaffold stored in an AMOS bank?<br><br />
[[Cbcb:Pop-Lab:How to use the partition software? | How do I use the partition software?]]</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Cbcb:Pop-Lab:Chris-Report&diff=5379Cbcb:Pop-Lab:Chris-Report2009-06-13T00:47:18Z<p>Cmhill: /* Tasks */</p>
<hr />
<div>== June 12, 2009 ==<br />
===Tasks===<br />
'''Intergenic space and gene ontology work'''<br />
*Investigated GO annotation tools that Bo has linked me to.<br />
*I have been working on the Snail, ''Lottia gigantea'', genome.<br />
*Found the top 10% intergenic space with relative ease, however there was no GO annotation file for the genes. Most of the genomes of interest will not have a GO annotation file (shocker, I know).<br />
*BLAST'd the sequences using Blast2GO, '''ALL in ONE tool for functional annotation of (novel) sequences and the analysis of annotation data.''' http://www.blast2go.org/<br />
*Blast2GO has a pipeline version, so I have been planning a way to take new incomplete genomes and have them run through this annotation pipeline.<br />
*BLAST took around 8 hours for ~2200 sequences using an online database. I will run another group on the CBCB's BLAST.<br />
*GO annotation using Blast2GO took around 24 hours.<br />
*I plan on running HMMs for the approximate 200 sequences with no suitable hits this weekend.<br />
'''Partition'''<br />
*Drastically overhauling the backend of my partition program to make it more extendable/robust.<br />
*Adding XML support to make parsing easier and increase possible functionality.<br />
*Need to focus on getting James something usable asap.<br />
<br />
===Summer Goals===<br />
*Finish analyzing the intergenic space of the incomplete genomes. Determine future possibilities/publications.<br />
*Modify/extend partition program to incorporate into James' metagenomic pipeline.<br />
*Include viral metagenomic data into GeneParser. <br />
<br />
== Mar 2, 2009 ==<br />
===Tasks===<br />
'''Intergenic space and gene ontology work with Cristian''' <br />
*Approach<br />
Get the gene-spacing information - typically a simple parse of GFF files, if they are available.<br />
Get the gene function information - use GO if available.<br />
Rank the genes based on 5' spacing size.<br />
Take 10% longest, 10% shortest and middle 20% and find out what they do with GeneMerge<br />
*Completed genomes<br />
**Anenome, ''Nematostella vectensis''<br />
**Frog, ''Xenopus tropicalis''<br />
**Waterflea, ''Daphnia pulex''<br />
*Incomplete genomes that require blast<br />
**Gastropod Snail, ''Lottia gigantea''<br />
**Polychaete Worm, ''Capitella sp''<br />
*Future genomes<br />
**Leech, ''Helobdella robusta''<br />
**Flatworm (Planaria), '' Schmidtea mediterranea'' <br />
**Sea Slug,'' Aplysia californica''<br />
**Snail,'' Biomphalaria glabrata''<br />
**Slime-mold, ''Dictyostelium purpureum QSDP1''<br />
*Waiting to hear back from Cristian about blast value cut-offs for incomplete genomes.<br />
<br />
'''Partitioning System'''<br />
*Splitting contigs based on "subgroup" information - essentially extending breaking up an entire assembly into multiple "sub-assemblies" each containing just reads from a single subgroup.<br />
<br />
'''Conserved genomic elements in bacteria'''<br />
*Update elements based on Adam's changes to Insignia.<br />
*Find something to write about.<br />
===Interesting Stuff===<br />
*New insights into aging based on transcription factors, [http://med.stanford.edu/news_releases/2008/july/aging-worm.html Prevailing theory of aging challenged in Stanford worm study]</div>Cmhillhttps://wiki.umiacs.umd.edu/cbcb/index.php?title=Cbcb:Pop-Lab:Chris-Report&diff=5378Cbcb:Pop-Lab:Chris-Report2009-06-13T00:45:46Z<p>Cmhill: /* Tasks */</p>
<hr />
<div>== June 12, 2009 ==<br />
===Tasks===<br />
'''Intergenic space and gene ontology work'''<br />
*Investigated GO annotation tools that Bo has linked me to.<br />
*I have been working on the Snail, ''Lottia gigantea'', genome.<br />
*Found the top 10% intergenic space with relative easy, however there was no GO annotation file for the genes. <br />
*BLAST'd the sequences using Blast2GO, '''ALL in ONE tool for functional annotation of (novel) sequences and the analysis of annotation data.''' http://www.blast2go.org/<br />
*Blast2GO has a pipeline version, so I have been planning a way to take new incomplete genomes and have them run through this annotation pipeline.<br />
*BLAST took around 8 hours for ~2200 sequences using an online database. I will run another group on the CBCB's BLAST.<br />
*GO annotation using Blast2GO took around 24 hours.<br />
*I plan on running HMMs for the approximate 200 sequences with no suitable hits this weekend.<br />
'''Partition'''<br />
*Drastically overhauling the backend of my partition program to make it more extendable/robust.<br />
*Adding XML support to make parsing easier and increase possible functionality.<br />
*Need to focus on getting James something usable asap.<br />
<br />
===Summer Goals===<br />
*Finish analyzing the intergenic space of the incomplete genomes. Determine future possibilities/publications.<br />
*Modify/extend partition program to incorporate into James' metagenomic pipeline.<br />
*Include viral metagenomic data into GeneParser. <br />
<br />
== Mar 2, 2009 ==<br />
===Tasks===<br />
'''Intergenic space and gene ontology work with Cristian''' <br />
*Approach<br />
Get the gene-spacing information - typically a simple parse of GFF files, if they are available.<br />
Get the gene function information - use GO if available.<br />
Rank the genes based on 5' spacing size.<br />
Take 10% longest, 10% shortest and middle 20% and find out what they do with GeneMerge<br />
*Completed genomes<br />
**Anenome, ''Nematostella vectensis''<br />
**Frog, ''Xenopus tropicalis''<br />
**Waterflea, ''Daphnia pulex''<br />
*Incomplete genomes that require blast<br />
**Gastropod Snail, ''Lottia gigantea''<br />
**Polychaete Worm, ''Capitella sp''<br />
*Future genomes<br />
**Leech, ''Helobdella robusta''<br />
**Flatworm (Planaria), '' Schmidtea mediterranea'' <br />
**Sea Slug,'' Aplysia californica''<br />
**Snail,'' Biomphalaria glabrata''<br />
**Slime-mold, ''Dictyostelium purpureum QSDP1''<br />
*Waiting to hear back from Cristian about blast value cut-offs for incomplete genomes.<br />
<br />
'''Partitioning System'''<br />
*Splitting contigs based on "subgroup" information - essentially extending breaking up an entire assembly into multiple "sub-assemblies" each containing just reads from a single subgroup.<br />
<br />
'''Conserved genomic elements in bacteria'''<br />
*Update elements based on Adam's changes to Insignia.<br />
*Find something to write about.<br />
===Interesting Stuff===<br />
*New insights into aging based on transcription factors, [http://med.stanford.edu/news_releases/2008/july/aging-worm.html Prevailing theory of aging challenged in Stanford worm study]</div>Cmhill