<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE pise SYSTEM "PARSER/pise.dtd" [
<!ENTITY emboss_init SYSTEM "XMLDIR/emboss.xml">
]>

<pise>

<head>
<title>COMPSEQ</title>
<description>Counts the composition of dimer/trimer/etc words in a sequence (EMBOSS)</description>
<category>nucleic:composition</category>
<category>protein:composition</category>
<doclink>http://bioweb.pasteur.fr/docs/EMBOSS/compseq.html</doclink>
</head>

<command>compseq</command>

<parameters>

&emboss_init;


<parameter type="Paragraph">
<paragraph>
<name>input</name>
	<prompt>Input section</prompt>

<parameters>
	<parameter type="Sequence" ismandatory="1" issimple="1" ishidden="0">
	<name>sequence</name>
	<attributes>
		<prompt>sequence -- any [sequences] (-sequence)</prompt>
		<format>
			<language>perl</language>
			<code>" -sequence=$value -sformat=fasta"</code>
		</format>
		<group>1</group>
		<seqtype><value>any</value></seqtype>
		<seqfmt>
			<value>8</value>
		</seqfmt>
		<pipe>
			<pipetype>seqsfile</pipetype>
				<language>perl</language>
				<code>1</code>
		</pipe>
	</attributes>
	</parameter>

	</parameters>
</paragraph>
</parameter>


<parameter type="Paragraph">
<paragraph>
<name>required</name>
	<prompt>Required section</prompt>

<parameters>
	<parameter type="Integer" ismandatory="1" issimple="1" ishidden="0">
	<name>word</name>
	<attributes>
		<prompt>Word size to consider (e.g. 2=dimer) (-word)</prompt>
		<vdef>
			<value>2</value>
		</vdef>
		<format>
			<language>perl</language>
			<code>" -word=$value"</code>
		</format>
		<group>2</group>
		<comment>
			<value>This is the size of word (n-mer) to count. &lt;BR&gt; Thus if you want to count codon frequencies, you should enter 3 here.</value>
		</comment>
		<scalemin><value>1</value></scalemin>
		<scalemax><value>20</value></scalemax>
	</attributes>
	</parameter>

	</parameters>
</paragraph>
</parameter>


<parameter type="Paragraph">
<paragraph>
<name>advanced</name>
	<prompt>Advanced section</prompt>

<parameters>
	<parameter type="InFile" ismandatory="0" issimple="0" ishidden="0">
	<name>infile</name>
	<attributes>
		<prompt>'compseq' file to use for expected word frequencies (-infile)</prompt>
		<format>
			<language>perl</language>
			<code>($value)? " -infile=$value" : ""</code>
		</format>
		<group>3</group>
		<comment>
			<value>This is a file previously produced by 'compseq' that can be used to set the expected frequencies of words in this analysis. &lt;BR&gt; The word size in the current run must be the same as the one in this results file. Obviously, you should use a file produced from protein sequences if you are counting protein sequence word frequencies, and you must use one made from nucleotide frequencies if you and analysing a nucleotide sequence.</value>
		</comment>
	</attributes>
	</parameter>

	<parameter type="Integer" ismandatory="0" issimple="0" ishidden="0">
	<name>frame</name>
	<attributes>
		<prompt>Frame of word to look at (0=all frames) (-frame)</prompt>
		<vdef>
			<value>0</value>
		</vdef>
		<format>
			<language>perl</language>
			<code>(defined $value &amp;&amp; $value != $vdef)? " -frame=$value" : ""</code>
		</format>
		<group>4</group>
		<comment>
			<value>The normal behaviour of 'compseq' is to count the frequencies of all words that occur by moving a window of length 'word' up by one each time. &lt;BR&gt; This option allows you to move the window up by the length of the word each time, skipping over the intervening words. &lt;BR&gt; You can count only those words that occur in a single frame of the word by setting this value to a number other than zero. &lt;BR&gt; If you set it to 1 it will only count the words in frame 1, 2 will only count the words in frame 2 and so on.</value>
		</comment>
		<scalemax>
			<language>acd</language>
			<code>$word</code>
		</scalemax>
	</attributes>
	</parameter>

	<parameter type="Switch" ismandatory="0" issimple="0" ishidden="0">
	<name>ignorebz</name>
	<attributes>
		<prompt>Ignore the amino acids B and Z and just count them as 'Other' (-ignorebz)</prompt>
		<vdef>
			<value>1</value>
		</vdef>
		<format>
			<language>perl</language>
			<code>($value)? "" : " -noignorebz"</code>
		</format>
		<group>5</group>
		<comment>
			<value>The amino acid code B represents Asparagine or Aspartic acid and the code Z represents Glutamine or Glutamic acid. &lt;BR&gt; These are not commonly used codes and you may wish not to count words containing them, just noting them in the count of 'Other' words.</value>
		</comment>
	</attributes>
	</parameter>

	<parameter type="Switch" ismandatory="0" issimple="0" ishidden="0">
	<name>reverse</name>
	<attributes>
		<prompt>Count words in the forward and reverse sense (-reverse)</prompt>
		<vdef>
			<value>0</value>
		</vdef>
		<format>
			<language>perl</language>
			<code>($value)? " -reverse" : ""</code>
		</format>
		<group>6</group>
		<comment>
			<value>Set this to be true if you also wish to also count words in the reverse complement of a nucleic sequence.</value>
		</comment>
	</attributes>
	</parameter>

	</parameters>
</paragraph>
</parameter>


<parameter type="Paragraph">
<paragraph>
<name>output</name>
	<prompt>Output section</prompt>

<parameters>
	<parameter type="OutFile" ismandatory="1" issimple="1" ishidden="0">
	<name>outfile</name>
	<attributes>
		<prompt>outfile (-outfile)</prompt>
		<vdef><value>outfile.out</value></vdef>
		<format>
			<language>perl</language>
			<code>" -outfile=$value"</code>
		</format>
		<group>7</group>
		<comment>
			<value>This is the results file.</value>
		</comment>
	</attributes>
	</parameter>

	<parameter type="Switch" ismandatory="0" issimple="0" ishidden="0">
	<name>zerocount</name>
	<attributes>
		<prompt>Display the words that have a frequency of zero (-zerocount)</prompt>
		<vdef>
			<value>1</value>
		</vdef>
		<format>
			<language>perl</language>
			<code>($value)? "" : " -nozerocount"</code>
		</format>
		<group>8</group>
		<comment>
			<value>You can make the output results file much smaller if you do not display the words with a zero count.</value>
		</comment>
	</attributes>
	</parameter>

	</parameters>
</paragraph>
</parameter>

<parameter type="String" ishidden="1">
<name>auto</name>
<attributes>
	<format>
		<language>perl</language>
		<code>" -auto -stdout"</code>
	</format>
	<group>9</group>
</attributes>
</parameter>

</parameters>
</pise>
