Noisy on XSEDE1.5.12Identify homo-plastic characters in multiple sequence alignments - run on XSEDEChristoph Flamm, Sonja J Prohaska, Guido Fritzsch, Peter F Stadler
Andreas W. M.Dress, Christoph Flamm, Guido Fritzsch, Stefan Grünewald, Matthias Kruspe, Sonja J. Prohaska, Peter F. Stadler Noisy: identification of problematic columns in multiple sequence alignments. Algorithms Mol Biol, 3:7 (2008). doi:10.1186/1748-7188-3-7
Stefan Grünewald, Kristoffer Forslund, Andreas W. M.Dress, Vincent Moulton QNet: An Agglomerative Method for the Construction of Phylogenetic Networks from Weighted Quartets. Mol Biol Evol, 24(2):532-538 (2007). doi:10.1093/molbev/msl180
Bryant, David and Moulton, Vincent (2004) Neighbor-Net: An Agglomerative Method for the Construction of Phylogenetic Networks. Mol. Biol. Evol. 21:255-265
Phylogeny / Alignmentnoisy_xsedenoisy_cometperl""0number_nodesperl!$more_memory2scheduler.confperl
"nodes=1\\n" .
"node_exclusive=0\n" .
"threads_per_process=1\\n"
number_nodes2perl$more_memory2scheduler.confperl
"nodes=1\\n" .
"node_exclusive=1\n" .
"threads_per_process=1\\n"
infileInput File (AFA format)perl"input.fasta"99input.fastaall_results*runtime1Maximum Hours to Run (up to 168 hours)scheduler.conf0.5The maximum hours to run must be less than 168perl$runtime > 168.0The maximum hours to run must be greater than 0.05perl$runtime < 0.05perl"runhours=$value\\n"The job will run on 3 processors as configured. If it runs for the entire configured time, it will consume 2 x $runtime cpu hoursperl$runtime ne 0 Estimate the maximum time your job will need to run. We recommend testimg initially with a < 0.5hr test run because Jobs set for 0.5 h or less depedendably run immediately in the "debug" queue.
Once you are sure the configuration is correct, you then increase the time. The reason is that jobs > 0.5 h are submitted to the "normal" queue, where jobs configured for 1 or a few hours times may
run sooner than jobs configured for the full 168 hours.
specify_ntaxaHow many taxa does your alignment have?This option will help when more memory is neededmore_memoryI need more memoryThis option will help when more memory is neededspecify_cutoff2Set the lower bound of the reliability scoreperldefined $value ? "--cutoff $value":""Set the lower bound of the reliability score for an alignment column to FLOAT. Columns with a score below FLOAT are removed from the output alignment. The name of the output MSA is constructed from the base name of the input MSA by adding the post fix _out.fas specify_distcalcSet distance calculation of NeighborNet (--distance)HAMMINGGTRperl" --distance $value "3Set distance calculation of NeighborNet to HAMMING or GTRdistance_matrixSelect Substitution matrix file for Neighbornet (--matrix)perl!defined $specify_distcalcperl defined $value ? " --matrix distance_matrix.txt":""4distance_matrix.txtspecify_stringTreat this character as missing data (--missing)perldefined $value ? "--missing $value":""5Each character of STRING is treated as missing data, and is removed a column before before changes between character states are calculated.set_nogapAdd the gap symbol to the set of missing characters (--nogap)perl($value) ? "--nogap":""6Add the gap symbol to the set of missing characters.suppress_constantSuppress constant columns in the output MSA. (--noconstant)perl($value) ? "--noconstant":""7Suppress constant columns in the output MSA.specify_orderingSet the method to calculate the cyclic order (--ordering)nnetqnetrandallINTnnet"--ordering nnet"qnet"--ordering qnet"rand"--ordering rand,$specify_randint"all"--ordering all"INT"--ordering $specify_intint "8If the number of tax for the all option is greater than 8, the run can become quite lengthyperl$specify_ordering eq "all" && $specify_ntaxa > 8 More than 120 taxa can cause you to run out of memory. Consider the more memory optionperl$specify_ntaxa > 120 Sorry, Noisy cant handle more than 338 taxaperl$specify_ntaxa > 338 specify_randintSpecify an integer for the ordering (rand or INT)perl$specify_ordering eq "rand" Please enter an integer for the orderingperl$specify_ordering eq "rand" && !defined $specify_intintWith rand a random sample of all possible orderings of the TAXA can be specified for which the
reliability score is calculated. The size of the random sample (default is 1000) can be set by
adding an integer after a comma to rand i.e. rand,42. (All orderings with a smaller reliability
than cutoff are singled out to a text file with "_best.gr" as post fix)specify_intintSpecify a cyclic ordering (INT)perl$specify_ordering eq "INT"Please enter an integer for the orderingperl$specify_ordering = "INT" && !defined $specify_intintSpecified by a comma-separated list of TAXA indices in the range [0, NumberOfTAXA[ (no spaces are allowed) e.g 3,0,4,1,2 as ordering for the 5 TAXA in the
input MSA.specify_shufflesSpecify number of random shufflings per column of the MSA (--shuffles)perl(defined $value) ? "--shuffles $value":""Perform INT random shufflings per column of the MSA.9specify_smoothingCalculate a running average over the reliability score of x columns (--smooth)perl(defined $value) ? "--smooth $value":""10Calculate a running average over the reliability score of INT columns and use this smoothed values to remove unreliable columns from the MAS. specify_datatypeSet sequence type of input MSA (--seqtype)DPRDperl"--seqtype $value"11Set sequence type of input MSA to DNA which is the default Protein or RNA. This information is used by NeighborNet during distance matrix calculation.increase_verbosityIncrease the verbosity level (--verbose)7perl($value) ? "--verbose":""12Provide more verbose output.