Noisy on ACCESS 1.5.12 Identify homo-plastic characters in multiple sequence alignments - run on XSEDE Christoph Flamm, Sonja J Prohaska, Guido Fritzsch, Peter F Stadler Andreas W. M.Dress, Christoph Flamm, Guido Fritzsch, Stefan Grünewald, Matthias Kruspe, Sonja J. Prohaska, Peter F. Stadler Noisy: identification of problematic columns in multiple sequence alignments. Algorithms Mol Biol, 3:7 (2008). doi:10.1186/1748-7188-3-7 Stefan Grünewald, Kristoffer Forslund, Andreas W. M.Dress, Vincent Moulton QNet: An Agglomerative Method for the Construction of Phylogenetic Networks from Weighted Quartets. Mol Biol Evol, 24(2):532-538 (2007). doi:10.1093/molbev/msl180 Bryant, David and Moulton, Vincent (2004) Neighbor-Net: An Agglomerative Method for the Construction of Phylogenetic Networks. Mol. Biol. Evol. 21:255-265 Phylogeny / Alignment noisy_xsede noisy_comet perl "" 0 number_nodes 2 scheduler.conf perl "threads_per_process=1\\n" . "node_exclusive=0\\n" . "mem=15G\\n" . "nodes=1\\n" infile Input File (AFA format) perl "input.afa" 3 input.afa all_results * runtime 1 Maximum Hours to Run (up to 168 hours) scheduler.conf 0.5 The maximum hours to run must be less than 168 perl $runtime > 168.0 The maximum hours to run must be greater than 0.05 perl $runtime < 0.05 perl "runhours=$value\\n" The job will run on 1 processor as configured. If it runs for the entire configured time, it will consume $runtime cpu hours perl $runtime ne 0 Estimate the maximum time your job will need to run. We recommend testing initially with a < 0.5hr test run because Jobs set for 0.5 h or less depedendably run immediately in the "debug" queue. Once you are sure the configuration is correct, you then increase the time. The reason is that jobs > 0.5 h are submitted to the "normal" queue, where jobs configured for 1 or a few hours times may run sooner than jobs configured for the full 168 hours. specify_cutoff 5 Set the lower bound of the reliability score perl defined $value ? "--cutoff $value":"" Set the lower bound of the reliability score for an alignment column to FLOAT. Columns with a score below FLOAT are removed from the output alignment. The name of the output MSA is constructed from the base name of the input MSA by adding the post fix _out.fas specify_distcalc Set distance calculation of NeighborNet (--distance) HAMMING GTR HAMMING perl " --distance $value " 6 Set distance calculation of NeighborNet to HAMMING or GTR distance_matrix Select Substitution matrix file for Neighbornet (--matrix) perl !defined $specify_distcalc perl defined $value ? " --matrix distance_matrix.txt":"" 7 The matrix file is not compatible with using the --distance option perl defined $distance_matrix && defined $distance distance_matrix.txt Read distance matrix used by NeighborNet to generate the cyclic order from FILE instead of letting NeighborNet calculate the distance matrix by one of the methods given to option -*distance specify_string Treat this character(s) as missing data (--missing) perl defined $value ? "--missing $value":"" N 8 Each character of STRING is treated as missing data, and is removed a column before before changes between character states are calculated. set_nogap Add the gap symbol to the set of missing characters (--nogap) perl ($value) ? "--nogap":"" 9 Add the gap symbol to the set of missing characters. suppress_constant Suppress constant columns in the output MSA. (--noconstant) perl ($value) ? "--noconstant":"" 10 Suppress constant columns in the output MSA. specify_ordering Set the method to calculate the cyclic order (--ordering) nnet qnet rand all INT nnet "--ordering nnet" qnet "--ordering qnet" rand "--ordering rand,$specify_randint" all "--ordering all" INT "--ordering $specify_intint " 11 If the number of tax for the all option is greater than 8, the run can become quite lengthy perl $specify_ordering eq "all" && $specify_ntaxa > 8 More than 120 taxa can cause you to run out of memory. Consider the more memory option perl $specify_ntaxa > 120 Sorry, Noisy cant handle more than 338 taxa perl $specify_ntaxa > 338 specify_randint Specify an integer for the ordering (rand or INT) perl $specify_ordering eq "rand" 12 Please enter an integer for the ordering perl $specify_ordering eq "rand" && !defined $specify_intint With rand a random sample of all possible orderings of the TAXA can be specified for which the reliability score is calculated. The size of the random sample (default is 1000) can be set by adding an integer after a comma to rand i.e. rand,42. (All orderings with a smaller reliability than cutoff are singled out to a text file with "_best.gr" as post fix) specify_intint Specify a cyclic ordering (INT) perl $specify_ordering eq "INT" 13 Please enter an integer for the ordering perl $specify_ordering = "INT" && !defined $specify_intint Specified by a comma-separated list of TAXA indices in the range [0, NumberOfTAXA[ (no spaces are allowed) e.g 3,0,4,1,2 as ordering for the 5 TAXA in the input MSA. specify_shuffles Specify number of random shufflings per column of the MSA (--shuffles) perl (defined $value) ? "--shuffles $value":"" 14 Perform INT random shufflings per column of the MSA. 9 specify_smoothing Calculate a running average over the reliability score of x columns (--smooth) perl (defined $value) ? "--smooth $value":"" 15 Calculate a running average over the reliability score of INT columns and use this smoothed values to remove unreliable columns from the MAS. specify_datatype Set sequence type of input MSA (--seqtype) D P R D perl "--seqtype $value" 16 Set sequence type of input MSA to DNA which is the default Protein or RNA. This information is used by NeighborNet during distance matrix calculation. increase_verbosity Increase the verbosity level (--verbose) 7 perl ($value) ? "--verbose":"" 17 Provide more verbose output.