EPA-NG on ACCESS0.3.8Massively Parallel Evolutionary Placement of Genetic Sequences - run on XSEDEPierre Barbera, Alexey M Kozlov, Lucas Czech, Benoit Morel, Diego Darriba, Tomáš Flouri, Alexandros Stamatakis,.Pierre Barbera, Alexey M Kozlov, Lucas Czech, Benoit Morel, Diego Darriba, Tomáš Flouri, Alexandros Stamatakis, EPA-ng: Massively Parallel Evolutionary Placement of Genetic Sequences, Systematic Biology, Volume 68, Issue 2, March 2019, Pages 365–369, https://doi.org/10.1093/sysbio/syy054Phylogeny / Alignmentepa_ng_xsedeexabayes_15perl"epa_ng_expanse"0conf_fileregmem2scheduler.confperl$set_memory == 96perl
"ChargeFactor=1.0\\n" .
"mem=96G\\n" .
"nodes=1\\n" .
"node_exclusive=0\\n" .
"cpus-per-task=48\\n" .
"threads_per_process=48\\n"
conf_file243mem2scheduler.confperl$set_memory == 243perl
"ChargeFactor=1.0\\n" .
"mem=243G\\n" .
"nodes=1\\n" .
"node_exclusive=1\\n" .
"cpus-per-task=128\\n" .
"threads_per_process=128\\n"
conf_file500mem2scheduler.confperl$set_memory == 500perl
"ChargeFactor=1.0\\n" .
"mem=500G\\n" .
"large_data=1\\n" .
"nodes=1\\n" .
"node_exclusive=0\\n" .
"cpus-per-task=32\\n" .
"threads_per_process=32\\n"
specify_threads_96memperl$set_memory == 96perl"-T 48"99specify_threads_243memperl$set_memory == 243perl"-T 128"99specify_threads_500memperl$set_memory == 500perl"-T 32"99infileInput Tree File (must be in Newick format)perl"--tree infile_tree.txt"5infile_tree.txtset_outdirperl"--out-dir ./"all_results*runtime1scheduler.confMaximum Hours to Run (up to 168 hours)perl"runhours=$value\\n"0.5The maximum hours to run must be less than 168perl$set_memory < 500 && $runtime > 168.0For high memory jobs, the maximum hours to run is 48 hoursperl$runtime > 48 && $set_memory == 500 For high memory jobs, the runhours request must be greater than 6, but you will only be charged for the time your run actually usesperl$runtime <= 6 && $set_memory == 500 The maximum hours to run must be greater than 0.05perl$runtime < 0.05The job will run on 48 processors as configured. If it runs for the entire configured time, it will consume 48 X $runtime cpu hoursperl$runtime > 0 && $set_memory == 96The job will run on 128 processors as configured. If it runs for the entire configured time, it will consume 128 X $runtime cpu hoursperl$runtime > 0 && $set_memory == 243The job will run on 64 processors as configured. If it runs for the entire configured time, it will consume 64 X $runtime cpu hoursperl$runtime > 0 && $set_memory == 500Estimate the maximum time your job will need to run. We recommend testing initially with a < 0.5hr test run because Jobs set for 0.5 h or less depedendably run immediately in the "debug" queue.
Once you are sure the configuration is correct, you then increase the time. The reason is that jobs > 0.5 h are submitted to the "normal" queue, where jobs configured for 1 or a few hours times may
run sooner than jobs configured for the full 168 hours.
ref_file12Select ref-msa filepise(defined $ref_file) ? "--ref-msa ref_msa.fasta":""ref_msa.fastabinary_file12Select binary fileperl!defined $ref_filepise(defined $binary_file) ? "--binary ref_binary.bin":""ref_binary.binquery_file12Select query filepise(defined $query_file) ? "--query query.txt":""query.txtPlease select either a text reference file or a binaryperl!defined $ref_file && !defined $binary_filePlease select a query fileperl!defined $query_fileraxquery_file12Select RAxML info filepise(defined $raxquery_file) ? "--model model.txt":""model.txtAs of version 0.2.0, GTRGAMMA model parameters have to be specified explicitly. There are currently two ways of doing this: Either specify a raxml-ng-style model descriptor (elaborated here), like so:
epa-ng -*model GTR{0.7/1.8/1.2/0.6/3.0/1.0}+FU{0.25/0.23/0.30/0.22}+G4{0.47} ... or pass a file containing the relevant information, coming from one of the supported tree inference programs.
RECOMMENDED In the case of raxml-ng, pass the [...].bestModel file resulting from an evaluation run to EPA-ng:
This method has support for pretty much every model that raxml-ng supports,
so it is highly recommended you do it this way. Alternatively we also support parsing
the model parameters either from RAxML 8.x info files, or from IQ-TREE report files,
though there may be parsing problems as not all models are covered.model_string12Specify the model with a text stringpise(defined $model_string) ? "--model $model_string":""Please specify the model parameters with a file or a text stringperl!defined $raxquery_file && !defined $model_stringSorry, you cannot select a parameter file AND add these values as a stringperldefined $raxquery_file && defined $model_stringAs of version 0.2.0, GTRGAMMA model parameters have to be specified explicitly. There are currently two ways of doing this: Either specify a raxml-ng-style model descriptor (elaborated here), like so:
epa-ng -*model GTR{0.7/1.8/1.2/0.6/3.0/1.0}+FU{0.25/0.23/0.30/0.22}+G4{0.47} ... or pass a file containing the relevant information, coming from one of the supported tree inference programs.
RECOMMENDED In the case of raxml-ng, pass the [...].bestModel file resulting from an evaluation run to EPA-ng:
This method has support for pretty much every model that raxml-ng supports,
so it is highly recommended you do it this way. Alternatively we also support parsing
the model parameters either from RAxML 8.x info files, or from IQ-TREE report files,
though there may be parsing problems as not all models are covered.set_memorySelect the memory required9624350096compute_optsCompute Optionschoose_heuristicChoose your heuristicdyn-heurfix-heurbaseball-heurno-heurdyn-heur"--dyn-heur $specify_dynamic"fix-heur"--fix-heur $specify_fixed"baseball-heur"--baseball-heur"no-heur"--no-heur" -g,-*dyn-heur FLOAT:FLOAT in [0 - 1]=0.99999 Excludes: -*fix-heur -*baseball-heur -*no-heur
Two-phase heuristic, determination of candidate edges using accumulative threshold. Enabled by default! See -*no-heur for disabling it
-G,-*fix-heur FLOAT:FLOAT in [0 - 1] Excludes: -*dyn-heur -*baseball-heur -*no-heur
Two-phase heuristic, determination of candidate edges by specified percentage of total edges.
-*baseball-heur Excludes: -*dyn-heur -*fix-heur -*no-heur
Baseball heuristic as known from pplacer. strike_box=3,max_strikes=6,max_pitches=40.
-*no-heur Excludes: -*dyn-heur -*fix-heur -*baseball-heurspecify_dynamicProvide a value for the dynamic heuristicperl$choose_heuristic eq "dyn-heur"0.9999Please enter a value for the dynamic heuristicperl$choose_heuristic eq "dyn-heur" && !defined $specify_dynamicValue for the dynamic heuristic must be between 0 and 1perl$choose_heuristic eq "dyn-heur" && ($specify_dynamic > 1 || $specify_dynamic < 0 ) specify_fixedProvide a value for the fixed heuristicperl$choose_heuristic eq "fix-heur"Please enter a value for the fixed hueristicperl$choose_heuristic eq "fix-heur" && !defined $specify_fixedThe value for the fixed hueristic must be beteen 0 and 1 perl$choose_heuristic eq "fix-heur" && ($specify_fixed > 1 || $specify_fixed < 0)specify_queryseqNumber of query sequences to be read in at a time. (--chunk-size)perl(defined $specify_queryseq) ? "--chunk-size $specify_queryseq":"" employ_queryseqEmploy old style of branch length optimization during thorough insertion. (--raxml-blo)perl($value) ? "--raxml-blo":"" no_premaskDo not pre-mask sequences. (--no-pre-mask)perl($value) ? "--no-pre-mask":"" rate_scalersUse individual rate scalers? (--rate-scalers)offonautoperl"--rate-scalers $value"output_optsOutput Optionslikelihood_weightLikelihood weightfilter-acc-lwrfilter-min-lwrspecify_acclwAccumulated likelihood weightperl$likelihood_weight eq "filter-acc-lwr" perl"--filter-acc-lwr $specify_acclw"specify_minlwMinimum likelihood weightperl$likelihood_weight eq "filter-min-lwr" perl"--filter-min-lwr $specify_minlw"0.01specify_minplaceMinimum number of placements per sequence to include (--filter-min)perl(defined $specify_minplace) ? "--filter-min $specify_minplace":"" 1specify_maxplaceMaximum number of placements per sequence to include (--filter-max)perl(defined $specify_maxplace) ? "--filter-max $specify_maxplace":"" 7specify_precisionOutput decimal point precision for floating point (--precision)perl(defined $specify_precision) ? "--precision $specify_precision":"" 10preserve_rootingPreserve rooting (--preserve-rooting)perl($value) ? "--preserve-rooting on":"--preserve-rooting off" Preserve the rooting of rooted trees. When disabled, EPA-ng will print the result as an unrooted tree.