Genetree on XSEDE

Genetree on XSEDE 8.3 Estimation of mutation, migration and growth rates, and ancestral inference Bob Griffiths Simulating probability distributions in the coalescent. Theor. Popn. Biol., 46, 131-159, 1994. R. C. Griffiths, S. Tavaré Phylogeny / Alignment genetree_xsede infile Input sequence file haplotype_dat genetree_invocation perl "/expanse/projects/ngbt/opt/comet/genetree/gtree/genetree haplotype_dat -z" 0 genetree_invocation2 perl

"&& /expanse/projects/ngbt/opt/comet/genetree/gtree/genetree haplotype_dat.# $theta_val $run_val $seed_val -A mutation_age.# -M genetree_temp.#"

2 genetree_invocation3 perl "; wait ; chmod 770 ./output.sh ; wait ; sleep 2 ; convert_parallel.sh output.sh" 95 genetree_invocation4 perl "; wait ; sleep 2 ; ./output.sh" 99 genetree_scheduler scheduler.conf perl $run_val < 3000 perl


									"threads_per_process=1\\n" .
									"mem=5G\\n" .
									"node_exclusive=0\\n" .
									"nodes=1\\n"

genetree_scheduler2 scheduler.conf perl $run_val > 2999 perl


									"threads_per_process=18\\n" .
									"mem=36G\\n" .
									"node_exclusive=0\\n" .
									"nodes=1\\n"

all_output All output * runtime 1 scheduler.conf Maximum Hours to Run (click here for help setting this correctly) perl "runhours=$value\\n" 0.25 Maximum Hours to Run must be less than 168 perl $runtime > 168.0 Maximum Hours to Run must be greater than 0.1 perl $runtime < 0.1 The job will run on 1 processor as configured. If it runs for the entire configured time, it will consume $numthreads x $runtime cpu hours perl $runtime ne 0 Estimate the maximum time your job will need to run. We recommend testimg initially with a < 0.5hr test run because Jobs set for 0.5 h or less dependably run immediately in the "debug" queue. Once you are sure the configuration is correct, you then increase the time. The reason is that jobs > 0.5 h are submitted to the "normal" queue, where jobs configured for 1 or a few hours times may run sooner than jobs configured for the full 168 hours. theta_val Specify a value for theta 3 run_val Specify the number of runs 3 seed_val Specify a seed value 3 make_batchfile Make a batch file (-b) 4 perl ($value) ? "-b":"" multifile_sum Multiple file summary (-l) 4 perl ($value) ? "-l":"" run_allrootedtrees Provide a migration rate file (matrix of rates, diagonals zero) (-m) 5 infile.mig perl (defined $value) ? "-m infile.mig":"" allow_popsizes Provide subpopulation relative size file (array of rates) (-p) 6 pop.dat perl (defined $value) ? "-p pop.dat":"" num_subpops Number of subpopulations (default - number in tree_file) (-s) 7 perl (defined $value) ? "-s $value":"" num_segsites Number of segregating sites + 1 (-Z) 7 perl (defined $value) ? "-Z $value":"" max_eventspersim Maximum events in one simulation run (default - 500) (-x) 8 pop.dat perl (defined $value) ? "-x $value":"" max_typesanc Maximum types in the ancestry of the sample (-y) 8 pop.dat perl (defined $value) ? "-y $value":"" output_level Output Level (-o) 8 1 likelihood 2 show tree 3 TMRCA 4 age of mutations 5 MRCA distribution in sub-populations 6 Mutation distribution in sub-population 7 Subpopulation TMRCA perl (defined $value) ? "-o $value":"" surface_outfile Surface outfile_name (g m theta likelihood sd_like) (-f) 9 surf.file perl (defined $value) ? "-f surf.file":"" theta_1 Theta value 1 (-g) 9 perl (defined $value) ? "-g $theta_1 $theta_2 $surface_1 $surface_2":"" theta_2 Theta value 2 surface_1 Surface point 1 surface_2 Surface point 2 outfile_name_val Specify the output file name perl "> output.sh" 10