GeneMark on ACCESS4.371Novel eukaryotic genomes can be analyzed by the self-training GeneMark-ES.Alexandre Lomsadze, Vardges Ter-Hovhannisyan, Yury O. Chernoff, Mark BorodovskyGeneMark-ES: Gene identification in novel eukaryotic genomes by self-training algorithm Nucl Acids Res (2005) 33, 6494-6506 Aseembly / Genecallinggenemark_xsedeinfile_fastaInput File (must be in fasta format)perl"--sequence infile.fa"99infile.fagenemark_invocationperl"genemark_4.371_expanse"0genemark_scheduler1scheduler.confperl$specify_cores < 128 perl
"threads_per_process=$specify_cores\\n" .
"mem=" . ($specify_cores * 2) . "G\\n" .
"node_exclusive=0\\n" .
"nodes=1\\n"
genemark_scheduler2scheduler.confperl$specify_cores > 64 perl
"threads_per_process=$specify_cores\\n" .
"mem=243G\\n" .
"node_exclusive=1\\n" .
"nodes=1\\n"
specify_cores2scheduler.confperl--cores $specify_cores3all_outputAll output*runtime1scheduler.confMaximum Hours to Run (click here for help setting this correctly)perl"runhours=$value\\n"0.25Maximum Hours to Run must be less than 168perl$runtime > 168.0Maximum Hours to Run must be greater than 0.1 perl$runtime < 0.1The job will run on 1 processor as configured. If it runs for the entire configured time, it will consume $specify_cores x $runtime cpu hoursperl$runtime ne 0 && $specify_cores == 1The job will run on 2 processors as configured. If it runs for the entire configured time, it will consume $specify_cores x $runtime cpu hoursperl$runtime ne 0 && $specify_cores == 2The job will run on 4 processors as configured. If it runs for the entire configured time, it will consume $specify_cores x $runtime cpu hoursperl$runtime ne 0 && $specify_cores == 4The job will run on 8 processors as configured. If it runs for the entire configured time, it will consume $specify_cores x $runtime cpu hoursperl$runtime ne 0 && $specify_cores == 8The job will run on 16 processors as configured. If it runs for the entire configured time, it will consume $specify_cores x $runtime cpu hoursperl$runtime ne 0 && $specify_cores == 16The job will run on 32 processors as configured. If it runs for the entire configured time, it will consume $specify_cores x $runtime cpu hoursperl$runtime ne 0 && $specify_cores == 32The job will run on 64 processors as configured. If it runs for the entire configured time, it will consume $specify_cores x $runtime cpu hoursperl$runtime ne 0 && $specify_cores == 64The job will run on 128 processors as configured. If it runs for the entire configured time, it will consume $specify_cores x $runtime cpu hoursperl$runtime ne 0 && $specify_cores == 128Estimate the maximum time your job will need to run. We recommend testimg initially with a < 0.5hr test run because Jobs set for 0.5 h or less dependably run immediately in the "debug" queue.
Once you are sure the configuration is correct, you then increase the time. The reason is that jobs > 0.5 h are submitted to the "normal" queue, where jobs configured for 1 or a few hours times may
run sooner than jobs configured for the full 168 hours.
specify_coresHow many cores?1248163264128perl"--cores $value"specify_analysisWhich analysis?EPESETES"--ES"ET"--ET RNAseq_alignment.gff"EP""ES5specify_etfileSpecify a file with inttron coordinatesperl$specify_analysis eq "ET"RNAseq_alignment.gffspecify_etscoreMinimum score of the intron for ET?perl$specify_analysis eq "ET" perl"--etscore $value"105specify_epfile1Specify a file with protein database in FASTA formatperl$specify_analysis eq "EP"protein_db.faperl(defined $specify_epfile1) ? "--EP protein_db.fa" : ""specify_epfile2Specify a file with intron coordinatesperl$specify_analysis eq "EP"protein_splice_alignment.gffperl(defined $specify_epfile2) ? "--EP protein_splice_alignment.gff" : ""specify_epscoreMinimum score of the intron for EP?perl$specify_analysis eq "EP" perl"--ep_score $value"4Please specify a value for ep_scoreperl$specify_analysis eq "EP" && !defined $specify_epscore5