PICARD on XSEDE2.2.8Tools for manipulating high-throughput sequencing (HTS) data and formatshttp://broadinstitute.github.io/picard/Assembly:Assemble_readspicard_xsedepicard_sortsamperl$run_sortsamperl"picard_expanse picard SortSam VALIDATION_STRINGENCY=LENIENT MAX_RECORDS_IN_RAM=7500000 INPUT=A1.sam OUTPUT=A1.bam.sorted.bam $select_sortorder"1picard_sortsam_andperl$run_sortsam && $run_markduplicatesperl"&&"2picard_sammarkduplicatesperl$run_sortsam && $run_markduplicatesperl"picard_expanse picard MarkDuplicates INPUT=A1.bam.sorted.bam OUTPUT=A1.bam.sorted_marked.bam METRICS_FILE=metrics.txt OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 CREATE_INDEX=true TMP_DIR=/tmp"3picard_markduplicatesperl!$run_sortsam && $run_markduplicatesperl"picard_expanse picard MarkDuplicates INPUT=A1.bam.sorted.bam OUTPUT=A1.bam.sorted_marked.bam METRICS_FILE=metrics.txt OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 CREATE_INDEX=true TMP_DIR=/tmp"3picard_readgroups_andperl$run_addreadgroups && $run_markduplicatesperl"&&"4picard_addreadgroupsperl$run_addreadgroupsperl"picard_expanse picard AddOrReplaceReadGroups I=A1.bam.sorted_marked.bam O=A1.bam.sorted_marked_readgroups.bam TMP_DIR=tmp SORT_ORDER=coordinate RGID=$sample RGLB=$sample RGPL=illumina RGPU=$sample RGSM=$sample CREATE_INDEX=True VALIDATION_STRINGENCY=LENIENT"5infileInput fasta fileA1.sampicard_schedulerscheduler.confperl
"threads_per_process=1\\n" .
"node_exclusive=0\\n" .
"mem=2G\\n" .
"nodes=1\\n"
0allresults*runtime1scheduler.confMaximum Hours to Run (click here for help setting this correctly)0.25Estimate the maximum time your job will need to run. We recommend testimg initially with a < 0.5hr test run because Jobs set for 0.5 h or less depedendably run immediately in the "debug" queue.
Once you are sure the configuration is correct, you then increase the time. The reason is that jobs > 0.5 h are submitted to the "normal" queue, where jobs configured for 1 or a few hours times may
run sooner than jobs configured for the full 168 hours.
Maximum Hours to Run must be less than 168perl$runtime > 168.0Maximum Hours to Run must be greater than 0.1 perl$runtime < 0.1Sorry, you cannot run SortSam and Addreadgroups unless you also run Markduplicatesperl$run_addreadgroups && $run_sortsam && !$run_markduplicatesThe job will run on 1 processors as configured. If it runs for the entire configured time, it will consume 1 x $runtime cpu hoursperldefined $runtimeperl"runhours=$value\\n"run_sortsamRun SortSam command1select_sortorderSort Order of Output Fileunsortedquerynamecoordinateduplicateunknowncoordinateperl($value ne $vdef) ? "SO=$value":'"run_addreadgroupsRun AddReadgroups command1mark_duplicatesparaMark Duplicatesrun_markduplicatesRun MarkDuplicates command1infile2Select Input for MarkDuplicatesperl$run_markduplicates && !$run_sortsammarkdups.bamPlease enter a file for the MarkDuplicates stageperl!defined $infile2 && $run_markduplicates && !$run_sortsamTo use the MarkDuplicates command without SortSam, the input file must be "coordinate" sortedperl$run_markduplicates && !$run_sortsaminfile3Select Input for Addreadgroups commandperl$run_addreadgroups && !$run_markduplicates && !$run_sortsamA1.bam.sorted_marked.bamPlease enter a file for the AddReadGroups stageperl!defined $infile3 && $run_addreadgroups && !$run_markduplicates && !$run_sortsamoutfileOutfile NameThis is not the fasta file. This is the sam/bam file.samplesample nameName of the sample (e.g. A3)