PICARD on XSEDE
2.3.0
Tools for manipulating high-throughput sequencing (HTS) data and formats
http://broadinstitute.github.io/picard/
Assembly:Assemble_reads
picard_xsede
picard_sortsam
perl
$run_sortsam
perl
"picard_expanse picard VALIDATION_STRINGENCY=LENIENT MAX_RECORDS_IN_RAM=7500000 INPUT=A1.sam OUTPUT=A1.bam.sorted.bam SO=coordinate"
1
picard_sortsam_and
perl
$run_sortsam && $run_markduplicates
perl
"&&"
2
picard_markduplicates
perl
$run_markduplicates
perl
"picard_expanse picard MarkDuplicates INPUT=A1.bam.sorted.bam OUTPUT=A1.bam.sorted_marked.bam METRICS_FILE=metrics.txt OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 CREATE_INDEX=true TMP_DIR=/tmp"
3
picard_readgroups_and
perl
$run_addreadgroups && $run_markduplicates
perl
"&&"
4
picard_addreadgroups
perl
$run_addreadgroups
perl
"picard_expanse picard AddOrReplaceReadGroups I=A1.bam.sorted_marked.bam O=A1.bam.sorted_marked_readgroups.bam TMP_DIR=tmp SORT_ORDER=coordinate RGID=$sample RGLB=$sample RGPL=illumina RGPU=$sample RGSM=$sample CREATE_INDEX=True VALIDATION_STRINGENCY=LENIENT"
5
infile
Input fasta file
A1.sam
picard_scheduler
scheduler.conf
perl
"threads_per_process=1\\n" .
"node_exclusive=0\\n" .
"mem=2G\\n" .
"nodes=1\\n"
0
allresults
*
runtime
1
scheduler.conf
Maximum Hours to Run (click here for help setting this correctly)
0.25
Estimate the maximum time your job will need to run. We recommend testimg initially with a < 0.5hr test run because Jobs set for 0.5 h or less depedendably run immediately in the "debug" queue.
Once you are sure the configuration is correct, you then increase the time. The reason is that jobs > 0.5 h are submitted to the "normal" queue, where jobs configured for 1 or a few hours times may
run sooner than jobs configured for the full 168 hours.
Maximum Hours to Run must be less than 168
perl
$runtime > 168.0
Maximum Hours to Run must be greater than 0.1
perl
$runtime < 0.1
Sorry, you cannot run SortSam and Addreadgroups unless you also run Markduplicates
perl
$run_addreadgroups && $run_sortsam && !$run_markduplicates
The job will run on 1 processors as configured. If it runs for the entire configured time, it will consume 1 x $runtime cpu hours
perl
defined $runtime
perl
"runhours=$value\\n"
run_sortsam
Run SortSam command
1
run_markduplicates
Run Markduplicates command
1
run_addreadgroups
Run Addreadgroups command
1
infile2
Select Input for Markduplicates
perl
$run_markduplicates && !$run_sortsam
A1.bam.sorted.bam
Please enter a file for the Markduplicates stage
perl
!defined $infile2 && $run_markduplicates && !$run_sortsam
infile3
Select Input for Addreadgroups command
perl
$run_addreadgroups && !$run_markduplicates && !$run_sortsam
A1.bam.sorted_marked.bam
Please enter a file for the Addredgroups stage
perl
!defined $infile3 && $run_addreadgroups && !$run_markduplicates && !$run_sortsam
sample
sample name
Name of the sample (e.g. A3)