Cutadapt on XSEDE2.10Remove adapter sequences from high-throughput sequencing readsMarcel MartinMarcel Martin (2011) Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.Journal, 17(1):10-12. http://dx.doi.org/10.14806/ej.17.1.200
Assembly:Assemble_readscutadapt_xsedecutadapt_invokeperl"/projects/ps-ngbt/home/cipres/.local/bin/cutadapt"1infile_placeholder97placeholder.txtinfile_notcompressed2Input fasta file97perl!$compressed_inputperl"input.fastq"infile_compressed297perl$compressed_inputperl"input.fastq.gz"cutadapt_schedulerscheduler.confperl$num_cores <24perl
"threads_per_process=$num_cores\\n" .
"node_exclusive=0\\n" .
"nodes=1\\n"
0cutadapt_scheduler2scheduler.confperl$num_cores == 24perl
"threads_per_process=24\\n" .
"node_exclusive=1\\n" .
"nodes=1\\n"
0allresults*runtime1scheduler.confMaximum Hours to Run (click here for help setting this correctly)0.25Estimate the maximum time your job will need to run. We recommend testimg initially with a < 0.5hr test run because Jobs set for 0.5 h or less depedendably run immediately in the "debug" queue.
Once you are sure the configuration is correct, you then increase the time. The reason is that jobs > 0.5 h are submitted to the "normal" queue, where jobs configured for 1 or a few hours times may
run sooner than jobs configured for the full 168 hours.
Maximum Hours to Run must be less than 168perl$runtime > 168.0Maximum Hours to Run must be greater than 0.1 perl$runtime < 0.1The job will run on $num_cores processors as configured. If it runs for the entire configured time, it will consume $num_cores x $runtime cpu hoursperldefined $runtimeperl"runhours=$value\\n"num_coresHow many cores?3perl(defined $value) ? "-j $num_cores":""1The number of cores must be less than or equal to 24perl$num_cores > 24paired_endsI have paired end readscompressed_inputMy input is compressedinfile_compressedSelect your first input file (compressed)perl$compressed_input97input.fastq.gzPlease select your first input file (compressed)perl$compressed_input && !defined $infile_compressedinfile_notcompressedSelect your first input file (not compressed)perl!$compressed_input97input.fastqPlease select your first input file (not compressed)perl!$compressed_input && !defined $infile_notcompressedpaired_endfilecomprSelect your second paired end reads file (compressed)99perl$paired_ends && $compressed_inputperl"input2.fastq.gz"input2.fastq.gzPlease select your second paired end reads file (compressed)perl$paired_ends && $compressed_input && !defined $paired_endfilecomprpaired_endfilenotcomprSelect your second paired end reads file (not compressed)99perl!$compressed_input && $paired_endsperl"input2.fastq"input2.fastqPlease select your second paired end reads file (not compressed)perl$paired_ends && !$compressed_input && !defined $paired_endfilenotcomprspecify_3adapterEnter the 3' adapter2perl(defined $value) ? "-a $value":"" Sequence of an adapter ligated to the 3 end (paired data: of the first read). The adapter and subsequent
bases are trimmed. If a $ character is appended (anchoring), the adapter is only found if it is a suffix of the read.specify_3adapterpairedEnter the 3' adapter -A for paired-end reads2perl$paired_endsperl(defined $value) ? "-A $value":"" Sequence of an adapter ligated to the 3 end (paired data: of the first read). The adapter and subsequent
bases are trimmed. If a $ character is appended (anchoring), the adapter is only found if it is a suffix of the read.specify_5adapterEnter the 5' adapter2perl(defined $value) ? "-g $value":""5_adapterSequence of an adapter ligated to the 5 end (paired data: of the first read). The adapter and any preceding bases are trimmed. Partial matches at the 5'
end are allowed. If a ^ character is prepended (anchoring), the adapter is only found if it is a prefix of the read.specify_5adapterpairedEnter the 5' adapter for paired end reads (-G)2perl$paired_endsperl(defined $value) ? "-G $value":""5_adapterSequence of an adapter ligated to the 5 end (paired data: of the first read). The adapter and any preceding bases are trimmed. Partial matches at the 5'
end are allowed. If a ^ character is prepended (anchoring), the adapter is only found if it is a prefix of the read.specify_anywhichwayadapterEnter an adapter that may be 3' or 5'2perl(defined $value) ? "-b $value":""35_adapterSequence of an adapter that may be ligated to the 5'or 3' end (paired data: of the first read). Both types of matches as described under -a und -g are allowed.
If the first base of the read is part of the match, the behavior is as with -g, otherwise as with -a. This option is mostly for rescuing failed library
preparations - do not use if you know which end your adapter was ligated to! specify_anywhichwayadapterpairedEnter an adapter that may be 3' or 5' for paired end reads (-B)2perl$paired_endsperl(defined $value) ? "-B $value":""35_adapterSequence of an adapter that may be ligated to the 5'or 3' end (paired data: of the first read). Both types of matches as described under -a und -g are allowed.
If the first base of the read is part of the match, the behavior is as with -g, otherwise as with -a. This option is mostly for rescuing failed library
preparations - do not use if you know which end your adapter was ligated to! specify_maxerrorrateMaximum allowed error rate (-e, --error-rate)9perl($value ne $vdef) ? "-e $specify_maxerrorrate":""0.1Maximum allowed error rate as value between 0 and 1 (no. of errors divided by length of matching region). Default: 0.1 (=10%) only_mismatchesAllow only mismatches in alignments11perl($value) ? "--no-indels":""specify_removenadaptersRemove up to COUNT adapters from each read (-n, --times)13perl($value ne $vdef) ? "-n $specify_removenadapters":""1overlap_minlengthRequire MINLENGTH overlap between read and adapter (--overlap)15perl($value ne $vdef) ? "-O $overlap_minlength":""3Require MINLENGTH overlap between read and adapter for an adapter to be found. (--overlap) Default=3interpret_wildcardsInterpret IUPAC wildcards in reads17perl($value) ? "--match-read-wildcards":""donot_interpretwildcardsDo not interpret IUPAC wildcards in adapters.19perl($value) ? "--no-match-adapter-wildcards":""specify_adapterhandlingWhat to do with found adapters?21trimmasklowercasenoneperl(defined $value) ? "--action $value":""trimcheck_reversecompCheck read AND reverse complement for adapter matches23perl$paired_endsperl($value) ? "--rc":"" Check both the read and its reverse complement for adapter matches. If match is on reverse-complemented
version, output that one. Default: check only readadditional_readmodsAdditional read modificationscut_lengthRemove this many bases from each read (-u)25perl(defined $value) ? "-u $cut_length":""Remove bases from each read (first read only if paired). If LENGTH is positive, remove bases from the
beginning. If LENGTH is negative, remove bases from the end. Can be used twice if LENGTHs have different signs. This is applied *before* adapter trimmingnextseq_trimNumber of nucleotides removed via NextSeq-specific quality trimming (each read; --nextseq-trim).27perl(defined $value) ? "--nextseq-trim=$value":""NextSeq-specific quality trimming (each read). Trims also dark cycles appearing as high-quality G bases. quality_cutoffTrim low-quality bases from 5' and/or 3' ends -q ([5',]3prime).27perl(defined $value) ? "-q $value":""Trim low-quality bases from 5 and/or 3 ends of each read before adapter removal. Applied to both reads if
data is paired. If one value is given, only the 3 end is trimmed. If two comma-separated cutoffs are given,
the 5 end is trimmed with the first cutoff, the 3 end with the second. quality_baseAssume that quality values in FASTQ are encoded as ascii (quality + N)27perl(defined $value) ? "--quality-base $value":""33Assume that quality values in FASTQ are encoded as ascii(quality + N). This needs to be set to 64 for
some old Illumina FASTQ files. Default: 33 specify_lengthShorten reads to LENGTH (-l, --length)29perl(defined $value) ? "-l $value":""Shorten reads to LENGTH. Positive values remove bases at the end while negative ones remove bases at the
beginning. This and the following modifications are applied after adapter trimming. specify_trimnTrim N's on ends of reads. (--trim-n)29perl($value) ? "--trim-n":""specify_lengthtagSearch for TAG followed by a decimal number (--length-tag)31perl(defined $value) ? "--length-tag $value":""Search for TAG followed by a decimal number in the description field of the read. Replace the decimal
number with the correct length of the trimmed read.For example, use --length-tag 'length=' to correct fields like 'length=123'. specify_stripsuffix Remove this suffix from read names if present (--strip-suffix)31perl(defined $value) ? "--strip-suffix $value":""specify_addprefixAdd this prefix to read names (-x)31perl(defined $value) ? "-x $value":""specify_addsuffixAdd this suffix to read names (-y)31perl(defined $value) ? "-y $value":""specify_negtozeroChange negative quality values to zero (-z)31perl($value) ? "-z":""filtering_processedreadsFiltering of processed readsspecify_discardlengthDiscard reads shorter than LEN (-m; minimum)perl!$paired_ends55perl(defined $value) ? "-m $value":""specify_maxdiscardlengthDiscard reads longer than LEN (-M; maximum)57perl!$paired_endsperl(defined $value) ? "-M $value":""specify_maxnbases Discard reads with more than COUNT 'N' bases. (--max-n)59perl!$paired_endsperl(defined $value) ? "--max-n $value":""Discard reads with more than COUNT 'N' bases. If COUNT is a number between 0 and 1, it is interpreted as a
fraction of the read length. discard_maxerrsDiscard reads whose expected number of errors exceeds (--max-expected-errors)61perl!$paired_endsperl(defined $value) ? "--max-expected-errors $value":""Discard reads whose expected number of errors (computed from quality values) exceeds ERRORSdiscard_trimmedDiscard reads that contain an adapter (--discard-trimmed)63perl!$paired_endsperl($value) ? "--discard-trimmed":""Discard reads that contain an adapter. Use also -O to avoid discarding too many randomly matching readdiscard_untrimmedDiscard untrimmed reads (--discard-untrimmed)65perl!$paired_endsperl($value) ? "--discard-untrimmed":""Discard reads that do not contain an adapter.discard_casavaDiscard reads that did not pass CASAVA filtering (--discard-casava)67perl!$paired_endsperl($value) ? "--discard-casava":""Search for TAG followed by a decimal number in the description field of the read. Replace the decimal
number with the correct length of the trimmed read.For example, use --length-tag 'length=' to correct fields like 'length=123'. filtering_pairedreadsFiltering of paired end readsspecify_paireddiscardlengthDiscard paired end reads shorter than LEN (-m, --minimum-length LEN:LEN )perl$paired_ends55perl(defined $value) ? "-m $value":"" When trimming paired-end reads, the minimum lengths for R1 and R2 can be specified separately by separating them with a colon (:). If the colon syntax is not used, the same minimum length applies to both reads, as discussed above. Also, one of the values can be omitted to impose no restrictions. For example, with -m 17:, the length of R1 must be at least 17, but the length of R2 is ignored.specify_pairedmaxlengthDiscard paired end reads longer than LEN (-M, --maximum-length LEN:LEN )57perl$paired_endsperl(defined $value) ? "-M $value":"" When trimming paired-end reads, the maximum lengths for R1 and R2 can be specified separately by separating them with a colon (:). If the colon syntax is not used, the same minimum length applies to both reads, as discussed above. Also, one of the values can be omitted to impose no restrictions. For example, with -m 17:, the length of R1 must be at least 17, but the length of R2 is ignored.specify_pairedreadfilterWhich reads in paired-end read have to match the filtering criterion (--discard-casava)67perl$paired_endsanybothfirstperl($value) ? "--pair-filter=$value":""Which of the reads in a paired-end read have to match the filtering criterion in order for it to be filtered. output_optionsOutput optionsspecify_fullreportPrint full report (unchecked gives minimal)81perl($value) ? "--report full":"--report minimal"specify_outputfileWrite trimmed reads to FILE (-o)81perl(defined $value) ? "-o $value":""Please enter a name for the trimmed reads output fileperl!defined $specify_outputfile Write trimmed reads to FILE. FASTQ or FASTA format is chosen depending on input. Summary report is sent to
standard output. Use '{name}' for demultiplexing (see docs). Default: write to standard outputspecify_pairedoutputfileWrite trimmed paired end reads to FILE (-p)perl$paired_ends81perl(defined $value) ? "-p $value":""Please enter a name for the second paired end trimmed reads output fileperl$paired_ends && !defined $specify_pairedoutputfile Write trimmed reads to FILE. FASTQ or FASTA format is chosen depending on input. Summary report is sent to
standard output. Use '{name}' for demultiplexing (see docs). Default: write to standard outputspecify_fastaoutOutput FASTA to standard output even on FASTQ input (--fasta)83perl($value) ? "--fasta":""specify_compressionUse compression level 1 for gzipped output files (-Z)85perl($value) ? "--fasta":"" Use compression level 1 for gzipped output files (faster, but uses more space) specify_infofileWrite information about each read and its adapter matches into FILE (--info-file)87perl(defined $value) ? "--info-file $value":""Write information about each read and its adapter matches into FILE. See the documentation for the file specify_restfileWrite information about each read and its adapter matches into FILE (-r)89perl(defined $value) ? "-r $value":""When the adapter matches in the middle of a read, write the rest (after the adapter) to FILE.specify_tooshortfileWrite reads that are too short into FILE (--too-short-output)89perl(defined $value) ? "--too-short-output $value":""Write reads that are too short (according to length specified by -m) to FILE. Default: discard reads specify_untooshortpairedreadoutfileWrite the second read in a pair to this file if pair is too short (--too-short-paired-output)67perl$paired_endsperl($value) ? "--untrimmed-paired-output too-short-paired-output.txt":""too-short-paired-output.txtWrite the second read in a pair to this file if pair is too short. Use together with --too-short-output.specify_toolongfileWrite reads that are too long into FILE (--too-long-output)89perl(defined $value) ? "--too-long-output $value":"" Write reads that are too long (according to length specified by -M) to FILE. Default: discard readsspecify_toolongpairedreadoutfileWrite too long second reads to a file (--toolong-paired-output)67perl$paired_endsperl($value) ? "--too-long-paired-output toolong_paired_outfile.txt":""toolong_paired_outfile.txtSearch for TAG followed by a decimal number in the description field of the read. Replace the decimal
number with the correct length of the trimmed read.For example, use --length-tag 'length=' to correct fields like 'length=123'. specify_untrimmedfileWrite reads that do not contain any adapter into FILE (--too-untrimmed-output)89perl(defined $value) ? "--too-untrimmed-output $value":""Write reads that do not contain any adapter to FILE Default: discard readsspecify_untrimmedpairedreadoutfileWrite paired reads that do not contain any adapter into a file (--untrimmed-paired-output)67perl$paired_endsperl($value) ? "--untrimmed-paired-output untrimmed_paired_outfile.txt":""untrimmed_paired_outfile.txt