DIFFSEQ

]> DIFFSEQ Find differences between nearly identical sequences (EMBOSS) alignment:differences http://bioweb.pasteur.fr/docs/EMBOSS/diffseq.html diffseq &emboss_init; input Input section asequence asequence -- any [single sequence] (-asequence) perl " -asequence=$value" 1 any 2 4 14 seqfile perl 1 bsequence bsequence [single sequence] (-bsequence) perl " -bsequence=$value" 2 acd @($(acdprotein) ? stopprotein : nucleotide) 2 4 14 required Required section wordsize Word size (-wordsize) 10 perl " -wordsize=$value" 3 The similar regions between the two sequences are found by creating a hash table of 'wordsize'd subsequences. 10 is a reasonable default. Making this value larger (20?) may speed up the program slightly, but will mean that any two differences within 'wordsize' of each other will be grouped as a single region of difference. This value may be made smaller (4?) to improve the resolution of nearby differences, but the program will go much slower. 2 output Output section outfile outfile (-outfile) outfile.out perl " -outfile=$value" 4 afeatout afeatout (-afeatout) acd $asequence.name.diffgff perl " -afeatout=$value" 6 File for output of first sequence's features afeatout_offormat Feature output format (-offormat) perl ($value)? " -offormat=$value" : "" embl embl gff gff swiss swiss pir pir nbrf nbrf gff 6 bfeatout bfeatout (-bfeatout) acd $bsequence.name.diffgff perl " -bfeatout=$value" 7 File for output of second sequence's features bfeatout_offormat Feature output format (-offormat) perl ($value)? " -offormat=$value" : "" embl embl gff gff swiss swiss pir pir nbrf nbrf gff 7 columns Output in columns format (-columns) 0 perl ($value)? " -columns" : "" 8 The default format for the output report file is to have several lines per difference giving the sequence positions, sequences and features. <BR> If this option is set true then the output report file's format is changed to a set of columns and no feature information is given. auto perl " -auto -stdout" 9