OUTPUT
[ Previous | Top | Next ] 
The output from FastX is a list file, and is suitable for input to any GCG program that allows indirect file specifications. (For information about indirect file specification, see Chapter 2, Using Sequence Files and Databases of the User's Guide.) 

Here is some of the output file:

!!SEQUENCE_LIST 1.0

(Nucleotide) FASTX of: singlepass.seq  from: 1 to: 102  September 25, 1998 13:36

A fragment of ggammacod.seq with simulated frameshift errors.
Human fetal beta globins G and A gamma
from Shen, Slightom and Smithies,  Cell 26; 191-203.
Analyzed by Smithies et al. Cell 26; 345-353.

 TO: PIR:*  Sequences:    109,075  Symbols: 34,814,664  Word Size: 2

 Databases searched:
   NBRF, Release 57.0, Released on 30Jun1998, Formatted on 18Aug1998

 Searching with both strands of the query.
 Scoring matrix: GenRunData:Blosum50.Cmp
 Constant pamfactor used
 Gap creation penalty: 15  Gap extension penalty: 2  Frameshift penalty: 20

Histogram Key:
 Each histogram symbol represents 327 search set sequences
 Each inset symbol represents 7 search set sequences
 z-scores computed from opt scores

z-score obs    exp
        (=)    (*)

< 20  1731     0:======
  22     6     0:=
  24    44     0:=
  26    93     5:*
  28   246    49:*
  30   664   300:*==
  32  1804  1160:===*==
  34  4373  3146:=========*====
  36  7914  6461:===================*=====
  38 12508 10677:================================*======
  40 17296 14893:=============================================*=======
  42 19386 18205:=======================================================*====
  44 19567 20082:===========================================================*
  46 19089 20454:===========================================================*
  48 17629 19583:======================================================     *
  50 15834 17869:=================================================     *
  52 14363 15710:============================================    *
  54 12398 13419:======================================   *
  56 10183 11209:================================  *
  58  8834  9202:============================*
  60  7074  7455:======================*
  62  5747  5976:==================*
  64  4660  4753:==============*
  66  3865  3757:===========*
  68  3084  2955:=========*
  70  2187  2316:=======*
  72  1735  1809:=====*
  74  1306  1411:====*
  76  1079  1098:===*
  78   868   853:==*
  80   559   663:==*
  82   467   507:=*
  84   352   402:=*
  86   239   311:*
  88   178   240:*
  90   131   186:*
  92    92   144:*         :==============      *
  94    80   111:*         :============   *
  96    33    86:*         :=====       *
  98    28    67:*         :====     *
 100    27    52:*         :====   *
 102     8    40:*         :==   *
 104    15    31:*         :=== *
 106     5    24:*         :=  *
 108     7    18:*         := *
 110     4    14:*         :=*
 112     7    11:*         :=*
 114     3     9:*         :=*
 116     2     7:*         :*
 118     2     5:*         :*
>120   344     4:*=        :*=======================================

Joining threshold: 36, opt. threshold: 24, opt. width:  16, reg.-scaled

The best scores are:                    init1 initn   opt    z-sc E(217753)..

PIR2:A30213
! hemoglobin epsilon chain - North Am...  130   130   141   223.3  2.7e-05
PIR1:HGMQP
! hemoglobin gamma chain - pig-tailed...  129   129   140   221.8  3.3e-05
PIR1:HGBAY
! hemoglobin gamma chain - yellow baboon  129   129   140   221.8  3.3e-05

//////////////////////////////////////////////////////////////////////////

\\End of List

singlepass.seq
PIR2:A30213

P1;A30213 - hemoglobin epsilon chain - North American opossum
C;Species: Didelphis virginiana, Didelphis marsupialis virginiana (North
 American opossum)
C;Date: 18-Oct-1989 #sequence_revision 18-Oct-1989 #text_change 21-Nov-1997
C;Accession: A30213
R;Koop, B.F.; Goodman, M.
Proc. Natl. Acad. Sci. U.S.A. 85, 3893-3897, 1988 . . .

SCORES   Init1: 130   Initn: 130   Opt: 141   z-score: 223.3 E(): 2.7e-05
Smith-Waterman score: 141;    76.5% identity in 34 aa overlap

                10        20        30       39
singlepass.s LVVYP/WTQRFV\DSFGNLSSASASWATPXVKAH
             ||||| |||||  |||||||||||  ::| ||||
A30213       LVVYP-WTQRFF-DSFGNLSSASAVMGNPKVKAH
                    40         50        60

//////////////////////////////////////////////////////////////////////////

! Distributed over 1 thread.
!      Start time: Fri Sep 25 13:32:20 1998
! Completion time: Fri Sep 25 13:39:08 1998

! CPU time used:
!        Database scan:  0:02:29.6
! Post-scan processing:  0:00:04.5
!       Total CPU time:  0:02:34.1
! Output File: singlepass.fastx
What is the Output? 

The first part of the output file contains a histogram showing the distribution of the z-scores between the query and search set sequences. (See the ALGORITHM topic for an explanation of z-score.) The histogram is composed of bins of size 2 that are labeled according to the higher score for that bin (the leftmost column of the histogram). For example, the bin labeled 24 stores the number of sequence pairs that had scores of 23 or 24. 

The next two columns of the histogram list the number of z-scores that fell within each bin. The second column lists the number of z-scores observed in the search and the third column lists the number of z-scores that were expected. 

The body of the histogram displays a graphical representation of the score distributions. Equal signs (=) indicate the number of scores of that magnitude that were observed during the search, while asterisks (*) plot the number of scores of that magnitude that were expected. 

At the bottom of the histogram is a list of some of the parameters pertaining to the search. 

Below the histogram, FastX displays a listing of the best scores. Strand:- after the sequence name in this list indicates that the match was found between search set sequence and the reverse complement of the query sequence. 

Following the list of best scores, FastX displays the alignments of the regions of best overlap between the query and search sequences. /rev following the query sequence name indicates that the search sequence is aligned with the reverse complement of the query sequence. 

This program displays only the region of overlap between the two aligned sequences (plus some residues on either side of the region to provide context for the alignment) unless you use -SHOWall. The display of identities and conservative replacements between the aligned sequences depends on the value of -MARKx. By default ( -MARKx=3), the pipe character (|) is used to denote identities and the colon (:) to denote conservative replacements.