Section 9-4: Comparison Programs

[ Previous chapter ][ This chapter ][ Next chapter ] Programs may distinguish between the


Subsection 9.4.1

Two Sequences of Similar Length

Best suited for comparing homologous sequences from different species, or similar sequences with approximately the same length:

% gap


Subsection 9.4.2

Two Sequences of Different Length

Best suited for comparing sequences discovered in searches, or sequences with site homology rather than integral similarity.

% bestfit


Subsection 9.4.3

DNA and Protein Sequences

Best suited for comparing sequences discovered in searches, or sequences with site homology and a suspicious reading frame shift.

% framealign


Subsection 9.4.4

Programs to Display Two Aligned Sequences, Text

publish can display alignments (DNA or protein) in formatted fashion.


Subsection 9.4.5

Programs to Display Two Aligned Sequences, Graphics

Best suited for visualising overlaps or regions of homology. Needs Graphics - remember to have set the graphics environment with setplot correctly if you work with GCG locally. X-Windows setups have to set the DISPLAY environment correctly.

% gap -out

or

% bestfit -out

Next, display the graphics using the ".out" files generated by the commands given above.

% gapshow


Subsection 9.4.6

Significance Evaluation

Preparation of Data

Randomisation during Alignment

Best suited for estimating whether the alignment produced is (statistically) significant. Should be used with significantly more than the default 10 randomisation's (try at least 50).

% gap -ran=50

or

% bestfit -ran=50



================================= Begin Exercise 10

Pairwise sequence analysis: Understand the use of comparison matrices in the alignment procedure of protein sequences. Apply different algorithms to the sequences obtained from DNA after translation, and evaluate significance of the result on both DNA and Protein level.

Using previous exercise results, you should have two DNA sequences by now: my1.seq as the typed-in sequence, and my2.seq as the reading-frame extracted DNA sequence from the seqed exercise. You shall compare these two sequences now on DNA and protein level. If you haven't translated the sequences to protein level already, you should do this now.

To solve this problem, follow this schedule:

================================= End Exercise 10


[next page] , or [overview] , or [table of contents]