Section 8-6: Protein Tools

[ Previous chapter ][ This chapter ][ Next chapter ] There are various tools available which allow you to analyse single protein sequences.


Subsection 8.6.1

Secondary Structure Prediction

Principle

The desire to predict a secondary or even tertiary structure from the amino acid sequence is known as the folding problem. Unfortunately, there is no solution available at this point of time. Two approaches are in use:

Programs for secondary structure prediction

Remember that the prediction of secondary structure without a reasonable homology to three-dimensional data is rather unsafe. Programs which employ three-dimensional modelling techniques require special hardware (powerful computers) and dedicated software, hence, are beyond the scope of the BioCompanion .

The programs available to you in the desktop environment wil typically be restricted to secondary structure prediction from scratch. In order to display the secondary structure plots, you need to have a computer screen which is capable of displaying graphics. It is recommended that you have access to a colour graphics device if you want to run these programs. Remember to have set the graphics environment correctly with setplot if you work with GCG locally. X-Windows setups must have set the DISPLAY environment correctly.

To display several measures of secondary structure, use

% pepplot

To generate a table of several measures (with a comparison of Garnier-Robson and Chou-Fassman predictions), use

% peptidestructure

The generated output file can be plotted "two-dimensionally", but for serious inspection the one-dimensional plotting is recommended (use the corresponding menu option):

% plotstructure


Subsection 8.6.2

Visualisation of Secondary Structure

Given the assumption that the protein fragment adopts a helical structure, the program helicalwheel can be used.

The program moments plots a three-dimensional map which displays moments of hydrophathy in dependence of the sequence and the rotational angle of the peptide bond (90 - 110 degrees is OK for helices, 0 or 180 degrees is indicating chances for beta sheet).


Subsection 8.6.3

Fragmentation

The programs peptidemap and peptidesort work like the DNA counterparts .

The program peptidesort can also be used advantageously to determine the composition of a peptide. If the <SPACE BAR> is hit on the question "Which Enzymes?", no fragmentation is calculated. Rather, the composition is detailed to a much larger extent than the composition program will provide.


Subsection 8.6.4

Isoelectric Point

The isoelectric point of the denatured protein can be determined from the titration curve plotted by the program isoelectric .


Subsection 8.6.5

Simplification of Protein Sequences

Frequently, you might want to know where "acidic" or other regions of your protein sequence are located. As ambiguity symbols in the single-letter peptide alphabet are not defined, you might rewrite your sequence and use the window program in order to plot the result with statplot . The data for the simplify program are located in a file which you can get from the GCG program database with the command

% fetch simplify.txt

This file has a self-describing format, and basically will replace each amino acid listed in the second column with an amino acid listed in the first column:

 

  
 D   DEQN
  

  
will make all D, E, Q and N symbols convert to D. This might look biologically irrelevant but a good approach to get all acidic amino acids to read "D" - as these can be plotted now with the 'window/statplot' programs.

================================= Begin Exercise 7

Summary of single-sequence tools: Translate the sequence GENEMBL:M19311 in the determined reading frame, perform a secondary structure prediction from scratch, and plot the acidic amino acids as function of the sequence.

To use amino acid sequences, the computer needs a defined reading frame in the DNA sequence which allows the translation into a peptide sequence. The translated amino acids are written into a peptide sequence. The purpose of this exercise is to create the sequence M19311.pep and predict its secondary structure. Proceed as follows:

================================= End Exercise 7


[ previous chapter ],[ this chapter ][ next chapter ] , [next page/section] , or [overview] , or [table of contents]