JAMF ArchiveBioCompanion as published in 1995THIS IS THE REFERENCE CODE AS PUBLISHED. Doelz, R. Optimal production of biological documentation: the JAM format. Comput. Applic. Biosci. 11, 224-226 (1995).The version you are currently viewing is the one printed and distributed via the Internet from the server of BioComputing Basel. Version 3.1 of the BioCompanion was published with version 2 of the JAMF software. The server that was indicated in the documentation has ceased to exist. Version 3.2 of the BioCompanion was not publicly available for free but was shareware that was distributed with GCG's software release 9. For the purpose of enhanced editing, JAMF was partially rewritten and the proprietary version 3.x of JAMF was used from 1996 onwards. The Biocompanion is available in a current version from the publisher . It has significantly changed both in software and content. |
|||
|
|
|
||
The requirement to identify a single sequence in a database
is very much different from a keyword search. Keywords
are expected to match exactly. This type of keyword searching
has been described
earlier .
Pattern
searching programs will match a pattern exactly
at single, defined positions in a sequence.
A sequence searching program , however, is expected
to report and, most important, The major problem of sequence searching, therefore, is to
find a reasonable definition for the similarity of
sequences. Application programs doing "sequence searching"
in general imply that each entry
of a sequence database is compared to the query sequence sequentially,
and the result is a list of database entries which are similar
to the query sequence. The receipt of this program type
reads as follows:
A powerful heuristic of a fast screening can be expressed
as follows:
Two sequences are similar if a sufficient number of identical
oligomers is found at a given arrangement of the two.
The internal loop of a sequence searching program (see above)
requires the identification of a score, which
is a numerical value describing the quality of a sequence comparison.
The observed score depends on The numerical values of scores are specific for a given program
and must not be used for comparison between different algorithms
or result output. Unless a statistical significance or another
type of normalisation is computed, (see below) the score values
are relative
in relation to the search sequence and its algorithm. The following
example shall demonstrate how a heuristic sequence search algorithm
might be implemented. We do this on DNA level but no fundamental
difference will be seen in protein comparisons, as the heuristic
algorithms score with identities of short sequence fragments
rather than using
comparison tables . To a certain extent, the comparison
using identities of oligomers can be visualised as a
dotplot-type of comparison. We reuse the example of earlier
chapters, with
To improve sensitivity, we need to search the columns of identical
shift which occur at continuously increasing origin, such as
7,8,9 in the column of shift -4 above. Doing
this, we find the following table of diagonals, sorted by the
length of the observed diagonal
The 'fasta' program (as listed in the "programs"
section below) has the init scores in the first
two columns of its numerical output which refer to the crude
and the joined diagonals, respectively. Joined diagonals are
found by extending the segment. The extension will be counted
if adjacent diagonals can be joined to increase the overall score
of the newly found segment.
The 'blast' type of programs, also mentioned
below, will gain speed by determining the occurrence of oligomers
in the database sequences in a preprocessing step (to be executed
once only at formatting time of the database) before
the actual search is started. This reduces the need
to scan the database each time for every comparison. However,
as the pre-computation is done on the entire database, the arbitrary
search of subsections of the database will no longer be possible
without executing the format process into a 'blast' database).
After collecting sequences of interest, the ranking and
display of the found alignments is performed, which is also a
crucial step for successful result evaluation. The most difficult
problem of reliable sequence similarity searching is the proper
selection of criteria which describe statistical relevance
and map as advantageous as possible to biological
relevance.
The 'fasta' type of programs will redo the comparison
of the top-score entries as found in the database with a rigorous
algorithm (such in the GCG program bestfit and
in rigorous searches as described below). A more accurate alignment
of the database sequences with the query sequence, which
eventually goes along with a re-scoring of the
hits, will allow to increase the power of a sequence comparison.
It should be kept in mind that the re-scoring (in the 'fasta'
programs called the opt score) will
not be reflected in the listing order of top-scoring entries
as these are sorted for the so-called "initn" score (the joined
fragment score).
NOTE: The fasta 2.x version of this package adds very
comprehensive statistics. However, at the time of writing, the
GCG software did not include this version of the fasta package.
The 'blast' programs use a statistical approach
in order to list only sequence segments in the result which are
not expected by chance. To do this, it is necessary to determine
a score which can discriminate between "hit" and "random". The
information unit is a "bit" and might be explained as follows:
The amount of information stored in biological sequences is,
from a statistical point of view, a function of the sequence
composition. E.g., a poly-A stretch in a DNA sequence has the
information only
A and the length of the sequence. Information
can be measured in bits, which is either YES
or NO, or 0 or 1
in computer science world. A nucleotide, therefore,
which can be any of four characters, can be written in two
bits as the relevant information of four different symbols
requires four different combinations: The amount of data you will need to review after a sequence
search might be enormous. To help you evaluating the result,
most searching programs will permit to generate a histogram
which displays the scoring table graphically: The number
of found scores is plotted against the score. The very large
peak in the area of low scores can be attributed to "random"
hits which occur by chance. Depending on algorithm and sequence,
the hits of remote similarity will occur in the downhill area
of this peak. Sequences from other organisms score at higher
values, and the identity score is, typically, at vary high values
if a sequence is found in the database which matches the query
sequence directly. The following graph shall illustrate this
description of a search histogram: (NOTE that
the peak at low scores has been purposely truncated and is not
shown in its full size). NOTE: Unless a reliable statistical estimation is part
of the search program, you should investigate the region of the
beginning "statistical noise" carefully. However, keep in mind
that statistical relevance might not match biological requirements
and might be misleading.
The significance of searches can be improved if the search
is conducted on the protein level rather than
the DNA level. This is due to the fact that codon usage differences
between different algorithms will increase the number of mismatches
even if the protein resulting from the two different sequence
fragments will be identical. Programs are available which will
search the DNA sequence database after an on-the-fly translation
to all possible six reading frames. Doing these kinds of searches,
two major considerations will apply: The 'framesearch' program
of the GCG package takes care of this effect to some extend,
but utilises extreme resources for completion.
$ blast
This program will be able to do searches on protein level
if you use DNA sequence databases. The original NCBI
blast program package as available from the NCBI includes
the following programs: The GCG implementation of the 'blast' program
suite uses a single program - 'blast' - which
launches any of the programs mentioned above. (This is called
a front-end program).
The 'blast' suite is a program which
may run either locally or via network. The 'blast' system is
not implemented at the VMS site, but runs on the UNIX cluster
via the HASSLE system. Databases available at Basel, or via the
HASSLE system include: 1) The definition for "nr" might vary. Depending on the
location, you might use either GENBANK with an exclusion set
of EMBL data not in GENBANK, or use EMBL with an exclusion set
of GENBANK (e.g., in Basel). Depending on whether or not you
are connected to a network which is used to update the data on
a periodic basis, the "nr" set includes also daily updates. At
Basel, both EMBL and GENBANK are updated weekly, and EMBL is
the basis of exclusion using the NCBI's 'nrdb' program.
2)The definition of "nr" might vary from site to site. At Basel,
all available databases and their corresponding updates are
computed on a weekly basis with NCBI's 'nrdb' program. Additional databases are available at some sites, and will
be displayed at the menu if you start the 'blast' program suite.
It might happen that the 'blast' suite of programs is not enabled
or temporarily unavailable at the time of your command due to
resource limitations. It should be kept in mind that 'blast'
programs are good for screening but should always be supplemented
with other screening methods (e.g., swsearch
or MPsrch, see below) in order to confirm findings.
General purpose, fast and reliable searches are done using
$ fasta
The GCG implementation of the 'fasta' program
is typically a (s)lower version than the one distributed by the
Author William Pearson. The "original" version of the 'fasta'
program (i.e., not the version adopts by GCG) is much faster
and searches only one strand in DNA searches. However, version
2.x
and higher of the fasta package do an automatic statistical
evaluation of the result. Problems with the 'fasta' programs
are most usually observed when the database has been specified
incorrectly. Use the database name and a colon with an asterisk
to give the correct specification, following the instructions
below.
Databases available at Basel
University include: 1) The definition of GENEMBL can vary. Depending on the
location, you can use either GENBANK with an exclusion set of
EMBL data not found in GENBANK, or vice versa (e.g., in Basel).
Depending on whether you are connected to a network which is
used to update data on a periodic basis, the GENEMBL set may
include also daily updates.
2) containing weekly updates
3) PATCHX is updated quarterly and includes the previous release
of SWISSPROT, an automatic translation of EMBL, and some other
databases.
4) The definitions vary. XEMBL, EM_NEW, EMBL_DAILY, GB_NEW, XSWISS,
SW_NEW, PIR4, etc. are names that denote the character of the
preliminary entries.
5) This is a Basel-specific item. The main purpose of this database
is to find new data in the annotation, as updates rarely include
changes in the sequence. In order to have the main EMBL database
show not too many entries in FASTA runs, the XXEMBL database
is not included in the usual GENEMBL set.
6) This is a Basel-specific item. The weekly updated GENBANK
database is calculated against EMBL and XEMBL to find those entries
which are not in the EMBL updates yet. Additional databases are
available at Basel. Their names are displayed when you start
the molecular biology environment. Examples are Amos Bairoch's
PROSITE database of protein motifs, or Rich Robert's REBASE database
of restriction enzymes.
NOTE: The term GENEMBLPLUS, introduced in GCG version 8.1, is
equivalent to GENEMBL. This is a deviation from the standard
GCG installation which uses GENEMBL:* to describe all databases
except EST and STS sections.
If you need to search for peptide in the DNA database, this can
be achieved with the tfasta program, which
translates the DNA database on-the-fly into all possible reading
frames. This search is computationally more expensive but considered
to be more sensitive (see above, and below).
Very sensitive search program implementations use the "Smith
and Waterman" algorithms. In contrast to the heuristic methods
mentioned above, rigorous searching will compute a complete
alignment of each possible sequence pair of the query sequence
versus the database sequence. Depending on the program or implementation,
various matrices will be used that that time. Refer to the pairwise comparison section
for details.
The framesearch program of the GCG package
does this type of search for a protein sequence if a suitable
DNA library is specified.
Programs running the Smith and Waterman type of rigorous searching
might use quite a long time to achieve completion, or require
special hardware in order to complete in shorter time. In particular,
searches in DNA databases can take significant resources. Famous
programs are 'swsearch' on the Bioccelerator
and 'MPsrch' on the MasPar Computer.
If you happen to have access
to W.Pearson's sequence analysis software you could try the
ssearch program. This requires that you should first
convert the sequences of interest to STADEN format with the command
tostaden . Alternatively, you can use the
program readseq .
Also, via HASSLE, the programs by Coulson et al. (marketed by
IntelliGenetics, Inc.) are available as
$ mpsrch
If you can get hold of an alignment of several sequences, and
can produce a profile, use the program 'profilesearch' (see section
on patterns .).
Frequently, the combination of methods will get more comprehensive
results than a single search. Therefore, even if the first trial
of a sequence search produces apparently satisfactory results,
it is suggested to run all available methods. Additionally, the
following measures will help.
Use the sequence editor
seqed to create smaller sequences (100 bp, or 30 AA)
, or cut out frequently occurring parts such as ALU I repeats.
The following criteria might be used to split DNA sequences:
Determine the reading frame with the
single-sequence analysis methods (e.g., frames
and convert the DNA sequence to a protein with
map followed by extractpeptide ),
and run the search on protein level with tfasta
instead of DNA level.
Default settings are: Default setting: The software package 'fasta' from W.Pearson contains the
'prdf' program to analyse the results of arbitrary sequence pairs.
However, version
2.x of fasta does statistical analysis automatically.
If the EGCG package programs are installed
you might try fastacheck to check for significance
of the results obtained.
In case of doubt, you might use the
'bestfit' program with the randomise option.
Make sure that you give at least 200 randomisations to get a
reasonable statistical distribution. Alternatively, you might
use the shuffle program to generate a new
sequence with identical length and composition. However, as the
ordering of the symbols is different, the subsequent search should
give significantly different groups of hits than the original
search sequence.
If you assume that your sequence is similar to a given group
but failed to detect it with the selected search algorithm, you
might consider to run a "prototype" search and use the list of
sequences as subset (see below).
The fastalert program as developed by F.Eggenberger
at BioComputing Basel is a network application which will do
the statistical analysis for you.
Sequence similarity searches will result in a list of sequences
which is reported to be similar to the original. However, in
contrast to a pattern search, query sequences might be of considerable
length, and, therefore, show similarity to other sequences in
several regions. This requires that the inspection of the sequence
searching output is classified by sequence coordinates of the
query sequence. As no programs do currently exist which will
allow for an
automatic assignment, manual mapping of the detected sequence
features is required. This manual mapping might also go along
with the labelling of additional sequence features as revealed
by the method of single sequence
analysis .
If several hits are encountered in the result of a sequence
search, a close inspection of the actually occurring hits is
essential. It might sound trivial but a title of a sequence,
if listed in a search output, will not allow the conclusion that
the segment of similarity actually counts for the functionality
of spotted protein. Rather, a look in the annotation
of the sequence is required in order to confirm that
the segment of similarity is relevant for protein function. In
order to determine whether the similarity is accidental or meaningful,
the seqed sequence
editor might be used to partition the sequence of interest and
search the detected similarity as separate sequence. The following
Figure shall illustrate this technique schematically: Very low similarity of sequences might not be easily detected
if the search is performed in the entire database. Due to the
noise level of similarities scored by chance,
important matches might be missed. The use of filters
is essential in this case. A filter is any procedure
applied to reduce the total number of sequences searched, most
desirably using criteria which match the expectation of the performed
search. These might be
The basic difference in these methods is the way how the sublibraries
are addressed. Depending on the algorithm, some of the procedures
might not be available to you. E.g., the 'blast' database
searching programs will not allow the use of user-specified subsets.
There is a special manual
of the GCG package which will tell you about database sub-libraries
(see below). Depending whether your site honors the EMBL database
or the GENBANK database as base set, the corresponding counterpart
will be available as subset. This results in the effect that
the GCG program package always has both EMBL and GENBANK logicals
defined even if a subset contains only a small amount of sequences.
In rare occations, these subsets might be even entirely empty
- this will happen if EMBL and GENBANK subsections are perfectly
in sync.
The WPI version of the interface will present
these database subsections to you in database-neutral fashion
if you use the correct
window .
To see what sub-libraries are supported, you might try to obtain
an on-line list as follows in the command line version:
$ show log EM*
$ show log GB* Use the resulting names as GCG libraries. Additional help
is provided in the data set
manual of the GCG package. E.g., the EST:* specification applies
if you are interested only in the expressed sequence tag section
of the EMBL data library.
WARNING
The EST section of the DNA databases usually cover all sorts
of species. If you want to utilize data subsections by organism
rather than in its entirety you would presumably need to employ
large lists (such as created with
suitable search programs ) ans process these as described
below.
TIP
The SWISS-PROT database uses the organism name as part of the
entry name. E.g., Swissprot:*yeast will cover all yeast sequences.
To use groups of sequences, a reasonable paradigm is supplied
by each program package in a specified syntax. This syntax tells
the software that the specification given shall be used as group
of sequences rather as a single sequence. The GCG package calls
this mechanism a Sequence List. Documentation before the 8.0
release of the package might refer to this feature as a File
of Sequence Names (FOSN). The idea is straightforward: Programs
do no longer read the sequence from a file which specifies
the sequence data but rather use a file as a "pointer"
where to look for data, i.e., they read the sequence
from a file which is specified in a list file.
To maintain compatibility with the established input handling,
files which specify a list of sequences rather than a sequence
directly are tagged by the character @ (English
spelling: "at" character). A Sequence List is
produced by a number of programs, such as: To utilize the resulting file of sequence names, you might
use the @ character in front of the file name in all programs
which use multiple files. Sequence searching is such an application.
To use a Sequence List as a library, e.g.,
@my.fil in the fasta program,
you may use this nomenclature at the prompt "which sequence(s)?"
NOTES
1) The file-of-sequence-names method might not be available if
you run your sequence analysis via networks. 2) The 'blast' suite
of programs cannot use file of sequence names and requires own
database formatting (see
below ). 3) WPI users may use Sequence Lists much more conveniently
by using the correct window
- see below. To reformat Sequence Lists into other formats, refer to the
reformatting section
.
The sequences of a List of sequences are not stored in the
list file itself. Rather, the List file is a file of pointers
to the files which shall be worked on. This implies
the danger that, if a file being pointed to is deleted, the list
of sequences is no longer valid.
An alternative for the lists of sequences, therefore, is the
option to write all sequence data into a single file. This will
enlarge the file size, and also require that a specific format
is defined which allow multiple data rather than a single sequence
in one file. Most conveniently, such a file is produced by the
program 'pileup'
. This application produces
a multiple sequence alignment automatically, and stores the result
in a single file, including gaps and the specific
shift for each sequence. Multiple Sequence Files
(MSF) (*.msf) are named as my.msf{*} Details
on the 'pileup' program are in the section of the
multiple sequence analysis .
To reformat multiple sequence files into other formats, refer
to the reformatting section
.
Since Version 8, you may use the Wisconsin
Package Interface (WPI) via the
X-Windows system. Lists are readily handled and the base
principle of this user interface. Specifically, lists might be
expanded with a mouse click to select idividual sequences. Refer
to the corresponding WPI
section for details.
The usage of Lists might be restricted as not all databases
are available at each site. Specifically, if you run your sequence
analysis via networks or move from one site to another the lists
might become affected if site-specific features are included.
Keep in mind that Lists are created at a defined point in time.
If you use keyword searching
, your List will reflect the status of the database at this
specific time point. You might want to redo the keyword search
frequently in order to maintain an up-to-date set of sequences.
See also the notes below on the creation of own databases.
Lists are notorious troublemakers
if disk space is tight and references are made to specific user-provided
files. This implies that any 'cleaning' of sequence files from
your directories might render lists unusable if the references
are obsolete. Similarly, if you work on several machines, the
simultaneous use of Lists on different computers implies the
identical directory structure and the presence of all desired
files in the expected locations. The output of the the
profilesearch program (see
next chapter ) is a List as well, and known to inherit the
location of the data used for searching. Eventually, manual editing
is required to overcome this limitation.
Lists of sequences or Files of Sequence names can be very time-consuming
if you need to search a large amount of data. If you have enough
disk space, you can create or ask your system manager to create
how to create your own database with the command dataset
.
NOTE: ================================= Begin Exercise 12
Sequence searching: Use of the 'blast' and 'fasta' searching
programs to analyse DNA and protein sequences derived from previous
analysis.
Use the blast and fasta
programs to search the typed-in sequence in the largest database
you might access. Write down which of the sequence features occur
where in the database:
Use the blast and fasta
programs to search the translated sequence in the largest database
you might access. Write down which of the sequence features occur
where in the database:
Tools for Sequence Searching
start program
initialise top-scoring list
for each entry in database
{
compare database sequence with query sequence
evaluate similarity as "score"
compare "score" to top-scoring list
if "score" better than lowest entry
{
place entry in top-scoring list
}
}
normalise top-scoring list
for each entry in top-scoring list
{
determine statistics
print out result
}
Depending on the algorithm or implementation, some of
the steps might be missing or integrated in other steps. Sophisticated
sequence searching which performs extremely fast might be based
on two approaches:
Sequence Searching with Heuristic Methods
Principle of Similarity Detection
atggtaatggcacaattgactttcctgaatttctga Seq. A (formerly, horizontal sequence)
tgatggtcaagtaaactatgaagagttt Seq. B (formerly, vertical sequence)
In contrast to the dotplot, however, we now create a
table of oligomers and note the location of
the occurrence of these oligomers in the sequences. We are going
to use di-nucleotides here, which is for the sake of brevity
only, as nucleotide sequences typically use larger "words" (such
as six or more). Sixteen oligomers are possible, but not all
of them are found in the two sequences:
A B A B
AA 6,14,28 9,13,14,21 CA 11,13 8
AC 12,19 15 CC 24 -
AG - 10,22,24 CG - -
AT 1,7,15,29 3,18 CT 20,25,33 16
GA 18,27,35 2,20,23 TA 5 12,17
GC 10 - TC 23,32 7
GG 3,9 5 TG 2,8,17,26,34 1,4,19
GT 4 6,11,25 TT 16,21,22,30,31 26,27
Having created such a table, we may compute now the largest
segment of identity between the two sequences. The question,
therefore, is whether we can find oligomers which match at a
given, but identical shift of the two sequences. This can be
readily achieved by calculating all observed differences and
their occurrence:
Shift
observed -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
(in A) ----------------------------------------------------------------------
Locations 25 8 13 7 11 2 14 1 6 22 18 4 5 6 9 16
26 14 8 14 2 12 21 21 6
27 30 9 3 15 22 14
28 31 19 4
20 17
27 18
30
31
The result
could be counted in a very primitive fashion as in the dotplot-type
of example. In dotplots, a point was shown for each matching
oligomer. The columns in the table above represent identities
of a diagonal. The most matches on a single diagonal occurs at
a shift of -4 from sequence A to sequence B,
compare this to what was already observed in the schematic comparison section
. Similarly, the second-best column with a shift of +2
was already known. However, this type of crude approach
might not be sufficient to find the best matching segment of
identity as we ignore gaps entirely and just count occurences
of oligomer matches on a diagonal.
Length = (number of oligomers -1) + (size of oligomer)
Segment with shift 2 -4 -7 -5
Segment start position 1 7 26 13
Length 5 3 4 3
The occurrence of mismatches will be
noticed as an interruption of the segment. Gaps will
occur as a change of shift. Interruptions and changes of shift,
therefore, will not be considered in the initial identification,
and many algorithms will not consider gaps at later stages either.
A 00
G 01
C 10
T 11
The information contents of a match,
if expressed in bits of information, will therefore contribute
to the ranking of the sequence. The 'blast'
type of programs uses this method to determine the
relevance of a sequence searching result. Additionally, the
probability of finding the segment of interest
by chance is calculated on the basis of the database and query
information content. The lower the probability in its value,
the higher the significance, as the hit will less favourably
occur by chance. The concept of information contents can be used
to introduce thresholds in sequence comparison,
which will determine whether a match is carried or a score is
computed at all: If the information content of a sequence is
below such a threshold, no comparison will be made, and the result
of a search will be that no relevant hits will be reported. This
is a rare case but will occasionally occur in 'blast'
searches.
Expectations
number
of hits
^
| //// statistical
| * * noise
| * * <----------
| * *
| * * species
| * * similarity
| * * remote score
| * * similarity |
| * * | | identities
| * * v v |
| * * * * v
| * * * *** *
+-----------------*********-----***********-***-->
score
The discrimination between "remote similarity"
and "statistical noise" is crucial and severely depends on the
algorithm and the data used for comparison.
Programs
Database name GCG name contents
----------------------------------------------------------------
EMBL + Updates
GENBANK + Updates
(GB as exclusion set) nr all DNA databases (1)
SWISSPROT swissprot most proteins
SWISSPROT +
PIR International
+ PATCHX + OWL nr all peptide databases (2)
Database name GCG name contents
----------------------------------------------------------------
EMBL + Updates
GENBANK + Updates
(GB as exclusion set) GENEMBL: all DNA databases (1)
SWISSPROT SWISSPROT: most proteins (2)
PIR International PIR: most proteins
PATCHX + PIR MIPSX: MIPS merged database (3)
NEW entries of EMBL XEMBL: EMBL new entries (4)
UPDATED entries EMBL XXEMBL: EMBL updated entries (5)
GENBANK update excl. GB_NEW: GENBANK exclusion (6)
Rigorous Searching in the Twilight Zone
Principle
Programs
Searching Strategies
Tuning of your Sequence
Translate DNA
Tuning of the 'fasta' Parameter "word size"
2 for proteins 6 for DNA
To get different output, try
1 for proteins 3 for DNA (or even 1)
Tuning of the 'fasta' Parameter "list size"
40
to get longer lists, try:
100
Statistics Analysis of Hits
Mapping Result Data
Analysis of Target Sequences
Region of
similarity
|------------------------------> query sequence
||||||
|------------------------------> database sequence
: :
: : Redo the sequence search with the
------ isolated fragment of database sequence
This second sequence search should retrieve a similar
pattern than the original search if the homology was significant.
Careful inspection might also be useful to identify this segment
as a member of a sequence family which can be used further on
to validate the originally found sequence.
Use of Specific Searching Libraries
Database Sub-Libraries
Sequence Lists (formerly File Of Sequence Names (FOSN))
Multiple Sequence Files (MSF)
Lists within the Wisconsin Package Interface (WPI)
Impact of Electronic Networks and Time Effects
Creation of own Databases
| Query | Database | Feature or name of
| from-to| entry |from-to| identified sequence
|--------+--------+-------+------------------
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| Reading | Query | Database | Feature or name of
|frame no.|from-to| entry |from-to| identified sequence
|---------+-------+--------+-------+---------------------
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
================================= End Exercise 12
JAM produced file:
SEQUEN11.HTML as [next page] , or [overview] , or [table of contents]