JAMF Archive

BioCompanion as published in 1995
THIS IS THE REFERENCE CODE AS PUBLISHED.
		Doelz, R.   
		Optimal production of biological documentation: the JAM format.
		Comput. Applic. Biosci. 11, 224-226 (1995).    
		
The version you are currently viewing is the one printed and distributed via the Internet from the server of BioComputing Basel. Version 3.1 of the BioCompanion was published with version 2 of the JAMF software. The server that was indicated in the documentation has ceased to exist.

Version 3.2 of the BioCompanion was not publicly available for free but was shareware that was distributed with GCG's software release 9. For the purpose of enhanced editing, JAMF was partially rewritten and the proprietary version 3.x of JAMF was used from 1996 onwards. The Biocompanion is available in a current version from the publisher . It has significantly changed both in software and content.

JAMF source code

LATEX version source code

	

location: Home > Archive > BioCompanion V2.x (1995)

Chapter 5: DataTransferImportHandlingandFormatting

Data Transfer, Import, Handling, and Formatting


Transfer of Data in between Computers

In order to work with sequence data that you have transfered from another computer, you need to know what a sequence format is. The procedure to type sequence data manually into files is described in section "Sequence Editing" .

Note that word processors on personal computers quite frequently store data in non-ASCII format. Make sure that the file you want to transfer is really plain text. If needed, use the <save as> option and add <printed text> or <printed text with line breaks>.

'ftp'

The file transfer program (ftp) requires that the computers which shall exchange data run the same protocol, the TCP/IP (Internet Protocol) suite. Your computer has to be configured accordingly with the appropriate software .

Personal Computer Setups

To find the 'ftp' program on your desktop computer, search for ftp, NCSA, or fetch on your PC's or Macintosh's hard disk. Ask your system manager if you have questions about the configuration of your network . The big advantage of the personal computer setups is the user-friendly interface. All 'ftp' applications follow the same scheme:

Take care to transfer files in the correct mode. Sequence data files are usually text-only files and, therefore, need to be transferred in text mode (also called ASCII mode).

NOTE: Personal computers with modern operating systems (Macintosh, Windows 95, OS/2) allow file names which cannot be used on UNIX or VMS systems (which are usually installed at the host of interest).

Bear in mind the following rules for file names in order to facilitate working:

From and to other Hosts via 'ftp'

The details of this procedure depend on the implementations of the 'ftp' program in UNIX, VMS, or other operating systems. In general, the following points are important:

The following table gives an overview on the most important 'ftp' commands:

 
   
what                           | 'ftp' command   
-------------------------------+-------------------   
see where you are              | pwd   
-------------------------------+-------------------   
look for files                 | ls    
-------------------------------+-------------------   
change to subdirectory "test"  | cd test   
-------------------------------+-------------------   
go one level up                | cd ..   
-------------------------------+-------------------   
set file type binary           | bin   
-------------------------------+-------------------   
set file type text             | ASCII   
-------------------------------+-------------------   
get file "t1.seq"              | get t1.seq   
-------------------------------+-------------------   
get all files "t1.*"           | mget t1.*   
-------------------------------+-------------------   
transfer "t1.seq" to remote    | put t1.seq   
-------------------------------+-------------------   
toggle question in 'mget' mode | prompt   
-------------------------------+-------------------   
print progress during transfer | hash   
  
Note that the use of FTP requires that you provide user name and password in orser to access the remote directory.

SECURITY ADVICE: You should never leave your terminal or PC unattended if you are logged in to the computer. To ensure data security, you should avoid using other people's accounts. FTP access to a remote computer implies full read/write access to the remote data and is as sensitive as login via telnet or similar.

The option to use anonymous ftp is important for data retrieval and access to programs such as the JAMF code which is used to customize this Biocompanion.

VMS Import from other VMS Hosts via DECnet

The file name specification in VMS is

node::device:[directory.subdirectory]file.extension;versionnumber

If you do not have a so-called "proxy account", you need to type your user name (e.g., doelz) and password (e.g., gar34rwq) in the following way to copy the file in your current directory:

$ copy yogi"doelz gar34rwq"::d$bio:[doelz.sequence]t.seq []

 
 %S, YOGI"doelz password"::D$BIO:[DOELZ.SEQUENCE] T.SEQ1 copied to D$D:[DOELZ]T.SEQ1 (7 blocks)  
  

UNIX Import from other UNIX Hosts via Remote Copy

The file name specification in UNIX is

user@node:subdirectory/file.extension

If you have a so-called "trusted login", you may use the following type of command. Otherwise, you need to use the 'ftp' program as described above.

% rcp doelz@biox:sequence/t.seq ./

'kermit'

If you are connected via a serial line (i.e., rather old networking or modem lines), you may need to use the program 'kermit'. On the remote computer you must give the following command to receive a data file:

$ kermit receive test.seq

The procedures are reversed if you want to transfer a data file from the remote to the local computer. On the remote computer you need to type the following command to send a data file:

$ kermit send test.seq

The local options are reversed. On MS-DOS you must start with the local escape character which is shown at the bottom of the screen.

<CTRL><A> c

Then, you can give the command to send (or receive) a file, e.g.,

KERMIT-MS> send test.seq

To get back, type

KERMIT-MS> connect

NOTE: The local escape character might vary (e.g., <CTRL><[> c), but is usually shown at the bottom line of the screen.

On Macintosh and other Graphical User Interfaces there are usually options in the <File> or similar menu.

ZMODEM

This method is known from bulletin boards and other servers, but rarely used in molecular biology environments. It covers file transfer with built-in compression and its use is similar to the 'kermit' program.


File Handling Commands on Various Operating Systems

Depending on your environment, you need to know various commands to move, rename and view files. Some workstations or central computers offer a graphical user interface which you can use if you run the X-Windows environment. Ask your local computer experts if you want to do this. Don't be disappointed if the performance does not meet your expectations. These visual file systems work best if you are sitting directly in front of a workstation screen. They do not respond as quickly as you would like if you are in front of a PC or Mac-based X-terminal.

NOTE: The following sections refers to the command line.

Navigation

 
what                              |      VMS               |     UNIX   
----------------------------------+------------------------+-------------------   
see where you are                 | show default           | pwd   
----------------------------------+------------------------+-------------------   
list files                        | dir                    | ls    
                                  +------------------------+-------------------   
dito, in detail                   | dir/size/date          | ls -lsa    
----------------------------------+------------------------+-------------------   
create subdirectory "test"        | create/dir [.test]     | mkdir test   
----------------------------------+------------------------+-------------------   
change to subdirectory "test"     | set default [.test]    | cd test  
----------------------------------+------------------------+-------------------   
go one level up                   | set default [-]        | cd ..   
----------------------------------+------------------------+-------------------   
go to login directory             | set default sys$login  | cd ~/   
----------------------------------+------------------------+-------------------                       
  

Manipulation

 
what                              | VMS                    | UNIX   
----------------------------------+------------------------+-------------------   
copy file f1.dat to file f2.dat   | copy f1.dat f2.dat     | cp f1.dat f2.dat  
----------------------------------+------------------------+-------------------                        
rename file f1.dat to file f2.dat | rename f1.dat f2.dat   | mv f1.dat f2.dat          
----------------------------------+------------------------+-------------------                        
delete file f1.dat                | delete f1.dat;*        | rm f1.dat   
----------------------------------+------------------------+-------------------                         
get rid of old file versions      | purge                  | (no file versions)  
----------------------------------+------------------------+-------------------   
edit file f1.dat                  | edit f1.dat            | vi f1.dat   
                                  | OR eve f1.dat          | OR emacs f1.dat   
                                  |    (see sections below for details)   
----------------------------------+--------------------------------------------   
  

Output

 
what                              | VMS                    | UNIX   
----------------------------------+------------------------+------------------   
view file f1.dat on screen        | type/page f1.dat       | more f1.dat   
view next page                    | <RETURN>               | <SPACE>   
quit this mode                    | <q>                    | <q>   
----------------------------------+------------------------+------------------   
print file on queue "test"        | print/queue=test       | lpr -Ptest   
----------------------------------+------------------------+------------------   

CAUTION: Print only what you have looked at before. If you print a file (text, graphics), make sure that you know where to print it (on a campus network). See below for details. Read the section "Need to Stop a Print Session" .

Local Site Information for Printing

At the campus of Basel University, you can access various printers via the network. Type

$ show queue

to see how many print queues there are. The following text-only print queues are available:

 
  
Name           Location   
-------------------------------------------------------  
LPSPRA         Biozentrum, Praktikumsraum   
LPSANS         Biozentrum, 1st floor (URZ)   
LPPRB          Biozentrum, fast line printer (URZ)  
BIOPRINT       Biozentrum, 2nd floor (BioComputing Lab)  
LPSBIOZ1       Biozentrum, 3rd floor (Xray)   
LASBIOZ6       Biozentrum, 6th floor (Messraum Biophysics)  
LASPUK         PUK   
LAS...         other locations, laser printers  
LPS...         other laser printers, postscript-capable   
  
There is a postscript laser print server in the URZ (Biozentrum, 1st floor):
 
  
Name           Location   
-------------------------------------------------------  
LPSPOS         Biozentrum, room 184A (URZ)  

NOTE: Names of printers may change.

Remember to print with

$ print/queue=<QUEUENAME> <FILENAME>

where <QUEUENAME> is one of the print queue names mentioned above. If you need to stop a print job, type

$ show entry

to determine the "entry" number of your print job, and then stop the job with

$ stop/entry=<ENTRYNUMBER>


Local Site Information for Editing

NOTE: The following is a very basic "cookbook" introduction. You should read a manual or ask colleagues if you intend to work more closely with the EDT editor. Other, possibly more user-friendly, editors are not described here.

Start the EDT Editor

To edit a file with the EDT editor, type

$ edit file.dat

If you see the symbol

[C]*

use the 'exit' command. Then set the terminal to "vt100" as described in section "Unknown Terminal" before starting to edit again. If you see the symbol

*

give the command 'change' to get in full-screen editing mode. This is done by typing a c (c for change).

Typing Text

All you type appears on the screen and will appear in the file after saving it. The cursor keys can be used for navigation. The <DELETE> key should delete the previous character and move the cursor to this position. If this does not work, try <BACKSPACE> instead.

Help for Sophisticated Functions

The second character in the top row of the numeric block is usually the <HELP> key. On terminal emulators, other keystrokes might be needed (refer to your manual for details).

Screen Refresh

If the screen is corrupted by operator messages or behaves strangely, try <CTRL><W>. On terminal emulators, other keystrokes might be needed (refer to your manual for details).

Exit the Editor

When you have finished editing, use <CRTL><Z> to get the symbol

*

There, type the command 'exit'. The file will be saved, and you should be back at the $ prompt.


Import of Sequences to the GCG Package

To use sequence data on the computer, you need to know what a sequence format is. After you have transferred a sequence file to your computer, you may need to reformat the sequence to work with a given sequence analysis package. This section explains most of the solutions using the GCG package.

Sequence Formats

Briefly, a sequence format is a convention which defines what part of a data file is interpreted as sequence and what part as additional data. Depending on the software package used for sequence analysis, some of these additional data are of importance for processing. E.g., the GCG sequence format defines the type of the sequence data (protein or DNA). Other elements set the date, or log a line containing the length of the file. Therefore, a given sequence format is difficult to maintain in a normal text editor, and, usually, computer programs dedicated to sequence editing will deal with the details.

Plain Text Sequence Format

The plain text sequence format is typically generated by word processors (saved as text file with line breaks) or by electronic sources such as mail messages. A plain text format contains only sequence data and, therefore, may need editing to strip all additional data.

Sequence Formats Ready to Use with Sequence Analysis Packages

Sequence formats ready to use with sequence analysis packages are either generated within a sequence analysis package, e.g.,

or come from the original databases. This can be either from a local installation, or by network retrieval tools, such as electronic mail or World-Wide Web . Examples:

 
ID  (entry code)  
... (other fields) ...  
SQ  (then the sequence)   
//  

 
LOCUS     (entry code)  
..........(other fields) ...  
ORIGIN ...(then the sequence)   
//  

 
>P1; (entry code)  
... (one line of text) ...   
(sequence, finished by a *)  
(eventually, more text)   

Reformatting Sequences

Refer to the section "Transfer of Data" for details on how to copy data from and to other computers.

Reformatting from other Packages

The program readseq is very useful to interconvert all kinds of sequence formats. Alternatively, try one of the programs of the GCG package. To get information about GCG's reformatting programs, use

$ genmanual sequence_exchange

The following selection of programs should cover most of your needs.

NOTE: When reformatting a sequence, the sequence name of the original sequence is adopted. The original file name is replaced by the name of the corresponding sequence in the originating database; e.g., if you have used the file name 'test.seq' in an export from electronic mail , WWW , ENTREZ , or similar, and the entry obtained from EMBL is M12345, the reformatting will result in a file called 'm12345.embl' and not retain the file name used before.

 
from GENBANK (NCBI)              

$ fromgenbank

 
from EMBL (EBI)               

$ fromembl

 
from the IG suite package   

$ fromig

 
from programs of PIR (e.g., ATLAS)  

$ frompir

 
from ASCII files (e.g., electronic mail, or STADEN package)  

$ fromstaden

if errors occur (because lines are too long), use first

$ chopup

Reformatting from Established GCG Sequences.

The program 'reformat' allows you to format from and to various GCG-type of formats and also helps if sequences are corrupted (checksum changed). To get information on this program, use

$ genhelp reformat

or

$ reformat /check

(The sequence of exercise 1 must be treated this way).

Reformatting from "Unknowns"

A plain text file (only sequence data) is a good place to start. Use your text editor to create such a file. To convert the file to the GCG sequence format, put two periods (..) at the beginning of the text. Then, use

$ reformat

to obtain the final GCG-type format.


JAM produced file: DATA5.HTML as [next page] , or [overview] , or [table of contents]