CHAPTER 3
DATA ANALYSIS SOFTWARE

One objective of this dissertation is to characterize the spectral content of RS CVn light curves so that generalizations can be made about the physical structure of the systems. Because of the volume of signal processing to be performed, the analysis environment must be capable of operating in a "production mode" where all the sources can be reduced in the same manner. Another objective is to define the methods of analysis that can be used to reduce the plethora of APT data. In this case, the analysis environment needs to be user interactive so that various methods may be tried and evaluated. In order to address both these problems, I developed TISAN (TIme Series Analysis eNvironment). This chapter describes the TISAN data analysis system, its structure and operation. I assume that the reader has a working knowledge of MS-DOS and is familiar with the file and input/output (I/O) structure of MS-DOS based systems (IBM 1986).

3.1 The TISAN Package

TISAN is written in a combination of Microsoft C 4.0 (Microsoft 1986) and 8086 assembly for MS-DOS microcomputers based on the 8086, 80286 and 80386 family of processors (as well as the 8087, 80287 and 80387 family of math coprocessors). The architecture is similar to that of the Astronomical Image Processing System (AIPS) produced by the National Radio Astronomy Observatory (NRAO). The C language was chosen because of its use in writing operating systems (UNIX is written in C for example) and because of its high level of mathematics support. The MS-DOS family of microcomputers was chosen because the 80286 and 80386 based work stations have the desired speed, mass storage and low cpu-hour cost requirements needed for this work. A complete listing of the TISAN system can be found in Appendix A.

In TISAN, programs that perform specific operations are written as separate entities (tasks) which are spawned as child processes by a simple command line interpreter (CLI). This modular structure allows tasks to be written independently of the main program and all other tasks. All tasks share a common set of data structures, variables and I/O routines, which are called from a custom library module. This structural similarity ensures compatibility between files created by different tasks as well as communications across tasks by the CLI. TISAN also supports an extensive on-line help facility, so the source listings need not be addressed when assigning the input values for the more complex tasks.

At the interpreter level, variable arguments (referred to as adverbs) are used to control the execution of the tasks. All tasks are limited to the same set of adverbs, but the adverbs can have different meanings for each task. Aside from setting adverb values, the interpreter also supports simple single actions commands (verbs) and more complex compound commands (pseudoverbs). TISAN is also designed to support the use of external execution files (referred to as RUN files), which can be used to process data in a production mode.

The TISAN operating environment requires three support files and two subdirectories: TISAN.IDX, TISAN.MSG, TISAN.HLP, RUN and INPUTS. The first file is an index to the adverbs required by the individual tasks. When the adverb values (inputs) for a task are displayed, by the INPUTS pseudoverb, this file determines which adverbs will actually be displayed and what task specific messages are to be shown along side the values. TISAN.MSG is the file that contains the messages referenced by TISAN.IDX and TISAN.HLP is the help facility. The files are maintained by three corresponding support programs that translate ASCII files into the proper formats. The two subdirectories are the default locations for all RUN and INPUTS files, although the RUN subdirectory is optional.

The CLI communicates with the individual tasks through "inputs files" written to the INPUTS subdirectory. When the GO pseudoverb (see section 3.4) is executed, the adverb values are written to a file in the INPUTS subdirectory with an extension of '.INP' and a root name which matches the task being called, thus all task names are limited to eight characters. The task is spawned as a child process of the CLI, suspending the CLI in memory, and then reads in the inputs file appropriate to its name. After completing execution, a task may pass values back to the CLI by simply updating the inputs file used in the original call. This structure ensures that any aborted or failed task will return to the CLI and that the inputs of the most recent run of any task can always be retrieved.

TISAN currently supports four file formats: two real and two complex. Time series files contain only amplitude information. The time information is implicit in a data points position within the file, so each point has a validity flag associated with it. The header of the data file contains the scaling constants to convert from position within the file to time. Time labeled files have the timing information explicitly entered in the file, so the scaling information in the header is unused, and there are no validity flags.

3.2 The Command Line Interpreter

The TISAN CLI uses a simple parser and evaluator that employs a "minimum match" entry structure, so only enough characters must be entered to uniquely identify the command. For example, H and HELP both execute the help facility since HELP is the only command that begins with the letter H. Commands are delimited with spaces while values are delimited with spaces or commas. Multiple commands my be entered on a single line if they are separated by semicolons.

When assigning adverb values, the parser can only evaluate numeric, string and adverb expressions. Thus the following are all legitimate assignment statements:

FACTOR 0
TITLE 'This is a String Expression'
CODE FACTOR.

The first two expressions are direct assignments to the adverbs FACTOR and TITLE (see section 3.5) while the last expression will assign the value currently identified with FACTOR to CODE. It should be noted that entering an adverb name without an argument will display the current value of that adverb.

When assigning values to arrayed adverbs, an optional argument index value can be given. For example, the statement PARMS[2]=5 would set the second element of the PARMS array equal to five while PARMS=0 would set all the elements of the PARMS adverb to zero.

3.3 Verbs

Verbs are stand alone commands (i.e., no arguments can be given) that perform simple actions. The TISAN environment supports eight verbs, which are summarized in table 3-1.

TISAN supports several hardware configurations which are set with the CONFIG verb. When executed, the user is prompted for the type of text display screen (monochrome or CGA) and the type of printer being used (IBM graphics or HP Laser Jet). If the display is an IBM Monochrome screen, then the type of graphics display (CGA or Hercules) is also requested. If TISAN cannot find a previous system configuration when the program is started, the CONFIG verb is automatically executed.

Production mode processing may be performed with the RUN pseudoverb (see section 3.4). This mode of operation supports three verbs: ECHO, NOECHO and END. As each line of the file is processed, it is normally displayed on the screen. ECHO and NOECHO are use to turn on, and off, this output, respectively. A RUN execution will normally terminate on reaching the end of file, however, the END verb can be used to terminate the processing prematurely.

TISAN may be terminated by the user, with EXIT or QUIT. EXIT will save the current inputs under the file name TISAN.INP in the INPUTS subdirectory and then exit to MS-DOS, while QUIT will abandon the current environment.

The last two verbs are CLS and DEFAULTS. CLS is used to clear the screen and reset the video mode. If a graphics task has been aborted, then the CLS verb will return the screen to the normal (text) state. The DEFAULTS command is used to reset all string adverbs, except TASKNAME, to zero length and all numeric adverbs to their predefined default values.

3.4 Pseudoverbs

Pseudoverbs are commands that accept at least one optional argument, although some pseudoverbs, such as COPY and DELETE, have mandatory arguments. TISAN supports eighteen pseudoverbs, which are summarized in table 3-2.

The most commonly used pseudoverb is INPUTS. This command displays the adverbs, adverb values and task specific messages for the individual tasks. INPUTS takes an optional argument that must specify a task that is included in the TISAN.IDX file. If the task is not in the index file, an error message will be issued. If no argument is given, then the task specified by the adverb TASKNAME is used. Figure 3-1 shows the inputs for the task IMEAN.

Once the inputs have been specified, the GO command is used to execute a task. If no argument is provided, then the task specified by the adverb TASKNAME is used. The outcome of the GO command is task dependent (see section 3.6).

TISAN supports six MS-DOS interface commands. Files may be renamed, copied, deleted and cataloged using the RENAME, COPY, DELETE and CATALOG pseudoverbs, respectively. These commands are identical to their MS-DOS counterparts except that file name "wild cards" are not accepted.

TISAN supports two interfaces to foreign executable files: DOS and SPAWN. The DOS pseudoverb can be followed by any valid MS-DOS command, which is executed as if it were given at the system level. The file COMMAND.COM must be accessible through the current MS-DOS command specification (COMSPEC) setting. SPAWN is similar to the DOS pseudoverb in that it executes a child process specified by a mandatory argument, but it does not invoke the MS-DOS command processor and can only be used to execute external program files (i.e., MS-DOS batch files cannot be called and programs cannot be called with arguments). The MS-DOS level can also be entered interactively by pressing the ^Z (Control Z) key combination. This feature is useful for suspending RUN file executions.

There are two commands available for saving and restoring the TISAN adverb environment: GET and PUT. These commands take an optional argument which is the name of the file to read or write. If no argument is given, then the current TASKNAME is used and a file is read or written from the INPUTS subdirectory.

The pseudoverbs GETHEAD and PUTHEAD allow the user to inspect TISAN file types and directly modify the time scaling values in time series files. GETHEAD will get the header information from a file while PUTHEAD will put the header information to a file. If no file is specified, then the INNAME, INCLASS and INPATH adverbs are used to build the file name. When executed, the type of file is displayed, and the slope and intercept values for time scaling are placed into or written from the TRANGE adverb.

The RUN pseudoverb is used for production mode processing. A RUN file is a text file created with any editor (EDLIN for example) that holds the commands to be executed by TISAN. RUN files may call other RUN files, but if the nesting is too deep, a stack overflow error will occur. WAIT and REM allow for run-time interaction and commenting of RUN files. WAIT generates an audible tone and prints whatever message follows the command. Program execution will pause until a key is pressed by the user. The REM pseudoverb will cause the interpreter to ignore the remainder of the line, semicolons not withstanding, so it is used for commenting.

TISAN supports two interactive screen graphics controls: SETWIN and SETPOINT. These pseudoverbs take an optional argument which is the name of a graphics image. If a name is given, the graphics screen is loaded with the image and the setting function is entered, otherwise the graphics screen is cleared. The cursor control keys are used to move about the screen, and a point is set by pressing the RETURN key. When execution has completed, the adverbs WINDOW or POINT are updated with the new values.

3.5 Adverbs

Adverbs are the TISAN environment variables that are used to control the execution of individual tasks. There are currently thirty-one adverbs used by system, although four additional serial communications adverbs are defined, they are currently unused. The TISAN adverbs are summarized in table 3-3. Adverbs can be integer (2 bytes), real (8 bytes) or string (132 bytes). String values must be delimited by single quotes.

In TISAN, file names are constructed by providing the three basic elements required by MS-DOS. Each file name consists of a path (INPATH, IN2PATH and OUTPATH), which can include drive and directory specifiers (up to a maximum of 63 characters), an eight character root name (INNAME, IN2NAME and OUTNAME) and a three character extension (INCLASS, IN2CLASS and OUTCLASS). The user may specify an input, secondary input and output file name. For most tasks, if the output file is not specified, then the input file is overwritten.

The five most widely used adverbs, besides the I/O specifiers, are TASKNAME, FACTOR, CODE, PARMS and QUIET. TASKNAME is a string that is used to specify the current task. Unlike other string adverbs, the single quotes are optional. TASKNAME is used as the default value for many of the pseudoverbs. FACTOR is a real value that is normally used as a processing argument for a given task. For example, the task PGRAM uses FACTOR to specify the resolution of the resultant periodogram. CODE is an integer value that normally controls program execution. If a task supports multiple operations, then CODE is used to select between them. PARMS is an array of ten real values. This adverb allows a wide range of parameters to be passed to a given task. Lastly, QUIET is used to control the level of messages displayed to the screen or printer by a task. If a task sends a message that has a priority less than the QUIET level, it is not displayed. If QUIET is negative, then the display output is also sent to the printer.

TRANGE and YRANGE are used to select the time and amplitude ranges to be used for a specific task. TRANGE[1] (YRANGE[1]) is the first real value and TRANGE[2] (YRANGE[2]) is the second. If the starting value is less than or equal to the ending value then the entire range is used.

Two adverbs are available for numeric format control: TFORMAT and YFORMAT. These string adverbs allow the user to specify the precision, field width, justification and other formatting controls. The syntax is:

%[Flags][Width][.Precision]Type.        (3-1)

The arguments in brackets are optional, but the total number cannot exceed 11 characters. The fields have the following meanings: Width is the minimum field width. If the printed value is larger than this prescribed width, it will overflow the field. If Width is preceded by a zero, then the value will be left filled with zeros. Precision is the number of digits to be printed after the decimal point (the default value is six). The Flags and Type fields are described in Table 3-4.

The remaining adverbs are currently used only for the graphics tasks SCRNPLOT, PNTRPLOT and CADPLOT. BORDER and FRAME are logical values that turn on the graph border and graph frame, respectively. Plots may be labeled with a title, an abscissa label and an ordinate label using TITLE, TLABEL and YLABEL, while Tic-marks are controlled with the TMAJOR, YMAJOR, TMINOR and YMINOR adverbs. Plots do not necessarily need to cover the entire graphics area. WINDOW is a four element adverb that corresponds to the physical plotting limits used for graphics display. The first two values are the lower left corner and the second two are the upper right corner. If the lower left corner is greater than the upper right, then the entire area will be used. The available limits depend on the graphics device in use. Table 3-5 shows the resolution of the currently supported devices.

The three remaining adverbs are not used by all the graphics routines. COLOR is used by SCRNPLOT and CADPLOT to set the drawing colors, while DEVICE is used by PNTRPLOT to select the output port for the printer (LPT1:, LPT2: or PRN:). CADPLOT uses one additional adverb, POINT. The first value of this two element array specifies the Z coordinate plane for the plot.

3.6 Tasks

Tasks are the application programs written for the TISAN environment. Each task uses the adverb values in its own way. Table 3-6 lists the currently supported tasks.

There are ten general operation tasks supported by TISAN: DBCALC, DBCMB, DBCON, DBLIST, DBMOD, DBSCALE, DBSMOOTH, DBSUBSET, DBTRANS and IMEAN. These tasks handle the more mundane aspects of data base management by performing integration and differentiation, data base combination, data format conversion, listing, modification using arithmetic and log functions, scaling, smoothing, editing, modification using transcendental functions and basic statistical analysis. These tasks are the basic foundation of the TISAN system. When used in combination with the other, more specialized tasks, the TISAN environment becomes a powerful data manipulation tool.

There are three spectral decomposition routines available from within TISAN: DFT, DCDFT and PGRAM. All three tasks can be given an angular frequency range over which to perform the analysis and a resolution factor.

The DFT task performs a discrete Fourier transform or its inverse. The DFT algorithm has been defined to be compatible with the DCDFT routine, so the sign in the exponent is the negative of that in equation 2-1. The DCDFT task performs a data-compensated discrete Fourier transform, as described in section 2.3. This task can also filter a signal at a given frequency and display the parameters used for the filter. Both these tasks generate a complex time series file.

PGRAM is used to calculate the periodogram of a data base as described in section 2.2. The final periodogram can be: scaled by the variance, displayed as level of probability or unscaled.

The output files of DFT and DCDFT can be decomposed with DBX. This task extracts the amplitude, phase, real or imaginary part of a complex data base and writes it to a real data file.

There are three graphics routines supported by TISAN which allow output to the screen, printer or the AutoDesk AutoCAD package. Two files may be plotted against one another and multiple files can be plotted on a single screen.

The user can exercise a great deal of control over an individual plot. The defaults will automatically scale and label the axis, but the user can control: the plotting ranges, plot size, color, line type, orientation and position of all the labels, tic-marks or grid lines, axis directionality (i.e., values increase down or to the left), semi-log or log-log scales and plotted symbols.

SCRNPLOT is used to display data on and IBM CGA or Hercules screen, while PNTRPLOT will plot data to either an IBM graphics or HP Laser Jet printer. The hardware requirements for both these tasks are set with the system configuration verb CONFIG. The inputs and operation of PNTRPLOT are nearly identical to SCRNPLOT except that the COLOR adverb is not used.

CADPLOT will plot data to an external file in the AutoDesk AutoCAD DXF file format (AutoDesk 1987). The file is automatically given the file extension of '.DXF'. The inputs and operation are nearly identical to SCRNPLOT except that the CODE adverb is used to select between three dimensional or two dimensional line types, and a layer name can be selected instead of a line type. This form of graphics output allows for extremely high resolution plots (11000 x 17000), plus all the graphics editing capabilities that the computer aided design (CAD) package has to offer.


Figure 3-1
Inputs Listing for the Task IMEAN

Figure 3-1 shows a typical inputs listing for the task IMEAN. The first field is the adverb name, the second is the adverbs value and the third is a short description of the adverbs use within the task. In this example, the file C:\APT\U\93LEO.U is to be read. Since the time range and format fields have not been specified, the entire file will be processed and the output will be displayed with the default of six decimal places. The CODE value of 3 will cause IMEAN to return the maximum value and its bracketed time range in the FACTOR and TRANGE adverbs. These values are retrieved with the GET pseudoverb.

IMEAN - I Determine Basic Statistics on a Range of Data

INNAME ... '93LEO' Input Name
INCLASS ... 'U' Input Class
INPATH ... 'C:\APT\U' Input Path
TRANGE ... 0, 0 Time Range, T2<=t1> ALL
TFORMAT ... '' Time Output Format, Default -> '%G'
YFORMAT ... '' Y Output Format, Default -> '%G'
CODE



...



3



Completion Control Code
1 -> Return MEAN
2 -> Return Minimum
3 -> Return Maximum

Table 3-1
TISAN Verbs

Table 3-1 is a summary of the TISAN verb commands. The verbs are listed alphabetically in the first column followed by a one line description of the action.

CLS Clears and resets the screen
CONFIG Configure for the display hardware
DEFAULTS Reset the adverbs to their default values
ECHO Echo the execution of RUN files
END Terminate the execution of a RUN file
EXIT Exit to MS-DOS after saving current environment
NOECHO Do not echo the execution of RUN files
QUIT Exit to MS-DOS without saving environment

Table 3-2
TISAN Pseudoverbs

Table 3-2 is a summary of the TISAN pseudoverb commands. The pseudoverbs are listed alphabetically in the first column followed by a one line description of the action.

CATALOG Display a disk directory
COPY Copy disk files
DELETE Delete disk files
DOS Execute an external MS-DOS command
GET Get inputs from disk
GETHEAD Get header information from a file
GO Execute a task
HELP Display help information
INPUTS Display the inputs for a task
PUT Put inputs to disk
PUTHEAD Put header information to a file
REM Remark
RENAME Rename disk files
RUN Execute a run file
SETPOINT Interactively set POINT adverb
SETWIN Interactively set WINDOW adverb
SPAWN Spawn an external child process
WAIT Pause execution of a RUN file

Table 3-3
TISAN Adverbs

Table 3-3 is a summary of the TISAN adverbs. The adverbs are listed alphabetically in the first column followed by a one line description of the usage.

BORDER Draw border around a graph
CODE Execution code. Implementation dependent
COLOR Color specifier for graphics
DEVICE Output device
FACTOR Numeric factor. Implementation dependent
FRAME Draw frame around graph
IN2CLASS Secondary input file class
IN2NAME Secondary input file name
IN2PATH Secondary input file path
INCLASS Input file class
INNAME Input file name
INPATH Input file path
OUTCLASS Output file class
OUTNAME Output file name
OUTPATH Output file path
PARMS Array of parameters used by selected tasks
POINT X,Y data pair
QUIET System adverb to suppress inputs listings on GO
TASKNAME Name of default task
TFORMAT Format string for data display
TITLE Plot title for graphics display
TLABEL Time axis label for graphics display
TMAJOR Time axis major tic-marks
TMINOR Time axis minor tic-marks
TRANGE Time range to be used
WINDOW Physical plotting window
YFORMAT Format string for data display
YLABEL Y axis label for graphics display
YMAJOR Y axis major tic-marks
YMINOR Y axis minor tic-marks
YRANGE Amplitude range to be used

Table 3-4
Flag and Type Fields for Format Adverbs

Table 3-4 is a listing of the meaning of the Flag and Type fields used by the TFORMAT and YFORMAT adverbs.

FLAG MEANING DEFAULT
- Left Justify Right Justify.
+ Force Sign Sign Not Forced.
blank Force a decimal point Decimal Printed Only if Digits Follow it
TYPE OUTPUT FORMAT
f Non-exponential floating point
e Exponential floating point using a lower case e
E Exponential floating point using an upper case E
g Selects 'e' or 'f' type, whichever is more compact
G Selects 'E' or 'f' type, whichever is more compact

Table 3-5
Display Resolution for Graphics Devices

Table 3-5 is a summary of the resolution of the available graphics devices. It should be noted that the last entry, AutoCAD DXF File, is not a device, but a file format read by the AutoDesk AutoCAD program which is scaled to 11 by 17 inches.

DEVICE RESOLUTION
IBM CGA

640 x 200 (Black and White)
320 x 200 (Color)
HERCULES 720 x 348
Dot Matrix Printers 760 x 480
HP Laser Jet Printers 750 x 592
AutoCAD DXF File 17000 x 11000

Table 3-6
TISAN Tasks

Table 3-6 is a summary of the TISAN tasks. The tasks are listed alphabetically in the first column followed by a one line description of its action.

CADPLOT Plot data to an AutoCAD DXF file
DBCALC Integration and differentiation
DBCMB Combine two data bases
DBCON Convert between data formats
DBLIST Data tabulation
DBMOD Modify an existing data base
DBSCALE Change the scaling of an existing data base
DBSMOOTH Data smoothing
DBSUBSET Take a subset of an existing data base
DBTRANS Operate transcendental functions on a data base
DBX Extract components of a complex data base
DCDFT Data-compensated discrete Fourier transform
DFT Discrete Fourier transform
IMEAN Data ranges, mean and deviation
PGRAM Periodogram analysis
PNTRPLOT Plot data to a graphics printer
SCRNPLOT Plot data to a CGA or Hercules graphics screen


Back Up Next
IDTR Home Page

05/22/02 ern

Copyright (c) 1988-1997, Eric R. Nelson, Ph.D.