S_PARSE

Parse a string and return information about it

WSupported on Windows
USupported on Unix
VSupported on OpenVMS
NSupported in Synergy .NET
xcall S_PARSE(string, start, dimension, item_position, item_length, item_type, #items, end
&     [, no_quote])

Arguments

string

The string to parse. (a)

start

The beginning parse position within string. (n)

dimension

The dimension of the item position, item length, and item type arrays. (n)

item_position

The first element of the item position array. (n)

item_length

The first element of the item length array. (n)

item_type

The first element of the item type array. (n)

#items

The variable that will be loaded with the number of items parsed. (n)

end

The variable that will be loaded with either the ending parse position (if string contained more than dimension items) or zero (if all items were parsed). (n)

no_quote

(optional) Overrides the default handling of quote characters. (n)

Discussion

The S_PARSE subroutine parses a string and loads three arrays with information about each item (or token) in the string.

String is an alphanumeric literal or variable. Each item in string will be parsed, and the item characteristics will be loaded into item_position, item_length, and item_type.

The starting position (base one), the length, and the type of each token are loaded into the item position, item length, and item type arrays, respectively. Each array must have dimension elements or more. In other words, item_position(1), item_length(1), and item_type(1) contain the arguments for the first item within string; item_position(2), item_length(2), and item_type(2) contain the arguments for the second item; and so forth.

The type of each item is coded as follows. The I_ mnemonics are defined by the compiler.

Item Type Coding

Item

Mnemonic

Description

1

I_ANUM

Case-insensitive alpha character, followed by zero or more case-insensitive alphanumeric characters.

2

I_IDENT

Case-insensitive alpha character, followed by one or more case-insensitive alphanumeric, dollar sign ($), or underscore (_) characters, with at least one of the characters being a dollar sign or an underscore.

3

I_DIGIT

One or more decimal digits.

4

I_FIXED

One or more decimal digits, followed by a period (.), followed by one or more decimal digits.

5

I_SPACE

One or more spaces and/or tabs.

6

I_SQUOTE

String enclosed in single quotation marks.

7

I_DQUOTE

String enclosed in double quotation marks.

8

I_SPECIAL

Any single character that is not part of one of the other item types.

Parsing continues either until all items within string have been parsed or until dimension items have been parsed. If dimension items are parsed and string still contains more items, end is the base one position within string at which the next item begins. (In other words, passing end as start on another S_PARSE call will continue the parsing.) If all of the items within string are parsed, end is returned with a value of zero.

When a quoted string item is parsed, the item_position array element is the position of the first character that follows the double or single quotation mark, and item_length is the length of the item up to the closing quotation mark. Thus, the delimiters of a quoted string are the only characters within string that aren’t enclosed in one of the items.

Note that quoted strings can be implicitly terminated by the end of string, and that it is possible to have a quoted string with a length of zero (two successive quote characters).

If you want successive quotes in a string to represent a single occurrence of that quote (for example, ‘O’’Leary’ to represent “O’Leary”), the calling program must detect successive occurrences of I_DQUOTE or I_SQUOTE items. In particular, if one of these items is the last item parsed, and more items are on the line, the item should be “pushed back” before the next call to S_PARSE. In other words, set start to the following rather than end for the next call, and don’t process the last item on the current call:

item_position(dimension) – 1

On completion, #items is returned with the number of items that were parsed.

If no_quote is passed and nonzero, S_PARSE returns I_SPECIAL for single and double quotation marks instead of I_SQUOTE and I_DQUOTE.

Note

S_PARSE is affected by the case value of the LOCALIZE routine wherever the operation depends on (or is independent of) the case of a character.

Examples

.define TTCHN   ,1
record
    line        ,a20
    start       ,i4,    1
    dim         ,i4,    20
    pos         ,20i4
    len         ,20i4
    type        ,20i4
    items       ,i4
    end         ,i4
    ix          ,i4
proc
    open(TTCHN, o, "tt:")
    writes(TTCHN, "Enter a string to parse: ")
    reads(TTCHN, line)
    xcall s_parse(line, start, dim, pos, len, type, items, end)
    writes(TTCHN, %string(items) + " items parsed")
    if (end) then
      writes(TTCHN, %string(end) + " is the end position")
    else
      writes(TTCHN, "All items were parsed!")
    for ix from 1 thru items            ;Display contents of arrays
      writes(TTCHN, "item " + %string(ix) + " pos=" +  
  &          %string(pos(ix), "ZX") + " len=" + 
  &          %string(len(ix)) + " type=" + 
  &          %string(type(ix)))
    close TTCHN
    stop
end

Let’s assume the following line is input:

FRED EARNS $17/HR

The program above produces the following output (based on the recommended .DEFINEs):

Enter a string to parse:
FRED EARNS $17/HR
9 items parsed
All items were parsed!
item 1 pos= 1 len=4 type=1
item 2 pos= 5 len=1 type=5
item 3 pos= 6 len=5 type=1
item 4 pos=11 len=1 type=5
item 5 pos=12 len=1 type=8
item 6 pos=13 len=2 type=3
item 7 pos=15 len=1 type=8
item 8 pos=16 len=2 type=1
item 9 pos=18 len=3 type=5