Go to the first, previous, next, last section, table of contents.


Token semantics

We now look at the semantics of tokens. The semantic value of a `(' is void: the fact that it is an open parenthesis is all the information it contains. However, when reading a number, the value of the number is important. To illustrate this, the next example is the language of numbers; the parser will add the numbers together and display the result at the end of a sentence.

A first go at the parser looks like this:

%class ex2

%token NUMBER

language
                {
                  int value;
                }
        :
          [NUMBER
                {
                  value += ...;
                }
          ]*
                {
                  [[[stdio out] print ("result: ", value)] nl];
                }
        ;

The semantic block `{ int value; }' between the name of the non-terminal (`language') and the colon at the start of the rule, is visible in the whole rule, and therefore the designated place to declare local variables.

The `*'-operator is preceded by two entities between `[' and `]'. These brackets group the entities in between them and make the operator apply to the whole group. Every time a NUMBER is matched, the action will also be executed.

One thing remains to be defined: how is the value of the NUMBER token retrieved from the lexer, i.e., what must be filled in at the place of the `...'? We can make our lexer provide a method

int
  number_value;

which we can invoke to retrieve the value of the most recently read NUMBER token. In this case, the action becomes

value += [lex number_value];

`lex' is an instance variable of the parser, of the type gps.Lexer. Obviously, this class does not implement the `number_value' method. We could add it as an extension, but prefer to employ a subclass instead. To indicate this class to the TOM compiler, the lex instance variable must be redeclared. For the purpose of (re-)declaring instance variables, gp provides the `%instance' directive, which must be placed following the `%class' directive.

%instance
{
  <doc> Our lexer is an {ex2.Lexer}.  </doc>
  redeclare ex2.Lexer lex;
}

This completes our number-adding parser. The directory `ex2' contains the full source, including a lexer that can recognize integer numbers, which it returns as NUMBER tokens. Some sample runs:

$ echo | ./ex2
result: 0
$ echo 1 2 3 4 | ./ex2
result: 10
$ echo -1 | ./ex2
result: 0
stdin:1: error at '-'
$ ./ex2
40
2
result: 42


Go to the first, previous, next, last section, table of contents.