250 lines
7.5 KiB
TeX
250 lines
7.5 KiB
TeX
\section{Sections}
|
||
|
||
\subsection{unit}
|
||
The unit section specifies the unit name of the generated source file.
|
||
The syntax is identical to Object Pascal.
|
||
|
||
\subsection{uses}
|
||
The §uses{...}§ section is used to specify the units which must be
|
||
included in the interface's uses clause of the generated pascal
|
||
unit. Every unit name must be terminated by a semicolon. Repeated
|
||
units are included only once.
|
||
\begin{verbatim}
|
||
uses
|
||
{
|
||
Classes;
|
||
Windows;
|
||
}
|
||
\end{verbatim}
|
||
|
||
|
||
\subsection{const}
|
||
The §const{...}§ section is used to specify items that appear in
|
||
the interface's const clause of the generated pascal unit. The
|
||
content of this section is copied verbatim into the unit.
|
||
\begin{verbatim}
|
||
const
|
||
{
|
||
const1 = 12;
|
||
const2 = ‘FOO’;
|
||
}
|
||
\end{verbatim}
|
||
|
||
\subsection{type}
|
||
The §type{...}§ section is used to specify items that appear in
|
||
the interface's type clause of the generated pascal unit. The
|
||
content of this section is copied verbatim into the unit.
|
||
\begin{verbatim}
|
||
type
|
||
{
|
||
TmyType1 = integer;
|
||
TmyType2 = array [0..16] of TmyType1;
|
||
}
|
||
\end{verbatim}
|
||
|
||
\subsection{options}
|
||
The §options{...}§ section contains options for a given grammar
|
||
element. Options can be defined for lexer/parser classes, rules
|
||
and subrules.
|
||
|
||
\subsection{tokens}
|
||
If you need to define an ``imaginary'' token (i.e. one that has no
|
||
corresponding real input symbol) use the §tokens{...}§ section to
|
||
define them. You can also define literals in this section.
|
||
|
||
\begin{verbatim}
|
||
tokens
|
||
{
|
||
"procedure";
|
||
"function";
|
||
INTEGER;
|
||
}
|
||
\end{verbatim}
|
||
|
||
Strings defined in this way are treated just as if you had referenced them in
|
||
the parser. The formal syntax is:
|
||
|
||
\begin{verbatim}
|
||
tokenSpecification
|
||
: "tokens"
|
||
LCURLY
|
||
(tokenItem SEMI)*
|
||
RCURLY
|
||
;
|
||
|
||
tokenItem
|
||
: TOKEN
|
||
| STRING
|
||
;
|
||
\end{verbatim}
|
||
|
||
The §tokens{...}§ section is only valid in lexer grammars.
|
||
|
||
\subsection{memberdecl}
|
||
The §memberdecl{...}§ section contains additional member
|
||
declarations for the grammar class. It allows the expansion of the
|
||
grammar class with user defined members, so it is not necessary to
|
||
derive new classes from the generated class to implement
|
||
additional functionality. The content of this section is copied
|
||
verbatim into the class declaration of the generated grammar
|
||
class.
|
||
\begin{verbatim}
|
||
memberdecl
|
||
{
|
||
procedure proc1;
|
||
procedure proc2;
|
||
}
|
||
\end{verbatim}
|
||
|
||
\subsection{memberdef}
|
||
The §memberdef{...}§ section contains the implementation of the
|
||
classes' additional functionality. The content of this section is
|
||
copied verbatim into the implementation part of the generated
|
||
unit. This section may also contain the initialization and
|
||
finalization clauses.
|
||
|
||
\begin{verbatim}
|
||
memberdef
|
||
{
|
||
procedure TmyClass.proc1;
|
||
begin
|
||
...
|
||
end;
|
||
|
||
procedure TmyClass.proc2;
|
||
begin
|
||
...
|
||
end;
|
||
}
|
||
\end{verbatim}
|
||
|
||
\subsection{parser}
|
||
Parser rules must be associated with a parser class. Each parser
|
||
class specification precedes the options, and rule definitions of
|
||
the parser. Grammar files §.g§ can hold only one class definition.
|
||
A parser specification in a grammar file looks like:
|
||
\begin{verbatim}
|
||
unit myParser;
|
||
uses... // optional uses {...} section
|
||
const... // optional const {...} section
|
||
type... // optional type {...} section
|
||
|
||
parser TmyParser;
|
||
|
||
options... // optional options {...} section
|
||
memberdecl... // optional memberdecl {...} section
|
||
parser rules...
|
||
memberdef... // optional memberdef {...} section
|
||
\end{verbatim}
|
||
|
||
In the generated code, the parser class results in an Object
|
||
Pascal class, and the rules become member methods of the class.
|
||
|
||
Note, that the content of the §memberdecl{...}§ section is copied
|
||
verbatim into the class declaration part of the generated parser
|
||
class while the content of the §memberdef{...}§ section is copied
|
||
after the implementation of the member rules, so the
|
||
initialization and finalization clauses of a pascal unit can be
|
||
placed in the §memberdef{...}§ section.
|
||
|
||
\subsection{lexer}
|
||
To perform lexical analysis, you need to specify a lexer class that describes
|
||
how to break up the input character stream into a stream of tokens. The syntax
|
||
is similar to that of a parser class:
|
||
\begin{verbatim}
|
||
unit myLexer;
|
||
uses... // optional uses {...} section
|
||
const... // optional const {...} section
|
||
type... // optional type {...} section
|
||
|
||
lexer TmyLexer;
|
||
|
||
options... // optional options {...} section
|
||
tokens... // optional tokens {...} section
|
||
memberdecl... // optional memberdecl {...} section
|
||
lexer rules...
|
||
memberdef... // optional memberdef {...} section
|
||
\end{verbatim}
|
||
|
||
Lexical rules contained within a lexer class become member methods in the
|
||
generated class. A lexer grammar may have a §tokens{...}§ section to specify
|
||
imaginary tokens and string literals.
|
||
|
||
\subsection{rule definitions}
|
||
The structure of an input stream of atoms is specified by a set of
|
||
mutually-referenced rules. Each rule has a name and any of the
|
||
following optional attributes: a scope specifier; a set of
|
||
arguments; an init-action; a return value; local variable
|
||
definitions; an exception handler and an alternative or
|
||
alternatives. Each alternative contains a series of elements that
|
||
specify what to match and where. Scope can be specified by
|
||
private, protected, or public keywords. A rule has public scope by
|
||
default. The basic form of a rule is:
|
||
\begin{verbatim}
|
||
(scope) rulename
|
||
: alternative_1
|
||
| alternative_2
|
||
...
|
||
| alternative_n
|
||
;
|
||
\end{verbatim}
|
||
|
||
Parameters for a rule can be specified in the following form:
|
||
\begin{verbatim}
|
||
rulename [formal parameters] : ... ;
|
||
\end{verbatim}
|
||
|
||
If the rule returns a value, it’s type can be defined with the
|
||
returns keyword:
|
||
\begin{verbatim}
|
||
rulename returns [typename] : ... ;
|
||
\end{verbatim}
|
||
|
||
where §typename§ is a valid Object Pascal type specifier.
|
||
|
||
Local variables for a rule can be defined in the §local{...}§ section:
|
||
|
||
\begin{verbatim}
|
||
rule
|
||
local
|
||
{
|
||
foo: integer;
|
||
bar: string;
|
||
}
|
||
\end{verbatim}
|
||
|
||
Init-actions are specified before the colon. Init-actions differ from normal
|
||
actions because they are always executed regardless of guess mode.
|
||
|
||
\begin{verbatim}
|
||
rule
|
||
{
|
||
init-action
|
||
}
|
||
: ... ;
|
||
\end{verbatim}
|
||
|
||
|
||
\paragraph{Parser rules} apply structure to a stream of tokens, whereas
|
||
lexer rules apply structure to a stream of characters. Parser
|
||
rules, therefore, must not reference cha\-rac\-ter literals.
|
||
Double-quoted strings in parser rules are considered to be token
|
||
references. Note: all parser rules must begin with a lowercase
|
||
letter.
|
||
|
||
\paragraph{Lexer rules} defined within a lexer grammar must have a name beginning
|
||
with an uppercase letter. These rules implicitly match
|
||
cha\-rac\-ters on the input stream instead of tokens on the token
|
||
stream. Referenced grammar elements include token references
|
||
(implicit lexer rule references), cha\-rac\-ters and strings.
|
||
Lexer rules are processed in the same manner as parser rules, and
|
||
may also specify arguments and return values. A scope specifier
|
||
for a lexer rule has special meaning in lexer grammars. In the
|
||
generated Object Pascal unit, the lexer class has a §nextToken§
|
||
function which is the interface between the lexer and the parser.
|
||
This function is synthesized from the public lexer rules. It means
|
||
that non-public lexer rules don't modify the prediction logic of
|
||
the lexer. They are usually helper rules. If the lexer grammar has
|
||
no public rule at all, the §nextToken§ function returns EOF to the
|
||
parser.
|