* derived_type_definition ::= [abstract] new subtype_ind [record_extension_part]
@ 2003-03-14 2:34 Oliver Kellogg
2003-03-16 18:30 ` Oliver Kellogg
2003-03-16 23:06 ` Oliver Kellogg
0 siblings, 2 replies; 3+ messages in thread
From: Oliver Kellogg @ 2003-03-14 2:34 UTC (permalink / raw)
The Ada95 grammar has just one rule, derived_type_definition,
to denote both an ordinary derived type and a derived record
extension. How come the distinction is missing?
(When making an abstract syntax tree for Ada, I wonder
whether this is fine-grained enough. In particular, it
doesn't feel right to have just one node representing
both alternatives.)
O. Kellogg
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: derived_type_definition ::= [abstract] new subtype_ind [record_extension_part]
2003-03-14 2:34 derived_type_definition ::= [abstract] new subtype_ind [record_extension_part] Oliver Kellogg
@ 2003-03-16 18:30 ` Oliver Kellogg
2003-03-16 23:06 ` Oliver Kellogg
1 sibling, 0 replies; 3+ messages in thread
From: Oliver Kellogg @ 2003-03-16 18:30 UTC (permalink / raw)
I similar problem exists for the mixing of PROCEDURE
and FUNCTION in the subprogram related rules.
I'm resolving the problem by propagating those tokens
into the tree nodes. Not beautiful, but it works.
OTOH, some of the Ada grammar rules seem overly
detailed. `statement' for example, which resolves to
`simple_statement' or `compound_statement', and then
to the individual statements.
I guess I just don't yet grasp the design rationale of
the Ada grammar in all detail. Does anyone know where to
find such information?
Thanks,
Oliver Kellogg
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: derived_type_definition ::= [abstract] new subtype_ind [record_extension_part]
2003-03-14 2:34 derived_type_definition ::= [abstract] new subtype_ind [record_extension_part] Oliver Kellogg
2003-03-16 18:30 ` Oliver Kellogg
@ 2003-03-16 23:06 ` Oliver Kellogg
1 sibling, 0 replies; 3+ messages in thread
From: Oliver Kellogg @ 2003-03-16 23:06 UTC (permalink / raw)
Pardon the bad style of replying to one's own posting...
I wrote:
> (When making an abstract syntax tree for Ada, I wonder
> whether this is fine-grained enough. In particular, it
> doesn't feel right to have just one node representing
> both alternatives.)
Well, after some more thinking I know where the "bad feeling"
comes from, see the end of the below text. (These are my
internal guidelines for the ANTLR Ada tree design.)
$Id: TREMULA,v 0.0 2003/03/16 22:36:38 kellogg Exp kellogg $
TREe design Meta rULes for Ada
In principle, the tree structure closely follows the RM grammar,
except where good reasons suggest not to.
- Most of the RM rules are mirrored here, both as rules and as tree
nodes. For an explanation of which rules are not mirrored, see
the next paragraph.
Rule names are given in lowercase letters.
Node types are given in uppercase letters.
Node types named after RM grammar rules represent the corresponding
items of the RM.
Node types not named after RM grammar rules are auxiliary nodes.
The main purpose of such nodes is to normalize the node structure.
(For an index of all node types implemented, see the second and
third part of the "tokens" section of the AdaLexer in ada.g.)
- Only a few RM grammar rules are not mirrored, namely those which
are trivially reconstructed from finer grained rules/nodes.
An example of such are the simple_statement and compound_statement
rules/nodes which are trivially reconstructed from the individual
statement nodes.
The reason for omitting this kind of nodes is to avoid burdening
the tree with "pass-thru" nodes that offer very little added value.
- Additionally, optional items in the RM grammar (i.e. the parts
enclosed in square brackets) have their own rules, and also their
own nodes in the tree. The rule name ends in _opt and the node type
ends in _OPT.
Subsequent optional items, be they nested or in series, are mapped
to a single _OPT node, for example:
[ abstract [ tagged ] ] [ limited ] => ABSTRACT_TAGGED_LIMITED_OPT
- Tree Design Formula:
To go from an RM rule to the tree, prune the RM rule of terminals.
The items that remain after this pruning have a direct correspondence
in the tree node, with the abovementioned augmentation of optional
items applying. The sequence of items is preserved in the node.
The only syntactic items not mirrored in the tree are:
1. The "END [identifier]" at the end of Ada units et al.
The optional identifier, if given, is checked to match with the
corresponding defining_identifier at the start of the unit during
parsing, but the END keyword and the ending identifier do not
enter the tree.
2. The ending semicolon of all statements/declarations.
Here is an example for the mapping from RM grammar to the ANTLR rule,
and to the Abstract Syntax Tree.
RM grammar rule:
block_statement ::= [block_statement_identifier:]
[ DECLARE
declarative_part ]
BEGIN
handled_sequence_of_statements
END [block_identifier];
Corresponding ANTLR rule:
block_statement
Direct subordinate rules invoked by block_statement:
statement_identifier_opt
Handles the "[block_statement_identifier:]".
This is a node of type STATEMENT_IDENTIFIER_OPT which is
always present, but whose contents may be empty because the
block_statement_identifier is optional. The ending colon
does not appear in the node.
declare_opt
Handles the "[ DECLARE declarative_part ]".
This is a node of type DECLARE_OPT which is always present,
but whose contents may be empty because the "DECLARE
declarative_part" is optional.
The DECLARE keyword does not appear in the node.
block_body
Handles the "BEGIN handled_sequence_of_statements".
Returns a node of type HANDLED_SEQUENCE_OF_STATEMENTS.
The BEGIN keyword does not appear in the node.
end_id_opt
Handles the "END [block_identifier]".
Checks that the block_identifier matches the
block_statement_identifier given at the beginning.
However, no node is constructed.
semi
Handles the ending semicolon.
No node is constructed.
Note: The existence of the last three rules (block_body, end_id_opt,
and semi) does not directly follow from the Tree Design Formula
given above. In fact, those three rules could be inlined into
the block_statement rule without ripple. It just so happens that
these rules can be reused in a number of other rules.
Thus, the Tree Design Formula is extended as follows:
- If a subset of a rule is found to occur in more than one place
within the grammar, then this subset is factored into a separate
rule. The new rule is "synthetic" in that it does not appear in
the RM grammar. There may also exist a node type that corresponds
to the synthetic rule. However, the design decision about whether
or not such a node exists is not yet formalized. @@FIXME
AST node: type = BLOCK_STATEMENT
Subordinate nodes:
STATEMENT_IDENTIFIER_OPT
DECLARE_OPT
HANDLED_SEQUENCE_OF_STATEMENTS
==
Applying the Tree Design Formula to the Ada grammar has worked out
fine - with two prominent exceptions:
1. Mentions of PROCEDURE and FUNCTION
The Annex P grammar always lumps together mentions of PROCEDURE
and FUNCTION in a single rule. (The rules are:
subprogram_specification, access_to_subprogram_definition,
generic_renaming_declaration, generic_instantiation.)
However, this is not fine grained enough, i.e. applying the Tree
Design Formula it is not possible to discard the terminals,
PROCEDURE and FUNCTION, to arrive at the node structure.
This problem is solved by maintaining the PROCEDURE and FUNCTION
terminals within the corresponding nodes.
2. derived_type_definition
The derived_type_definition rule does not exclude the following:
TYPE child IS ABSTRACT NEW parent;
which of course is not valid Ada. In order to maintain a clean
node design, the derived_type_definition is split into two
rules/nodes, namely ordinary_derived_type and
derived_record_extension (rules) resp. ORDINARY_DERIVED_TYPE
and DERIVED_RECORD_EXTENSION (nodes).
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2003-03-16 23:06 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-14 2:34 derived_type_definition ::= [abstract] new subtype_ind [record_extension_part] Oliver Kellogg
2003-03-16 18:30 ` Oliver Kellogg
2003-03-16 23:06 ` Oliver Kellogg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox