* lexical ambiguity @ 2006-06-02 22:13 bla_bla1357 2006-06-02 22:35 ` Frank J. Lhota 2006-06-02 23:27 ` Keith Thompson 0 siblings, 2 replies; 23+ messages in thread From: bla_bla1357 @ 2006-06-02 22:13 UTC (permalink / raw) I'm doing a lexical analysis of Ada using Lex as part of a student project. The highlight is on using Lex, not on the programming language of Ada and I'm not farmilliar with using Ada. So what I woulkd like to find out is if there is any lexical ambiguity in Ada (like the ambiguity in C with the unary and binary plus and minus). Thanks in advance... ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-02 22:13 lexical ambiguity bla_bla1357 @ 2006-06-02 22:35 ` Frank J. Lhota 2006-06-03 5:20 ` Jeffrey R. Carter 2006-06-02 23:27 ` Keith Thompson 1 sibling, 1 reply; 23+ messages in thread From: Frank J. Lhota @ 2006-06-02 22:35 UTC (permalink / raw) The biggest lexical issue with Ada is the multiple uses of the single quote: - Single quotes surround character literals (e.g. 'A'), - prefix attributes (for example List'First), and - are used in aggregates, such as Rational'(Num =>1, Demom => 2). ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-02 22:35 ` Frank J. Lhota @ 2006-06-03 5:20 ` Jeffrey R. Carter 2006-06-04 17:33 ` Frank J. Lhota 0 siblings, 1 reply; 23+ messages in thread From: Jeffrey R. Carter @ 2006-06-03 5:20 UTC (permalink / raw) Frank J. Lhota wrote: > > - are used in aggregates, such as Rational'(Num =>1, Demom => 2). This is a qualified expression, as is Integer'(I). It just happens that the expression is an aggregate. Aggregates themselves don't use the apostrophe: R : Rational := (Num => 1, Denom => 2); -- Jeff Carter "Why don't you bore a hole in yourself and let the sap run out?" Horse Feathers 49 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-03 5:20 ` Jeffrey R. Carter @ 2006-06-04 17:33 ` Frank J. Lhota 2006-06-05 1:36 ` Jeffrey R. Carter 0 siblings, 1 reply; 23+ messages in thread From: Frank J. Lhota @ 2006-06-04 17:33 UTC (permalink / raw) "Jeffrey R. Carter" <spam.not.jrcarter@acm.not.spam.org> wrote in message news:w_8gg.760526$084.110855@attbi_s22... > Frank J. Lhota wrote: >> >> - are used in aggregates, such as Rational'(Num =>1, Demom => 2). > > This is a qualified expression, as is Integer'(I). It just happens that > the expression is an aggregate. Aggregates themselves don't use the > apostrophe: > > R : Rational := (Num => 1, Denom => 2); Yes, of course you're right. The main point is that the multiple uses of single quote is the one thing that the Ada lexer needs to be especially careful about. Make sure that your lexer can handle the following exression properly: Foo'(',',',',',' ... ) ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-04 17:33 ` Frank J. Lhota @ 2006-06-05 1:36 ` Jeffrey R. Carter 2006-06-05 18:30 ` Frank J. Lhota 0 siblings, 1 reply; 23+ messages in thread From: Jeffrey R. Carter @ 2006-06-05 1:36 UTC (permalink / raw) Frank J. Lhota wrote: > > Yes, of course you're right. The main point is that the multiple uses of > single quote is the one thing that the Ada lexer needs to be especially > careful about. Make sure that your lexer can handle the following exression > properly: Yes, but it's good to be precise. > Foo'(',',',',',' ... ) Clearly you have an evil mind :) -- Jeff Carter "What I wouldn't give for a large sock with horse manure in it." Annie Hall 42 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-05 1:36 ` Jeffrey R. Carter @ 2006-06-05 18:30 ` Frank J. Lhota 2006-06-05 20:27 ` Keith Thompson 2006-06-05 22:16 ` Jeffrey R. Carter 0 siblings, 2 replies; 23+ messages in thread From: Frank J. Lhota @ 2006-06-05 18:30 UTC (permalink / raw) Jeffrey R. Carter wrote: > Yes, but it's good to be precise. Absolutely! I should have said "qualified expression" in my original post. Sorry for any confusion that I may have caused. >> Foo'(',',',',',' ... ) > > Clearly you have an evil mind :) Well, there is a good reason to consider this worst case scenario. I have seen quick and dirty Ada lexers that try to determine if a single quote starts a character literal by looking ahead 2 character. As this scenario shows, this approach is not guaranteed to work. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-05 18:30 ` Frank J. Lhota @ 2006-06-05 20:27 ` Keith Thompson 2006-06-05 22:11 ` Jeffrey R. Carter 2006-06-05 22:16 ` Jeffrey R. Carter 1 sibling, 1 reply; 23+ messages in thread From: Keith Thompson @ 2006-06-05 20:27 UTC (permalink / raw) "Frank J. Lhota" <flhota@NOSPAM.ll.mit.edu> writes: > Jeffrey R. Carter wrote: >> Yes, but it's good to be precise. > > Absolutely! I should have said "qualified expression" in my original > post. Sorry for any confusion that I may have caused. > >>> Foo'(',',',',',' ... ) >> Clearly you have an evil mind :) > > Well, there is a good reason to consider this worst case scenario. I > have seen quick and dirty Ada lexers that try to determine if a single > quote starts a character literal by looking ahead 2 character. As this > scenario shows, this approach is not guaranteed to work. If I recall correctly, it's sufficient to remember what the previous token was. A character literal cannot follow an identifier. I think that might break down if an implementation chooses to define an attribute with a single-character name, but I don't remember the details; presumably no implementation will actually do this. -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst> We must do something. This is something. Therefore, we must do this. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-05 20:27 ` Keith Thompson @ 2006-06-05 22:11 ` Jeffrey R. Carter 2006-06-06 10:39 ` Georg Bauhaus 0 siblings, 1 reply; 23+ messages in thread From: Jeffrey R. Carter @ 2006-06-05 22:11 UTC (permalink / raw) Keith Thompson wrote: > > If I recall correctly, it's sufficient to remember what the previous > token was. A character literal cannot follow an identifier. Right, so it must be either an attribute, a qualified expression, or an error. An attribute must be an identifier, so it can't be an attribute, so it's either a qualified expression or an error. In this case, it's an error, since you can't have "..." as part of an aggregate :) -- Jeff Carter "Nobody expects the Spanish Inquisition!" Monty Python's Flying Circus 22 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-05 22:11 ` Jeffrey R. Carter @ 2006-06-06 10:39 ` Georg Bauhaus 2006-06-06 11:38 ` M E Leypold ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Georg Bauhaus @ 2006-06-06 10:39 UTC (permalink / raw) On Mon, 2006-06-05 at 22:11 +0000, Jeffrey R. Carter wrote: > Keith Thompson wrote: > > > > If I recall correctly, it's sufficient to remember what the previous > > token was. A character literal cannot follow an identifier. > > Right, so it must be either an attribute, a qualified expression, or an > error. Though the previous token shouldn't be a reserved word, as in if'('="-"("="('='=',',','=',')) -- Georg ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-06 10:39 ` Georg Bauhaus @ 2006-06-06 11:38 ` M E Leypold 2006-06-07 9:02 ` Dmitry A. Kazakov ` (2 more replies) 2006-06-06 13:50 ` Simon Clubley 2006-06-06 18:56 ` Peter C. Chapin 2 siblings, 3 replies; 23+ messages in thread From: M E Leypold @ 2006-06-06 11:38 UTC (permalink / raw) Georg Bauhaus <bauhaus@futureapps.de> writes: > On Mon, 2006-06-05 at 22:11 +0000, Jeffrey R. Carter wrote: > > Keith Thompson wrote: > > > > > > If I recall correctly, it's sufficient to remember what the previous > > > token was. A character literal cannot follow an identifier. > > > > Right, so it must be either an attribute, a qualified expression, or an > > error. > > Though the previous token shouldn't be a reserved word, as in > > if'('="-"("="('='=',',','=',')) Or return'a'; So now (question to all): Is the following rule enough? - "'" is the beginning of a character literal if the token before "'" has not been an identifier (reserved words not counted as identifier in this case). Regards -- Markus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-06 11:38 ` M E Leypold @ 2006-06-07 9:02 ` Dmitry A. Kazakov 2006-06-07 13:15 ` Georg Bauhaus 2006-06-07 14:49 ` Robert A Duff 2 siblings, 0 replies; 23+ messages in thread From: Dmitry A. Kazakov @ 2006-06-07 9:02 UTC (permalink / raw) On 06 Jun 2006 13:38:06 +0200, M E Leypold wrote: > Georg Bauhaus <bauhaus@futureapps.de> writes: > >> On Mon, 2006-06-05 at 22:11 +0000, Jeffrey R. Carter wrote: >>> Keith Thompson wrote: >>> > >>> > If I recall correctly, it's sufficient to remember what the previous >>> > token was. A character literal cannot follow an identifier. >>> >>> Right, so it must be either an attribute, a qualified expression, or an >>> error. >> >> Though the previous token shouldn't be a reserved word, as in >> >> if'('="-"("="('='=',',','=',')) > > Or > > return'a'; > > So now (question to all): Is the following rule enough? > > - "'" is the beginning of a character literal if the token before > "'" has not been an identifier (reserved words not counted as > identifier in this case). It does not differ from the case of +/-. In the infix context, i.e. after an operand (whatever it might be), ' is an infix operation as well as +/-. In the prefix context, where an operand is expected ' introduces a character literal (=operand), +/- do an unary prefix operation. Your rule is wrong: 'A' and 'B'. "and" is a reserved word. Then of course "..." comments should be parsed before. Which gives you a nice vicious circle around ' " ' and " ' ". (:-)) The bottom line: parsing has state. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-06 11:38 ` M E Leypold 2006-06-07 9:02 ` Dmitry A. Kazakov @ 2006-06-07 13:15 ` Georg Bauhaus 2006-06-07 14:49 ` Robert A Duff 2 siblings, 0 replies; 23+ messages in thread From: Georg Bauhaus @ 2006-06-07 13:15 UTC (permalink / raw) On Tue, 2006-06-06 at 13:38 +0200, M E Leypold wrote: > - "'" is the beginning of a character literal if the token before > "'" has not been an identifier (reserved words not counted as > identifier in this case). You could change the words of the rule slightly be considering ''' Georg ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-06 11:38 ` M E Leypold 2006-06-07 9:02 ` Dmitry A. Kazakov 2006-06-07 13:15 ` Georg Bauhaus @ 2006-06-07 14:49 ` Robert A Duff 2006-06-07 17:18 ` M E Leypold 2 siblings, 1 reply; 23+ messages in thread From: Robert A Duff @ 2006-06-07 14:49 UTC (permalink / raw) M E Leypold <development-2006-8ecbb5cc8a-REMOVETHIS@m-e-leypold.de> writes: > So now (question to all): Is the following rule enough? > > - "'" is the beginning of a character literal if the token before > "'" has not been an identifier (reserved words not counted as > identifier in this case). Not quite: function F(X: Integer) return String; Length: constant Natural := F(123)'Length; Y: access T'Class := ...; Z: access T2'Class := Y.all'Access; For reserved words, I think you have to study the grammar, and determine which ones can precede a tick mark. - Bob ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-07 14:49 ` Robert A Duff @ 2006-06-07 17:18 ` M E Leypold 2006-06-08 21:30 ` Robert A Duff ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: M E Leypold @ 2006-06-07 17:18 UTC (permalink / raw) Robert A Duff <bobduff@shell01.TheWorld.com> writes: > M E Leypold <development-2006-8ecbb5cc8a-REMOVETHIS@m-e-leypold.de> writes: > > > So now (question to all): Is the following rule enough? > > > > - "'" is the beginning of a character literal if the token before > > "'" has not been an identifier (reserved words not counted as > > identifier in this case). > > Not quite: > > function F(X: Integer) return String; > > Length: constant Natural := F(123)'Length; Ouch. OK. First a message to Dmitry A. Kazakov and Georg Bauhaus: Sorry, I did neither understand all of what you said nor the exact implications. But Thanks! Than: The original poster asked a question about 'lexical ambiguity'. The ensuing diskussions leaves me more and more doubtful: Can lexical anlysis (grouping characters to tokens and grammatical analysis (building a parse tree from a token sequence) be separated cleanly in Ada? My first approach would have been (no I'm not implementing an Ada parser, but since compiler construction has been a favorite subject of me for a number of years, I'm a bit curious about the position of Ada in all this) -- now: My first approach would have been, to write a lexer with a minimal amount of state. It would shift into collect-string state when encountering a '"' (I mean a double quote :-) and into especially into maybe-now-comes-a-character-literal state at certain points. My first take was that the "certain points" are always after identifiers. In view of the case quoted above (F(123)'Length) I could amend this rule by adding ')' to the certain points. But now things become rather ad-hoc. Well -- as I said, that it's just curiosity driving me, so I'm not going now to examine the RM not I'm going to reverse engineer GNAT to find out how it is done in reality. But if anyone in c.l.a. has the answer to the following questions, I'd be eternally grateful. Well, grateful, anyway. :-) - Is it possible (for Ada parsers) to separate lexical analysis and grammatical analysis into seperate phases without tricky feedback from parser to lexer, possibly by using a lexer with a finite amount of states. - What is the complete rule for deciding when the next token might be a character literal. Or is that undecidable by just looking on past input (i.e. using lexer state)? BTW: The "evil" case if'('="-"("="('='=',',','=',')) is not parsed ok by syntax highligting in emacs ada-mode (I wouldn't have expected it, actually). The rule there seems to be my incomplete rule without the reserved words exception. Everything falls magically into place if a " " is inserted immediately after "if". > > Y: access T'Class := ...; > Z: access T2'Class := Y.all'Access; > > For reserved words, I think you have to study the grammar, and determine > which ones can precede a tick mark. OK. That I understand now. Regards -- Markus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-07 17:18 ` M E Leypold @ 2006-06-08 21:30 ` Robert A Duff 2006-06-09 4:41 ` Jeffrey R. Carter 2006-06-09 8:23 ` Georg Bauhaus 2 siblings, 0 replies; 23+ messages in thread From: Robert A Duff @ 2006-06-08 21:30 UTC (permalink / raw) M E Leypold <development-2006-8ecbb5cc8a-REMOVETHIS@m-e-leypold.de> writes: > Robert A Duff <bobduff@shell01.TheWorld.com> writes: > > > M E Leypold <development-2006-8ecbb5cc8a-REMOVETHIS@m-e-leypold.de> writes: > > > > > So now (question to all): Is the following rule enough? > > > > > > - "'" is the beginning of a character literal if the token before > > > "'" has not been an identifier (reserved words not counted as > > > identifier in this case). > > > > Not quite: > > > > function F(X: Integer) return String; > > > > Length: constant Natural := F(123)'Length; > > Ouch. It's not a BIG ouch. To determine whether a single quote begins a character literal versus a tick, it is sufficient to look back one token. Some tokens can be followed by a tick, some by a char_lit, and some by neither. None can be followed by both. It's fairly straightforward to study the grammar and determine which are which. Or look at the GNAT sources. It might be wise to include a sentinel token at the start of the token stream (Begin_File_Token or whatever), just in case ' comes first (that would be illegal, but you don't want to crash on it). It can all be done in the lexer, with no feedback from the parser -- the lexer just needs to keep track of the previous token, and check it when it sees a single quote. Lookahead will get you in trouble; look-back is the better answer here. > OK. First a message to Dmitry A. Kazakov and Georg Bauhaus: Sorry, I > did neither understand all of what you said nor the exact > implications. But Thanks! I didn't entirely understand that, either. > Than: The original poster asked a question about 'lexical > ambiguity'. The ensuing diskussions leaves me more and more doubtful: > Can lexical anlysis (grouping characters to tokens and grammatical > analysis (building a parse tree from a token sequence) be separated > cleanly in Ada? Yes. The look-back is localized to the lexer (which is not "clean", but at least it's localized (separated from the parser)). > My first approach would have been (no I'm not implementing an Ada > parser, but since compiler construction has been a favorite subject of > me for a number of years, I'm a bit curious about the position of Ada > in all this) -- now: My first approach would have been, to write a > lexer with a minimal amount of state. It would shift into > collect-string state when encountering a '"' (I mean a double quote > :-) and into especially into maybe-now-comes-a-character-literal state > at certain points. My first take was that the "certain points" are > always after identifiers. In view of the case quoted above > (F(123)'Length) I could amend this rule by adding ')' to the certain > points. Right. But you have to study the grammar to know which tokens have this property. It's not that big of a deal. > But now things become rather ad-hoc. Well -- as I said, that it's just > curiosity driving me, so I'm not going now to examine the RM not I'm > going to reverse engineer GNAT to find out how it is done in reality. > > But if anyone in c.l.a. has the answer to the following questions, I'd > be eternally grateful. Well, grateful, anyway. :-) > > - Is it possible (for Ada parsers) to separate lexical analysis and > grammatical analysis into seperate phases without tricky feedback > from parser to lexer, possibly by using a lexer with a finite > amount of states. Yes. Just a tiny bit of state -- the previous token. The lexer writer needs to understand the grammar, but the lexer does not need to understand the parser. > - What is the complete rule for deciding when the next token might > be a character literal. Or is that undecidable by just looking on > past input (i.e. using lexer state)? It is decidable by looking at the previous token. I forget the exact rule, but it can be deduced easily from the grammar. > BTW: The "evil" case > > if'('="-"("="('='=',',','=',')) > > is not parsed ok by syntax highligting in emacs ada-mode (I wouldn't > have expected it, actually). The rule there seems to be my incomplete > rule without the reserved words exception. Everything falls magically > into place if a " " is inserted immediately after "if". I'm not surprised. Emacs ada-mode uses some ad-hoc technique that doesn't always work properly. Anyway, Emacs is trying to parse bits and pieces of things without seeing the whole file, and that's a whole 'nother thing. It is certainly easy to parse the above "evil" thing properly, but not necessarily if you start in the middle of it. > > Y: access T'Class := ...; > > Z: access T2'Class := Y.all'Access; > > > > For reserved words, I think you have to study the grammar, and determine > > which ones can precede a tick mark. > > OK. That I understand now. > > Regards -- Markus - Bob ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-07 17:18 ` M E Leypold 2006-06-08 21:30 ` Robert A Duff @ 2006-06-09 4:41 ` Jeffrey R. Carter 2006-06-09 8:23 ` Georg Bauhaus 2 siblings, 0 replies; 23+ messages in thread From: Jeffrey R. Carter @ 2006-06-09 4:41 UTC (permalink / raw) M E Leypold wrote: > > But now things become rather ad-hoc. Well -- as I said, that it's just > curiosity driving me, so I'm not going now to examine the RM not I'm > going to reverse engineer GNAT to find out how it is done in reality. You don't need to reverse engineer it. The sources are freely available. -- Jeff Carter "Run away! Run away!" Monty Python and the Holy Grail 58 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-07 17:18 ` M E Leypold 2006-06-08 21:30 ` Robert A Duff 2006-06-09 4:41 ` Jeffrey R. Carter @ 2006-06-09 8:23 ` Georg Bauhaus 2 siblings, 0 replies; 23+ messages in thread From: Georg Bauhaus @ 2006-06-09 8:23 UTC (permalink / raw) M E Leypold wrote: >> M E Leypold <development-2006-8ecbb5cc8a-REMOVETHIS@m-e-leypold.de> writes: >> >>> So now (question to all): Is the following rule enough? >>> >>> - "'" is the beginning of a character literal if the token before >>> "'" has not been an identifier (reserved words not counted as >>> identifier in this case). > OK. First a message to Dmitry A. Kazakov and Georg Bauhaus: Sorry, I > did neither understand all of what you said nor the exact > implications. But Thanks! Just a sloppy remark that in ''' the second single quote isn't the beginning of a character literal even though the token before it has not been an identifier. Just another case I could think of. Georg ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-06 10:39 ` Georg Bauhaus 2006-06-06 11:38 ` M E Leypold @ 2006-06-06 13:50 ` Simon Clubley 2006-06-06 18:56 ` Peter C. Chapin 2 siblings, 0 replies; 23+ messages in thread From: Simon Clubley @ 2006-06-06 13:50 UTC (permalink / raw) In article <1149590366.8521.5.camel@localhost>, Georg Bauhaus <bauhaus@futureapps.de> writes: > > Though the previous token shouldn't be a reserved word, as in > > if'('="-"("="('='=',',','=',')) > Hmmm. :-) Perhaps somebody should run a "Obfuscated Ada" contest... Simon. -- Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP If Google's motto is "don't be evil", then how did we get Google Groups 2 ? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-06 10:39 ` Georg Bauhaus 2006-06-06 11:38 ` M E Leypold 2006-06-06 13:50 ` Simon Clubley @ 2006-06-06 18:56 ` Peter C. Chapin 2006-06-06 19:41 ` Georg Bauhaus 2 siblings, 1 reply; 23+ messages in thread From: Peter C. Chapin @ 2006-06-06 18:56 UTC (permalink / raw) Georg Bauhaus <bauhaus@futureapps.de> wrote in news:1149590366.8521.5.camel@localhost: > if'('="-"("="('='=',',','=',')) Now *that* is evil. :-) Peter ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-06 18:56 ` Peter C. Chapin @ 2006-06-06 19:41 ` Georg Bauhaus 0 siblings, 0 replies; 23+ messages in thread From: Georg Bauhaus @ 2006-06-06 19:41 UTC (permalink / raw) On Tue, 2006-06-06 at 18:56 +0000, Peter C. Chapin wrote: > Georg Bauhaus <bauhaus@futureapps.de> wrote in > news:1149590366.8521.5.camel@localhost: > > > if'('="-"("="('='=',',','=',')) > > Now *that* is evil. :-) ;) When it came to the tick mark in ASnip's tokenizer, I had to consider the case when there isn't a token at which to look back (a source snippet might well start with 'x'). So the solution isn't perfect. I should add another piece of history, for better classing of tokens where possible. -- Georg ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-05 18:30 ` Frank J. Lhota 2006-06-05 20:27 ` Keith Thompson @ 2006-06-05 22:16 ` Jeffrey R. Carter 2006-06-06 13:20 ` Frank J. Lhota 1 sibling, 1 reply; 23+ messages in thread From: Jeffrey R. Carter @ 2006-06-05 22:16 UTC (permalink / raw) Frank J. Lhota wrote: > > Well, there is a good reason to consider this worst case scenario. I > have seen quick and dirty Ada lexers that try to determine if a single > quote starts a character literal by looking ahead 2 character. As this > scenario shows, this approach is not guaranteed to work. That's too simple minded. A character literal can't follow an identifier, so this must be either an attribute or a qualified expression (presuming it's not an error). Since "(" can't be an attribute, it must be a qualified expression. I'm not sure how to parse "...", though. You still have an evil mind, since you didn't include any spaces between the components of the aggregate, making it even harder for humans to parse (lack of spaces shouldn't make any difference to machine parsing). -- Jeff Carter "Nobody expects the Spanish Inquisition!" Monty Python's Flying Circus 22 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-05 22:16 ` Jeffrey R. Carter @ 2006-06-06 13:20 ` Frank J. Lhota 0 siblings, 0 replies; 23+ messages in thread From: Frank J. Lhota @ 2006-06-06 13:20 UTC (permalink / raw) Jeffrey R. Carter wrote: > Frank J. Lhota wrote: >> >> Well, there is a good reason to consider this worst case scenario. I >> have seen quick and dirty Ada lexers that try to determine if a single >> quote starts a character literal by looking ahead 2 character. As this >> scenario shows, this approach is not guaranteed to work. > > That's too simple minded. A character literal can't follow an > identifier, so this must be either an attribute or a qualified > expression (presuming it's not an error). Since "(" can't be an > attribute, it must be a qualified expression. I'm not sure how to parse > "...", though. That is precisely my point: the character look-ahead is too simple minded. As you and other posters have pointed out, if we simply keep track of the last token, we can use that information to determine how to handle the single quote. > You still have an evil mind, since you didn't include any spaces between > the components of the aggregate, making it even harder for humans to > parse (lack of spaces shouldn't make any difference to machine parsing). This example was to illustrate a worst case scenario for an Ada lexer. It was *not* presented as an example of recommended programming style, which it clearly is not. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: lexical ambiguity 2006-06-02 22:13 lexical ambiguity bla_bla1357 2006-06-02 22:35 ` Frank J. Lhota @ 2006-06-02 23:27 ` Keith Thompson 1 sibling, 0 replies; 23+ messages in thread From: Keith Thompson @ 2006-06-02 23:27 UTC (permalink / raw) bla_bla1357 <bla_bla1357MaknispaM@yahoo.com> writes: > I'm doing a lexical analysis of Ada using Lex as part of a student project. > The highlight is on using Lex, not on the programming language of Ada and > I'm not farmilliar with using Ada. So what I woulkd like to find out is if > there is any lexical ambiguity in Ada (like the ambiguity in C with the > unary and binary plus and minus). Thanks in advance... I suppose it depends on what you mean by "lexical ambiguity". Strictly speaking, there are no grammatical ambiguities in either language. There are plenty of things that look like ambiguities, but they're all resolved by the rules of the language. In C, for example, this: x+++++y looks like it could be parsed as x ++ + ++ y which would be a legal expression, but in fact it's tokenized as x ++ ++ + y which results in a syntax error. (C's typedef names do cause some interesting lexical problems, but that's another topic.) Ada, like, C, has unary and binary "+" and "-" operators, but each operator is easily identified based on the syntactic context in which it appears. One well-known case of a near ambiguity is: Character'('x') If Ada followed C's "maximal munch" rule, this would be tokenized as Character '(' x '... leading to a syntax error; instead, it's tokenized as: Character ' ( 'x' ) So, there are no real ambiguities in either language, but each uses different rules to resolve things that would otherwise have been ambiguous. -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst> We must do something. This is something. Therefore, we must do this. ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2006-06-09 8:23 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2006-06-02 22:13 lexical ambiguity bla_bla1357 2006-06-02 22:35 ` Frank J. Lhota 2006-06-03 5:20 ` Jeffrey R. Carter 2006-06-04 17:33 ` Frank J. Lhota 2006-06-05 1:36 ` Jeffrey R. Carter 2006-06-05 18:30 ` Frank J. Lhota 2006-06-05 20:27 ` Keith Thompson 2006-06-05 22:11 ` Jeffrey R. Carter 2006-06-06 10:39 ` Georg Bauhaus 2006-06-06 11:38 ` M E Leypold 2006-06-07 9:02 ` Dmitry A. Kazakov 2006-06-07 13:15 ` Georg Bauhaus 2006-06-07 14:49 ` Robert A Duff 2006-06-07 17:18 ` M E Leypold 2006-06-08 21:30 ` Robert A Duff 2006-06-09 4:41 ` Jeffrey R. Carter 2006-06-09 8:23 ` Georg Bauhaus 2006-06-06 13:50 ` Simon Clubley 2006-06-06 18:56 ` Peter C. Chapin 2006-06-06 19:41 ` Georg Bauhaus 2006-06-05 22:16 ` Jeffrey R. Carter 2006-06-06 13:20 ` Frank J. Lhota 2006-06-02 23:27 ` Keith Thompson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox