From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,e0fa6eae2c537e3d X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news2.google.com!news3.google.com!feeder1-2.proxad.net!proxad.net!feeder2-2.proxad.net!newsfeed.arcor.de!newsspool2.arcor-online.net!news.arcor.de.POSTED!not-for-mail From: "Dmitry A. Kazakov" Subject: Re: GNAT.Regpat problem. Newsgroups: comp.lang.ada User-Agent: 40tude_Dialog/2.0.15.1 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Reply-To: mailbox@dmitry-kazakov.de Organization: cbb software GmbH References: <41cae3ac-97b6-4b01-ad73-9ff1ec2fdf86@q12g2000prb.googlegroups.com> <1thhy53rdlhrs.1tygcc01f6i11$.dlg@40tude.net> <5exbsdc3coio.1ke3vi3sk7ss4.dlg@40tude.net> Date: Fri, 25 Mar 2011 10:02:14 +0100 Message-ID: <1nldz3m96wp6o.1gxjoy2zlrwon.dlg@40tude.net> NNTP-Posting-Date: 25 Mar 2011 10:02:14 CET NNTP-Posting-Host: 34e7814b.newsspool1.arcor-online.net X-Trace: DXC=:>6bb7Uoc\\AX0F2i>21F On Thu, 24 Mar 2011 17:12:04 -0400, Peter C. Chapin wrote: > On Thu, 24 Mar 2011, Dmitry A. Kazakov wrote: > >> So, yes, it is difficult to find a case where RE could find place. And in >> general, any patterns are unusable for syntax analyzers for many reasons, >> I don't want to go into. A manually written scanner is simpler and safer. > > I did notice that it was harder, I thought, to get high quality error > reporting when using packaged REs. I also had a case that used a hand > written finite state machine. I prefer to split it into smaller elements (e.g. "comment," "literal") for a recursive descent parser and then code them without FSM. I hate FSMs. I have to use FSM when dealing with layered protocols, which makes me hating it more and more. > The code was longer but the error reporting > was very specific. I'm not saying it would be impossible to get the same > error reporting using REs but it seemed unnatural. Right now my code using > REs just reports "Unrecognized input" (or something to that effect) and > doesn't bother trying to figure out what's unrecognized and why. It was no > big deal to get the hand written FSM to do that, however. That reminds me one case for patterns I forgot. When you get something incomprehensible you might wish to jump forward to a place where parser could recover. E.g. to the end of a comment, to the n-th closing bracket etc (standard REs cannot count brackets). For this kind of superficial + error tolerant analysis patterns indeed might be usable. A similar case would be syntax coloring. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de