From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,e0fa6eae2c537e3d
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII-7-bit
Path: 
 g2news2.google.com!news3.google.com!feeder1-2.proxad.net!proxad.net!feeder2-2.proxad.net!newsfeed.arcor.de!newsspool2.arcor-online.net!news.arcor.de.POSTED!not-for-mail
From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Subject: Re: GNAT.Regpat problem.
Newsgroups: comp.lang.ada
User-Agent: 40tude_Dialog/2.0.15.1
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Reply-To: mailbox@dmitry-kazakov.de
Organization: cbb software GmbH
References: <alpine.WNT.2.00.1103221429170.5412@WHIRLWIND>
 <41cae3ac-97b6-4b01-ad73-9ff1ec2fdf86@q12g2000prb.googlegroups.com>
 <m2mxkmganb.fsf@pushface.org> <alpine.WNT.2.00.1103240622590.2220@WHIRLWIND>
 <1thhy53rdlhrs.1tygcc01f6i11$.dlg@40tude.net>
 <alpine.WNT.2.00.1103241003330.4896@WHIRLWIND>
 <5exbsdc3coio.1ke3vi3sk7ss4.dlg@40tude.net>
 <alpine.WNT.2.00.1103241708520.7004@WHIRLWIND>
Date: Fri, 25 Mar 2011 10:02:14 +0100
Message-ID: <1nldz3m96wp6o.1gxjoy2zlrwon.dlg@40tude.net>
NNTP-Posting-Date: 25 Mar 2011 10:02:14 CET
NNTP-Posting-Host: 34e7814b.newsspool1.arcor-online.net
X-Trace: 
 DXC=:>6bb7Uoc\\AX0F2i><W:Sic==]BZ:af^4Fo<]lROoRQ<`=YMgDjhgR>21F<XZeO;Z[6LHn;2LCV^7enW;^6ZC`T\`mfM[68DCS;hIe^g;BJfQ
X-Complaints-To: usenet-abuse@arcor.de
Xref: g2news2.google.com comp.lang.ada:19410
Date: 2011-03-25T10:02:14+01:00
List-Id: <comp.lang.ada>

On Thu, 24 Mar 2011 17:12:04 -0400, Peter C. Chapin wrote:

> On Thu, 24 Mar 2011, Dmitry A. Kazakov wrote:
> 
>> So, yes, it is difficult to find a case where RE could find place. And in 
>> general, any patterns are unusable for syntax analyzers for many reasons, 
>> I don't want to go into. A manually written scanner is simpler and safer.
> 
> I did notice that it was harder, I thought, to get high quality error 
> reporting when using packaged REs. I also had a case that used a hand 
> written finite state machine.

I prefer to split it into smaller elements (e.g. "comment," "literal") for
a recursive descent parser and then code them without FSM. I hate FSMs. I
have to use FSM when dealing with layered protocols, which makes me hating
it more and more.

> The code was longer but the error reporting 
> was very specific. I'm not saying it would be impossible to get the same 
> error reporting using REs but it seemed unnatural. Right now my code using 
> REs just reports "Unrecognized input" (or something to that effect) and 
> doesn't bother trying to figure out what's unrecognized and why. It was no 
> big deal to get the hand written FSM to do that, however.

That reminds me one case for patterns I forgot. When you get something
incomprehensible you might wish to jump forward to a place where parser
could recover. E.g. to the end of a comment, to the n-th closing bracket
etc (standard REs cannot count brackets). For this kind of superficial +
error tolerant analysis patterns indeed might be usable. A similar case
would be syntax coloring.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de