From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,278bf0771374076e X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,UTF8 Path: g2news2.google.com!news4.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!newsfeed00.sul.t-online.de!t-online.de!keepthis.news.telefonica.de!telefonica.de!news-fra1.dfn.de!newsfeed.ision.net!newsfeed2.easynews.net!ision!newsfeed.arcor.de!newsspool3.arcor-online.net!news.arcor.de.POSTED!not-for-mail Newsgroups: comp.lang.ada Subject: Re: ada is getting spanked :( From: Georg Bauhaus In-Reply-To: References: <1162052997.664967.135910@e3g2000cwe.googlegroups.com> <3321666.DLNnW6yRHq@linux1.krischik.com> <1162085683.30292.23.camel@localhost.localdomain> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Organization: # Message-ID: <1162153407.18869.34.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.6.1 Date: Sun, 29 Oct 2006 21:23:27 +0100 NNTP-Posting-Date: 29 Oct 2006 21:21:07 CET NNTP-Posting-Host: 9bcc8708.newsspool3.arcor-online.net X-Trace: DXC=l6AIN>VejVh[<2^lD@_eR`M_YhlUJ83dj X-Complaints-To: usenet-abuse@arcor.de Xref: g2news2.google.com comp.lang.ada:7267 Date: 2006-10-29T21:21:07+01:00 List-Id: On Sun, 2006-10-29 at 16:27 +0000, Bj=C3=B6rn Persson wrote: > Georg Bauhaus wrote: > > In order to optimize GNAT's standing, I have made a small > > but quite effective change to one of Jim Rogers' programs > > (regex-dna #2), the factor is 14 (fourteen). The speed factor is now close to 18 :-). After some simplifications, and after (I think) more closely reflecting the benchmark description. http://home.arcor.de/bauhaus/Ada/ The program now also works more like the others (in my view). > Ah, the one that uses Spitbol. Doesn't it really belong under=20 > "interesting alternative programs"? The requirements clearly state that=20 > regular expressions should be used. The pattern strings in all programs all look regular to me, including the GNAT.SPITBOL ones. But that doesn't mean that the various programs' calls to RE routines such as .findall, -all, m//g global subst(...) etc. imply "normal" naive regular expression processing. So GNAT.SPITBOL is no exception here. (In particular, I have left out obvious standard SPITBOL improvements in order to reflect the benchmark description.) I guess that the benchmark is also about how well an implementation deals with just simple REs. Anyway, I think that all of the RE, PCRE, and SPITBOL patterns I've seen in many of the contributions reflect the spirit of the benchmark, as is required (literally). How could a notion of REs be both precise and precisely applicable to the ways in which various PLs implement them? E.g. how can you turn off Boyer-Moore string searching when one implementation has it, just so that only some specified internal way of pattern matching is compared? So my rule was: Simple patterns, no tricks. > I fixed up the regex version some time ago and achieved a dramatic=20 > improvement, but I didn't touch the Spitbol version. They were very=20 > similar before that, and as I recall they had similar performance too. > My changes are now in "regex-dna Ada 95 GNAT #3". I did have a look at this program, however with program #3 as is I keep getting segmentation faults due to the Sequence_Lines (1 .. 1_000_000) of type Unbounded_String. ulimit -s 10000 didn't help, reducing the number of lines did. regards, Georg=20