comp.lang.ada
 help / color / mirror / Atom feed
From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Subject: Re: Parallel Text Corpus Processing with Ada?
Date: Mon, 12 Nov 2007 14:31:37 +0100
Date: 2007-11-12T14:24:39+01:00	[thread overview]
Message-ID: <qde6vz562brd.yx5hzl15of6t.dlg@40tude.net> (raw)
In-Reply-To: 1194796479.6547.13.camel@K72

On Sun, 11 Nov 2007 16:54:40 +0100, Georg Bauhaus wrote:

> On Sun, 2007-11-11 at 09:23 +0100, Dmitry A. Kazakov wrote:
> 
>> Why necessarily RE? Or else why patterns? Patterns come at a high price.
>> They are sufficiently slower than tailored string processing algorithms.
> 
> SPITBOL patterns serve "tailored string processing algorithms";
> I don't quite understand how you could have more targetted algorithms
> for large corpora. What do you have in mind?

There is wide class of problems which patterns do not solve or solve
inefficiently. Like building dictionaries finding longest common substring
etc. (Compiling Ada programs is also in this class. (:-))

> A few RE packages have a Boyer-Moore algorithm built in, and more.
> How can string processing be substantially faster?

That depends on the problem. RE is not just about string search. Searching
for a substring is itself a very specialized problem which IMO is rarely
needed. In the other hand string processing often is more than pure
matching (i.e. cursor moving + failure/success outcome). SNOBOL had
immediate assignment to respond the problem. As a general note, pattern
matching languages are declarative with all disadvantage of.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



  parent reply	other threads:[~2007-11-12 13:31 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-10 23:05 Parallel Text Corpus Processing with Ada? braver
2007-11-11  0:11 ` tmoran
2007-11-11  1:10 ` Georg Bauhaus
2007-11-11  8:23 ` Dmitry A. Kazakov
2007-11-11 15:54   ` Georg Bauhaus
2007-11-11 16:13     ` Georg Bauhaus
2007-11-12 13:31     ` Dmitry A. Kazakov [this message]
2007-11-12 15:07       ` Georg Bauhaus
2007-11-12 16:11         ` Dmitry A. Kazakov
2007-11-11 22:49   ` braver
2007-11-12 16:17     ` Dmitry A. Kazakov
2007-11-13 22:45     ` Simon Wright
2007-11-14 23:38       ` braver
2007-11-15  9:39         ` Ludovic Brenta
2007-11-15 11:12           ` Dmitry A. Kazakov
2007-11-15 21:11         ` Simon Wright
2007-11-17  1:05           ` Randy Brukardt
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox