From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,4f316de357ae35e9
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 2002-08-05 08:56:03 PST
Path: 
 archiver1.google.com!news1.google.com!newsfeed.stanford.edu!newsmi-us.news.garr.it!newsmi-eu.news.garr.it!NewsITBone-GARR!fu-berlin.de!news.uar.net!carrier.kiev.ua!news.lucky.net!not-for-mail
From: Oleg Goodyckov <og@videoproject.kiev.ua>
Newsgroups: comp.lang.ada
Subject: Re: FAQ and string functions
Date: Mon, 5 Aug 2002 18:12:31 +0300
Organization: unknown
Distribution: world
Message-ID: <20020805181231.C2351@videoproject.kiev.ua>
References: <20020730093206.A8550@videoproject.kiev.ua>
 <20020731182308.K1083@videoproject.kiev.ua>
 <aib0a6$139lkn$1@ID-77047.news.dfncis.de>
 <20020801161052.M1080@videoproject.kiev.ua>
 <aidq39$13rmja$1@ID-77047.news.dfncis.de>
 <20020802193535.N1101@videoproject.kiev.ua>
 <b0osku0tktsihgp0hoih183250hq3pjhq5@4ax.com>
Reply-To: og@videoproject.kiev.ua
NNTP-Posting-Host: news.lucky.net
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: news.lucky.net 1028562955 3645 193.193.193.102 (5 Aug 2002 15:55:55
 GMT)
X-Complaints-To: usenet@news.lucky.net
NNTP-Posting-Date: Mon, 5 Aug 2002 15:55:55 +0000 (UTC)
Keywords: 265282490
X-Return-Path: oleg@videoproject.kiev.ua
Xref: archiver1.google.com comp.lang.ada:27706
Date: 2002-08-05T18:12:31+03:00
List-Id: <comp.lang.ada>

On Mon, Aug 05, 2002 at 01:50:38PM +0200, Dmitry A. Kazakov wrote:
> >me it is more effectively to process only correct data (which are reliably
> >recognized) and any other simply to drop nuffig.
> 
> Ah, that practice, which makes HTML a disaster because browsers
> silently ignore what they do not understand. The results are known.

Seems, you don't like this "known" result? Why?

> The problem of all global methods is that the parameters they need
> cannot be optimal in a  large context. Split is an example. It
> requires a separator and a notion of a token which may vary from point
> to point, making the approach useless.

Baseless assertions. Again.

> I remember a project with a config file of ~2MBytes big. (it was a
> Windows registry folder). I wonder how much time it would take to
> parse it using split technique.

Why you've took so nasty example? 

> >> that as the complexity of syntax increases it becomes almost impossible at 
> >> some point to write a correct pattern and prove that it is correct.
> >
> >Which nuffig "complexity of syntax"? Syntax is - no more simplest: fields
> >with separators (of one type) between of them.
> 
> It is not a real syntax.

It is what I try to tell.

> >Take record, split it by separators and enjoy.
> 
> Well, how long a record is allowed to be?

It is no need in such constrain. Any.

> >Really? Empty words. Try and show me. In skipped example I've seen one
> >attempt. Show me another - better.
> >Task solved in skipped example has name - building hystorgram of words
> >implementation. Why you name this task not realistic?
> 
> Because histogram is also a global method (used for I suppose sort of
> clustering) which also has great limitations and is by no means an end
> product of the program.

Ok. It is answer on my second question (not very impressive, BTW). Now how 
about first - about better realization of task?

> >So, if that 80% of code throw out, then program will work? Or they are
> >necessary though?
> 
> Not for text processing. I supposed that it does something more than
> only that.
> 
> Generally, if you have a problem to solve you must first decompose it
> into subproblems. You should do it properly. Surely one could use
> eigenvalues and vectors to invert a matrix but this would be a *bad*
> idea. To decompose some text analysing problem into a bunch of split
> operations as also a *bad* idea. This is my point.

Baseless point. Exorcisms.