From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,4f316de357ae35e9 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2002-08-05 08:56:03 PST Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!newsmi-us.news.garr.it!newsmi-eu.news.garr.it!NewsITBone-GARR!fu-berlin.de!news.uar.net!carrier.kiev.ua!news.lucky.net!not-for-mail From: Oleg Goodyckov Newsgroups: comp.lang.ada Subject: Re: FAQ and string functions Date: Mon, 5 Aug 2002 18:12:31 +0300 Organization: unknown Distribution: world Message-ID: <20020805181231.C2351@videoproject.kiev.ua> References: <20020730093206.A8550@videoproject.kiev.ua> <20020731182308.K1083@videoproject.kiev.ua> <20020801161052.M1080@videoproject.kiev.ua> <20020802193535.N1101@videoproject.kiev.ua> Reply-To: og@videoproject.kiev.ua NNTP-Posting-Host: news.lucky.net Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: news.lucky.net 1028562955 3645 193.193.193.102 (5 Aug 2002 15:55:55 GMT) X-Complaints-To: usenet@news.lucky.net NNTP-Posting-Date: Mon, 5 Aug 2002 15:55:55 +0000 (UTC) Keywords: 265282490 X-Return-Path: oleg@videoproject.kiev.ua Xref: archiver1.google.com comp.lang.ada:27706 Date: 2002-08-05T18:12:31+03:00 List-Id: On Mon, Aug 05, 2002 at 01:50:38PM +0200, Dmitry A. Kazakov wrote: > >me it is more effectively to process only correct data (which are reliably > >recognized) and any other simply to drop nuffig. > > Ah, that practice, which makes HTML a disaster because browsers > silently ignore what they do not understand. The results are known. Seems, you don't like this "known" result? Why? > The problem of all global methods is that the parameters they need > cannot be optimal in a large context. Split is an example. It > requires a separator and a notion of a token which may vary from point > to point, making the approach useless. Baseless assertions. Again. > I remember a project with a config file of ~2MBytes big. (it was a > Windows registry folder). I wonder how much time it would take to > parse it using split technique. Why you've took so nasty example? > >> that as the complexity of syntax increases it becomes almost impossible at > >> some point to write a correct pattern and prove that it is correct. > > > >Which nuffig "complexity of syntax"? Syntax is - no more simplest: fields > >with separators (of one type) between of them. > > It is not a real syntax. It is what I try to tell. > >Take record, split it by separators and enjoy. > > Well, how long a record is allowed to be? It is no need in such constrain. Any. > >Really? Empty words. Try and show me. In skipped example I've seen one > >attempt. Show me another - better. > >Task solved in skipped example has name - building hystorgram of words > >implementation. Why you name this task not realistic? > > Because histogram is also a global method (used for I suppose sort of > clustering) which also has great limitations and is by no means an end > product of the program. Ok. It is answer on my second question (not very impressive, BTW). Now how about first - about better realization of task? > >So, if that 80% of code throw out, then program will work? Or they are > >necessary though? > > Not for text processing. I supposed that it does something more than > only that. > > Generally, if you have a problem to solve you must first decompose it > into subproblems. You should do it properly. Surely one could use > eigenvalues and vectors to invert a matrix but this would be a *bad* > idea. To decompose some text analysing problem into a bunch of split > operations as also a *bad* idea. This is my point. Baseless point. Exorcisms.