From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,ac4955b8006bd13c X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.204.152.217 with SMTP id h25mr2688493bkw.3.1338881592363; Tue, 05 Jun 2012 00:33:12 -0700 (PDT) Path: e27ni15361bkw.0!nntp.google.com!news1.google.com!goblin2!goblin.stu.neva.ru!aioe.org!.POSTED!not-for-mail From: "Dmitry A. Kazakov" Newsgroups: comp.lang.ada Subject: Re: Q: type ... is new String Date: Tue, 5 Jun 2012 09:32:48 +0200 Organization: cbb software GmbH Message-ID: <1tr1nuc1xy9mp$.d5s1fz9vuczz.dlg@40tude.net> References: <82defba0-2d39-4418-b678-ebbefeb105d7@x21g2000vbc.googlegroups.com> <4fcccd1f$0$6583$9b4e6d93@newsspool3.arcor-online.net> <4fccdd0c$0$6578$9b4e6d93@newsspool3.arcor-online.net> <4fcd20dd$0$9519$9b4e6d93@newsspool1.arcor-online.net> Reply-To: mailbox@dmitry-kazakov.de NNTP-Posting-Host: FbOMkhMtVLVmu7IwBnt1tw.user.speranza.aioe.org Mime-Version: 1.0 X-Complaints-To: abuse@aioe.org User-Agent: 40tude_Dialog/2.0.15.1 X-Notice: Filtered by postfilter v. 0.8.2 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Date: 2012-06-05T09:32:48+02:00 List-Id: On Mon, 04 Jun 2012 22:56:01 +0200, Georg Bauhaus wrote: > On 04.06.12 19:05, Dmitry A. Kazakov wrote: >> There is nothing ambiguous in character encoding, > > In processing data from any source that speaks HTTP, you don't really know > the character encoding: you may be told the encoding is X but actually it > is Y. <=> I do know the encoding. You are trying to pursue some absolute truth, e.g. "true encoding" of a broken page, which simply does not exist and is irrelevant. You should define an encoding and that is all the corresponding component need to know about it. Note again a connection to error checks: the program shall not check itself. A consequence of this: if you use an input it is not your responsibility to make guesses. You do as you told. If you want to add some encoding guessing layer, do it just elsewhere. Just basics of good software design where each component shall have a well defined narrow functionality. >> For each possible input >> there is a defined output the parser should spill. Where is a problem? > > Here is the problem: There is no complete description of the set of > possible inputs. See, that is the problem. People didn't do their job. > It's the web. Changing. Data are not quite as lucid as a set of ways > to designate a file. Is the latter set changing so frequently that the > Ada standard would not be able to follow? Nope, things changing are as irrelevant as the computer's relative position to Proxima Centauri. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de