From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,FORGED_GMAIL_RCVD, FREEMAIL_FROM autolearn=no autolearn_force=no version=3.4.4 X-Received: by 2002:a6b:a207:: with SMTP id l7-v6mr1827302ioe.134.1530755166091; Wed, 04 Jul 2018 18:46:06 -0700 (PDT) X-Received: by 2002:aca:c744:: with SMTP id x65-v6mr911734oif.2.1530755165894; Wed, 04 Jul 2018 18:46:05 -0700 (PDT) Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!feeder.eternal-september.org!news.linkpendium.com!news.linkpendium.com!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!d7-v6no2223248itj.0!news-out.google.com!z3-v6ni2216iti.0!nntp.google.com!d7-v6no2223245itj.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Wed, 4 Jul 2018 18:46:05 -0700 (PDT) In-Reply-To: <795161eb-b58c-4146-9721-9b553039868a@googlegroups.com> Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=76.113.16.86; posting-account=lJ3JNwoAAAAQfH3VV9vttJLkThaxtTfC NNTP-Posting-Host: 76.113.16.86 References: <70c11a71-3832-4f57-8127-f3f1c48a052f@googlegroups.com> <887212304.552080112.848502.laguest-archeia.com@nntp.aioe.org> <87muvan83x.fsf@adaheads.home> <1449870001.552246132.581310.laguest-archeia.com@nntp.aioe.org> <5de5f768-40bf-4518-a647-22788658de74@googlegroups.com> <64454862-b293-4ed7-9c3e-c8a1252344db@googlegroups.com> <0ebf920a-61fa-47e8-a34f-54da2e143bb6@googlegroups.com> <6af9d974-b2b4-4ab9-82e6-690ffaee2901@googlegroups.com> <795161eb-b58c-4146-9721-9b553039868a@googlegroups.com> User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: Subject: Re: Strange crash on custom iterator From: Shark8 Injection-Date: Thu, 05 Jul 2018 01:46:06 +0000 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Xref: reader02.eternal-september.org comp.lang.ada:53631 Date: 2018-07-04T18:46:05-07:00 List-Id: On Wednesday, July 4, 2018 at 6:12:06 PM UTC-6, Dan'l Miller wrote: > On Wednesday, July 4, 2018 at 5:04:13 PM UTC-5, Shark8 wrote: > > On Wednesday, July 4, 2018 at 2:05:17 PM UTC-6, Lucretia wrote: > > > On Wednesday, 4 July 2018 20:53:21 UTC+1, Shark8 wrote: > > >=20 > > > > But let's take a step backward; what about displaying the text? One= certainly could argue that Unicode is a good solution in this arena, after= all havng the ability to encode all of human language is it's stated desig= n-goal, so surely it must be well-suited to that, right? > > >=20 > > > You're wrong. Unicode is not about displaying text, it even says that= in the spec, it's about representation. Stop trying to force Unicode into = Lisp or Forth or whatever to try to add meaning to text. > >=20 > > I didn't say it *was*, I used display as an example. > > But you bring up a good point: it's a terrible representation, for all = that I've said, and more. >=20 > Shark8, it seems that your criticisms were that instead of representing t= he Hebrew letters, we ought to represent the whole Hebrew word. Isn't that= an entirely different problem-space higher in the food chain? >=20 > My qualms with Unicode is that it gets into far more topics than characte= r encoding and then for some odd reason refuses to standardize single-codep= oint representation of some language's letters (and then for some even odde= r reason standardizes offbeat emojis far beyond the original Japanese singl= e-codepoint representations of old 1980s emoticons). I guess all that bill= ion codepoints beyond BMP is reserved for all the extra-terrestrial space-a= lien languages, not for us mere mortals on planet Earth. Poor old Lithuani= an needs to not only stand in line behind all the Western European nations = (and their former colonies) but also poor old Lithuanian needs to stand in = line behind E.T. >=20 > Shark8, what would be the better solution for character-encoding itself? = (not whole words) Whole-word isn't a terrible idea, per se. But the thrust I was getting at i= s the delination between languages: with Unicode it's a sequence of codepoi= nts, independent of the actual item (word, sentence, etc) other than [perha= ps] graphic-presented. That the example is (Eng,Eng,Eng...Eng, Heb,Heb,Heb,= Heb, Eng,Eng,Eng...) codepoints is not the problem, though related, because= it discards all information in favor of (num, num, num, num, ...) rather t= han actually considering alternate languages: IMO, ("The Hebrew word for ma= n" (quote ADAM) (quote "Adam") ".") is much better as 'text' because we're = preserving structure: [ENGLISH [THIS SECTION HEBREW] ENGLISH].