From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail From: Luke A. Guest Newsgroups: comp.lang.ada Subject: Re: Strange crash on custom iterator Date: Thu, 5 Jul 2018 03:07:55 +0100 Organization: Aioe.org NNTP Server Message-ID: <176034645.552448963.078419.laguest-archeia.com@nntp.aioe.org> References: <5de5f768-40bf-4518-a647-22788658de74@googlegroups.com> <64454862-b293-4ed7-9c3e-c8a1252344db@googlegroups.com> <0ebf920a-61fa-47e8-a34f-54da2e143bb6@googlegroups.com> <6af9d974-b2b4-4ab9-82e6-690ffaee2901@googlegroups.com> <795161eb-b58c-4146-9721-9b553039868a@googlegroups.com> NNTP-Posting-Host: mgzK1bab4F7fJIJBaOQFWA.user.gioia.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Complaints-To: abuse@aioe.org User-Agent: NewsTap/5.3.1 (iPhone/iPod Touch) Cancel-Lock: sha1:EKSmhqxn0A19bKtJfzjO6Gt6Rfw= X-Notice: Filtered by postfilter v. 0.8.3 Xref: reader02.eternal-september.org comp.lang.ada:53633 Date: 2018-07-05T03:07:55+01:00 List-Id: Shark8 wrote: >> Shark8, what would be the better solution for character-encoding itself? >> (not whole words) > > Whole-word isn't a terrible idea, per se. But the thrust I was getting at > is the delination between languages: with Unicode it's a sequence of > codepoints, independent of the actual item (word, sentence, etc) other > than [perhaps] graphic-presented. That the example is (Eng,Eng,Eng...Eng, > Heb,Heb,Heb,Heb, Eng,Eng,Eng...) codepoints is not the problem, though > related, because it discards all information in favor of (num, num, num, > num, ...) rather than actually considering alternate languages: IMO, > ("The Hebrew word for man" (quote ADAM) (quote "Adam") ".") is much > better as 'text' because we're preserving structure: [ENGLISH [THIS > SECTION HEBREW] ENGLISH]. > I don’t understand why you think Unicode should carry linguistic information when all it has ever been designed to do is encode symbols across all languages and their direction.