From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=unavailable autolearn_force=no version=3.4.4 X-Received: by 10.107.11.227 with SMTP id 96mr3368546iol.129.1523491030921; Wed, 11 Apr 2018 16:57:10 -0700 (PDT) X-Received: by 2002:a9d:3286:: with SMTP id u6-v6mr409734otb.13.1523491030703; Wed, 11 Apr 2018 16:57:10 -0700 (PDT) Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!feeder.eternal-september.org!news.uzoreto.com!weretis.net!feeder6.news.weretis.net!feeder.usenetexpress.com!feeder-in1.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!k65-v6no267201ita.0!news-out.google.com!u64-v6ni276itb.0!nntp.google.com!e130-v6no270087itb.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Wed, 11 Apr 2018 16:57:10 -0700 (PDT) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=2400:412e:97aa:f100:213f:d89b:88fa:f6ae; posting-account=Mi71UQoAAACnFhXo1NVxPlurinchtkIj NNTP-Posting-Host: 2400:412e:97aa:f100:213f:d89b:88fa:f6ae References: <7d5b8717-1e70-4153-af13-dfab24679ed9@googlegroups.com> User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: <6518518c-5153-4afd-a2c5-d173ec0fe268@googlegroups.com> Subject: Re: The extension of Is_Basic to unicode (about AI12-0260-1) From: ytomino Injection-Date: Wed, 11 Apr 2018 23:57:10 +0000 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Xref: reader02.eternal-september.org comp.lang.ada:51455 Date: 2018-04-11T16:57:10-07:00 List-Id: On Thursday, April 12, 2018 at 7:20:28 AM UTC+9, Randy Brukardt wrote: > "J-P. Rosen" wrote in message=20 > news:palsmv$g18$1@gioia.aioe.org... > > Le 11/04/2018 =C3=A0 16:32, Dan'l Miller a =C3=A9crit : > >>> True if Item is a basic letter. A basic letter is a character that > >>> is in one of the ranges 'A'..'Z' and 'a'..'z', or that is one of > >>> the following: '=C3=86', '=C3=A6', '=C4=9E', '=C4=9F', '=C5=9E', '=C5= =9F', or '=C3=9F'. > >> If this Ada-specific definition of this is-basic/base-Latin-letter > >> property is the official normative list, then it seems rather > >> arbitrary and capricious, not conforming to Unicode or to linguistic > >> reality. > >> > >> In Unicode-speak's terminology/jargon, the definition of base > >> character at https://definedterm.com/a/definition/160575 would admit > >> quite a few more, [...] > > The above Is_Basic is about Character, and is defined only when using > > Latin-1. Unicode is a different standard. >=20 > Moreover, its definition is historical -- it was defined this way for Ada= =20 > 95, and whether or not that would be the correct definition had it been= =20 > defined in 2018 is irrelevant. Changing the definition would potentially= =20 > silently break programs that use it. There are a number of things in=20 > Ada.Characters.Handling that aren't correct for Unicode purposes, one of= =20 > them is even called out by the third note in A.3.2. >=20 > Randy. Thanks for your detailed description. If Character.Handling.Is_Basic can not be changed because compatibility, st= ill more, this *overloading* will create new problem for the future. For example, on rewriting some applications from Character to Wide_Characte= r, it may be imagined that two meanings of Is_Basic will confuse. Or, they makes hard to use "use clause", or use as a generic formal subprog= ram. Excuse me for repeating, should new function name be used for new definitio= n? function Is_Base (Item : Wide_Character) return Boolean; -- according wit= h Unicode function Is_Basic (Item : Wide_Character) return Boolean is (Is_Base (Ite= m) and Is_Letter (Item)); -- for compatibility