From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail From: "J-P. Rosen" Newsgroups: comp.lang.ada Subject: Re: The extension of Is_Basic to unicode (about AI12-0260-1) Date: Wed, 11 Apr 2018 05:38:06 +0200 Organization: Adalog Message-ID: References: <7d5b8717-1e70-4153-af13-dfab24679ed9@googlegroups.com> NNTP-Posting-Host: XdjV4tYMtFfMKGpv0MuDzQ.user.gioia.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Complaints-To: abuse@aioe.org User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 Openpgp: preference=signencrypt Content-Language: fr X-Notice: Filtered by postfilter v. 0.8.3 Xref: reader02.eternal-september.org comp.lang.ada:51437 Date: 2018-04-11T05:38:06+02:00 List-Id: Le 11/04/2018 à 02:52, ytomino a écrit : > AI12-0260-1/04 Functions Is_Basic and To_Basic in Wide_Characters.Handling > I found inconsistency between existing Characters.Handling.Is_Basic and new Wide_Characters.Handling.Is_Basic. > > Characters.Handling.Is_Basic in RM: > > True if Item is a basic letter. A basic letter is a character that is in one of the ranges 'A'..'Z' and 'a'..'z', or that is one of the following: 'Æ', 'æ', 'Ð', 'ð', 'Þ', 'þ', or 'ß'. > > Characters.H.Is_Basic includes only alphabet, not include other symbols. > Is_Basic ('+') = False. > > Wide_Characters.Handling.Is_Basic in AI: > > Returns True if the Wide_Character designated by Item has no Decomposition Mapping in the code charts of ISO/IEC 10646:2017; otherwise returns False. > > Wide_Characters.H.Is_Basic includes all un-decomposable characters, called as "base character" in Unicode world. It include the symbols. > Is_Basic ('+') = True. > > Perhaps, Is_Basic must be defined as the intersection of the set of base characters *and the set of letters* (categorized as 'Ll', 'Lu', 'Lt', 'Lm', 'Lo'... in Unicode Character Database). Right, but the old definition was wrong and the new one is right. In general, Ada prefers to use existing standards rather than inventing its own special definitions. If you need to make sure that something is a letter, there is the Is_Letter function. -- J-P. Rosen Adalog 2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00 http://www.adalog.fr