From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail From: "Dmitry A. Kazakov" Newsgroups: comp.lang.ada Subject: Re: Bug in Ada - Latin 1 is not a subset of UTF-8 Date: Tue, 18 Oct 2016 09:41:39 +0200 Organization: Aioe.org NNTP Server Message-ID: References: <86f0d2fe-d498-4bc4-bb9d-e34629c89bb4@googlegroups.com> NNTP-Posting-Host: vZYCW951TbFitc4GdEwQJg.user.gioia.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: abuse@aioe.org User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 X-Notice: Filtered by postfilter v. 0.8.2 Xref: news.eternal-september.org comp.lang.ada:32109 Date: 2016-10-18T09:41:39+02:00 List-Id: On 18/10/2016 01:25, G.B. wrote: > On 17.10.16 22:18, Lucretia wrote: > According to ISO 10646, UTF stands for UCS Transformation > Format. So, it's a format, suggesting a representation. > > On similar grounds, one could define a string subtype for > other types of objects, for example > > subtype Number_String is String; You are wrong. String of numeric characters is not an encoding, it is a constraint = (def) each instance of numeric string is a string. [An example of encoding (= representation) is IEEE 754 vs IBM 360 float.] UTF-8 string is not a constrained string and conversely string is not a constrained UTF-8 string. These are two distinct types which values (some of them) overlap and can be converted into each other. The latter allows making them subtypes, but Ada language lacks means for that. In Ada a subtype can either be a constraint (AKA "Ada subtype") or class member / class-wide. UTF-8 is not a constraint and String is not tagged. The decision to force UTF-8 string and string [Latin-1 string to be precise] to be subtypes in Ada sense is the least of two evils. It is bad and wrong, but the alternative would be only worse. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de