comp.lang.ada
 help / color / mirror / Atom feed
From: "G.B." <bauhaus@futureapps.invalid>
Subject: Re: Bug in Ada - Latin 1 is not a subset of UTF-8
Date: Tue, 18 Oct 2016 12:09:11 +0200
Date: 2016-10-18T12:09:11+02:00	[thread overview]
Message-ID: <nu4sbm$4m3$1@dont-email.me> (raw)
In-Reply-To: <nu4nee$18le$1@gioia.aioe.org>

On 18.10.16 10:45, Dmitry A. Kazakov wrote:
> On 18/10/2016 10:23, G.B. wrote:
>> On 18.10.16 09:41, Dmitry A. Kazakov wrote:
>>> On 18/10/2016 01:25, G.B. wrote:
>>>> On 17.10.16 22:18, Lucretia wrote:
>>>
>>>> According to ISO 10646, UTF stands for UCS Transformation
>>>> Format. So, it's a format, suggesting a representation.
>>>>
>>>> On similar grounds, one could define a string subtype for
>>>> other types of objects, for example
>>>>
>>>>   subtype Number_String is String;
>>>
>>> You are wrong.
>>
>> The constraints on either UTF_String or or Number_String are
>> not expressible as simple Ada subtypes. They are given by
>> description and normative reference, respectively.
>
> In the case of UTF-8 it is not a constraint.

Not an Ada constraint, in particular insofar as UTF-8 means
a representation;
still, any UTF-8 encoded "string" of UCS objects is wellformed
and it satisfies a predicate that involves all components x, x', x'', ...
of a UTF_8_String object, by stating that if x matches 2#10......#,
then x' is such-and-such, and so on. I'm not sure this predicate
is easily stated as a stand-alone type invariant, for example, but
that's the idea. It shouldn't have to be visible to Ada programmers.

>
> Numeric character is a constraint expressible in Ada:
>
>    subtype Numeric is Character range '0'..'9';
>
> Numeric string constraint is not expressible, but it still a constraint.

(Although, the Numeric_String subtype described earlier will have
a meaningless constraint on Numeric, since all remainders
are values both in base 256 and in Character. Come to think of it,
the example format is broken. #-)



-- 
"HOTDOGS ARE NOT BOOKMARKS"
Springfield Elementary teaching staff


  reply	other threads:[~2016-10-18 10:09 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-17 20:18 Bug in Ada - Latin 1 is not a subset of UTF-8 Lucretia
2016-10-17 20:57 ` Jacob Sparre Andersen
2016-10-18  5:44   ` J-P. Rosen
2016-10-17 23:25 ` G.B.
2016-10-18  7:41   ` Dmitry A. Kazakov
2016-10-18  8:23     ` G.B.
2016-10-18  8:45       ` Dmitry A. Kazakov
2016-10-18 10:09         ` G.B. [this message]
2016-10-18 12:24           ` Dmitry A. Kazakov
2016-10-18 15:10             ` G.B.
2016-10-18 16:35               ` Dmitry A. Kazakov
2016-10-18 17:35                 ` G.B.
2016-10-18 20:03                   ` Dmitry A. Kazakov
2016-10-19  8:15                     ` G.B.
2016-10-19  8:25                       ` G.B.
2016-10-19  8:49                       ` Dmitry A. Kazakov
2016-10-19 14:20                         ` G.B.
2016-10-19 16:20                           ` Dmitry A. Kazakov
2016-10-20  0:31         ` Randy Brukardt
2016-10-20  7:36           ` Dmitry A. Kazakov
2016-10-21 12:28             ` G.B.
2016-10-21 16:13               ` Lucretia
2016-10-21 16:43                 ` Dmitry A. Kazakov
2016-10-22  5:51                   ` G.B.
2016-10-22  7:49                     ` Dmitry A. Kazakov
2016-10-24 11:35                       ` Luke A. Guest
2016-10-24 13:01                         ` Dmitry A. Kazakov
2016-10-24 14:54                           ` Luke A. Guest
2016-10-22  1:53             ` Randy Brukardt
2016-10-28 21:08         ` Shark8
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox