From: "G.B." <bauhaus@futureapps.invalid>
Subject: Re: Bug in Ada - Latin 1 is not a subset of UTF-8
Date: Tue, 18 Oct 2016 12:09:11 +0200
Date: 2016-10-18T12:09:11+02:00 [thread overview]
Message-ID: <nu4sbm$4m3$1@dont-email.me> (raw)
In-Reply-To: <nu4nee$18le$1@gioia.aioe.org>
On 18.10.16 10:45, Dmitry A. Kazakov wrote:
> On 18/10/2016 10:23, G.B. wrote:
>> On 18.10.16 09:41, Dmitry A. Kazakov wrote:
>>> On 18/10/2016 01:25, G.B. wrote:
>>>> On 17.10.16 22:18, Lucretia wrote:
>>>
>>>> According to ISO 10646, UTF stands for UCS Transformation
>>>> Format. So, it's a format, suggesting a representation.
>>>>
>>>> On similar grounds, one could define a string subtype for
>>>> other types of objects, for example
>>>>
>>>> subtype Number_String is String;
>>>
>>> You are wrong.
>>
>> The constraints on either UTF_String or or Number_String are
>> not expressible as simple Ada subtypes. They are given by
>> description and normative reference, respectively.
>
> In the case of UTF-8 it is not a constraint.
Not an Ada constraint, in particular insofar as UTF-8 means
a representation;
still, any UTF-8 encoded "string" of UCS objects is wellformed
and it satisfies a predicate that involves all components x, x', x'', ...
of a UTF_8_String object, by stating that if x matches 2#10......#,
then x' is such-and-such, and so on. I'm not sure this predicate
is easily stated as a stand-alone type invariant, for example, but
that's the idea. It shouldn't have to be visible to Ada programmers.
>
> Numeric character is a constraint expressible in Ada:
>
> subtype Numeric is Character range '0'..'9';
>
> Numeric string constraint is not expressible, but it still a constraint.
(Although, the Numeric_String subtype described earlier will have
a meaningless constraint on Numeric, since all remainders
are values both in base 256 and in Character. Come to think of it,
the example format is broken. #-)
--
"HOTDOGS ARE NOT BOOKMARKS"
Springfield Elementary teaching staff
next prev parent reply other threads:[~2016-10-18 10:09 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-17 20:18 Bug in Ada - Latin 1 is not a subset of UTF-8 Lucretia
2016-10-17 20:57 ` Jacob Sparre Andersen
2016-10-18 5:44 ` J-P. Rosen
2016-10-17 23:25 ` G.B.
2016-10-18 7:41 ` Dmitry A. Kazakov
2016-10-18 8:23 ` G.B.
2016-10-18 8:45 ` Dmitry A. Kazakov
2016-10-18 10:09 ` G.B. [this message]
2016-10-18 12:24 ` Dmitry A. Kazakov
2016-10-18 15:10 ` G.B.
2016-10-18 16:35 ` Dmitry A. Kazakov
2016-10-18 17:35 ` G.B.
2016-10-18 20:03 ` Dmitry A. Kazakov
2016-10-19 8:15 ` G.B.
2016-10-19 8:25 ` G.B.
2016-10-19 8:49 ` Dmitry A. Kazakov
2016-10-19 14:20 ` G.B.
2016-10-19 16:20 ` Dmitry A. Kazakov
2016-10-20 0:31 ` Randy Brukardt
2016-10-20 7:36 ` Dmitry A. Kazakov
2016-10-21 12:28 ` G.B.
2016-10-21 16:13 ` Lucretia
2016-10-21 16:43 ` Dmitry A. Kazakov
2016-10-22 5:51 ` G.B.
2016-10-22 7:49 ` Dmitry A. Kazakov
2016-10-24 11:35 ` Luke A. Guest
2016-10-24 13:01 ` Dmitry A. Kazakov
2016-10-24 14:54 ` Luke A. Guest
2016-10-22 1:53 ` Randy Brukardt
2016-10-28 21:08 ` Shark8
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox