From: "Luke A. Guest" <laguest@archeia.com>
Subject: Re: Ada and Unicode
Date: Mon, 19 Apr 2021 12:56:34 +0100 [thread overview]
Message-ID: <s5jr59$1tkq$1@gioia.aioe.org> (raw)
In-Reply-To: 86mttuk5f0.fsf@stephe-leake.org
On 19/04/2021 10:08, Stephen Leake wrote:
>> What's the way to manage Unicode correctly ?
>
> There are two issues: Unicode in source code, that the compiler must
> understand, and Unicode in strings, that your program must understand.
And this is there the Ada standard gets it wrong, in the encodings
package re utf-8.
Unicode is a superset of 7-bit ASCII not Latin 1. The high bit in the
leading octet indicates whether there are trailing octets. See
https://github.com/Lucretia/uca/blob/master/src/uca.ads#L70 for the data
layout. The first 128 "characters" in Unicode match that of 7-bit ASCII,
not 8-bit ASCII, and certainly not Latin 1. Therefore this:
package Ada.Strings.UTF_Encoding
...
subtype UTF_8_String is String;
...
end Ada.Strings.UTF_Encoding;
Was absolutely and totally wrong.
next prev parent reply other threads:[~2021-04-19 11:56 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-17 22:03 Ada and Unicode DrPi
2021-04-18 0:02 ` Luke A. Guest
2021-04-19 9:09 ` DrPi
2021-04-19 8:29 ` Maxim Reznik
2021-04-19 9:28 ` DrPi
2021-04-19 13:50 ` Maxim Reznik
2021-04-19 15:51 ` DrPi
2021-04-19 11:15 ` Simon Wright
2021-04-19 11:50 ` Luke A. Guest
2021-04-19 15:53 ` DrPi
2022-04-03 19:20 ` Thomas
2022-04-04 6:10 ` Vadim Godunko
2022-04-04 14:19 ` Simon Wright
2022-04-04 15:11 ` Simon Wright
2022-04-05 7:59 ` Vadim Godunko
2022-04-08 9:01 ` Simon Wright
2023-03-30 23:35 ` Thomas
2022-04-04 14:33 ` Simon Wright
2021-04-19 9:08 ` Stephen Leake
2021-04-19 9:34 ` Dmitry A. Kazakov
2021-04-19 11:56 ` Luke A. Guest [this message]
2021-04-19 12:13 ` Luke A. Guest
2021-04-19 15:48 ` DrPi
2021-04-19 12:52 ` Dmitry A. Kazakov
2021-04-19 13:00 ` Luke A. Guest
2021-04-19 13:10 ` Dmitry A. Kazakov
2021-04-19 13:15 ` Luke A. Guest
2021-04-19 13:31 ` Dmitry A. Kazakov
2022-04-03 17:24 ` Thomas
2021-04-19 13:24 ` J-P. Rosen
2021-04-20 19:13 ` Randy Brukardt
2022-04-03 18:04 ` Thomas
2022-04-06 18:57 ` J-P. Rosen
2022-04-07 1:30 ` Randy Brukardt
2022-04-08 8:56 ` Simon Wright
2022-04-08 9:26 ` Dmitry A. Kazakov
2022-04-08 19:19 ` Simon Wright
2022-04-08 19:45 ` Dmitry A. Kazakov
2022-04-09 4:05 ` Randy Brukardt
2022-04-09 7:43 ` Simon Wright
2022-04-09 10:27 ` DrPi
2022-04-09 16:46 ` Dennis Lee Bieber
2022-04-09 18:59 ` DrPi
2022-04-10 5:58 ` Vadim Godunko
2022-04-10 18:59 ` DrPi
2022-04-12 6:13 ` Randy Brukardt
2021-04-19 16:07 ` DrPi
2021-04-20 19:06 ` Randy Brukardt
2022-04-03 18:37 ` Thomas
2022-04-04 23:52 ` Randy Brukardt
2023-03-31 3:06 ` Thomas
2023-04-01 10:18 ` Randy Brukardt
2021-04-19 16:14 ` DrPi
2021-04-19 17:12 ` Björn Lundin
2021-04-19 19:44 ` DrPi
2022-04-16 2:32 ` Thomas
2021-04-19 13:18 ` Vadim Godunko
2022-04-03 16:51 ` Thomas
2023-04-04 0:02 ` Thomas
2021-04-19 22:40 ` Shark8
2021-04-20 15:05 ` Simon Wright
2021-04-20 19:17 ` Randy Brukardt
2021-04-20 20:04 ` Simon Wright
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox