comp.lang.ada
 help / color / mirror / Atom feed
From: "Lawrence D’Oliveiro" <ldo@nz.invalid>
Subject: Re: Ada 202x; 2022; and 2012 and Unicode package (UTF-nn encodings handling)
Date: Tue, 2 Sep 2025 22:56:12 -0000 (UTC)	[thread overview]
Message-ID: <1097smc$qe34$5@dont-email.me> (raw)
In-Reply-To: 10974d1$jn0e$1@dont-email.me

On Tue, 2 Sep 2025 10:01:34 -0600, Alex // nytpu wrote:

> ... (UCS-4 has a number of additional differences from UTF-32
> regarding "valid encodings", namely that all valid Unicode
> codepoints (0x0--0x10FFFF inclusive) are allowed in UCS-4 but only
> Unicode scalar values (0x0--0xD7FF and 0xE000--0x10FFFF inclusive)
> are valid in UTF-32) ...

So what do those codes mean in UCS-4?

> ... and are missing some additional information: a key detail is
> that even with UTF-32 where each Unicode scalar value is held in one
> array element rather than being variable-width like UTF-8/UTF-16,
> you still can't treat them as arbitrary arrays like 7-bit ASCII
> because a grapheme can be made up of multiple Unicode scalar values.
> Even with ASCII characters there's the possibility of combining
> diacritics or such that would break if you split the string between
> them.

This is why you have “normalization”.
<https://www.unicode.org/faq/char_combmark.html>

  parent reply	other threads:[~2025-09-02 22:56 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-20 21:38 Ada 2012 and Unicode package (UTF-nn encodings handling) Yannick Duchêne (Hibou57)
2010-08-20 21:41 ` Yannick Duchêne (Hibou57)
2010-08-21  6:21 ` Dmitry A. Kazakov
2010-08-21  7:01 ` J-P. Rosen
2010-08-21  8:12   ` Yannick Duchêne (Hibou57)
2010-08-22 18:51     ` J-P. Rosen
2010-08-22 19:48       ` Georg Bauhaus
2010-08-22 20:40         ` J-P. Rosen
2010-08-23 10:32           ` Georg Bauhaus
2010-08-23 22:28 ` Randy Brukardt
2025-08-31 17:39 ` Ada 202x; 2022; and " Nicolas Paul Colin de Glocester
2025-08-31 21:23   ` Kevin Chadwick
2025-08-31 21:27     ` Nicolas Paul Colin de Glocester
2025-09-02 16:01   ` Alex // nytpu
2025-09-02 17:40     ` Nicolas Paul Colin de Glocester
2025-09-02 18:49       ` Keith Thompson
2025-09-02 19:27         ` Nicolas Paul Colin de Glocester
2025-09-02 20:02           ` Keith Thompson
2025-09-02 17:42     ` Nicolas Paul Colin de Glocester
2025-09-02 19:15       ` Alex // nytpu
2025-09-02 19:50         ` Nicolas Paul Colin de Glocester
2025-09-02 18:08     ` Dmitry A. Kazakov
2025-09-02 19:13       ` Alex // nytpu
2025-09-02 22:56     ` Lawrence D’Oliveiro [this message]
2025-09-03  0:20       ` Alex // nytpu
2025-09-03  4:10         ` Lawrence D’Oliveiro
2025-09-03 17:25           ` Alex // nytpu
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox