From: "Lawrence D’Oliveiro" <ldo@nz.invalid>
Subject: Re: Ada 202x; 2022; and 2012 and Unicode package (UTF-nn encodings handling)
Date: Tue, 2 Sep 2025 22:56:12 -0000 (UTC) [thread overview]
Message-ID: <1097smc$qe34$5@dont-email.me> (raw)
In-Reply-To: 10974d1$jn0e$1@dont-email.me
On Tue, 2 Sep 2025 10:01:34 -0600, Alex // nytpu wrote:
> ... (UCS-4 has a number of additional differences from UTF-32
> regarding "valid encodings", namely that all valid Unicode
> codepoints (0x0--0x10FFFF inclusive) are allowed in UCS-4 but only
> Unicode scalar values (0x0--0xD7FF and 0xE000--0x10FFFF inclusive)
> are valid in UTF-32) ...
So what do those codes mean in UCS-4?
> ... and are missing some additional information: a key detail is
> that even with UTF-32 where each Unicode scalar value is held in one
> array element rather than being variable-width like UTF-8/UTF-16,
> you still can't treat them as arbitrary arrays like 7-bit ASCII
> because a grapheme can be made up of multiple Unicode scalar values.
> Even with ASCII characters there's the possibility of combining
> diacritics or such that would break if you split the string between
> them.
This is why you have “normalization”.
<https://www.unicode.org/faq/char_combmark.html>
next prev parent reply other threads:[~2025-09-02 22:56 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-20 21:38 Ada 2012 and Unicode package (UTF-nn encodings handling) Yannick Duchêne (Hibou57)
2010-08-20 21:41 ` Yannick Duchêne (Hibou57)
2010-08-21 6:21 ` Dmitry A. Kazakov
2010-08-21 7:01 ` J-P. Rosen
2010-08-21 8:12 ` Yannick Duchêne (Hibou57)
2010-08-22 18:51 ` J-P. Rosen
2010-08-22 19:48 ` Georg Bauhaus
2010-08-22 20:40 ` J-P. Rosen
2010-08-23 10:32 ` Georg Bauhaus
2010-08-23 22:28 ` Randy Brukardt
2025-08-31 17:39 ` Ada 202x; 2022; and " Nicolas Paul Colin de Glocester
2025-08-31 21:23 ` Kevin Chadwick
2025-08-31 21:27 ` Nicolas Paul Colin de Glocester
2025-09-02 16:01 ` Alex // nytpu
2025-09-02 17:40 ` Nicolas Paul Colin de Glocester
2025-09-02 18:49 ` Keith Thompson
2025-09-02 19:27 ` Nicolas Paul Colin de Glocester
2025-09-02 20:02 ` Keith Thompson
2025-09-02 17:42 ` Nicolas Paul Colin de Glocester
2025-09-02 19:15 ` Alex // nytpu
2025-09-02 19:50 ` Nicolas Paul Colin de Glocester
2025-09-02 18:08 ` Dmitry A. Kazakov
2025-09-02 19:13 ` Alex // nytpu
2025-09-02 22:56 ` Lawrence D’Oliveiro [this message]
2025-09-03 0:20 ` Alex // nytpu
2025-09-03 4:10 ` Lawrence D’Oliveiro
2025-09-03 17:25 ` Alex // nytpu
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox