comp.lang.ada
 help / color / mirror / Atom feed
* UTF-8 in strings - a bug?
@ 2004-05-05 22:12 Björn Persson
  2004-05-05 23:31 ` Robert I. Eachus
  2004-05-06  9:06 ` David Starner
  0 siblings, 2 replies; 16+ messages in thread
From: Björn Persson @ 2004-05-05 22:12 UTC (permalink / raw)


The reference manual says:

3.5.2(2): The predefined type Character is a character type whose values 
correspond to the 256 code positions of Row 00 (also known as Latin-1) 
of the ISO 10646 Basic Multilingual Plane (BMP).

3.6.3(4): type String is array(Positive range <>) of Character;

It seems clear to me: Strings are Latin-1 (except for programs compiled 
in nonstandard modes). But when I set my Fedora system to use UTF-8, the 
strings I get from Ada.Command_Line.Argument contain UTF-8. This means 
that some of the elements in the string aren't characters, only byte 
values that are parts of multi-byte characters. And of course 'Length 
returns the number of bytes, not the number of characters. This looks 
like a violation of the standard. Should I consider this a bug in the 
library? Or in the compiler (Gnat (GCC) 3.3.2 and 3.4.0)?

-- 
Björn Persson

jor ers @sv ge.
b n_p son eri nu




^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2004-05-10  6:29 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-05 22:12 UTF-8 in strings - a bug? Björn Persson
2004-05-05 23:31 ` Robert I. Eachus
2004-05-06  8:34   ` Björn Persson
2004-05-06  9:25     ` Ludovic Brenta
2004-05-06 17:13       ` Björn Persson
2004-05-06 18:24       ` Martin Krischik
2004-05-07 23:32         ` Björn Persson
2004-05-08  6:38           ` Martin Krischik
2004-05-08  7:44           ` Jacob Sparre Andersen
2004-05-08 11:06             ` Björn Persson
2004-05-08 16:25               ` Martin Krischik
2004-05-09 12:16                 ` Georg Bauhaus
2004-05-10  6:29                   ` Martin Krischik
2004-05-08 12:10           ` Georg Bauhaus
2004-05-06  9:06 ` David Starner
2004-05-06 17:36   ` Björn Persson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox