From: Jacob Sparre Andersen <sparre@nbi.dk>
Subject: Re: Character Sets (plain text police report)
Date: Sun, 01 Dec 2002 12:28:14 +0100
Date: 2002-12-01T12:28:14+01:00 [thread overview]
Message-ID: <3DE9F24E.3010002@nbi.dk> (raw)
In-Reply-To: asaj6c$ts5$1@slb9.atl.mindspring.net
Marin David Condic wrote:
> It might make an easy extension to the Ada standard to include 32-bit
> Unicode. After all, its pretty much just a matter of taking existing
> packages and changing a few things so you could have Wide_Wide_Character.
> The question is, would it have sufficient utility to make it worth the
> effort? (Is there much use out there for 32-bit characters?)
Maybe not directly (except for in the far east), but there
is a rather large and growing indirect need for full support
for ISO-10646.
In Europe people are starting to switch from ISO-8859
encodings to the UTF-8 encoding of ISO-10646. This means
that although people in practice seldom will use more than
the 470-something European characters, they will start to
expect to have access to use all of ISO-10646.
> Perhaps if some additional utility was piled on top of it so that reading a
> text file, Ada would automatically determine what it was looking at and give
> you back text in the proper size (create something like "Universal_String"
> and a whole bunch of utilities around it so it would hold 8, 16 or 32-bit
> characters depending on how it was loaded) - but I don't see how that could
> be done for all text files.
Agreed. One needs some kind of information about which
encoding is used - but that is already the case. The best
solution I can think of is to demand that the operating
system keeps track of the file type (including encoding for
text files). The second best solution is (IMHO) to
introduce a sensible common standard encoding. I don't know
if it should be UTF-8 or raw 32-bit ISO-10646. And I can
certainly not advice people to use the current procedure on
Unix systems, where each user chooses his/her assumed
encoding of text files.
> The concept is a little vague in my mind, but I could imagine how something
> like this might be a useful idea for a standard Ada library. It really
> doesn't require any fundamental changes to the language.
No. But it would be nice, if one could demand that
compilers can handle UTF-8 or raw 32-bit ISO-10646 encoded
source files.
Greetings,
Jacob
--
"I don't want to gain immortality in my works.
I want to gain it by not dying."
next prev parent reply other threads:[~2002-12-01 11:28 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-11-28 17:53 Character Sets Robert C. Leif
2002-11-28 18:08 ` Character Sets (plain text police report) Warren W. Gay VE3WWG
2002-11-28 18:11 ` Warren W. Gay VE3WWG
2002-11-29 11:12 ` Lutz Donnerhacke
2002-11-29 14:58 ` Frank J. Lhota
2002-11-29 20:37 ` Robert C. Leif
2002-11-30 14:49 ` Marin David Condic
2002-12-01 11:28 ` Jacob Sparre Andersen [this message]
2002-12-01 14:38 ` Marin David Condic
2002-12-01 20:25 ` Jacob Sparre Andersen
2002-12-02 9:43 ` Preben Randhol
2002-12-02 13:26 ` Marin David Condic
2002-12-02 6:44 ` Robert C. Leif
2002-12-02 9:41 ` Preben Randhol
2002-12-02 16:58 ` Charles Lindsey
2002-12-02 19:29 ` A suggestion, completely unrelated to the original topic Wes Groleau
2002-12-02 23:21 ` David C. Hoos, Sr.
2002-11-29 12:28 ` Character Sets Georg Bauhaus
2002-12-02 18:28 ` Stephen Leake
2002-12-03 2:45 ` Robert C. Leif
2002-12-03 13:33 ` Robert A Duff
2002-12-03 15:32 ` Juanma Barranquero
2002-12-04 0:49 ` Robert C. Leif
2002-12-14 3:27 ` David Starner
2002-12-14 22:53 ` Vadim Godunko
2002-12-15 3:46 ` David Starner
2002-12-15 23:26 ` Robert C. Leif
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox