From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,e136d2bb18e6fb60
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 2002-12-01 03:28:17 PST
Path: 
 archiver1.google.com!news1.google.com!newsfeed.stanford.edu!newsfeed1.bredband.com!bredband!uio.no!newsfeed1.uni2.dk!news.net.uni-c.dk!not-for-mail
From: Jacob Sparre Andersen <sparre@nbi.dk>
Newsgroups: comp.lang.ada
Subject: Re: Character Sets (plain text police report)
Date: Sun, 01 Dec 2002 12:28:14 +0100
Organization: UNI-C
Message-ID: <3DE9F24E.3010002@nbi.dk>
References: <mailman.1038602282.10532.comp.lang.ada@ada.eu.org>
 <asaj6c$ts5$1@slb9.atl.mindspring.net>
NNTP-Posting-Host: kaoslx07.nbi.dk
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: news.net.uni-c.dk 1038742096 22620 130.225.212.98 (1 Dec 2002
 11:28:16 GMT)
X-Complaints-To: usenet@news.net.uni-c.dk
NNTP-Posting-Date: Sun, 1 Dec 2002 11:28:16 +0000 (UTC)
User-Agent: Any Browser, HTML 4.01, XHTML 1.0
X-Accept-Language: fo, sv, no, is, da, German [de]
Xref: archiver1.google.com comp.lang.ada:31325
Date: 2002-12-01T12:28:14+01:00
List-Id: <comp.lang.ada>

Marin David Condic wrote:
> It might make an easy extension to the Ada standard to include 32-bit
> Unicode. After all, its pretty much just a matter of taking existing
> packages and changing a few things so you could have Wide_Wide_Character.
> The question is, would it have sufficient utility to make it worth the
> effort? (Is there much use out there for 32-bit characters?)

Maybe not directly (except for in the far east), but there 
is a rather large and growing indirect need for full support 
for ISO-10646.

In Europe people are starting to switch from ISO-8859 
encodings to the UTF-8 encoding of ISO-10646.  This means 
that although people in practice seldom will use more than 
the 470-something European characters, they will start to 
expect to have access to use all of ISO-10646.

> Perhaps if some additional utility was piled on top of it so that reading a
> text file, Ada would automatically determine what it was looking at and give
> you back text in the proper size (create something like "Universal_String"
> and a whole bunch of utilities around it so it would hold 8, 16 or 32-bit
> characters depending on how it was loaded) - but I don't see how that could
> be done for all text files.

Agreed.  One needs some kind of information about which 
encoding is used - but that is already the case.  The best 
solution I can think of is to demand that the operating 
system keeps track of the file type (including encoding for 
text files).  The second best solution is (IMHO) to 
introduce a sensible common standard encoding.  I don't know 
if it should be UTF-8 or raw 32-bit ISO-10646.  And I can 
certainly not advice people to use the current procedure on 
Unix systems, where each user chooses his/her assumed 
encoding of text files.

> The concept is a little vague in my mind, but I could imagine how something
> like this might be a useful idea for a standard Ada library. It really
> doesn't require any fundamental changes to the language.

No.  But it would be nice, if one could demand that 
compilers can handle UTF-8 or raw 32-bit ISO-10646 encoded 
source files.

Greetings,

Jacob
-- 
"I don't want to gain immortality in my works.
  I want to gain it by not dying."