From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,e136d2bb18e6fb60 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2002-12-01 03:28:17 PST Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!newsfeed1.bredband.com!bredband!uio.no!newsfeed1.uni2.dk!news.net.uni-c.dk!not-for-mail From: Jacob Sparre Andersen Newsgroups: comp.lang.ada Subject: Re: Character Sets (plain text police report) Date: Sun, 01 Dec 2002 12:28:14 +0100 Organization: UNI-C Message-ID: <3DE9F24E.3010002@nbi.dk> References: NNTP-Posting-Host: kaoslx07.nbi.dk Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Trace: news.net.uni-c.dk 1038742096 22620 130.225.212.98 (1 Dec 2002 11:28:16 GMT) X-Complaints-To: usenet@news.net.uni-c.dk NNTP-Posting-Date: Sun, 1 Dec 2002 11:28:16 +0000 (UTC) User-Agent: Any Browser, HTML 4.01, XHTML 1.0 X-Accept-Language: fo, sv, no, is, da, German [de] Xref: archiver1.google.com comp.lang.ada:31325 Date: 2002-12-01T12:28:14+01:00 List-Id: Marin David Condic wrote: > It might make an easy extension to the Ada standard to include 32-bit > Unicode. After all, its pretty much just a matter of taking existing > packages and changing a few things so you could have Wide_Wide_Character. > The question is, would it have sufficient utility to make it worth the > effort? (Is there much use out there for 32-bit characters?) Maybe not directly (except for in the far east), but there is a rather large and growing indirect need for full support for ISO-10646. In Europe people are starting to switch from ISO-8859 encodings to the UTF-8 encoding of ISO-10646. This means that although people in practice seldom will use more than the 470-something European characters, they will start to expect to have access to use all of ISO-10646. > Perhaps if some additional utility was piled on top of it so that reading a > text file, Ada would automatically determine what it was looking at and give > you back text in the proper size (create something like "Universal_String" > and a whole bunch of utilities around it so it would hold 8, 16 or 32-bit > characters depending on how it was loaded) - but I don't see how that could > be done for all text files. Agreed. One needs some kind of information about which encoding is used - but that is already the case. The best solution I can think of is to demand that the operating system keeps track of the file type (including encoding for text files). The second best solution is (IMHO) to introduce a sensible common standard encoding. I don't know if it should be UTF-8 or raw 32-bit ISO-10646. And I can certainly not advice people to use the current procedure on Unix systems, where each user chooses his/her assumed encoding of text files. > The concept is a little vague in my mind, but I could imagine how something > like this might be a useful idea for a standard Ada library. It really > doesn't require any fundamental changes to the language. No. But it would be nice, if one could demand that compilers can handle UTF-8 or raw 32-bit ISO-10646 encoded source files. Greetings, Jacob -- "I don't want to gain immortality in my works. I want to gain it by not dying."