From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,ece5a18e6179c51a,start X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2003-10-23 07:48:12 PST Path: archiver1.google.com!postnews1.google.com!not-for-mail From: 402450@cepsz.unizar.es (Jano) Newsgroups: comp.lang.ada Subject: Ada, Gnat and Unicode Date: 23 Oct 2003 07:48:12 -0700 Organization: http://groups.google.com Message-ID: <5d6fdb61.0310230648.62219442@posting.google.com> NNTP-Posting-Host: 80.81.106.18 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Trace: posting.google.com 1066920492 12956 127.0.0.1 (23 Oct 2003 14:48:12 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Thu, 23 Oct 2003 14:48:12 +0000 (UTC) Xref: archiver1.google.com comp.lang.ada:1518 Date: 2003-10-23T07:48:12-07:00 List-Id: Hello sirs, I'm thinking about the best procedure to internationalize some Ada program and I have some doubts. Please shed some light if you can. AFAIK, the Ada Character type is the 256 first values from ISO 10646 (Latin1). In the same fashion, Wide_Character are the 2**16 values of that same ISO. The ARM furthermore says that an implementation can provide alternate representations conforming to local conventions, but later it states that said representation should be a proper subset of these two. I'm not very sure about what that implies. Some old discussion suggest that 10646 and Unicode are equivalent, but it seems that later they dissociated. In any case Unicode is more than the 2**16 values that Wide_character can hold so I'm not sure that Wide_character is useful at all (?) Anyhow, I was thinking of using UTF8 encoding. That's convenient as it can hold anything in the Unicode world, is space efficient, provides good interoperability with other languages/Packages (GtkAda, Java, ...). My doubt principally comes from behavior when you're not using a Latin1 OS, for example a Chinese Windows. When you do some I/O, for example a read from console with Text_IO.Get (Wide_Text_IO?). Or when using Gnat.Directory_Operations to enumerate files. I don't find information in the Gnat UG/RM about these things. What will these functions return? It's specified somewhere, or will they pass the bytes from the underlying OS calls inside a String so I can't know in advance what to expect? Thanks for any clarifications, Alex.