From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,ece5a18e6179c51a X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2003-10-23 21:01:12 PST Path: archiver1.google.com!news2.google.com!news.maxwell.syr.edu!in.100proofnews.com!in.100proofnews.com!attla2!ip.att.net!attbi_feed3!attbi.com!attbi_s02.POSTED!not-for-mail From: "Steve" Newsgroups: comp.lang.ada References: <5d6fdb61.0310230648.62219442@posting.google.com> Subject: Re: Ada, Gnat and Unicode X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 Message-ID: NNTP-Posting-Host: 12.211.13.75 X-Complaints-To: abuse@comcast.net X-Trace: attbi_s02 1066968071 12.211.13.75 (Fri, 24 Oct 2003 04:01:11 GMT) NNTP-Posting-Date: Fri, 24 Oct 2003 04:01:11 GMT Organization: Comcast Online Date: Fri, 24 Oct 2003 04:01:11 GMT Xref: archiver1.google.com comp.lang.ada:1572 Date: 2003-10-24T04:01:11+00:00 List-Id: A good place to start looking is to download XML/Ada and have a look at the unicode part. There appears to be extensive support there. Steve (The Duck) "Jano" <402450@cepsz.unizar.es> wrote in message news:5d6fdb61.0310230648.62219442@posting.google.com... > Hello sirs, > > I'm thinking about the best procedure to internationalize some Ada > program and I have some doubts. Please shed some light if you can. > > AFAIK, the Ada Character type is the 256 first values from ISO 10646 > (Latin1). In the same fashion, Wide_Character are the 2**16 values of > that same ISO. The ARM furthermore says that an implementation can > provide alternate representations conforming to local conventions, but > later it states that said representation should be a proper subset of > these two. I'm not very sure about what that implies. > > Some old discussion suggest that 10646 and Unicode are equivalent, but > it seems that later they dissociated. In any case Unicode is more than > the 2**16 values that Wide_character can hold so I'm not sure that > Wide_character is useful at all (?) > > Anyhow, I was thinking of using UTF8 encoding. That's convenient as it > can hold anything in the Unicode world, is space efficient, provides > good interoperability with other languages/Packages (GtkAda, Java, > ...). > > My doubt principally comes from behavior when you're not using a > Latin1 OS, for example a Chinese Windows. When you do some I/O, for > example a read from console with Text_IO.Get (Wide_Text_IO?). Or when > using Gnat.Directory_Operations to enumerate files. > > I don't find information in the Gnat UG/RM about these things. What > will these functions return? It's specified somewhere, or will they > pass the bytes from the underlying OS calls inside a String so I can't > know in advance what to expect? > > Thanks for any clarifications, > > Alex.