From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,URI_HEX autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,ece5a18e6179c51a X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2003-10-23 10:39:14 PST Path: archiver1.google.com!news2.google.com!fu-berlin.de!uni-berlin.de!77144-cm.able.ES!not-for-mail From: Jano Newsgroups: comp.lang.ada Subject: Re: Ada, Gnat and Unicode Date: Thu, 23 Oct 2003 19:38:57 +0200 Message-ID: References: <5d6fdb61.0310230648.62219442@posting.google.com> <3F97F83A.6060103@comcast.net> NNTP-Posting-Host: 77144-cm.able.es (212.97.177.144) X-Trace: news.uni-berlin.de 1066930752 30953371 212.97.177.144 (16 [49872]) X-Newsreader: MicroPlanet Gravity v2.50 Xref: archiver1.google.com comp.lang.ada:1530 Date: 2003-10-23T19:38:57+02:00 List-Id: Robert I. Eachus dice... (Snipped some interesting bits). > If you use UTF-8 for source input in GNAT, be aware that they only > support UTF-8 for BMP characters, full UTF-8 including 6 octet encodings > is not supported. (Note that all Unicode characters are effectively > supported in GNAT, although you will have to use two 16-bit encodings as > three octet sequences giving a six octet encoding...) Thanks for your reply, and now for some clarifications and more doubts ;) Firstly, I wasn't referring to me using anything outside of Latin1 for my source code. I think it will be best if I explain my problem better. I'm giving a try with an open source p2p protocol. It permits file searches by keyword. These keywords are filenames and/or metadata about the files. These data is exchanged UTF8 encoded. As you may be seeing now, I want to scan a folder and transform the filenames into UTF8. That's fine for me which know that I'm getting Latin1 encoded strings from the Directory_Operations package, and any metadata entered by the user. But I was wondering what would happen to a Chinese user (not that I foresee any usage of my program in wide deployment, but when faced with the problem one *must* know ;) > > I don't find information in the Gnat UG/RM about these things. > > Look again, in the GNAT Users Guide for "Foreign Language Representation." Correct me, that refers to source representation? (I had missed it anyway ^_^) (Of course if my program were to be translated, that applies. I'm not so concerned about this but I should have been clearer). As a final side note, my program is GUI-less, that's why I'm not concerned about translation. However it has a SOAP interface. With that I've plugged a Java GUI which correctly decodes and shows my UTF8 strings (a few traces and status reports). Thanks, -- ------------------------- Jano 402450.at.cepsz.unizar.es -------------------------