From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,URI_HEX autolearn=no
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,ece5a18e6179c51a
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 2003-10-23 10:39:14 PST
Path: 
 archiver1.google.com!news2.google.com!fu-berlin.de!uni-berlin.de!77144-cm.able.ES!not-for-mail
From: Jano <nono@celes.unizar.es>
Newsgroups: comp.lang.ada
Subject: Re: Ada, Gnat and Unicode
Date: Thu, 23 Oct 2003 19:38:57 +0200
Message-ID: <MPG.1a0230a0729e3ce8989778@News.CIS.DFN.DE>
References: <5d6fdb61.0310230648.62219442@posting.google.com>
 <3F97F83A.6060103@comcast.net>
NNTP-Posting-Host: 77144-cm.able.es (212.97.177.144)
X-Trace: news.uni-berlin.de 1066930752 30953371 212.97.177.144 (16 [49872])
X-Newsreader: MicroPlanet Gravity v2.50
Xref: archiver1.google.com comp.lang.ada:1530
Date: 2003-10-23T19:38:57+02:00
List-Id: <comp.lang.ada>

Robert I. Eachus dice...

(Snipped some interesting bits).

> If you use UTF-8 for source input in GNAT, be aware that they only 
> support UTF-8 for BMP characters, full UTF-8 including 6 octet encodings 
> is not supported.  (Note that all Unicode characters are effectively 
> supported in GNAT, although you will have to use two 16-bit encodings as 
> three octet sequences giving a six octet encoding...)

Thanks for your reply, and now for some clarifications and more doubts 
;)

Firstly, I wasn't referring to me using anything outside of Latin1 for 
my source code. I think it will be best if I explain my problem better.

I'm giving a try with an open source p2p protocol. It permits file 
searches by keyword. These keywords are filenames and/or metadata about 
the files. These data is exchanged UTF8 encoded.

As you may be seeing now, I want to scan a folder and transform the 
filenames into UTF8. That's fine for me which know that I'm getting 
Latin1 encoded strings from the Directory_Operations package, and any 
metadata entered by the user. But I was wondering what would happen to a 
Chinese user (not that I foresee any usage of my program in wide 
deployment, but when faced with the problem one *must* know ;)

> > I don't find information in the Gnat UG/RM about these things.
> 
> Look again, in the GNAT Users Guide for "Foreign Language Representation."

Correct me, that refers to source representation? (I had missed it 
anyway ^_^)

(Of course if my program were to be translated, that applies. I'm not so 
concerned about this but I should have been clearer).

As a final side note, my program is GUI-less, that's why I'm not 
concerned about translation. However it has a SOAP interface. With that 
I've plugged a Java GUI which correctly decodes and shows my UTF8 
strings (a few traces and status reports).

Thanks,

-- 
-------------------------
Jano
402450.at.cepsz.unizar.es
-------------------------