From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII
X-Google-Thread: 103376,1086bab45b40d4b0
X-Google-Attributes: gid103376,public
Path: 
 controlnews3.google.com!news2.google.com!news.maxwell.syr.edu!newsfeed.icl.net!feed.news.tiscali.de!newsfeed01.sul.t-online.de!newsfeed00.sul.t-online.de!newsmm00.sul.t-online.de!t-online.de!news.t-online.com!not-for-mail
From: Martin Krischik <krischik@users.sourceforge.net>
Newsgroups: comp.lang.ada
Subject: Re: UTF-8 in strings - a bug?
Date: Sat, 08 May 2004 08:38:23 +0200
Organization: AdaCL
Message-ID: <1146278.PRGNMAO9kp@linux1.krischik.com>
References: <TEdmc.58085$mU6.237063@newsb.telia.net>
 <WJOdndbsxKPZ5ATdRVn-iQ@comcast.com> <lMmmc.58280$mU6.237078@newsb.telia.net>
 <200456-112553-85684@foorum.com> <2178612.8V5KANFFf5@linux1.krischik.com>
 <q0Vmc.58459$mU6.237464@newsb.telia.net>
Reply-To: krischik@users.sourceforge.net
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-15
Content-Transfer-Encoding: 8Bit
X-Trace: news.t-online.com 1084030376 07 24930 HJz0Go8rQk13EqV 040508 15:32:56
X-Complaints-To: usenet-abuse@t-online.de
X-ID: V9KWjwZOYe9p+ms9WYryuKEl54HWWmtWOROInsNU5JxbMXblfYxFYN
User-Agent: KNode/0.7.7
Xref: controlnews3.google.com comp.lang.ada:387
Date: 2004-05-08T08:38:23+02:00
List-Id: <comp.lang.ada>

Bj�rn Persson wrote:

> Martin Krischik wrote:
> 
>> XMLAda comes with a Unicode library which can do some transcoding.
> 
> Well, I suppose the existence of that library is a good thing, but after
> reading the introduction in unicode.ads I have to wonder whether it's
> them or me who have misunderstood Unicode. It mentions "Utf32 Latin1"
> and "Utf8 Latin2" strings. This looks really weird to me. You don't
> encode Latin-1 in UTF-32 or Latin-2 in UTF-8. You encode Unicode in
> UTF-8 or UTF-32, or you encode a subset of Unicode in Latin-1, or
> another subset in Latin-2.

Well, I have worked a bit more with that library and it seems that there are
special versions of UTF-8 and that you can place some info block at the
beginning at the UTF-8 String for fine tuning.

UTF-16 and UTF-32 are variable length encodings as well. Just in case
extrateritials finally drop in and we need 64 bit character sets. 

So the XMLAda seems more complete then the average Unicode implementation.

With Regards

Martin

-- 
mailto://krischik@users.sourceforge.net
http://www.ada.krischik.com