From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,43ab55a75a8b5d1 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII Path: g2news2.google.com!news3.google.com!news.glorb.com!news.swapon.de!news2.arglkargh.de!news.n-ix.net!newsfeed.freenet.de!newsfeed01.chello.at!newsfeed02.chello.at!news.hispeed.ch.POSTED!not-for-mail Message-Id: <57677082.XS7luc1THj@linux1.krischik.com> From: Martin Krischik Subject: Re: System.WCh_Cnv Newsgroups: comp.lang.ada Date: Sun, 16 Jul 2006 13:41:46 +0200 References: <3082414.k9Jeq3hKxq@linux1.krischik.com> <1152811469.003475.301520@s13g2000cwa.googlegroups.com> <1152862832.649761.205770@75g2000cwc.googlegroups.com> User-Agent: KNode/0.10.2 MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 8Bit X-Complaints-To: abuse@hispeed.ch Organization: hispeed.ch NNTP-Posting-Host: 80.218.119.160 (80.218.119.160) NNTP-Posting-Date: Sun, 16 Jul 2006 14:00:21 +0200 X-Trace: 552c444ba2a55f57fd20512303 Xref: g2news2.google.com comp.lang.ada:5722 Date: 2006-07-16T13:41:46+02:00 List-Id: Bj�rn Persson wrote: > Martin Krischik wrote: >> I wonder about that. UCS character set are fixed length and UTF >> character sets are variable lengt. So is it rigth to say that UCS-4 >> is UTF-32? > > I believe every possible text will be encoded identically in UCS-4BE and > UTF-32BE, as well as in UCS-4LE and UTF-32LE. If you have a > counter-example then I would like to see it. What character could take > up more than one code unit in UTF-32? A few years ago you could have said the same replacing all '32' with '16'. Many programmers relied on UTF-16 and UCS-2 being the the same. There where no counter-examples at the time either. But one fine day in 2001 the unicode authority(s) defined the 65537'th character... I know that currently only 21 bits are actually used and the unicode authority(s) have given up on using more codepoints. Still I am unsure of just declaring them both the same. Martin -- mailto://krischik@users.sourceforge.net Ada programming at: http://ada.krischik.com