From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00, PP_MIME_FAKE_ASCII_TEXT autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,43ab55a75a8b5d1 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII Path: g2news2.google.com!news4.google.com!news3.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!local02.nntp.dca.giganews.com!nntp.megapath.net!news.megapath.net.POSTED!not-for-mail NNTP-Posting-Date: Mon, 24 Jul 2006 18:33:56 -0500 From: "Randy Brukardt" Newsgroups: comp.lang.ada References: <3082414.k9Jeq3hKxq@linux1.krischik.com> <1152811469.003475.301520@s13g2000cwa.googlegroups.com> <1152862832.649761.205770@75g2000cwc.googlegroups.com> <57677082.XS7luc1THj@linux1.krischik.com> Subject: Re: System.WCh_Cnv Date: Mon, 24 Jul 2006 18:35:00 -0500 X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2800.1807 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1807 Message-ID: NNTP-Posting-Host: 64.32.209.38 X-Trace: sv3-4uNxXsiBIwg5JsdqVyXEm2DSX6eqjb5QGwdy20mzmVDJRbhgYRqTIqR5DWHn+DzO6WxOSvCHlH32kD0!NpoEzwzldS4UH7PA82uif5aQfySkjg44n8x3uuC3sq9uShFPHFgIpZyhcSbOwaju2olVOFSxIkCm!Bm2ICWXGJovEfQ== X-Complaints-To: abuse@megapath.net X-DMCA-Complaints-To: abuse@megapath.net X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.32 Xref: g2news2.google.com comp.lang.ada:5907 Date: 2006-07-24T18:35:00-05:00 List-Id: "Bj�rn Persson" wrote in message news:Sxaxg.10370$E02.3445@newsb.telia.net... >... > How about a *hypothetical* counter-example? If you had a character with > the code point 100000000 hexadecimal, how would you encode it in UTF-32? > I believe it's impossible; I believe UTF-32 is a fixed-width encoding. It would have to be hypothetical: Unicode is a 31-bit character set. Note that I said 31 bits, not 32-bits. Thus, UTF-32 and UCS-4 are the same *if encoding Unicode characters*. (UTF-32 would need extra bytes to encode 32-bit characters with the high bit on, but those would not be Unicode characters.) Of course, some future character set could use more then 31-bits, but that seems well into the future. Randy Brukardt.