From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,a82f86f344c98f79 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,UTF8 Path: g2news2.google.com!news2.google.com!news.germany.com!news.teledata-fn.de!newsfeed.arcor.de!newsspool4.arcor-online.net!news.arcor.de.POSTED!not-for-mail Newsgroups: comp.lang.ada Subject: Re: Avatox 1.0: Trouble with encoding in Windows From: Georg Bauhaus In-Reply-To: References: <45051d37@news.upm.es> <45053aec$0$5142$9b4e6d93@newsspool1.arcor-online.net> <5ZednRK-0M3K15rYnZ2dnUVZ_o2dnZ2d@megapath.net> <1158145462.921837.152720@i42g2000cwa.googlegroups.com> <1158224191.059815.103080@i42g2000cwa.googlegroups.com> <450AECB4.3060000@obry.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Organization: # Message-Id: <1158359363.29388.36.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.6.1 Date: Sat, 16 Sep 2006 00:29:24 +0200 NNTP-Posting-Date: 16 Sep 2006 00:27:55 CEST NNTP-Posting-Host: c3feefcb.newsspool4.arcor-online.net X-Trace: DXC=1?[Hk?AM6ljJ00P1S40fZg4IUK\BH3YbE^RM0W On Fri, 2006-09-15 at 20:53 +0200, Dmitry A. Kazakov wrote: > IMO, the idea to use Unicode for program sources is wrong. The language (= be > it formal or natural) should have a finite and reasonably small alphabet. > Unicode is practically an open-end set of symbols most of them you wouldn= 't > be able to either recognize or remember again. Unicode is quite flexible and allows a project to choose a reasonable subset of characters. A portable subset is fairly easy to describe because both Ada and UCS define a common character set from which you can choose. No lengthy discussions of how to interpret 8 bits, no issues with conforming compilers. Greek.=CE=A9 /=3D Electric.=E2=84=A6 is an issue in Ada 95, too, when you use local character sets for two different files. Shou1d the number l, sorry, 1, not occur in source text, because it is too easy to miss the difference, so please, remove it from the Ada grammar? ;-) You can extend the Unicode subset chosen for the project later, without introducing ambiguity or a configuration issue. Using Unicode for program source text lets you write identifiers that just cannot coexists in Latin_1, or any 8bit character set. -- Georg=20