From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,6f641d1e7358d78 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 1995-02-01 04:13:27 PST Path: nntp.gmd.de!Germany.EU.net!howland.reston.ans.net!gatech!newsxfer.itd.umich.edu!zip.eecs.umich.edu!panix!cmcl2!thecourier.cims.nyu.edu!thecourier.cims.nyu.edu!nobody From: dewar@cs.nyu.edu (Robert Dewar) Newsgroups: comp.lang.ada Subject: Re: Ada + Multi-Byte/Wide Chars = Modern Language? Date: 1 Feb 1995 07:13:27 -0500 Organization: Courant Institute of Mathematical Sciences Message-ID: <3gntt7$8dg@gnat.cs.nyu.edu> References: <3g3kde$9p4@gnat.cs.nyu.edu> <3g866d$6hb@cnj.digex.net> <3gc88j$7i1@gamma.ois.com> <1995Jan29.171712.4531@midway.uchicago.edu> NNTP-Posting-Host: gnat.cs.nyu.edu Date: 1995-02-01T07:13:27-05:00 List-Id: Richard Goerwitz says: >In R. William Beckwith writes: >> >> -- The declaration of type Wide_Character is based on the standard >> -- ISO 10646 BMP character set. The first 256 positions have the >> -- same contents as type Character. See 3.5.2. > ^^^^^^^^^^^^^ > >What is meant by "same contents"? There are several multi-byte and wide >character standards. By using the word "contents" here is the standard >implying that type Character is assumed to encode specific glyphs? Yes, there are several multi-byte and wide character standards, but what's that got to do with it? As Bill (and the RM!) make clear, Ada uses the ISO 10646 BMP standard, which is identical to Unicode. The first 256 character positions of the BMP set correspond to Latin-1, which is what type Character in Ada is. i.e the ansswer to your question is that not only is the Standard "implying" the encoding of specific glyphs, it is in a sense requiring it. Of course in practice there are no language semantics that depend on the specific glyphs, so a given Ada compiler can be used in an environment with a quite different set of glyphs. In addition, Ada compilers are free to provide alternative localizations of the definition of Character and/or Wide_Character. For example, in GNAT there is a switch gnati with the following settings: -gnati1 Latin-1 (the standard setting) -gnati2 Latin-2 -gnati3 Latin-3 -gnati4 Latin-4 -gnatip IBM-PC character set -gnatif Full upper half allowed in identifiers with no case equivalence -gnatin No upper half characters allowed in identifiers -gnatiw Wide characters allowed in identifiers These settings govern the set of characters that are accepted in identifiers, and the definition of upper-lower case correspondence. For detailed definitions of these character sets, see the source of package Csets in the GNAT sources.