From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,e136d2bb18e6fb60 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2002-12-02 10:29:07 PST Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!logbridge.uoregon.edu!hammer.uoregon.edu!skates!not-for-mail From: Stephen Leake Newsgroups: comp.lang.ada Subject: Re: Character Sets Date: 02 Dec 2002 13:28:58 -0500 Organization: NASA Goddard Space Flight Center (skates.gsfc.nasa.gov) Message-ID: References: NNTP-Posting-Host: anarres.gsfc.nasa.gov Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: skates.gsfc.nasa.gov 1038854420 28109 128.183.235.92 (2 Dec 2002 18:40:20 GMT) X-Complaints-To: usenet@news.gsfc.nasa.gov NNTP-Posting-Date: 2 Dec 2002 18:40:20 GMT User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2 Xref: archiver1.google.com comp.lang.ada:31352 Date: 2002-12-02T18:40:20+00:00 List-Id: "Robert C. Leif" writes: > Christoph Grein responded to my inquiry by stating that, " > Latin_9.Euro_Sign is a name for a character. The same character in > Latin_1 has a different name, it is the Currency_Sign." "So why do > you expect this character not to be in the set only because you use > a different name for it?" The Euro_Sign and the Currency_Sign have a > different representation according to The ISO 8859 Alphabet Soup > http://czyborra.com/charsets/iso8859.html > ------------------------------------------------ GNAT Latin_9 > (ISO-8859-15)includes the following: -- Summary of Changes from > Latin-1 => Latin-9 -- > ------------------------------------------------ > > -- 164 Currency => Euro_Sign > -- 166 Broken_Bar => UC_S_Caron > -- 168 Diaeresis => LC_S_Caron > -- 180 Acute => UC_Z_Caron > -- 184 Cedilla => LC_Z_Caron > -- 188 Fraction_One_Quarter => UC_Ligature_OE > -- 189 Fraction_One_Half => LC_Ligature_OE > -- 190 Fraction_Three_Quarters => UC_Y_Diaeresis Hmm. This says to me: "In the Latin-1 character set, the character with internal value 164 is called 'Currency'. In the Latin-9 character set, the character with internal value 164 is called 'Euro_Sign'". Presumably, elsewhere in the Latin-1 and Latin-9 standards, they specify the "glyph" used to display those characters on a screen or paper, and the glyph for character 164 is different between Latin-1 and Latin-9. > Since these are changes, they should not be the same character. By "same character", we (and Ada) mean "same internal value", ie "164". However, I suspect you mean "same glyph", in which case they are not the "same character"; they do not have the same glyph. > Below are the results of an extension of my original program that > now tests the characters of Latin_9 from character number 164 > through 190 and prints them out. What results would you like from this program? > I understand that choice of the Windows font will change their > representation. Yes, because the choice of font determines the glyph. > anyone interested, I have put my program at the end of this note. I > suspect that the best solution would be to introduce UniCode, I'm not clear what the "problem" is, so I can't tell if this is a "solution". -- -- Stephe