From: "Robert C. Leif" <rleif@rleif.com>
Subject: Re: Character Sets
Date: Thu, 28 Nov 2002 09:53:14 -0800
Date: 2002-11-28T09:53:14-08:00 [thread overview]
Message-ID: <mailman.1038506043.17255.comp.lang.ada@ada.eu.org> (raw)
Christoph Grein responded to my inquiry by stating that,
" Latin_9.Euro_Sign is a name for a character. The same character in Latin_1 has a different name, it is the Currency_Sign."
"So why do you expect this character not to be in the set only because you use a different name for it?"
The Euro_Sign and the Currency_Sign have a different representation according to The ISO 8859 Alphabet Soup http://czyborra.com/charsets/iso8859.html
------------------------------------------------
GNAT Latin_9 (ISO-8859-15)includes the following:
-- Summary of Changes from Latin-1 => Latin-9 --
------------------------------------------------
-- 164 Currency => Euro_Sign
-- 166 Broken_Bar => UC_S_Caron
-- 168 Diaeresis => LC_S_Caron
-- 180 Acute => UC_Z_Caron
-- 184 Cedilla => LC_Z_Caron
-- 188 Fraction_One_Quarter => UC_Ligature_OE
-- 189 Fraction_One_Half => LC_Ligature_OE
-- 190 Fraction_Three_Quarters => UC_Y_Diaeresis
Since these are changes, they should not be the same character.
Below are the results of an extension of my original program that now tests the characters of Latin_9 from character number 164 through 190 and prints them out. I understand that choice of the Windows font will change their representation. The correct glyphs can be found at The ISO 8859 Alphabet Soup. For anyone interested, I have put my program at the end of this note.
I suspect that the best solution would be to introduce UniCode, ISO/IEC 10646, into the Ada standard. The arguments for this are contained in W3C Character Model for the World Wide Web 1.0, W3C Working Draft 30 April 2002
http://www.w3.org/TR/charmod/
"The choice of Unicode was motivated by the fact that Unicode: is the only universal character repertoire available, covers the widest possible range, provides a way of referencing characters independent of the encoding of a resource, is being updated/completed carefully, is widely accepted and implemented by industry."
"W3C adopted Unicode as the document character set for HTML in [HTML 4.0]. The same approach was later used for specifications such as XML 1.0 [XML 1.0] and CSS2 [CSS2]. Unicode now serves as a common reference for W3C specifications and applications."
"The IETF has adopted some policies on the use of character sets on the Internet (see [RFC 2277])."
Bob Leif
------------------------Starting Test-----------------------
Latin_9_Diff is ñѪº¿⌐¬½¼¡«»░▒▓│┤╡╢╖╕╣║╗╝╜╛
The Character ñ is in Latin_1 is TRUE. Its position is 164
The Character Ñ is in Latin_1 is TRUE. Its position is 165
The Character ª is in Latin_1 is TRUE. Its position is 166
The Character º is in Latin_1 is TRUE. Its position is 167
The Character ¿ is in Latin_1 is TRUE. Its position is 168
The Character ⌐ is in Latin_1 is TRUE. Its position is 169
The Character ¬ is in Latin_1 is TRUE. Its position is 170
The Character ½ is in Latin_1 is TRUE. Its position is 171
The Character ¼ is in Latin_1 is TRUE. Its position is 172
The Character ¡ is in Latin_1 is TRUE. Its position is 173
The Character « is in Latin_1 is TRUE. Its position is 174
The Character » is in Latin_1 is TRUE. Its position is 175
The Character ░ is in Latin_1 is TRUE. Its position is 176
The Character ▒ is in Latin_1 is TRUE. Its position is 177
The Character ▓ is in Latin_1 is TRUE. Its position is 178
The Character │ is in Latin_1 is TRUE. Its position is 179
The Character ┤ is in Latin_1 is TRUE. Its position is 180
The Character ╡ is in Latin_1 is TRUE. Its position is 181
The Character ╢ is in Latin_1 is TRUE. Its position is 182
The Character ╖ is in Latin_1 is TRUE. Its position is 183
The Character ╕ is in Latin_1 is TRUE. Its position is 184
The Character ╣ is in Latin_1 is TRUE. Its position is 185
The Character ║ is in Latin_1 is TRUE. Its position is 186
The Character ╗ is in Latin_1 is TRUE. Its position is 187
The Character ╝ is in Latin_1 is TRUE. Its position is 188
The Character ╜ is in Latin_1 is TRUE. Its position is 189
The Character ╛ is in Latin_1 is TRUE. Its position is 190
------------------------Ending Test-----------------------
--Robert C. Leif, Ph.D & Ada_Med Copyright all rights reserved.
--Main Procedure
--Created 27 November 2002
with Ada.Text_Io;
with Ada.Io_Exceptions;
with Ada.Exceptions;
with Ada.Strings;
with Ada.Strings.Maps;
with Ada.Characters.Latin_1;
with Ada.Characters.Latin_9;
procedure Char_Sets_Test is
------------------Table of Contents-------------
package T_Io renames Ada.Text_Io;
package Str_Maps renames Ada.Strings.Maps;
package Latin_1 renames Ada.Characters.Latin_1;
package Latin_9 renames Ada.Characters.Latin_9;
subtype Character_Set_Type is Str_Maps.Character_Set;
subtype Character_Sequence_Type is Str_Maps.Character_Sequence;
-----------------End Table of Contents-------------
Latin_1_Range : constant Str_Maps.Character_Range
:= (Low => Latin_1.Nul, High => Latin_1.Lc_Y_Diaeresis);
Latin_1_Char_Set : Character_Set_Type
:= Str_Maps.To_Set (Span => Latin_1_Range);
--Standard for Ada '95
-- Latin_9 Differences: Euro_Sign, Uc_S_Caron, Lc_S_Caron, Uc_Z_Caron,
-- Lc_Z_Caron, Uc_Ligature_Oe, Lc_Ligature_Oe, Uc_Y_Diaeresis.
Latin_9_Diff_Latin_1_Super_Range : constant Str_Maps.Character_Range
:= (Low => Latin_9.Euro_Sign, High => Latin_9.Uc_Y_Diaeresis);
Latin_9_Diff_Latin_1_Super_Set : Character_Set_Type
:= Str_Maps.To_Set (Span => Latin_9_Diff_Latin_1_Super_Range);
Latin_9_Diff_Latin_1_Super_String : Character_Sequence_Type
:= Str_Maps.To_Sequence (Latin_9_Diff_Latin_1_Super_Set);
Character_Set_Name : String
:= "Latin_1";
---------------------------------------------
procedure Test_Character_Sets (
Character_Sequence_Var : in Character_Sequence_Type;
Set : in Character_Set_Type ) is
Is_In_Character_Set : Boolean := False;
Char : Character := 'X';
Character_Set_Position : Positive := 164; -- Euro_Sign
begin--Test_Character_Sets
T_Io.Put_Line("Latin_9_Diff is " & Latin_9_Diff_Latin_1_Super_String);
T_Io.Put_Line("");
Test_Chars:
for I in Character_Sequence_Var'range loop
Char:= Character_Sequence_Var(I);
Is_In_Character_Set:= Str_Maps.Is_In(
Element => Char,
Set => Latin_1_Char_Set);
T_Io.Put_Line("The Character " & Char & " is in " & Character_Set_Name
& " is " & Boolean'Image (
Is_In_Character_Set) & ". Its position is "
& Positive'Image(Character_Set_Position));
Character_Set_Position:= Character_Set_Position + 1;
end loop Test_Chars;
end Test_Character_Sets;
---------------------------------------------
begin--Bd_W_Char_Sets_Test
T_Io.Put_Line("----------------------Starting Test---------------------);
Test_Character_Sets (
Character_Sequence_Var => Latin_9_Diff_Latin_1_Super_String,
Set => Latin_1_Char_Set);
---------------------------------------------
T_Io.Put_Line("------------------------Ending Test---------------------);
exception
when A: Ada.Io_Exceptions.Status_Error =>
T_Io.Put_Line("Status_Error in Char_Sets_Test.");
T_Io.Put_Line(Ada.Exceptions.Exception_Information(A));
when O: others =>
T_Io.Put_Line("Others_Error in Char_Sets_Test.");
T_Io.Put_Line(Ada.Exceptions.Exception_Information(O));
end Char_Sets_Test;
next reply other threads:[~2002-11-28 17:53 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-11-28 17:53 Robert C. Leif [this message]
2002-11-28 18:08 ` Character Sets (plain text police report) Warren W. Gay VE3WWG
2002-11-28 18:11 ` Warren W. Gay VE3WWG
2002-11-29 11:12 ` Lutz Donnerhacke
2002-11-29 14:58 ` Frank J. Lhota
2002-11-29 20:37 ` Robert C. Leif
2002-11-30 14:49 ` Marin David Condic
2002-12-01 11:28 ` Jacob Sparre Andersen
2002-12-01 14:38 ` Marin David Condic
2002-12-01 20:25 ` Jacob Sparre Andersen
2002-12-02 9:43 ` Preben Randhol
2002-12-02 13:26 ` Marin David Condic
2002-12-02 6:44 ` Robert C. Leif
2002-12-02 9:41 ` Preben Randhol
2002-12-02 16:58 ` Charles Lindsey
2002-12-02 19:29 ` A suggestion, completely unrelated to the original topic Wes Groleau
2002-12-02 23:21 ` David C. Hoos, Sr.
2002-11-29 12:28 ` Character Sets Georg Bauhaus
2002-12-02 18:28 ` Stephen Leake
2002-12-03 2:45 ` Robert C. Leif
2002-12-03 13:33 ` Robert A Duff
2002-12-03 15:32 ` Juanma Barranquero
2002-12-04 0:49 ` Robert C. Leif
2002-12-14 3:27 ` David Starner
2002-12-14 22:53 ` Vadim Godunko
2002-12-15 3:46 ` David Starner
2002-12-15 23:26 ` Robert C. Leif
-- strict thread matches above, loose matches on Subject: below --
2002-11-27 9:00 Grein, Christoph
2002-11-26 21:41 Robert C. Leif
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox