comp.lang.ada
 help / color / mirror / Atom feed
* Re: Character Sets
@ 2002-11-28 17:53 Robert C. Leif
  2002-11-28 18:08 ` Character Sets (plain text police report) Warren W. Gay VE3WWG
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Robert C. Leif @ 2002-11-28 17:53 UTC (permalink / raw)


Christoph Grein responded to my inquiry by stating that,
" Latin_9.Euro_Sign is a name for a character. The same character in Latin_1 has a different name, it is the Currency_Sign."
"So why do you expect this character not to be in the set only because you use a different name for it?"
The Euro_Sign and the Currency_Sign have a different representation according to The ISO 8859 Alphabet Soup http://czyborra.com/charsets/iso8859.html
------------------------------------------------
GNAT Latin_9 (ISO-8859-15)includes the following:
   -- Summary of Changes from Latin-1 => Latin-9 --
   ------------------------------------------------

   --   164     Currency                => Euro_Sign
   --   166     Broken_Bar              => UC_S_Caron
   --   168     Diaeresis               => LC_S_Caron
   --   180     Acute                   => UC_Z_Caron
   --   184     Cedilla                 => LC_Z_Caron
   --   188     Fraction_One_Quarter    => UC_Ligature_OE
   --   189     Fraction_One_Half       => LC_Ligature_OE
   --   190     Fraction_Three_Quarters => UC_Y_Diaeresis
Since these are changes, they should not be the same character.
Below are the results of an extension of my original program that now tests the characters of Latin_9 from character number 164 through 190 and prints them out. I understand that choice of the Windows font will change their representation. The correct glyphs can be found at The ISO 8859 Alphabet Soup. For anyone interested, I have put my program at the end of this note.
I suspect that the best solution would be to introduce UniCode, ISO/IEC 10646, into the Ada standard. The arguments for this are contained in W3C Character Model for the World Wide Web 1.0, W3C Working Draft 30 April 2002
http://www.w3.org/TR/charmod/
"The choice of Unicode was motivated by the fact that Unicode: is the only universal character repertoire available, covers the widest possible range, provides a way of referencing characters independent of the encoding of a resource, is being updated/completed carefully, is widely accepted and implemented by industry."
"W3C adopted Unicode as the document character set for HTML in [HTML 4.0]. The same approach was later used for specifications such as XML 1.0 [XML 1.0] and CSS2 [CSS2]. Unicode now serves as a common reference for W3C specifications and applications."
"The IETF has adopted some policies on the use of character sets on the Internet (see [RFC 2277])."
Bob Leif
------------------------Starting Test-----------------------
Latin_9_Diff is ñѪº¿⌐¬½¼¡«»░▒▓│┤╡╢╖╕╣║╗╝╜╛

The Character ñ is in Latin_1 is TRUE. Its position is  164
The Character Ñ is in Latin_1 is TRUE. Its position is  165
The Character ª is in Latin_1 is TRUE. Its position is  166
The Character º is in Latin_1 is TRUE. Its position is  167
The Character ¿ is in Latin_1 is TRUE. Its position is  168
The Character ⌐ is in Latin_1 is TRUE. Its position is  169
The Character ¬ is in Latin_1 is TRUE. Its position is  170
The Character ½ is in Latin_1 is TRUE. Its position is  171
The Character ¼ is in Latin_1 is TRUE. Its position is  172
The Character ¡ is in Latin_1 is TRUE. Its position is  173
The Character « is in Latin_1 is TRUE. Its position is  174
The Character » is in Latin_1 is TRUE. Its position is  175
The Character ░ is in Latin_1 is TRUE. Its position is  176
The Character ▒ is in Latin_1 is TRUE. Its position is  177
The Character ▓ is in Latin_1 is TRUE. Its position is  178
The Character │ is in Latin_1 is TRUE. Its position is  179
The Character ┤ is in Latin_1 is TRUE. Its position is  180
The Character ╡ is in Latin_1 is TRUE. Its position is  181
The Character ╢ is in Latin_1 is TRUE. Its position is  182
The Character ╖ is in Latin_1 is TRUE. Its position is  183
The Character ╕ is in Latin_1 is TRUE. Its position is  184
The Character ╣ is in Latin_1 is TRUE. Its position is  185
The Character ║ is in Latin_1 is TRUE. Its position is  186
The Character ╗ is in Latin_1 is TRUE. Its position is  187
The Character ╝ is in Latin_1 is TRUE. Its position is  188
The Character ╜ is in Latin_1 is TRUE. Its position is  189
The Character ╛ is in Latin_1 is TRUE. Its position is  190
------------------------Ending Test-----------------------
--Robert C. Leif, Ph.D & Ada_Med Copyright all rights reserved.
--Main Procedure 
--Created 27 November 2002
with Ada.Text_Io;
with Ada.Io_Exceptions;
with Ada.Exceptions;
with Ada.Strings;
with Ada.Strings.Maps;
with  Ada.Characters.Latin_1;
with  Ada.Characters.Latin_9;
procedure Char_Sets_Test is 
   ------------------Table of Contents------------- 
   package T_Io renames Ada.Text_Io;
   package Str_Maps renames Ada.Strings.Maps;
   package Latin_1 renames Ada.Characters.Latin_1;
   package Latin_9 renames Ada.Characters.Latin_9;
   subtype Character_Set_Type is Str_Maps.Character_Set;
   subtype Character_Sequence_Type is Str_Maps.Character_Sequence;

   -----------------End Table of Contents-------------
   Latin_1_Range    : constant Str_Maps.Character_Range
      := (Low => Latin_1.Nul, High => Latin_1.Lc_Y_Diaeresis);  
   Latin_1_Char_Set :          Character_Set_Type      
      := Str_Maps.To_Set (Span => Latin_1_Range);  
   --Standard for Ada '95
   -- Latin_9 Differences: Euro_Sign, Uc_S_Caron, Lc_S_Caron, Uc_Z_Caron, 
   -- Lc_Z_Caron, Uc_Ligature_Oe, Lc_Ligature_Oe, Uc_Y_Diaeresis.
   Latin_9_Diff_Latin_1_Super_Range  : constant Str_Maps.Character_Range
      := (Low => Latin_9.Euro_Sign, High => Latin_9.Uc_Y_Diaeresis);  
   Latin_9_Diff_Latin_1_Super_Set    :          Character_Set_Type      
      := Str_Maps.To_Set (Span => Latin_9_Diff_Latin_1_Super_Range);  
   Latin_9_Diff_Latin_1_Super_String :          Character_Sequence_Type 
      := Str_Maps.To_Sequence (Latin_9_Diff_Latin_1_Super_Set);  
   Character_Set_Name                :          String                 
      := "Latin_1";  
   ---------------------------------------------   
   procedure Test_Character_Sets (
         Character_Sequence_Var : in     Character_Sequence_Type; 
         Set                    : in     Character_Set_Type       ) is 
      Is_In_Character_Set : Boolean   := False;  
      Char                : Character := 'X';  
      Character_Set_Position : Positive := 164; -- Euro_Sign   
   begin--Test_Character_Sets
      T_Io.Put_Line("Latin_9_Diff is " & Latin_9_Diff_Latin_1_Super_String);
      T_Io.Put_Line("");
      Test_Chars:
         for I in Character_Sequence_Var'range loop
         Char:= Character_Sequence_Var(I);
         Is_In_Character_Set:= Str_Maps.Is_In(
            Element => Char,            
            Set     => Latin_1_Char_Set);
         T_Io.Put_Line("The Character " & Char & " is in " & Character_Set_Name
            &  " is " & Boolean'Image (
               Is_In_Character_Set) & ". Its position is "
                  & Positive'Image(Character_Set_Position));
         Character_Set_Position:= Character_Set_Position + 1;
      end loop Test_Chars;
   end Test_Character_Sets;
   ---------------------------------------------     
begin--Bd_W_Char_Sets_Test
   T_Io.Put_Line("----------------------Starting Test---------------------);
   Test_Character_Sets (
      Character_Sequence_Var => Latin_9_Diff_Latin_1_Super_String, 
      Set                    => Latin_1_Char_Set);
   ---------------------------------------------
   T_Io.Put_Line("------------------------Ending Test---------------------);

exception
   when A: Ada.Io_Exceptions.Status_Error =>
      T_Io.Put_Line("Status_Error in Char_Sets_Test.");
      T_Io.Put_Line(Ada.Exceptions.Exception_Information(A));
   when O: others =>
      T_Io.Put_Line("Others_Error in Char_Sets_Test.");
      T_Io.Put_Line(Ada.Exceptions.Exception_Information(O));

end Char_Sets_Test;




^ permalink raw reply	[flat|nested] 29+ messages in thread
* Re: Character Sets
@ 2002-11-27  9:00 Grein, Christoph
  0 siblings, 0 replies; 29+ messages in thread
From: Grein, Christoph @ 2002-11-27  9:00 UTC (permalink / raw)


> From: Bob Leif
> I am trying to test if a character is not in the Latin_1 character set.
> I choose the Euro because it is in Latin_9 and not in Latin_1. I tested
> the function Ada.Strings.Maps.Is_In. It returns that the Euro_Sign is in
> the Latin_1 character set. What have I done wrong?
> My test program, which compiled and executed under GNAT 3.15p under
> Windows XP, produced:
> ------------------------Starting Test-----------------------
> Is_In_Character_Set is TRUE
> ------------------------Ending Test-----------------------
>  The test program is as follows:
> ---------------------------------------------------------
> with Ada.Text_Io;
> with Ada.Io_Exceptions;
> with Ada.Exceptions;
> with Ada.Strings;
> with Ada.Strings.Maps;
> with  Ada.Characters.Latin_1;
> with  Ada.Characters.Latin_9;
> procedure Char_Sets_Test is 
>    ------------------Table of Contents------------- 
>    package T_Io renames Ada.Text_Io;
>    package Str_Maps renames Ada.Strings.Maps;
>    package Latin_1 renames Ada.Characters.Latin_1;
>    package Latin_9 renames Ada.Characters.Latin_9;
>    subtype Character_Set_Type is Str_Maps.Character_Set;
>    -----------------End Table of Contents-------------
>    Latin_1_Range    : constant Str_Maps.Character_Range
> 	 := (Low => Latin_1.Nul, High => Latin_1.Lc_Y_Diaeresis);  

This is the full range of type Character, isn't it.

>    Latin_1_Char_Set :          Character_Set_Type       :=
> Str_Maps.To_Set 	(Span => Latin_1_Range);  

So this is the set of all characters.

>    --Standard for Ada '95
>    Is_In_Character_Set : Boolean := False;  
>    ---------------------------------------------
> begin--Bd_W_Char_Sets_Test
>    T_Io.Put_Line("-----------------------Starting
> Test--------------------);
>    ---------------------------------------------
>    --Test Character_Sets
>    Is_In_Character_Set:=Ada.Strings.Maps.Is_In (
>       Element => Latin_9.Euro_Sign, 
>       Set     => Latin_1_Char_Set);

Latin_9.Euro_Sign is a name for a character. The same character in Latin1 has a 
different name, it is the Currency_Sign.

So why do you expect this character not to be in the set only because you use a 
different name for it?

>    T_Io.Put_Line("Is_In_Character_Set is " & Boolean'Image
> (Is_In_Character_Set));
>    ---------------------------------------------   
>    ---------------------------------------------
>    T_Io.Put_Line("-----------------------Ending
> Test----------------------);
> 
> exception
>    when A: Ada.Io_Exceptions.Status_Error =>
>       T_io.Put_Line("Status_Error in Char_Sets_Test.");
>       T_Io.Put_Line(Ada.Exceptions.Exception_Information(A));
>    when O: others =>
>       T_Io.Put_Line("Others_Error in Char_Sets_Test.");
>       T_Io.Put_Line(Ada.Exceptions.Exception_Information(O));
> end Char_Sets_Test;
> 
> _______________________________________________
> comp.lang.ada mailing list
> comp.lang.ada@ada.eu.org
> http://ada.eu.org/mailman/listinfo/comp.lang.ada



^ permalink raw reply	[flat|nested] 29+ messages in thread
* Character Sets
@ 2002-11-26 21:41 Robert C. Leif
  0 siblings, 0 replies; 29+ messages in thread
From: Robert C. Leif @ 2002-11-26 21:41 UTC (permalink / raw)


From: Bob Leif
I am trying to test if a character is not in the Latin_1 character set.
I choose the Euro because it is in Latin_9 and not in Latin_1. I tested
the function Ada.Strings.Maps.Is_In. It returns that the Euro_Sign is in
the Latin_1 character set. What have I done wrong?
My test program, which compiled and executed under GNAT 3.15p under
Windows XP, produced:
------------------------Starting Test-----------------------
Is_In_Character_Set is TRUE
------------------------Ending Test-----------------------
 The test program is as follows:
---------------------------------------------------------
with Ada.Text_Io;
with Ada.Io_Exceptions;
with Ada.Exceptions;
with Ada.Strings;
with Ada.Strings.Maps;
with  Ada.Characters.Latin_1;
with  Ada.Characters.Latin_9;
procedure Char_Sets_Test is 
   ------------------Table of Contents------------- 
   package T_Io renames Ada.Text_Io;
   package Str_Maps renames Ada.Strings.Maps;
   package Latin_1 renames Ada.Characters.Latin_1;
   package Latin_9 renames Ada.Characters.Latin_9;
   subtype Character_Set_Type is Str_Maps.Character_Set;
   -----------------End Table of Contents-------------
   Latin_1_Range    : constant Str_Maps.Character_Range
	 := (Low => Latin_1.Nul, High => Latin_1.Lc_Y_Diaeresis);  
   Latin_1_Char_Set :          Character_Set_Type       :=
Str_Maps.To_Set 	(Span => Latin_1_Range);  
   --Standard for Ada '95
   Is_In_Character_Set : Boolean := False;  
   ---------------------------------------------
begin--Bd_W_Char_Sets_Test
   T_Io.Put_Line("-----------------------Starting
Test--------------------);
   ---------------------------------------------
   --Test Character_Sets
   Is_In_Character_Set:=Ada.Strings.Maps.Is_In (
      Element => Latin_9.Euro_Sign, 
      Set     => Latin_1_Char_Set);
   T_Io.Put_Line("Is_In_Character_Set is " & Boolean'Image
(Is_In_Character_Set));
   ---------------------------------------------   
   ---------------------------------------------
   T_Io.Put_Line("-----------------------Ending
Test----------------------);

exception
   when A: Ada.Io_Exceptions.Status_Error =>
      T_io.Put_Line("Status_Error in Char_Sets_Test.");
      T_Io.Put_Line(Ada.Exceptions.Exception_Information(A));
   when O: others =>
      T_Io.Put_Line("Others_Error in Char_Sets_Test.");
      T_Io.Put_Line(Ada.Exceptions.Exception_Information(O));
end Char_Sets_Test;




^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2002-12-15 23:26 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-28 17:53 Character Sets Robert C. Leif
2002-11-28 18:08 ` Character Sets (plain text police report) Warren W. Gay VE3WWG
2002-11-28 18:11   ` Warren W. Gay VE3WWG
2002-11-29 11:12     ` Lutz Donnerhacke
2002-11-29 14:58       ` Frank J. Lhota
2002-11-29 20:37   ` Robert C. Leif
2002-11-30 14:49     ` Marin David Condic
2002-12-01 11:28       ` Jacob Sparre Andersen
2002-12-01 14:38         ` Marin David Condic
2002-12-01 20:25           ` Jacob Sparre Andersen
2002-12-02  9:43             ` Preben Randhol
2002-12-02 13:26               ` Marin David Condic
2002-12-02  6:44           ` Robert C. Leif
2002-12-02  9:41           ` Preben Randhol
2002-12-02 16:58           ` Charles Lindsey
2002-12-02 19:29     ` A suggestion, completely unrelated to the original topic Wes Groleau
2002-12-02 23:21       ` David C. Hoos, Sr.
2002-11-29 12:28 ` Character Sets Georg Bauhaus
2002-12-02 18:28 ` Stephen Leake
2002-12-03  2:45   ` Robert C. Leif
2002-12-03 13:33     ` Robert A Duff
2002-12-03 15:32       ` Juanma Barranquero
2002-12-04  0:49       ` Robert C. Leif
2002-12-14  3:27         ` David Starner
2002-12-14 22:53           ` Vadim Godunko
2002-12-15  3:46             ` David Starner
2002-12-15 23:26             ` Robert C. Leif
  -- strict thread matches above, loose matches on Subject: below --
2002-11-27  9:00 Grein, Christoph
2002-11-26 21:41 Robert C. Leif

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox