Character Sets

comp.lang.ada
 help / color / mirror / Atom feed

* Character Sets
@ 2002-11-26 21:41 Robert C. Leif
  0 siblings, 0 replies; 29+ messages in thread
From: Robert C. Leif @ 2002-11-26 21:41 UTC (permalink / raw)


From: Bob Leif
I am trying to test if a character is not in the Latin_1 character set.
I choose the Euro because it is in Latin_9 and not in Latin_1. I tested
the function Ada.Strings.Maps.Is_In. It returns that the Euro_Sign is in
the Latin_1 character set. What have I done wrong?
My test program, which compiled and executed under GNAT 3.15p under
Windows XP, produced:
------------------------Starting Test-----------------------
Is_In_Character_Set is TRUE
------------------------Ending Test-----------------------
 The test program is as follows:
---------------------------------------------------------
with Ada.Text_Io;
with Ada.Io_Exceptions;
with Ada.Exceptions;
with Ada.Strings;
with Ada.Strings.Maps;
with  Ada.Characters.Latin_1;
with  Ada.Characters.Latin_9;
procedure Char_Sets_Test is 
   ------------------Table of Contents------------- 
   package T_Io renames Ada.Text_Io;
   package Str_Maps renames Ada.Strings.Maps;
   package Latin_1 renames Ada.Characters.Latin_1;
   package Latin_9 renames Ada.Characters.Latin_9;
   subtype Character_Set_Type is Str_Maps.Character_Set;
   -----------------End Table of Contents-------------
   Latin_1_Range    : constant Str_Maps.Character_Range
	 := (Low => Latin_1.Nul, High => Latin_1.Lc_Y_Diaeresis);  
   Latin_1_Char_Set :          Character_Set_Type       :=
Str_Maps.To_Set 	(Span => Latin_1_Range);  
   --Standard for Ada '95
   Is_In_Character_Set : Boolean := False;  
   ---------------------------------------------
begin--Bd_W_Char_Sets_Test
   T_Io.Put_Line("-----------------------Starting
Test--------------------);
   ---------------------------------------------
   --Test Character_Sets
   Is_In_Character_Set:=Ada.Strings.Maps.Is_In (
      Element => Latin_9.Euro_Sign, 
      Set     => Latin_1_Char_Set);
   T_Io.Put_Line("Is_In_Character_Set is " & Boolean'Image
(Is_In_Character_Set));
   ---------------------------------------------   
   ---------------------------------------------
   T_Io.Put_Line("-----------------------Ending
Test----------------------);

exception
   when A: Ada.Io_Exceptions.Status_Error =>
      T_io.Put_Line("Status_Error in Char_Sets_Test.");
      T_Io.Put_Line(Ada.Exceptions.Exception_Information(A));
   when O: others =>
      T_Io.Put_Line("Others_Error in Char_Sets_Test.");
      T_Io.Put_Line(Ada.Exceptions.Exception_Information(O));
end Char_Sets_Test;




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets
@ 2002-11-27  9:00 Grein, Christoph
  0 siblings, 0 replies; 29+ messages in thread
From: Grein, Christoph @ 2002-11-27  9:00 UTC (permalink / raw)


> From: Bob Leif
> I am trying to test if a character is not in the Latin_1 character set.
> I choose the Euro because it is in Latin_9 and not in Latin_1. I tested
> the function Ada.Strings.Maps.Is_In. It returns that the Euro_Sign is in
> the Latin_1 character set. What have I done wrong?
> My test program, which compiled and executed under GNAT 3.15p under
> Windows XP, produced:
> ------------------------Starting Test-----------------------
> Is_In_Character_Set is TRUE
> ------------------------Ending Test-----------------------
>  The test program is as follows:
> ---------------------------------------------------------
> with Ada.Text_Io;
> with Ada.Io_Exceptions;
> with Ada.Exceptions;
> with Ada.Strings;
> with Ada.Strings.Maps;
> with  Ada.Characters.Latin_1;
> with  Ada.Characters.Latin_9;
> procedure Char_Sets_Test is 
>    ------------------Table of Contents------------- 
>    package T_Io renames Ada.Text_Io;
>    package Str_Maps renames Ada.Strings.Maps;
>    package Latin_1 renames Ada.Characters.Latin_1;
>    package Latin_9 renames Ada.Characters.Latin_9;
>    subtype Character_Set_Type is Str_Maps.Character_Set;
>    -----------------End Table of Contents-------------
>    Latin_1_Range    : constant Str_Maps.Character_Range
> 	 := (Low => Latin_1.Nul, High => Latin_1.Lc_Y_Diaeresis);  

This is the full range of type Character, isn't it.

>    Latin_1_Char_Set :          Character_Set_Type       :=
> Str_Maps.To_Set 	(Span => Latin_1_Range);  

So this is the set of all characters.

>    --Standard for Ada '95
>    Is_In_Character_Set : Boolean := False;  
>    ---------------------------------------------
> begin--Bd_W_Char_Sets_Test
>    T_Io.Put_Line("-----------------------Starting
> Test--------------------);
>    ---------------------------------------------
>    --Test Character_Sets
>    Is_In_Character_Set:=Ada.Strings.Maps.Is_In (
>       Element => Latin_9.Euro_Sign, 
>       Set     => Latin_1_Char_Set);

Latin_9.Euro_Sign is a name for a character. The same character in Latin1 has a 
different name, it is the Currency_Sign.

So why do you expect this character not to be in the set only because you use a 
different name for it?

>    T_Io.Put_Line("Is_In_Character_Set is " & Boolean'Image
> (Is_In_Character_Set));
>    ---------------------------------------------   
>    ---------------------------------------------
>    T_Io.Put_Line("-----------------------Ending
> Test----------------------);
> 
> exception
>    when A: Ada.Io_Exceptions.Status_Error =>
>       T_io.Put_Line("Status_Error in Char_Sets_Test.");
>       T_Io.Put_Line(Ada.Exceptions.Exception_Information(A));
>    when O: others =>
>       T_Io.Put_Line("Others_Error in Char_Sets_Test.");
>       T_Io.Put_Line(Ada.Exceptions.Exception_Information(O));
> end Char_Sets_Test;
> 
> _______________________________________________
> comp.lang.ada mailing list
> comp.lang.ada@ada.eu.org
> http://ada.eu.org/mailman/listinfo/comp.lang.ada



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets
@ 2002-11-28 17:53 Robert C. Leif
  2002-11-28 18:08 ` Character Sets (plain text police report) Warren W. Gay VE3WWG
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Robert C. Leif @ 2002-11-28 17:53 UTC (permalink / raw)


Christoph Grein responded to my inquiry by stating that,
" Latin_9.Euro_Sign is a name for a character. The same character in Latin_1 has a different name, it is the Currency_Sign."
"So why do you expect this character not to be in the set only because you use a different name for it?"
The Euro_Sign and the Currency_Sign have a different representation according to The ISO 8859 Alphabet Soup http://czyborra.com/charsets/iso8859.html
------------------------------------------------
GNAT Latin_9 (ISO-8859-15)includes the following:
   -- Summary of Changes from Latin-1 => Latin-9 --
   ------------------------------------------------

   --   164     Currency                => Euro_Sign
   --   166     Broken_Bar              => UC_S_Caron
   --   168     Diaeresis               => LC_S_Caron
   --   180     Acute                   => UC_Z_Caron
   --   184     Cedilla                 => LC_Z_Caron
   --   188     Fraction_One_Quarter    => UC_Ligature_OE
   --   189     Fraction_One_Half       => LC_Ligature_OE
   --   190     Fraction_Three_Quarters => UC_Y_Diaeresis
Since these are changes, they should not be the same character.
Below are the results of an extension of my original program that now tests the characters of Latin_9 from character number 164 through 190 and prints them out. I understand that choice of the Windows font will change their representation. The correct glyphs can be found at The ISO 8859 Alphabet Soup. For anyone interested, I have put my program at the end of this note.
I suspect that the best solution would be to introduce UniCode, ISO/IEC 10646, into the Ada standard. The arguments for this are contained in W3C Character Model for the World Wide Web 1.0, W3C Working Draft 30 April 2002
http://www.w3.org/TR/charmod/
"The choice of Unicode was motivated by the fact that Unicode: is the only universal character repertoire available, covers the widest possible range, provides a way of referencing characters independent of the encoding of a resource, is being updated/completed carefully, is widely accepted and implemented by industry."
"W3C adopted Unicode as the document character set for HTML in [HTML 4.0]. The same approach was later used for specifications such as XML 1.0 [XML 1.0] and CSS2 [CSS2]. Unicode now serves as a common reference for W3C specifications and applications."
"The IETF has adopted some policies on the use of character sets on the Internet (see [RFC 2277])."
Bob Leif
------------------------Starting Test-----------------------
Latin_9_Diff is ñÑªº¿⌐¬½¼¡«»░▒▓│┤╡╢╖╕╣║╗╝╜╛

The Character ñ is in Latin_1 is TRUE. Its position is  164
The Character Ñ is in Latin_1 is TRUE. Its position is  165
The Character ª is in Latin_1 is TRUE. Its position is  166
The Character º is in Latin_1 is TRUE. Its position is  167
The Character ¿ is in Latin_1 is TRUE. Its position is  168
The Character ⌐ is in Latin_1 is TRUE. Its position is  169
The Character ¬ is in Latin_1 is TRUE. Its position is  170
The Character ½ is in Latin_1 is TRUE. Its position is  171
The Character ¼ is in Latin_1 is TRUE. Its position is  172
The Character ¡ is in Latin_1 is TRUE. Its position is  173
The Character « is in Latin_1 is TRUE. Its position is  174
The Character » is in Latin_1 is TRUE. Its position is  175
The Character ░ is in Latin_1 is TRUE. Its position is  176
The Character ▒ is in Latin_1 is TRUE. Its position is  177
The Character ▓ is in Latin_1 is TRUE. Its position is  178
The Character │ is in Latin_1 is TRUE. Its position is  179
The Character ┤ is in Latin_1 is TRUE. Its position is  180
The Character ╡ is in Latin_1 is TRUE. Its position is  181
The Character ╢ is in Latin_1 is TRUE. Its position is  182
The Character ╖ is in Latin_1 is TRUE. Its position is  183
The Character ╕ is in Latin_1 is TRUE. Its position is  184
The Character ╣ is in Latin_1 is TRUE. Its position is  185
The Character ║ is in Latin_1 is TRUE. Its position is  186
The Character ╗ is in Latin_1 is TRUE. Its position is  187
The Character ╝ is in Latin_1 is TRUE. Its position is  188
The Character ╜ is in Latin_1 is TRUE. Its position is  189
The Character ╛ is in Latin_1 is TRUE. Its position is  190
------------------------Ending Test-----------------------
--Robert C. Leif, Ph.D & Ada_Med Copyright all rights reserved.
--Main Procedure 
--Created 27 November 2002
with Ada.Text_Io;
with Ada.Io_Exceptions;
with Ada.Exceptions;
with Ada.Strings;
with Ada.Strings.Maps;
with  Ada.Characters.Latin_1;
with  Ada.Characters.Latin_9;
procedure Char_Sets_Test is 
   ------------------Table of Contents------------- 
   package T_Io renames Ada.Text_Io;
   package Str_Maps renames Ada.Strings.Maps;
   package Latin_1 renames Ada.Characters.Latin_1;
   package Latin_9 renames Ada.Characters.Latin_9;
   subtype Character_Set_Type is Str_Maps.Character_Set;
   subtype Character_Sequence_Type is Str_Maps.Character_Sequence;

   -----------------End Table of Contents-------------
   Latin_1_Range    : constant Str_Maps.Character_Range
      := (Low => Latin_1.Nul, High => Latin_1.Lc_Y_Diaeresis);  
   Latin_1_Char_Set :          Character_Set_Type      
      := Str_Maps.To_Set (Span => Latin_1_Range);  
   --Standard for Ada '95
   -- Latin_9 Differences: Euro_Sign, Uc_S_Caron, Lc_S_Caron, Uc_Z_Caron, 
   -- Lc_Z_Caron, Uc_Ligature_Oe, Lc_Ligature_Oe, Uc_Y_Diaeresis.
   Latin_9_Diff_Latin_1_Super_Range  : constant Str_Maps.Character_Range
      := (Low => Latin_9.Euro_Sign, High => Latin_9.Uc_Y_Diaeresis);  
   Latin_9_Diff_Latin_1_Super_Set    :          Character_Set_Type      
      := Str_Maps.To_Set (Span => Latin_9_Diff_Latin_1_Super_Range);  
   Latin_9_Diff_Latin_1_Super_String :          Character_Sequence_Type 
      := Str_Maps.To_Sequence (Latin_9_Diff_Latin_1_Super_Set);  
   Character_Set_Name                :          String                 
      := "Latin_1";  
   ---------------------------------------------   
   procedure Test_Character_Sets (
         Character_Sequence_Var : in     Character_Sequence_Type; 
         Set                    : in     Character_Set_Type       ) is 
      Is_In_Character_Set : Boolean   := False;  
      Char                : Character := 'X';  
      Character_Set_Position : Positive := 164; -- Euro_Sign   
   begin--Test_Character_Sets
      T_Io.Put_Line("Latin_9_Diff is " & Latin_9_Diff_Latin_1_Super_String);
      T_Io.Put_Line("");
      Test_Chars:
         for I in Character_Sequence_Var'range loop
         Char:= Character_Sequence_Var(I);
         Is_In_Character_Set:= Str_Maps.Is_In(
            Element => Char,            
            Set     => Latin_1_Char_Set);
         T_Io.Put_Line("The Character " & Char & " is in " & Character_Set_Name
            &  " is " & Boolean'Image (
               Is_In_Character_Set) & ". Its position is "
                  & Positive'Image(Character_Set_Position));
         Character_Set_Position:= Character_Set_Position + 1;
      end loop Test_Chars;
   end Test_Character_Sets;
   ---------------------------------------------     
begin--Bd_W_Char_Sets_Test
   T_Io.Put_Line("----------------------Starting Test---------------------);
   Test_Character_Sets (
      Character_Sequence_Var => Latin_9_Diff_Latin_1_Super_String, 
      Set                    => Latin_1_Char_Set);
   ---------------------------------------------
   T_Io.Put_Line("------------------------Ending Test---------------------);

exception
   when A: Ada.Io_Exceptions.Status_Error =>
      T_Io.Put_Line("Status_Error in Char_Sets_Test.");
      T_Io.Put_Line(Ada.Exceptions.Exception_Information(A));
   when O: others =>
      T_Io.Put_Line("Others_Error in Char_Sets_Test.");
      T_Io.Put_Line(Ada.Exceptions.Exception_Information(O));

end Char_Sets_Test;




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets (plain text police report)
  2002-11-28 17:53 Character Sets Robert C. Leif
@ 2002-11-28 18:08 ` Warren W. Gay VE3WWG
  2002-11-28 18:11   ` Warren W. Gay VE3WWG
  2002-11-29 20:37   ` Robert C. Leif
  2002-11-29 12:28 ` Character Sets Georg Bauhaus
  2002-12-02 18:28 ` Stephen Leake
  2 siblings, 2 replies; 29+ messages in thread
From: Warren W. Gay VE3WWG @ 2002-11-28 18:08 UTC (permalink / raw)


Hmmm... I guess since Robert Dewar is avoiding this group these
days, we also lost our "plain text" police force ;-)

In case you were not aware of it, you are posting HTML to this
news group. This is generally discouraged so that others who
are not using HTML capable news readers, are still able to make
sense of your posting.

Robert C. Leif wrote:
> Christoph Grein responded to my inquiry by stating that,
> " Latin_9.Euro_Sign is a name for a character. The same character in Latin_1 has a different name, it is the Currency_Sign."
> "So why do you expect this character not to be in the set only because you use a different name for it?"
> The Euro_Sign and the Currency_Sign have a different representation according to The ISO 8859 Alphabet Soup http://czyborra.com/charsets/iso8859.html
> ------------------------------------------------
> GNAT Latin_9 (ISO-8859-15)includes the following:
>    -- Summary of Changes from Latin-1 => Latin-9 --
>    ------------------------------------------------
...
> end Char_Sets_Test;

-- 
Warren W. Gay VE3WWG
http://home.cogeco.ca/~ve3wwg




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets (plain text police report)
  2002-11-28 18:08 ` Character Sets (plain text police report) Warren W. Gay VE3WWG
@ 2002-11-28 18:11   ` Warren W. Gay VE3WWG
  2002-11-29 11:12     ` Lutz Donnerhacke
  2002-11-29 20:37   ` Robert C. Leif
  1 sibling, 1 reply; 29+ messages in thread
From: Warren W. Gay VE3WWG @ 2002-11-28 18:11 UTC (permalink / raw)


ARRRG!!!

It seems that Netscape 7 replies to HTML in HTML!  Grrr!  That
means this message will likely be HTML as well...

Now _I_ must apologize! ;-)

Warren W. Gay VE3WWG wrote:
> Hmmm... I guess since Robert Dewar is avoiding this group these
> days, we also lost our "plain text" police force ;-)
> 
> In case you were not aware of it, you are posting HTML to this
> news group. This is generally discouraged so that others who
> are not using HTML capable news readers, are still able to make
> sense of your posting.
> 
> Robert C. Leif wrote:
> 
>> Christoph Grein responded to my inquiry by stating that,
>> " Latin_9.Euro_Sign is a name for a character. The same character in 
>> Latin_1 has a different name, it is the Currency_Sign."
>> "So why do you expect this character not to be in the set only because 
>> you use a different name for it?"
>> The Euro_Sign and the Currency_Sign have a different representation 
>> according to The ISO 8859 Alphabet Soup 
>> http://czyborra.com/charsets/iso8859.html
>> ------------------------------------------------
>> GNAT Latin_9 (ISO-8859-15)includes the following:
>>    -- Summary of Changes from Latin-1 => Latin-9 --
>>    ------------------------------------------------
> 
> ....
> 
>> end Char_Sets_Test;

-- 
Warren W. Gay VE3WWG
http://home.cogeco.ca/~ve3wwg




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets (plain text police report)
  2002-11-28 18:11   ` Warren W. Gay VE3WWG
@ 2002-11-29 11:12     ` Lutz Donnerhacke
  2002-11-29 14:58       ` Frank J. Lhota
  0 siblings, 1 reply; 29+ messages in thread
From: Lutz Donnerhacke @ 2002-11-29 11:12 UTC (permalink / raw)


* Warren W. Gay VE3WWG wrote:
> It seems that Netscape 7 replies to HTML in HTML!  Grrr!  That
> means this message will likely be HTML as well...

No. There is no HTML message in the whole thread.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets
  2002-11-28 17:53 Character Sets Robert C. Leif
  2002-11-28 18:08 ` Character Sets (plain text police report) Warren W. Gay VE3WWG
@ 2002-11-29 12:28 ` Georg Bauhaus
  2002-12-02 18:28 ` Stephen Leake
  2 siblings, 0 replies; 29+ messages in thread
From: Georg Bauhaus @ 2002-11-29 12:28 UTC (permalink / raw)


Robert C. Leif <rleif@rleif.com> wrote:
!fmt -w72
: I suspect that the best solution would be to introduce UniCode,
ISO/IEC 10646, into the Ada standard. The arguments for this are
contained in W3C Character Model for the World Wide Web 1.0, W3C
Working Draft 30 April 2002

Yes, and with Wide_String you can have the Basic Multilingual Plain,
as per ISO 10646.  There is at least one compiler with support for
different wide character endocings.

-- georg



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets (plain text police report)
  2002-11-29 11:12     ` Lutz Donnerhacke
@ 2002-11-29 14:58       ` Frank J. Lhota
  0 siblings, 0 replies; 29+ messages in thread
From: Frank J. Lhota @ 2002-11-29 14:58 UTC (permalink / raw)


Yes, there are HTML posts in this thread, although the only use of HTML is
to set the font.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: Character Sets (plain text police report)
  2002-11-28 18:08 ` Character Sets (plain text police report) Warren W. Gay VE3WWG
  2002-11-28 18:11   ` Warren W. Gay VE3WWG
@ 2002-11-29 20:37   ` Robert C. Leif
  2002-11-30 14:49     ` Marin David Condic
  2002-12-02 19:29     ` A suggestion, completely unrelated to the original topic Wes Groleau
  1 sibling, 2 replies; 29+ messages in thread
From: Robert C. Leif @ 2002-11-29 20:37 UTC (permalink / raw)


Oops. My apologies.
Bob Leif
The correct text version is below. 
Addendum: The solution is the creation of versions of Ada.Strings.Bounded for 16 and 32 bit characters. The 32 bit Unicode characters allow direct comparison of characters based on their position in Unicode.

-----Original Message-----
From: comp.lang.ada-admin@ada.eu.org [mailto:comp.lang.ada-admin@ada.eu.org] On Behalf Of Warren W. Gay VE3WWG
Sent: Thursday, November 28, 2002 10:09 AM
To: comp.lang.ada@ada.eu.org
Subject: Re: Character Sets (plain text police report)

Hmmm... I guess since Robert Dewar is avoiding this group these
days, we also lost our "plain text" police force ;-)

In case you were not aware of it, you are posting HTML to this
news group. This is generally discouraged so that others who
are not using HTML capable news readers, are still able to make
sense of your posting.
--------------------------------------------------------
Christoph Grein responded to my inquiry by stating that,
" Latin_9.Euro_Sign is a name for a character. The same character in Latin_1 has a different name, it is the Currency_Sign." "So why do you expect this character not to be in the set only because you use a different name for it?" The Euro_Sign and the Currency_Sign have a different representation according to The ISO 8859 Alphabet Soup http://czyborra.com/charsets/iso8859.html
------------------------------------------------
GNAT Latin_9 (ISO-8859-15)includes the following:
   -- Summary of Changes from Latin-1 => Latin-9 --
   ------------------------------------------------

   --   164     Currency                => Euro_Sign
   --   166     Broken_Bar              => UC_S_Caron
   --   168     Diaeresis               => LC_S_Caron
   --   180     Acute                   => UC_Z_Caron
   --   184     Cedilla                 => LC_Z_Caron
   --   188     Fraction_One_Quarter    => UC_Ligature_OE
   --   189     Fraction_One_Half       => LC_Ligature_OE
   --   190     Fraction_Three_Quarters => UC_Y_Diaeresis
Since these are changes, they should not be the same character. Below are the results of an extension of my original program that now tests the characters of Latin_9 from character number 164 through 190 and prints them out. I understand that choice of the Windows font will change their representation. The correct glyphs can be found at The ISO 8859 Alphabet Soup. For anyone interested, I have put my program at the end of this note. I suspect that the best solution would be to introduce UniCode, ISO/IEC 10646, into the Ada standard. The arguments for this are contained in W3C Character Model for the World Wide Web 1.0, W3C Working Draft 30 April 2002 http://www.w3.org/TR/charmod/ "The choice of Unicode was motivated by the fact that Unicode: is the only universal character repertoire available, covers the widest possible range, provides a way of referencing characters independent of the encoding of a resource, is being updated/completed carefully, is widely accepted and implemented by industry." "W3C adopted Unicode as the document character set for HTML in [HTML 4.0]. The same approach was later used for specifications such as XML 1.0 [XML 1.0] and CSS2 [CSS2]. Unicode now serves as a common reference for W3C specifications and applications." "The IETF has adopted some policies on the use of character sets on the Internet (see [RFC 2277])." Bob Leif ------------------------Starting Test----------------------- Latin_9_Diff is ñÑªº¿⌐¬½¼¡«»░▒▓│┤╡╢╖╕╣║╗╝╜╛

The Character ñ is in Latin_1 is TRUE. Its position is  164
The Character Ñ is in Latin_1 is TRUE. Its position is  165
The Character ª is in Latin_1 is TRUE. Its position is  166
The Character º is in Latin_1 is TRUE. Its position is  167
The Character ¿ is in Latin_1 is TRUE. Its position is  168
The Character ⌐ is in Latin_1 is TRUE. Its position is  169
The Character ¬ is in Latin_1 is TRUE. Its position is  170
The Character ½ is in Latin_1 is TRUE. Its position is  171
The Character ¼ is in Latin_1 is TRUE. Its position is  172
The Character ¡ is in Latin_1 is TRUE. Its position is  173
The Character « is in Latin_1 is TRUE. Its position is  174
The Character » is in Latin_1 is TRUE. Its position is  175
The Character ░ is in Latin_1 is TRUE. Its position is  176
The Character ▒ is in Latin_1 is TRUE. Its position is  177
The Character ▓ is in Latin_1 is TRUE. Its position is  178
The Character │ is in Latin_1 is TRUE. Its position is  179
The Character ┤ is in Latin_1 is TRUE. Its position is  180
The Character ╡ is in Latin_1 is TRUE. Its position is  181
The Character ╢ is in Latin_1 is TRUE. Its position is  182
The Character ╖ is in Latin_1 is TRUE. Its position is  183
The Character ╕ is in Latin_1 is TRUE. Its position is  184
The Character ╣ is in Latin_1 is TRUE. Its position is  185
The Character ║ is in Latin_1 is TRUE. Its position is  186
The Character ╗ is in Latin_1 is TRUE. Its position is  187
The Character ╝ is in Latin_1 is TRUE. Its position is  188
The Character ╜ is in Latin_1 is TRUE. Its position is  189
The Character ╛ is in Latin_1 is TRUE. Its position is  190 ------------------------Ending Test----------------------- --Robert C. Leif, Ph.D & Ada_Med Copyright all rights reserved. --Main Procedure 
--Created 27 November 2002
with Ada.Text_Io;
with Ada.Io_Exceptions;
with Ada.Exceptions;
with Ada.Strings;
with Ada.Strings.Maps;
with  Ada.Characters.Latin_1;
with  Ada.Characters.Latin_9;
procedure Char_Sets_Test is 
   ------------------Table of Contents------------- 
   package T_Io renames Ada.Text_Io;
   package Str_Maps renames Ada.Strings.Maps;
   package Latin_1 renames Ada.Characters.Latin_1;
   package Latin_9 renames Ada.Characters.Latin_9;
   subtype Character_Set_Type is Str_Maps.Character_Set;
   subtype Character_Sequence_Type is Str_Maps.Character_Sequence;

   -----------------End Table of Contents-------------
   Latin_1_Range    : constant Str_Maps.Character_Range
      := (Low => Latin_1.Nul, High => Latin_1.Lc_Y_Diaeresis);  
   Latin_1_Char_Set :          Character_Set_Type      
      := Str_Maps.To_Set (Span => Latin_1_Range);  
   --Standard for Ada '95
   -- Latin_9 Differences: Euro_Sign, Uc_S_Caron, Lc_S_Caron, Uc_Z_Caron, 
   -- Lc_Z_Caron, Uc_Ligature_Oe, Lc_Ligature_Oe, Uc_Y_Diaeresis.
   Latin_9_Diff_Latin_1_Super_Range  : constant Str_Maps.Character_Range
      := (Low => Latin_9.Euro_Sign, High => Latin_9.Uc_Y_Diaeresis);  
   Latin_9_Diff_Latin_1_Super_Set    :          Character_Set_Type      
      := Str_Maps.To_Set (Span => Latin_9_Diff_Latin_1_Super_Range);  
   Latin_9_Diff_Latin_1_Super_String :          Character_Sequence_Type 
      := Str_Maps.To_Sequence (Latin_9_Diff_Latin_1_Super_Set);  
   Character_Set_Name                :          String                 
      := "Latin_1";  
   ---------------------------------------------   
   procedure Test_Character_Sets (
         Character_Sequence_Var : in     Character_Sequence_Type; 
         Set                    : in     Character_Set_Type       ) is 
      Is_In_Character_Set : Boolean   := False;  
      Char                : Character := 'X';  
      Character_Set_Position : Positive := 164; -- Euro_Sign   
   begin--Test_Character_Sets
      T_Io.Put_Line("Latin_9_Diff is " & Latin_9_Diff_Latin_1_Super_String);
      T_Io.Put_Line("");
      Test_Chars:
         for I in Character_Sequence_Var'range loop
         Char:= Character_Sequence_Var(I);
         Is_In_Character_Set:= Str_Maps.Is_In(
            Element => Char,            
            Set     => Latin_1_Char_Set);
         T_Io.Put_Line("The Character " & Char & " is in " & Character_Set_Name
            &  " is " & Boolean'Image (
               Is_In_Character_Set) & ". Its position is "
                  & Positive'Image(Character_Set_Position));
         Character_Set_Position:= Character_Set_Position + 1;
      end loop Test_Chars;
   end Test_Character_Sets;
   ---------------------------------------------     
begin--Bd_W_Char_Sets_Test
   T_Io.Put_Line("----------------------Starting Test---------------------);
   Test_Character_Sets (
      Character_Sequence_Var => Latin_9_Diff_Latin_1_Super_String, 
      Set                    => Latin_1_Char_Set);
   ---------------------------------------------
   T_Io.Put_Line("------------------------Ending Test---------------------);

exception
   when A: Ada.Io_Exceptions.Status_Error =>
      T_Io.Put_Line("Status_Error in Char_Sets_Test.");
      T_Io.Put_Line(Ada.Exceptions.Exception_Information(A));
   when O: others =>
      T_Io.Put_Line("Others_Error in Char_Sets_Test.");
      T_Io.Put_Line(Ada.Exceptions.Exception_Information(O));

end Char_Sets_Test;




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets (plain text police report)
  2002-11-29 20:37   ` Robert C. Leif
@ 2002-11-30 14:49     ` Marin David Condic
  2002-12-01 11:28       ` Jacob Sparre Andersen
  2002-12-02 19:29     ` A suggestion, completely unrelated to the original topic Wes Groleau
  1 sibling, 1 reply; 29+ messages in thread
From: Marin David Condic @ 2002-11-30 14:49 UTC (permalink / raw)


It might make an easy extension to the Ada standard to include 32-bit
Unicode. After all, its pretty much just a matter of taking existing
packages and changing a few things so you could have Wide_Wide_Character.
The question is, would it have sufficient utility to make it worth the
effort? (Is there much use out there for 32-bit characters?)

Perhaps if some additional utility was piled on top of it so that reading a
text file, Ada would automatically determine what it was looking at and give
you back text in the proper size (create something like "Universal_String"
and a whole bunch of utilities around it so it would hold 8, 16 or 32-bit
characters depending on how it was loaded) - but I don't see how that could
be done for all text files. Only those that conformed to some other
standard, like XML, where you can determine from convention what sort of
characters will follow.

The concept is a little vague in my mind, but I could imagine how something
like this might be a useful idea for a standard Ada library. It really
doesn't require any fundamental changes to the language.

MDC
--
======================================================================
Marin David Condic
I work for: http://www.belcan.com/
My project is: http://www.jast.mil/

Send Replies To: m c o n d i c @ a c m . o r g

    "I'd trade it all for just a little more"
        --  Charles Montgomery Burns, [4F10]
======================================================================

Robert C. Leif <rleif@rleif.com> wrote in message
news:mailman.1038602282.10532.comp.lang.ada@ada.eu.org...
Oops. My apologies.
Bob Leif
The correct text version is below.
Addendum: The solution is the creation of versions of Ada.Strings.Bounded
for 16 and 32 bit characters. The 32 bit Unicode characters allow direct
comparison of characters based on their position in Unicode.






^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets (plain text police report)
  2002-11-30 14:49     ` Marin David Condic
@ 2002-12-01 11:28       ` Jacob Sparre Andersen
  2002-12-01 14:38         ` Marin David Condic
  0 siblings, 1 reply; 29+ messages in thread
From: Jacob Sparre Andersen @ 2002-12-01 11:28 UTC (permalink / raw)

Marin David Condic wrote:
> It might make an easy extension to the Ada standard to include 32-bit
> Unicode. After all, its pretty much just a matter of taking existing
> packages and changing a few things so you could have Wide_Wide_Character.
> The question is, would it have sufficient utility to make it worth the
> effort? (Is there much use out there for 32-bit characters?)

Maybe not directly (except for in the far east), but there 
is a rather large and growing indirect need for full support 
for ISO-10646.

In Europe people are starting to switch from ISO-8859 
encodings to the UTF-8 encoding of ISO-10646.  This means 
that although people in practice seldom will use more than 
the 470-something European characters, they will start to 
expect to have access to use all of ISO-10646.

> Perhaps if some additional utility was piled on top of it so that reading a
> text file, Ada would automatically determine what it was looking at and give
> you back text in the proper size (create something like "Universal_String"
> and a whole bunch of utilities around it so it would hold 8, 16 or 32-bit
> characters depending on how it was loaded) - but I don't see how that could
> be done for all text files.

Agreed.  One needs some kind of information about which 
encoding is used - but that is already the case.  The best 
solution I can think of is to demand that the operating 
system keeps track of the file type (including encoding for 
text files).  The second best solution is (IMHO) to 
introduce a sensible common standard encoding.  I don't know 
if it should be UTF-8 or raw 32-bit ISO-10646.  And I can 
certainly not advice people to use the current procedure on 
Unix systems, where each user chooses his/her assumed 
encoding of text files.

> The concept is a little vague in my mind, but I could imagine how something
> like this might be a useful idea for a standard Ada library. It really
> doesn't require any fundamental changes to the language.

No.  But it would be nice, if one could demand that 
compilers can handle UTF-8 or raw 32-bit ISO-10646 encoded 
source files.

Greetings,

Jacob
-- 
"I don't want to gain immortality in my works.
  I want to gain it by not dying."

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets (plain text police report)
  2002-12-01 11:28       ` Jacob Sparre Andersen
@ 2002-12-01 14:38         ` Marin David Condic
  2002-12-01 20:25           ` Jacob Sparre Andersen
                             ` (3 more replies)
  0 siblings, 4 replies; 29+ messages in thread
From: Marin David Condic @ 2002-12-01 14:38 UTC (permalink / raw)


Jacob Sparre Andersen <sparre@nbi.dk> wrote in message
news:3DE9F24E.3010002@nbi.dk...
> > effort? (Is there much use out there for 32-bit characters?)
>
> Maybe not directly (except for in the far east), but there
> is a rather large and growing indirect need for full support
> for ISO-10646.
>
My understanding was that the 16 bit characters covered most of the
practical uses one would find in modern languages. The reason for the 32 bit
characters was to provide for things that might be truly obscure (Egyptian
heiroglyphics and such) or other special character sets that may not be that
big a deal if Ada didn't support it.



> In Europe people are starting to switch from ISO-8859
> encodings to the UTF-8 encoding of ISO-10646.  This means
> that although people in practice seldom will use more than
> the 470-something European characters, they will start to
> expect to have access to use all of ISO-10646.
>
So possibly if there was some kind of variant of Text_IO that dealt with
UTF-8 files, it might be useful. You'd need special data types and
operations, but that wouldn't be insurmountable. Some set of packages that
would be wrapped around UTF-8 as an extension to Ada or part of a standard
Ada library might make sense.



>
> Agreed.  One needs some kind of information about which
> encoding is used - but that is already the case.  The best
> solution I can think of is to demand that the operating
> system keeps track of the file type (including encoding for
> text files).  The second best solution is (IMHO) to
> introduce a sensible common standard encoding.  I don't know
> if it should be UTF-8 or raw 32-bit ISO-10646.  And I can
> certainly not advice people to use the current procedure on
> Unix systems, where each user chooses his/her assumed
> encoding of text files.
>
You'd almost certainly want some indication from the OS that a file was a
UTF-8 file. The "Form" parameter in the Text_IO.Open procedure would be the
natural place to be specifying it, I'd think. Or if it was a set of new
packages, the underlying implementation would want a means of checking that
the file was of the appropriate type. The alternative is to dump it on the
user's head - as one generally must with Unix OS's since files there tend to
be viewed as a stream of bytes. "I ask you for a UTF-8 input file and if you
give me a relational database file, well, that's your tough luck..."



>
> No.  But it would be nice, if one could demand that
> compilers can handle UTF-8 or raw 32-bit ISO-10646 encoded
> source files.
>
That sounds like an implementation issue. (You're talking about the Ada
compiler eating Ada source that is in UTF-8? No reason that can't be done
without a language revision.) Otherwise, I'd think you could provide all the
tools by creating a Wide_Wide_Character and Wide_Wide_String type and
providing all the customary packages that would involve. From there,
additional utility probably should come from a standard Ada library so that
it could be enhanced and extended without formal language revision.

MDC
--
======================================================================
Marin David Condic
I work for: http://www.belcan.com/
My project is: http://www.jast.mil/

Send Replies To: m c o n d i c @ a c m . o r g

    "I'd trade it all for just a little more"
        --  Charles Montgomery Burns, [4F10]
======================================================================






^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets (plain text police report)
  2002-12-01 14:38         ` Marin David Condic
@ 2002-12-01 20:25           ` Jacob Sparre Andersen
  2002-12-02  9:43             ` Preben Randhol
  2002-12-02  6:44           ` Robert C. Leif
                             ` (2 subsequent siblings)
  3 siblings, 1 reply; 29+ messages in thread
From: Jacob Sparre Andersen @ 2002-12-01 20:25 UTC (permalink / raw)


Marin David Condic wrote:
> Jacob Sparre Andersen <sparre@nbi.dk> wrote in message
> news:3DE9F24E.3010002@nbi.dk...


> My understanding was that the 16 bit characters covered most of the
> practical uses one would find in modern languages.

There is, as I understand it, some disagreement with that 
explanation, if you ask a native Korean, Chinese or Japanese.

> So possibly if there was some kind of variant of Text_IO that dealt with
> UTF-8 files, it might be useful.

Yes, but that is something I should be able to write myself.

 > You'd need special data types and
> operations, but that wouldn't be insurmountable.

Agreed.

>>But it would be nice, if one could demand that
>>compilers can handle UTF-8 or raw 32-bit ISO-10646 encoded
>>source files.
> 
> That sounds like an implementation issue.

Yes.

 > (You're talking about the Ada
> compiler eating Ada source that is in UTF-8? No reason that can't be done
> without a language revision.)

Yes and yes.

Jacob
-- 
"The point is that I am now a perfectly safe penguin!"




^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: Character Sets (plain text police report)
  2002-12-01 14:38         ` Marin David Condic
  2002-12-01 20:25           ` Jacob Sparre Andersen
@ 2002-12-02  6:44           ` Robert C. Leif
  2002-12-02  9:41           ` Preben Randhol
  2002-12-02 16:58           ` Charles Lindsey
  3 siblings, 0 replies; 29+ messages in thread
From: Robert C. Leif @ 2002-12-02  6:44 UTC (permalink / raw)


The problem is that wide character versions of Ada.Strings.Bounded and
other string packages were not included in Ada 95. Although I have
nothing against extending Text_Io, I strongly believe that many future
applications will be based on XML. The simplest solution is to create an
Ada API interface to the XML languages. These XML languages constitute a
very rich GUI environment. The packages for this API can be also used to
extend Text_Io. 
In principle, the equivalent of XML could be created in Ada; and
probably would be better. Unfortunately, this is, at present, not
economically feasible. However, Ada could drive all or part of an XML
based windowing system. Since one can create XML schema with very close
to Ada semantics, the Ada community should take advantage of this. I
will talk on this subject at SIGAda 2002.
Parenthetically, an Ada.Strings.Bounded with the character size as a
generic type could permit the creation of 4 bit characters, Char_4.
These Char_4s would be an elegant coding for DNA and RNA base sequences.

Bob Leif

-----Original Message-----
From: comp.lang.ada-admin@ada.eu.org
[mailto:comp.lang.ada-admin@ada.eu.org] On Behalf Of Marin David Condic
Sent: Sunday, December 01, 2002 6:38 AM
To: comp.lang.ada@ada.eu.org
Subject: Re: Character Sets (plain text police report)

Jacob Sparre Andersen <sparre@nbi.dk> wrote in message
news:3DE9F24E.3010002@nbi.dk...
> > effort? (Is there much use out there for 32-bit characters?)
>
> Maybe not directly (except for in the far east), but there
> is a rather large and growing indirect need for full support
> for ISO-10646.
>
My understanding was that the 16 bit characters covered most of the
practical uses one would find in modern languages. The reason for the 32
bit
characters was to provide for things that might be truly obscure
(Egyptian
heiroglyphics and such) or other special character sets that may not be
that
big a deal if Ada didn't support it.



> In Europe people are starting to switch from ISO-8859
> encodings to the UTF-8 encoding of ISO-10646.  This means
> that although people in practice seldom will use more than
> the 470-something European characters, they will start to
> expect to have access to use all of ISO-10646.
>
So possibly if there was some kind of variant of Text_IO that dealt with
UTF-8 files, it might be useful. You'd need special data types and
operations, but that wouldn't be insurmountable. Some set of packages
that
would be wrapped around UTF-8 as an extension to Ada or part of a
standard
Ada library might make sense.



>
> Agreed.  One needs some kind of information about which
> encoding is used - but that is already the case.  The best
> solution I can think of is to demand that the operating
> system keeps track of the file type (including encoding for
> text files).  The second best solution is (IMHO) to
> introduce a sensible common standard encoding.  I don't know
> if it should be UTF-8 or raw 32-bit ISO-10646.  And I can
> certainly not advice people to use the current procedure on
> Unix systems, where each user chooses his/her assumed
> encoding of text files.
>
You'd almost certainly want some indication from the OS that a file was
a
UTF-8 file. The "Form" parameter in the Text_IO.Open procedure would be
the
natural place to be specifying it, I'd think. Or if it was a set of new
packages, the underlying implementation would want a means of checking
that
the file was of the appropriate type. The alternative is to dump it on
the
user's head - as one generally must with Unix OS's since files there
tend to
be viewed as a stream of bytes. "I ask you for a UTF-8 input file and if
you
give me a relational database file, well, that's your tough luck..."



>
> No.  But it would be nice, if one could demand that
> compilers can handle UTF-8 or raw 32-bit ISO-10646 encoded
> source files.
>
That sounds like an implementation issue. (You're talking about the Ada
compiler eating Ada source that is in UTF-8? No reason that can't be
done
without a language revision.) Otherwise, I'd think you could provide all
the
tools by creating a Wide_Wide_Character and Wide_Wide_String type and
providing all the customary packages that would involve. From there,
additional utility probably should come from a standard Ada library so
that
it could be enhanced and extended without formal language revision.

MDC
--
======================================================================
Marin David Condic
I work for: http://www.belcan.com/
My project is: http://www.jast.mil/

Send Replies To: m c o n d i c @ a c m . o r g

    "I'd trade it all for just a little more"
        --  Charles Montgomery Burns, [4F10]
======================================================================







^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets (plain text police report)
  2002-12-01 14:38         ` Marin David Condic
  2002-12-01 20:25           ` Jacob Sparre Andersen
  2002-12-02  6:44           ` Robert C. Leif
@ 2002-12-02  9:41           ` Preben Randhol
  2002-12-02 16:58           ` Charles Lindsey
  3 siblings, 0 replies; 29+ messages in thread
From: Preben Randhol @ 2002-12-02  9:41 UTC (permalink / raw)


Marin David Condic wrote:
> My understanding was that the 16 bit characters covered most of the
> practical uses one would find in modern languages. The reason for the
                                   ^^^^^^^^^^^^^^^^
                                   And modern languages here means?

> So possibly if there was some kind of variant of Text_IO that dealt
> with UTF-8 files, it might be useful. You'd need special data types
> and operations, but that wouldn't be insurmountable. Some set of
> packages that would be wrapped around UTF-8 as an extension to Ada or
> part of a standard Ada library might make sense.

That would be very useful yes.
-- 
Preben Randhol ------------------------ http://www.pvv.org/~randhol/ --
                          ï¿½1984 is soon coming to a computer near you.ï¿½



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets (plain text police report)
  2002-12-01 20:25           ` Jacob Sparre Andersen
@ 2002-12-02  9:43             ` Preben Randhol
  2002-12-02 13:26               ` Marin David Condic
  0 siblings, 1 reply; 29+ messages in thread
From: Preben Randhol @ 2002-12-02  9:43 UTC (permalink / raw)


Jacob Sparre Andersen wrote:
> Marin David Condic wrote:
>> So possibly if there was some kind of variant of Text_IO that dealt
>> with UTF-8 files, it might be useful.
> 
> Yes, but that is something I should be able to write myself.

Make a library so I don't have to ;-)


-- 
Preben Randhol ------------------------ http://www.pvv.org/~randhol/ --
                          ï¿½1984 is soon coming to a computer near you.ï¿½



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets (plain text police report)
  2002-12-02  9:43             ` Preben Randhol
@ 2002-12-02 13:26               ` Marin David Condic
  0 siblings, 0 replies; 29+ messages in thread
From: Marin David Condic @ 2002-12-02 13:26 UTC (permalink / raw)

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1623 bytes --]

Exactly! As far as the Ada Standard goes, you should only need a
Wide_Wide_Character and all that goes with that. (Including
Wide_Wide_Text_IO). Beyond that, if one wished to deal with more exotic
forms of text files (Such as UTF-8 or XML), it would be best to provide a
standard Ada library component (separate from the Ada language standard) to
deal with it. Since it can be built out of core Ada components and the file
formats themselves might undergo changes, plus probably spotty support from
different operating systems, its best to include this in a library that,
while being "Standard", isn't subject to the same rules and long revision
cycles of the Ada standard.

MDC
--
======================================================================
Marin David Condic
I work for: http://www.belcan.com/
My project is: http://www.jast.mil/

Send Replies To: m c o n d i c @ a c m . o r g

    "I'd trade it all for just a little more"
        --  Charles Montgomery Burns, [4F10]
======================================================================

Preben Randhol <randhol+news@pvv.org> wrote in message
news:slrnaumap6.en.randhol+news@kiuk0152.chembio.ntnu.no...
> Jacob Sparre Andersen wrote:
> > Marin David Condic wrote:
> >> So possibly if there was some kind of variant of Text_IO that dealt
> >> with UTF-8 files, it might be useful.
> >
> > Yes, but that is something I should be able to write myself.
>
> Make a library so I don't have to ;-)
>
>
> --
> Preben Randhol ------------------------ http://www.pvv.org/~randhol/ --
>                           �1984 is soon coming to a computer near you.�

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets (plain text police report)
  2002-12-01 14:38         ` Marin David Condic
                             ` (2 preceding siblings ...)
  2002-12-02  9:41           ` Preben Randhol
@ 2002-12-02 16:58           ` Charles Lindsey
  3 siblings, 0 replies; 29+ messages in thread
From: Charles Lindsey @ 2002-12-02 16:58 UTC (permalink / raw)

In <asd6tj$isb$1@slb2.atl.mindspring.net> "Marin David Condic" <mcondic.auntie.spam@acm.org> writes:

>My understanding was that the 16 bit characters covered most of the
>practical uses one would find in modern languages. The reason for the 32 bit
>characters was to provide for things that might be truly obscure (Egyptian
>heiroglyphics and such) or other special character sets that may not be that
>big a deal if Ada didn't support it.

Indeed so. If 16bit Wide-Characters are already available in Ada, then the
simplest thing is to use them as representing UTF-16. That would allow you
everything except the obscure Egyptian heiroglyphics (and they would
appear escaped, which would be messy).

You would use UTF-8 in external files (or whatever the OS wanted). Literal
strings in program source text would need some attention (ASCII characters
you just pad out to 16 bits).

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl@clw.cs.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets
  2002-11-28 17:53 Character Sets Robert C. Leif
  2002-11-28 18:08 ` Character Sets (plain text police report) Warren W. Gay VE3WWG
  2002-11-29 12:28 ` Character Sets Georg Bauhaus
@ 2002-12-02 18:28 ` Stephen Leake
  2002-12-03  2:45   ` Robert C. Leif
  2 siblings, 1 reply; 29+ messages in thread
From: Stephen Leake @ 2002-12-02 18:28 UTC (permalink / raw)


"Robert C. Leif" <rleif@rleif.com> writes:

> Christoph Grein responded to my inquiry by stating that, "
> Latin_9.Euro_Sign is a name for a character. The same character in
> Latin_1 has a different name, it is the Currency_Sign." "So why do
> you expect this character not to be in the set only because you use
> a different name for it?" The Euro_Sign and the Currency_Sign have a
> different representation according to The ISO 8859 Alphabet Soup
> http://czyborra.com/charsets/iso8859.html
> ------------------------------------------------ GNAT Latin_9
> (ISO-8859-15)includes the following: -- Summary of Changes from
> Latin-1 => Latin-9 --
> ------------------------------------------------
> 
>    --   164     Currency                => Euro_Sign
>    --   166     Broken_Bar              => UC_S_Caron
>    --   168     Diaeresis               => LC_S_Caron
>    --   180     Acute                   => UC_Z_Caron
>    --   184     Cedilla                 => LC_Z_Caron
>    --   188     Fraction_One_Quarter    => UC_Ligature_OE
>    --   189     Fraction_One_Half       => LC_Ligature_OE
>    --   190     Fraction_Three_Quarters => UC_Y_Diaeresis

Hmm. This says to me:

"In the Latin-1 character set, the character with internal value 164
is called 'Currency'. In the Latin-9 character set, the character with
internal value 164 is called 'Euro_Sign'".

Presumably, elsewhere in the Latin-1 and Latin-9 standards, they
specify the "glyph" used to display those characters on a screen or
paper, and the glyph for character 164 is different between Latin-1
and Latin-9.

> Since these are changes, they should not be the same character.

By "same character", we (and Ada) mean "same internal value", ie
"164". However, I suspect you mean "same glyph", in which case they
are not the "same character"; they do not have the same glyph.

> Below are the results of an extension of my original program that
> now tests the characters of Latin_9 from character number 164
> through 190 and prints them out. 

What results would you like from this program?

> I understand that choice of the Windows font will change their
> representation.

Yes, because the choice of font determines the glyph.

> anyone interested, I have put my program at the end of this note. I
> suspect that the best solution would be to introduce UniCode,

I'm not clear what the "problem" is, so I can't tell if this is a
"solution". 

-- 
-- Stephe



^ permalink raw reply	[flat|nested] 29+ messages in thread

* A suggestion, completely unrelated to the original topic
  2002-11-29 20:37   ` Robert C. Leif
  2002-11-30 14:49     ` Marin David Condic
@ 2002-12-02 19:29     ` Wes Groleau
  2002-12-02 23:21       ` David C. Hoos, Sr.
  1 sibling, 1 reply; 29+ messages in thread
From: Wes Groleau @ 2002-12-02 19:29 UTC (permalink / raw)


When I see something like:

> exception
>    when A: Ada.Io_Exceptions.Status_Error =>
>       T_Io.Put_Line("Status_Error in Char_Sets_Test.");
>       T_Io.Put_Line(Ada.Exceptions.Exception_Information(A));
>    when O: others =>
>       T_Io.Put_Line("Others_Error in Char_Sets_Test.");
>       T_Io.Put_Line(Ada.Exceptions.Exception_Information(O));

I generally think it would be easier to use:

exception
    when E : others =>
       T_Io.Put_Line(Ada.Exceptions.Exception_Name(E) &
                     " in Char_Sets_Test.");
       T_Io.Put_Line(Ada.Exceptions.Exception_Information(E));




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: A suggestion, completely unrelated to the original topic
  2002-12-02 19:29     ` A suggestion, completely unrelated to the original topic Wes Groleau
@ 2002-12-02 23:21       ` David C. Hoos, Sr.
  0 siblings, 0 replies; 29+ messages in thread
From: David C. Hoos, Sr. @ 2002-12-02 23:21 UTC (permalink / raw)



"Wes Groleau" <wesgroleau@despammed.com> wrote in message
news:4wOG9.2144$c6.2494@bos-service2.ext.raytheon.com...
> When I see something like:
>
> > exception
> >    when A: Ada.Io_Exceptions.Status_Error =>
> >       T_Io.Put_Line("Status_Error in Char_Sets_Test.");
> >       T_Io.Put_Line(Ada.Exceptions.Exception_Information(A));
> >    when O: others =>
> >       T_Io.Put_Line("Others_Error in Char_Sets_Test.");
> >       T_Io.Put_Line(Ada.Exceptions.Exception_Information(O));
>
> I generally think it would be easier to use:
>
> exception
>     when E : others =>
>        T_Io.Put_Line(Ada.Exceptions.Exception_Name(E) &
>                      " in Char_Sets_Test.");
>        T_Io.Put_Line(Ada.Exceptions.Exception_Information(E));
>
Exception_Information repeats the Exception_Name if the implementation
advice is followed, so explicitly outputting the name is probably
redundant.  I think it's also good practice to write information about
exceptions to the Standard_Error file instead of the Standard_Output,
as this makes it easier for the execution environment (e.g., a shell)
to separate exception information from normal output.

Just my $0.02 worth





^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: Character Sets
  2002-12-02 18:28 ` Stephen Leake
@ 2002-12-03  2:45   ` Robert C. Leif
  2002-12-03 13:33     ` Robert A Duff
  0 siblings, 1 reply; 29+ messages in thread
From: Robert C. Leif @ 2002-12-03  2:45 UTC (permalink / raw)

Since XML documents specify a character set, it would be useful to have
the equivalent in Ada. The common meaning of character refers to what
one sees on a screen or paper. If one only considers position, then
Latin-1 and Latin-9 are identical. 
I might note that I do not see how with Ada 95 one could directly create
a bounded string or unbounded string of wide characters?
Bob Leif

-----Original Message-----
From: comp.lang.ada-admin@ada.eu.org
[mailto:comp.lang.ada-admin@ada.eu.org] On Behalf Of Stephen Leake
Sent: Monday, December 02, 2002 10:29 AM
To: comp.lang.ada@ada.eu.org
Subject: Re: Character Sets

"Robert C. Leif" <rleif@rleif.com> writes:

> Christoph Grein responded to my inquiry by stating that, "
> Latin_9.Euro_Sign is a name for a character. The same character in
> Latin_1 has a different name, it is the Currency_Sign." "So why do
> you expect this character not to be in the set only because you use
> a different name for it?" The Euro_Sign and the Currency_Sign have a
> different representation according to The ISO 8859 Alphabet Soup
> http://czyborra.com/charsets/iso8859.html
> ------------------------------------------------ GNAT Latin_9
> (ISO-8859-15)includes the following: -- Summary of Changes from
> Latin-1 => Latin-9 --
> ------------------------------------------------
> 
>    --   164     Currency                => Euro_Sign
>    --   166     Broken_Bar              => UC_S_Caron
>    --   168     Diaeresis               => LC_S_Caron
>    --   180     Acute                   => UC_Z_Caron
>    --   184     Cedilla                 => LC_Z_Caron
>    --   188     Fraction_One_Quarter    => UC_Ligature_OE
>    --   189     Fraction_One_Half       => LC_Ligature_OE
>    --   190     Fraction_Three_Quarters => UC_Y_Diaeresis

Hmm. This says to me:

"In the Latin-1 character set, the character with internal value 164
is called 'Currency'. In the Latin-9 character set, the character with
internal value 164 is called 'Euro_Sign'".

Presumably, elsewhere in the Latin-1 and Latin-9 standards, they
specify the "glyph" used to display those characters on a screen or
paper, and the glyph for character 164 is different between Latin-1
and Latin-9.

> Since these are changes, they should not be the same character.

By "same character", we (and Ada) mean "same internal value", ie
"164". However, I suspect you mean "same glyph", in which case they
are not the "same character"; they do not have the same glyph.

> Below are the results of an extension of my original program that
> now tests the characters of Latin_9 from character number 164
> through 190 and prints them out. 

What results would you like from this program?

> I understand that choice of the Windows font will change their
> representation.

Yes, because the choice of font determines the glyph.

> anyone interested, I have put my program at the end of this note. I
> suspect that the best solution would be to introduce UniCode,

I'm not clear what the "problem" is, so I can't tell if this is a
"solution". 

-- 
-- Stephe

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets
  2002-12-03  2:45   ` Robert C. Leif
@ 2002-12-03 13:33     ` Robert A Duff
  2002-12-03 15:32       ` Juanma Barranquero
  2002-12-04  0:49       ` Robert C. Leif
  0 siblings, 2 replies; 29+ messages in thread
From: Robert A Duff @ 2002-12-03 13:33 UTC (permalink / raw)

"Robert C. Leif" <rleif@rleif.com> writes:

> I might note that I do not see how with Ada 95 one could directly create
> a bounded string or unbounded string of wide characters?

Umm, you could use the Strings.Wide_Bounded and Strings.Wide_Unbounded
packages.  ;-)

These are documented in RM-A.4.7.

There is also an AI in the works, having something to do with 32-bit
characters.  I don't remember the AI number.

- Bob

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets
  2002-12-03 13:33     ` Robert A Duff
@ 2002-12-03 15:32       ` Juanma Barranquero
  2002-12-04  0:49       ` Robert C. Leif
  1 sibling, 0 replies; 29+ messages in thread
From: Juanma Barranquero @ 2002-12-03 15:32 UTC (permalink / raw)


On Tue, 3 Dec 2002 13:33:24 GMT, Robert A Duff
<bobduff@shell01.TheWorld.com> wrote:

>There is also an AI in the works, having something to do with 32-bit
>characters.  I don't remember the AI number.

AI-00285, perhaps:

!subject Latin-9, Ada.Characters.Handling, and 32-bit characters


                                                      /L/e/k/t/u




^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: Character Sets
  2002-12-03 13:33     ` Robert A Duff
  2002-12-03 15:32       ` Juanma Barranquero
@ 2002-12-04  0:49       ` Robert C. Leif
  2002-12-14  3:27         ` David Starner
  1 sibling, 1 reply; 29+ messages in thread
From: Robert C. Leif @ 2002-12-04  0:49 UTC (permalink / raw)


Many thanks,
Bob Leif

-----Original Message-----
From: comp.lang.ada-admin@ada.eu.org
[mailto:comp.lang.ada-admin@ada.eu.org] On Behalf Of Robert A Duff
Sent: Tuesday, December 03, 2002 5:33 AM
To: comp.lang.ada@ada.eu.org
Subject: Re: Character Sets

"Robert C. Leif" <rleif@rleif.com> writes:

> I might note that I do not see how with Ada 95 one could directly
create
> a bounded string or unbounded string of wide characters?

Umm, you could use the Strings.Wide_Bounded and Strings.Wide_Unbounded
packages.  ;-)

These are documented in RM-A.4.7.

There is also an AI in the works, having something to do with 32-bit
characters.  I don't remember the AI number.

- Bob




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets
  2002-12-04  0:49       ` Robert C. Leif
@ 2002-12-14  3:27         ` David Starner
  2002-12-14 22:53           ` Vadim Godunko
  0 siblings, 1 reply; 29+ messages in thread
From: David Starner @ 2002-12-14  3:27 UTC (permalink / raw)


> There is also an AI in the works, having something to do with 32-bit
> characters.  I don't remember the AI number.

In response to AI-00285:

Why is Latin-9's introduction such a big deal? Latin-1 is still the
"standard" 8-bit character set, and so immortalized in HTML and
other places. Latin-9 is just another character set, no more 
important then any other 8-bit set. Sure, people in Western
Europe are using it; but I bet more people still use Latin-1 
then Latin-9, and more people probably use KOI8-R than Latin-9.
There are many character sets out there; adding support for just
one more doesn't help things. Especially as anyone writing for
international systems needs at the very least to set the character
set on startup rather than compile.


From: Pascal Leroy
> I still think
> that we want to retain the capacity of using 16-bit blobs to represent
> characters in the BMP, as 99.5% of practical applications will only need the
> BMP.

I sort of feel like this is saying that 99.5% of practical
applications will never need a "q". For any program that handles text,
there shouldn't be arbitrary restrictions on what comes in and out; a
program that handles Unicode should handle Unicode, instead of the
subset the programmer thought people would use. That's half the use of
Unicode; being able to use Latin letter Kra, and knowing that you
aren't limited to the systems that handle ISO-6937, or Ogham and
NSAI-434.

> Anyway, I don't think it is reasonable to force applications to go to the
> full 32-bit overhead just because they use, say, the french OE ligature.

Applications don't use the French OE ligature; users do. And
arbitrarily limiting users does not make your system a pleasure to
use.

In any case, how much overhead are we talking? In worst case
scenarios, we're talking a doubling of the memory the program uses.
But embedded systems are rarely heavy text users, and can probably
stay with Latin-1. I don't work with text files much larger than a
megabyte, and don't know of anyone who does. And if you're working
with large amounts of data and need to reduce size, compression - both
standard (e.g. LZW) and Unicode-specific (e.g. SCSU or BOCU-1) work
better than just using 16 bits.

> We certainly don't want to get into that business.  The designers of Ada 95
> wisely decided to lump all of the characters in the range 16#0100# ..
> 16#FFFD# into the category special_character, so that they don't have to
> decide which is a letter, a number, etc.  Similarly they didn't provide
> classification functions or upper/lower conversions for wide characters.

So it's left for a dozen implementations to do.

> This seems reasonable if we don't want to have to amend Ada each time a
> bunch of characters are added to 10646.

Why would you have to amend Ada? Add a Unicode version constant, and
define the data in terms of its Unicode properties. Then the
recentness of the characters is just a quality of implementation
issue.

From: Robert Dewar
> We certainly
> put in a lot of work in GNAT in implementing wide character with many
> different representation schemes,

GNAT supports input files in a dozen mostly bizzare or archaic
formats. It doesn't strike me as very useful, especially considering
as it supports Latin-1, Latin-2 (both useful), but also Latin-4
(completely unused) and Latin-3 (good for Maltese and Esperanto, and
most Esperanto users don't use it). It doesn't support ISO-8859-5 or
KOI8-R (Russian), or ISO-8859-7 (Greek). It doesn't support changing
formats on the fly - many users have multiple encodings around,
besides the fact that having to compile a different binary for each
user is a pain. Oh, and last time I submitted a bug on it, it got
ignored, until I brought it up on the gcc list, when it was pointed
out that the feature I was using (style checking on source files)
wasn't supported with UTF-8.

From: Pascal Leroy
> Remember, we are talking Ada applications here.  There are probably many
> applications out there that deal with mathematical symbols or with Tengwar, 
> but I doubt that they are written in Ada.

Mathematical symbols and Tengwar are text. Any text handling system
that supports Unicode should handle them like any other text, because
sooner or later users will expect it to handle them. (If you're
unlucky, it will be the day that you're showing your system off in
Hong Kong, and the potential buyer decides to put in his name that
isn't in the BMP.) If people don't want Ada to be a general-purpose
programming language, then that's fine; but it's not acceptable for a
general-purpose programming language not to be able to handle text,
and for a modern language, that means Unicode.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets
  2002-12-14  3:27         ` David Starner
@ 2002-12-14 22:53           ` Vadim Godunko
  2002-12-15  3:46             ` David Starner
  2002-12-15 23:26             ` Robert C. Leif
  0 siblings, 2 replies; 29+ messages in thread
From: Vadim Godunko @ 2002-12-14 22:53 UTC (permalink / raw)


starner@okstate.edu (David Starner) wrote in message news:<81f70ac6.0212131927.4fa6b642@posting.google.com>...
> 
> > This seems reasonable if we don't want to have to amend Ada each time a
> > bunch of characters are added to 10646.
> 
> Why would you have to amend Ada? Add a Unicode version constant, and
> define the data in terms of its Unicode properties. Then the
> recentness of the characters is just a quality of implementation
> issue.
> 
How many memory required for save all data from Unicode Character
Database? What you do if this constant changed? Retest all existing
applications?

> From: Robert Dewar
> > We certainly
> > put in a lot of work in GNAT in implementing wide character with many
> > different representation schemes,
> 
> GNAT supports input files in a dozen mostly bizzare or archaic
> formats. It doesn't strike me as very useful, especially considering
> as it supports Latin-1, Latin-2 (both useful), but also Latin-4
> (completely unused) and Latin-3 (good for Maltese and Esperanto, and
> most Esperanto users don't use it). It doesn't support ISO-8859-5 or
> KOI8-R (Russian), or ISO-8859-7 (Greek).
Latest public GNAT version and GCC3/GNAT both support ISO-8859-5
encoding in identifiers. And don't known any GNAT users who use
KOI8-R/U/B encodings outside comment, character and string literals.

> It doesn't support changing
> formats on the fly - many users have multiple encodings around,
> besides the fact that having to compile a different binary for each
> user is a pain. 
> 
You may propose any method for detect encoding of Ada source file "on
the fly"?

> From: Pascal Leroy
> > Remember, we are talking Ada applications here.  There are probably many
> > applications out there that deal with mathematical symbols or with Tengwar, 
> > but I doubt that they are written in Ada.
> 
> Mathematical symbols and Tengwar are text. Any text handling system
> that supports Unicode should handle them like any other text, because
> sooner or later users will expect it to handle them. (If you're
> unlucky, it will be the day that you're showing your system off in
> Hong Kong, and the potential buyer decides to put in his name that
> isn't in the BMP.) If people don't want Ada to be a general-purpose
> programming language, then that's fine; but it's not acceptable for a
> general-purpose programming language not to be able to handle text,
> and for a modern language, that means Unicode.

The main problem with encodings in Ada is a history. 

Many programs assume what Character is Latin-1. If we change semantic
of Ada.Characters.Handling, that results we get?

Ada83 define type Character as enumeration. The order of symbols
defined by its order in this enumeration not by real code. This allow
simple programs portation from, for example, ASCII to EBCDIC
encodings. Ada95 simple extend 7-bit ASCII to 8-bit ISO-8859-1.

The difference between logical code order in encoding and collation
order of current user language environment is another problem. Both
Ada9X and AI-00285 not solve this.

The best way for implement localization/internationalization support
in Ada is define special needs annex, but not change existing
interfaces because (1) this not affect to portability and (2) allow
new applications (if internationalization is critic) use new
interfaces.


Vadim Godunko



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Character Sets
  2002-12-14 22:53           ` Vadim Godunko
@ 2002-12-15  3:46             ` David Starner
  2002-12-15 23:26             ` Robert C. Leif
  1 sibling, 0 replies; 29+ messages in thread
From: David Starner @ 2002-12-15  3:46 UTC (permalink / raw)


vgodunko@vipmail.ru (Vadim Godunko) wrote in message news:<665e587a.0212141453.42386f5d@posting.google.com>...
>
> How many memory required for save all data from Unicode Character
> Database? 

After stripping the converters, ICU takes up 3 MB.
<http://oss.software.ibm.com/icu/userguide/icudata.html> But that
includes a lot of locale data, and could probably be compressed more
with work.
There's no reason it would need to be paged into memory;

> What you do if this constant changed? Retest all existing
> applications?

If the constant changed, then your version of the compiler changed,
and it's certainly possible that it broke your program, constant or
not. Given a stable API, a program should not break from a change in
the Unicode data, especially as they try not to make major changes to
the data between versions.
 
> Latest public GNAT version and GCC3/GNAT both support ISO-8859-5
> encoding in identifiers. 

Which may explain why people weren't using it in earlier versions. 

> And don't known any GNAT users who use
> KOI8-R/U/B encodings outside comment, character and string literals.

The problem is, source encoding is tied into the encoding that I/O
uses.

> The best way for implement localization/internationalization support
> in Ada is define special needs annex, 

The non-BMP Unicode is not l10n/i18n - it's basic text handling just
like the rest of Unicode. As for the character data and encodings -
sure, whatever. Just so long as it's supported in some way.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: Character Sets
  2002-12-14 22:53           ` Vadim Godunko
  2002-12-15  3:46             ` David Starner
@ 2002-12-15 23:26             ` Robert C. Leif
  1 sibling, 0 replies; 29+ messages in thread
From: Robert C. Leif @ 2002-12-15 23:26 UTC (permalink / raw)


I believe that we need to change to Latin_9. The European Economic
Community needs to have a Euro character. In the long-run, an XML_Io or
Unicode_Io package will have to be created. However it should be an
Applications Program Interface, rather than being part of the core
language or an annex.
Bob Leif

-----Original Message-----
From: comp.lang.ada-admin@ada.eu.org
[mailto:comp.lang.ada-admin@ada.eu.org] On Behalf Of Vadim Godunko
Sent: Saturday, December 14, 2002 2:54 PM
To: comp.lang.ada@ada.eu.org
Subject: Re: Character Sets

starner@okstate.edu (David Starner) wrote in message
news:<81f70ac6.0212131927.4fa6b642@posting.google.com>...
> 
> > This seems reasonable if we don't want to have to amend Ada each
time a
> > bunch of characters are added to 10646.
> 
> Why would you have to amend Ada? Add a Unicode version constant, and
> define the data in terms of its Unicode properties. Then the
> recentness of the characters is just a quality of implementation
> issue.
> 
How many memory required for save all data from Unicode Character
Database? What you do if this constant changed? Retest all existing
applications?

> From: Robert Dewar
> > We certainly
> > put in a lot of work in GNAT in implementing wide character with
many
> > different representation schemes,
> 
> GNAT supports input files in a dozen mostly bizzare or archaic
> formats. It doesn't strike me as very useful, especially considering
> as it supports Latin-1, Latin-2 (both useful), but also Latin-4
> (completely unused) and Latin-3 (good for Maltese and Esperanto, and
> most Esperanto users don't use it). It doesn't support ISO-8859-5 or
> KOI8-R (Russian), or ISO-8859-7 (Greek).
Latest public GNAT version and GCC3/GNAT both support ISO-8859-5
encoding in identifiers. And don't known any GNAT users who use
KOI8-R/U/B encodings outside comment, character and string literals.

> It doesn't support changing
> formats on the fly - many users have multiple encodings around,
> besides the fact that having to compile a different binary for each
> user is a pain. 
> 
You may propose any method for detect encoding of Ada source file "on
the fly"?

> From: Pascal Leroy
> > Remember, we are talking Ada applications here.  There are probably
many
> > applications out there that deal with mathematical symbols or with
Tengwar, 
> > but I doubt that they are written in Ada.
> 
> Mathematical symbols and Tengwar are text. Any text handling system
> that supports Unicode should handle them like any other text, because
> sooner or later users will expect it to handle them. (If you're
> unlucky, it will be the day that you're showing your system off in
> Hong Kong, and the potential buyer decides to put in his name that
> isn't in the BMP.) If people don't want Ada to be a general-purpose
> programming language, then that's fine; but it's not acceptable for a
> general-purpose programming language not to be able to handle text,
> and for a modern language, that means Unicode.

The main problem with encodings in Ada is a history. 

Many programs assume what Character is Latin-1. If we change semantic
of Ada.Characters.Handling, that results we get?

Ada83 define type Character as enumeration. The order of symbols
defined by its order in this enumeration not by real code. This allow
simple programs portation from, for example, ASCII to EBCDIC
encodings. Ada95 simple extend 7-bit ASCII to 8-bit ISO-8859-1.

The difference between logical code order in encoding and collation
order of current user language environment is another problem. Both
Ada9X and AI-00285 not solve this.

The best way for implement localization/internationalization support
in Ada is define special needs annex, but not change existing
interfaces because (1) this not affect to portability and (2) allow
new applications (if internationalization is critic) use new
interfaces.


Vadim Godunko




^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2002-12-15 23:26 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-11-28 17:53 Character Sets Robert C. Leif
2002-11-28 18:08 ` Character Sets (plain text police report) Warren W. Gay VE3WWG
2002-11-28 18:11   ` Warren W. Gay VE3WWG
2002-11-29 11:12     ` Lutz Donnerhacke
2002-11-29 14:58       ` Frank J. Lhota
2002-11-29 20:37   ` Robert C. Leif
2002-11-30 14:49     ` Marin David Condic
2002-12-01 11:28       ` Jacob Sparre Andersen
2002-12-01 14:38         ` Marin David Condic
2002-12-01 20:25           ` Jacob Sparre Andersen
2002-12-02  9:43             ` Preben Randhol
2002-12-02 13:26               ` Marin David Condic
2002-12-02  6:44           ` Robert C. Leif
2002-12-02  9:41           ` Preben Randhol
2002-12-02 16:58           ` Charles Lindsey
2002-12-02 19:29     ` A suggestion, completely unrelated to the original topic Wes Groleau
2002-12-02 23:21       ` David C. Hoos, Sr.
2002-11-29 12:28 ` Character Sets Georg Bauhaus
2002-12-02 18:28 ` Stephen Leake
2002-12-03  2:45   ` Robert C. Leif
2002-12-03 13:33     ` Robert A Duff
2002-12-03 15:32       ` Juanma Barranquero
2002-12-04  0:49       ` Robert C. Leif
2002-12-14  3:27         ` David Starner
2002-12-14 22:53           ` Vadim Godunko
2002-12-15  3:46             ` David Starner
2002-12-15 23:26             ` Robert C. Leif
  -- strict thread matches above, loose matches on Subject: below --
2002-11-27  9:00 Grein, Christoph
2002-11-26 21:41 Robert C. Leif

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox