From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,UTF8 X-Google-Thread: 103376,e136d2bb18e6fb60 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2002-11-28 09:54:06 PST Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news.tele.dk!news.tele.dk!small.news.tele.dk!fr.usenet-edu.net!usenet-edu.net!enst.fr!not-for-mail From: "Robert C. Leif" Newsgroups: comp.lang.ada Subject: Re: Character Sets Date: Thu, 28 Nov 2002 09:53:14 -0800 Organization: ENST, France Sender: comp.lang.ada-admin@ada.eu.org Message-ID: Reply-To: comp.lang.ada@ada.eu.org NNTP-Posting-Host: marvin.enst.fr Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Trace: avanie.enst.fr 1038506044 83889 137.194.161.2 (28 Nov 2002 17:54:04 GMT) X-Complaints-To: usenet@enst.fr NNTP-Posting-Date: Thu, 28 Nov 2002 17:54:04 +0000 (UTC) Return-Path: X-Envelope-From: rleif@rleif.com X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.4024 Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Errors-To: comp.lang.ada-admin@ada.eu.org X-BeenThere: comp.lang.ada@ada.eu.org X-Mailman-Version: 2.0.13 Precedence: bulk List-Unsubscribe: , List-Id: comp.lang.ada mail<->news gateway List-Post: List-Help: List-Subscribe: , Errors-To: comp.lang.ada-admin@ada.eu.org X-BeenThere: comp.lang.ada@ada.eu.org X-Original-Cc: christoph.grein@eurocopter.com Xref: archiver1.google.com comp.lang.ada:31293 Date: 2002-11-28T09:53:14-08:00 Christoph Grein responded to my inquiry by stating that, " Latin_9.Euro_Sign is a name for a character. The same character in = Latin_1 has a different name, it is the Currency_Sign." "So why do you expect this character not to be in the set only because = you use a different name for it?" The Euro_Sign and the Currency_Sign have a different representation = according to The ISO 8859 Alphabet Soup = http://czyborra.com/charsets/iso8859.html ------------------------------------------------ GNAT Latin_9 (ISO-8859-15)includes the following: -- Summary of Changes from Latin-1 =3D> Latin-9 -- ------------------------------------------------ -- 164 Currency =3D> Euro_Sign -- 166 Broken_Bar =3D> UC_S_Caron -- 168 Diaeresis =3D> LC_S_Caron -- 180 Acute =3D> UC_Z_Caron -- 184 Cedilla =3D> LC_Z_Caron -- 188 Fraction_One_Quarter =3D> UC_Ligature_OE -- 189 Fraction_One_Half =3D> LC_Ligature_OE -- 190 Fraction_Three_Quarters =3D> UC_Y_Diaeresis Since these are changes, they should not be the same character. Below are the results of an extension of my original program that now = tests the characters of Latin_9 from character number 164 through 190 = and prints them out. I understand that choice of the Windows font will = change their representation. The correct glyphs can be found at The ISO = 8859 Alphabet Soup. For anyone interested, I have put my program at the = end of this note. I suspect that the best solution would be to introduce UniCode, ISO/IEC = 10646, into the Ada standard. The arguments for this are contained in = W3C Character Model for the World Wide Web 1.0, W3C Working Draft 30 = April 2002 http://www.w3.org/TR/charmod/ "The choice of Unicode was motivated by the fact that Unicode: is the = only universal character repertoire available, covers the widest = possible range, provides a way of referencing characters independent of = the encoding of a resource, is being updated/completed carefully, is = widely accepted and implemented by industry." "W3C adopted Unicode as the document character set for HTML in [HTML = 4.0]. The same approach was later used for specifications such as XML = 1.0 [XML 1.0] and CSS2 [CSS2]. Unicode now serves as a common reference = for W3C specifications and applications." "The IETF has adopted some policies on the use of character sets on the = Internet (see [RFC 2277])." Bob Leif ------------------------Starting Test----------------------- Latin_9_Diff is = =C3=B1=C3=91=C2=AA=C2=BA=C2=BF=E2=8C=90=C2=AC=C2=BD=C2=BC=C2=A1=C2=AB=C2=BB= =E2=96=91=E2=96=92=E2=96=93=E2=94=82=E2=94=A4=E2=95=A1=E2=95=A2=E2=95=96=E2= =95=95=E2=95=A3=E2=95=91=E2=95=97=E2=95=9D=E2=95=9C=E2=95=9B The Character =C3=B1 is in Latin_1 is TRUE. Its position is 164 The Character =C3=91 is in Latin_1 is TRUE. Its position is 165 The Character =C2=AA is in Latin_1 is TRUE. Its position is 166 The Character =C2=BA is in Latin_1 is TRUE. Its position is 167 The Character =C2=BF is in Latin_1 is TRUE. Its position is 168 The Character =E2=8C=90 is in Latin_1 is TRUE. Its position is 169 The Character =C2=AC is in Latin_1 is TRUE. Its position is 170 The Character =C2=BD is in Latin_1 is TRUE. Its position is 171 The Character =C2=BC is in Latin_1 is TRUE. Its position is 172 The Character =C2=A1 is in Latin_1 is TRUE. Its position is 173 The Character =C2=AB is in Latin_1 is TRUE. Its position is 174 The Character =C2=BB is in Latin_1 is TRUE. Its position is 175 The Character =E2=96=91 is in Latin_1 is TRUE. Its position is 176 The Character =E2=96=92 is in Latin_1 is TRUE. Its position is 177 The Character =E2=96=93 is in Latin_1 is TRUE. Its position is 178 The Character =E2=94=82 is in Latin_1 is TRUE. Its position is 179 The Character =E2=94=A4 is in Latin_1 is TRUE. Its position is 180 The Character =E2=95=A1 is in Latin_1 is TRUE. Its position is 181 The Character =E2=95=A2 is in Latin_1 is TRUE. Its position is 182 The Character =E2=95=96 is in Latin_1 is TRUE. Its position is 183 The Character =E2=95=95 is in Latin_1 is TRUE. Its position is 184 The Character =E2=95=A3 is in Latin_1 is TRUE. Its position is 185 The Character =E2=95=91 is in Latin_1 is TRUE. Its position is 186 The Character =E2=95=97 is in Latin_1 is TRUE. Its position is 187 The Character =E2=95=9D is in Latin_1 is TRUE. Its position is 188 The Character =E2=95=9C is in Latin_1 is TRUE. Its position is 189 The Character =E2=95=9B is in Latin_1 is TRUE. Its position is 190 ------------------------Ending Test----------------------- --Robert C. Leif, Ph.D & Ada_Med Copyright all rights reserved. --Main Procedure=20 --Created 27 November 2002 with Ada.Text_Io; with Ada.Io_Exceptions; with Ada.Exceptions; with Ada.Strings; with Ada.Strings.Maps; with Ada.Characters.Latin_1; with Ada.Characters.Latin_9; procedure Char_Sets_Test is=20 ------------------Table of Contents-------------=20 package T_Io renames Ada.Text_Io; package Str_Maps renames Ada.Strings.Maps; package Latin_1 renames Ada.Characters.Latin_1; package Latin_9 renames Ada.Characters.Latin_9; subtype Character_Set_Type is Str_Maps.Character_Set; subtype Character_Sequence_Type is Str_Maps.Character_Sequence; -----------------End Table of Contents------------- Latin_1_Range : constant Str_Maps.Character_Range :=3D (Low =3D> Latin_1.Nul, High =3D> Latin_1.Lc_Y_Diaeresis); =20 Latin_1_Char_Set : Character_Set_Type =20 :=3D Str_Maps.To_Set (Span =3D> Latin_1_Range); =20 --Standard for Ada '95 -- Latin_9 Differences: Euro_Sign, Uc_S_Caron, Lc_S_Caron, = Uc_Z_Caron,=20 -- Lc_Z_Caron, Uc_Ligature_Oe, Lc_Ligature_Oe, Uc_Y_Diaeresis. Latin_9_Diff_Latin_1_Super_Range : constant Str_Maps.Character_Range :=3D (Low =3D> Latin_9.Euro_Sign, High =3D> = Latin_9.Uc_Y_Diaeresis); =20 Latin_9_Diff_Latin_1_Super_Set : Character_Set_Type =20 :=3D Str_Maps.To_Set (Span =3D> Latin_9_Diff_Latin_1_Super_Range); = =20 Latin_9_Diff_Latin_1_Super_String : Character_Sequence_Type=20 :=3D Str_Maps.To_Sequence (Latin_9_Diff_Latin_1_Super_Set); =20 Character_Set_Name : String =20 :=3D "Latin_1"; =20 --------------------------------------------- =20 procedure Test_Character_Sets ( Character_Sequence_Var : in Character_Sequence_Type;=20 Set : in Character_Set_Type ) is=20 Is_In_Character_Set : Boolean :=3D False; =20 Char : Character :=3D 'X'; =20 Character_Set_Position : Positive :=3D 164; -- Euro_Sign =20 begin--Test_Character_Sets T_Io.Put_Line("Latin_9_Diff is " & = Latin_9_Diff_Latin_1_Super_String); T_Io.Put_Line(""); Test_Chars: for I in Character_Sequence_Var'range loop Char:=3D Character_Sequence_Var(I); Is_In_Character_Set:=3D Str_Maps.Is_In( Element =3D> Char, =20 Set =3D> Latin_1_Char_Set); T_Io.Put_Line("The Character " & Char & " is in " & = Character_Set_Name & " is " & Boolean'Image ( Is_In_Character_Set) & ". Its position is " & Positive'Image(Character_Set_Position)); Character_Set_Position:=3D Character_Set_Position + 1; end loop Test_Chars; end Test_Character_Sets; --------------------------------------------- =20 begin--Bd_W_Char_Sets_Test T_Io.Put_Line("----------------------Starting = Test---------------------); Test_Character_Sets ( Character_Sequence_Var =3D> Latin_9_Diff_Latin_1_Super_String,=20 Set =3D> Latin_1_Char_Set); --------------------------------------------- T_Io.Put_Line("------------------------Ending = Test---------------------); exception when A: Ada.Io_Exceptions.Status_Error =3D> T_Io.Put_Line("Status_Error in Char_Sets_Test."); T_Io.Put_Line(Ada.Exceptions.Exception_Information(A)); when O: others =3D> T_Io.Put_Line("Others_Error in Char_Sets_Test."); T_Io.Put_Line(Ada.Exceptions.Exception_Information(O)); end Char_Sets_Test;