From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,MAILING_LIST_MULTI, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,UTF8 X-Google-Thread: 103376,e136d2bb18e6fb60 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2002-11-29 12:38:05 PST Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news.tele.dk!news.tele.dk!small.news.tele.dk!fr.usenet-edu.net!usenet-edu.net!enst.fr!not-for-mail From: "Robert C. Leif" Newsgroups: comp.lang.ada Subject: RE: Character Sets (plain text police report) Date: Fri, 29 Nov 2002 12:37:26 -0800 Organization: ENST, France Sender: comp.lang.ada-admin@ada.eu.org Message-ID: Reply-To: comp.lang.ada@ada.eu.org NNTP-Posting-Host: marvin.enst.fr Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Trace: avanie.enst.fr 1038602283 88255 137.194.161.2 (29 Nov 2002 20:38:03 GMT) X-Complaints-To: usenet@enst.fr NNTP-Posting-Date: Fri, 29 Nov 2002 20:38:03 +0000 (UTC) Return-Path: X-Envelope-From: rleif@rleif.com X-Envelope-To: X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.4024 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 In-Reply-To: <3DE65BB7.5010505@cogeco.ca> Importance: Normal Errors-To: comp.lang.ada-admin@ada.eu.org X-BeenThere: comp.lang.ada@ada.eu.org X-Mailman-Version: 2.0.13 Precedence: bulk List-Unsubscribe: , List-Id: comp.lang.ada mail<->news gateway List-Post: List-Help: List-Subscribe: , Errors-To: comp.lang.ada-admin@ada.eu.org X-BeenThere: comp.lang.ada@ada.eu.org Xref: archiver1.google.com comp.lang.ada:31316 Date: 2002-11-29T12:37:26-08:00 Oops. My apologies. Bob Leif The correct text version is below.=20 Addendum: The solution is the creation of versions of = Ada.Strings.Bounded for 16 and 32 bit characters. The 32 bit Unicode = characters allow direct comparison of characters based on their position = in Unicode. -----Original Message----- From: comp.lang.ada-admin@ada.eu.org = [mailto:comp.lang.ada-admin@ada.eu.org] On Behalf Of Warren W. Gay = VE3WWG Sent: Thursday, November 28, 2002 10:09 AM To: comp.lang.ada@ada.eu.org Subject: Re: Character Sets (plain text police report) Hmmm... I guess since Robert Dewar is avoiding this group these days, we also lost our "plain text" police force ;-) In case you were not aware of it, you are posting HTML to this news group. This is generally discouraged so that others who are not using HTML capable news readers, are still able to make sense of your posting. -------------------------------------------------------- Christoph Grein responded to my inquiry by stating that, " Latin_9.Euro_Sign is a name for a character. The same character in = Latin_1 has a different name, it is the Currency_Sign." "So why do you = expect this character not to be in the set only because you use a = different name for it?" The Euro_Sign and the Currency_Sign have a = different representation according to The ISO 8859 Alphabet Soup = http://czyborra.com/charsets/iso8859.html ------------------------------------------------ GNAT Latin_9 (ISO-8859-15)includes the following: -- Summary of Changes from Latin-1 =3D> Latin-9 -- ------------------------------------------------ -- 164 Currency =3D> Euro_Sign -- 166 Broken_Bar =3D> UC_S_Caron -- 168 Diaeresis =3D> LC_S_Caron -- 180 Acute =3D> UC_Z_Caron -- 184 Cedilla =3D> LC_Z_Caron -- 188 Fraction_One_Quarter =3D> UC_Ligature_OE -- 189 Fraction_One_Half =3D> LC_Ligature_OE -- 190 Fraction_Three_Quarters =3D> UC_Y_Diaeresis Since these are changes, they should not be the same character. Below = are the results of an extension of my original program that now tests = the characters of Latin_9 from character number 164 through 190 and = prints them out. I understand that choice of the Windows font will = change their representation. The correct glyphs can be found at The ISO = 8859 Alphabet Soup. For anyone interested, I have put my program at the = end of this note. I suspect that the best solution would be to introduce = UniCode, ISO/IEC 10646, into the Ada standard. The arguments for this = are contained in W3C Character Model for the World Wide Web 1.0, W3C = Working Draft 30 April 2002 http://www.w3.org/TR/charmod/ "The choice of = Unicode was motivated by the fact that Unicode: is the only universal = character repertoire available, covers the widest possible range, = provides a way of referencing characters independent of the encoding of = a resource, is being updated/completed carefully, is widely accepted and = implemented by industry." "W3C adopted Unicode as the document character = set for HTML in [HTML 4.0]. The same approach was later used for = specifications such as XML 1.0 [XML 1.0] and CSS2 [CSS2]. Unicode now = serves as a common reference for W3C specifications and applications." = "The IETF has adopted some policies on the use of character sets on the = Internet (see [RFC 2277])." Bob Leif ------------------------Starting = Test----------------------- Latin_9_Diff is = =C3=B1=C3=91=C2=AA=C2=BA=C2=BF=E2=8C=90=C2=AC=C2=BD=C2=BC=C2=A1=C2=AB=C2=BB= =E2=96=91=E2=96=92=E2=96=93=E2=94=82=E2=94=A4=E2=95=A1=E2=95=A2=E2=95=96=E2= =95=95=E2=95=A3=E2=95=91=E2=95=97=E2=95=9D=E2=95=9C=E2=95=9B The Character =C3=B1 is in Latin_1 is TRUE. Its position is 164 The Character =C3=91 is in Latin_1 is TRUE. Its position is 165 The Character =C2=AA is in Latin_1 is TRUE. Its position is 166 The Character =C2=BA is in Latin_1 is TRUE. Its position is 167 The Character =C2=BF is in Latin_1 is TRUE. Its position is 168 The Character =E2=8C=90 is in Latin_1 is TRUE. Its position is 169 The Character =C2=AC is in Latin_1 is TRUE. Its position is 170 The Character =C2=BD is in Latin_1 is TRUE. Its position is 171 The Character =C2=BC is in Latin_1 is TRUE. Its position is 172 The Character =C2=A1 is in Latin_1 is TRUE. Its position is 173 The Character =C2=AB is in Latin_1 is TRUE. Its position is 174 The Character =C2=BB is in Latin_1 is TRUE. Its position is 175 The Character =E2=96=91 is in Latin_1 is TRUE. Its position is 176 The Character =E2=96=92 is in Latin_1 is TRUE. Its position is 177 The Character =E2=96=93 is in Latin_1 is TRUE. Its position is 178 The Character =E2=94=82 is in Latin_1 is TRUE. Its position is 179 The Character =E2=94=A4 is in Latin_1 is TRUE. Its position is 180 The Character =E2=95=A1 is in Latin_1 is TRUE. Its position is 181 The Character =E2=95=A2 is in Latin_1 is TRUE. Its position is 182 The Character =E2=95=96 is in Latin_1 is TRUE. Its position is 183 The Character =E2=95=95 is in Latin_1 is TRUE. Its position is 184 The Character =E2=95=A3 is in Latin_1 is TRUE. Its position is 185 The Character =E2=95=91 is in Latin_1 is TRUE. Its position is 186 The Character =E2=95=97 is in Latin_1 is TRUE. Its position is 187 The Character =E2=95=9D is in Latin_1 is TRUE. Its position is 188 The Character =E2=95=9C is in Latin_1 is TRUE. Its position is 189 The Character =E2=95=9B is in Latin_1 is TRUE. Its position is 190 = ------------------------Ending Test----------------------- --Robert C. = Leif, Ph.D & Ada_Med Copyright all rights reserved. --Main Procedure=20 --Created 27 November 2002 with Ada.Text_Io; with Ada.Io_Exceptions; with Ada.Exceptions; with Ada.Strings; with Ada.Strings.Maps; with Ada.Characters.Latin_1; with Ada.Characters.Latin_9; procedure Char_Sets_Test is=20 ------------------Table of Contents-------------=20 package T_Io renames Ada.Text_Io; package Str_Maps renames Ada.Strings.Maps; package Latin_1 renames Ada.Characters.Latin_1; package Latin_9 renames Ada.Characters.Latin_9; subtype Character_Set_Type is Str_Maps.Character_Set; subtype Character_Sequence_Type is Str_Maps.Character_Sequence; -----------------End Table of Contents------------- Latin_1_Range : constant Str_Maps.Character_Range :=3D (Low =3D> Latin_1.Nul, High =3D> Latin_1.Lc_Y_Diaeresis); =20 Latin_1_Char_Set : Character_Set_Type =20 :=3D Str_Maps.To_Set (Span =3D> Latin_1_Range); =20 --Standard for Ada '95 -- Latin_9 Differences: Euro_Sign, Uc_S_Caron, Lc_S_Caron, = Uc_Z_Caron,=20 -- Lc_Z_Caron, Uc_Ligature_Oe, Lc_Ligature_Oe, Uc_Y_Diaeresis. Latin_9_Diff_Latin_1_Super_Range : constant Str_Maps.Character_Range :=3D (Low =3D> Latin_9.Euro_Sign, High =3D> = Latin_9.Uc_Y_Diaeresis); =20 Latin_9_Diff_Latin_1_Super_Set : Character_Set_Type =20 :=3D Str_Maps.To_Set (Span =3D> Latin_9_Diff_Latin_1_Super_Range); = =20 Latin_9_Diff_Latin_1_Super_String : Character_Sequence_Type=20 :=3D Str_Maps.To_Sequence (Latin_9_Diff_Latin_1_Super_Set); =20 Character_Set_Name : String =20 :=3D "Latin_1"; =20 --------------------------------------------- =20 procedure Test_Character_Sets ( Character_Sequence_Var : in Character_Sequence_Type;=20 Set : in Character_Set_Type ) is=20 Is_In_Character_Set : Boolean :=3D False; =20 Char : Character :=3D 'X'; =20 Character_Set_Position : Positive :=3D 164; -- Euro_Sign =20 begin--Test_Character_Sets T_Io.Put_Line("Latin_9_Diff is " & = Latin_9_Diff_Latin_1_Super_String); T_Io.Put_Line(""); Test_Chars: for I in Character_Sequence_Var'range loop Char:=3D Character_Sequence_Var(I); Is_In_Character_Set:=3D Str_Maps.Is_In( Element =3D> Char, =20 Set =3D> Latin_1_Char_Set); T_Io.Put_Line("The Character " & Char & " is in " & = Character_Set_Name & " is " & Boolean'Image ( Is_In_Character_Set) & ". Its position is " & Positive'Image(Character_Set_Position)); Character_Set_Position:=3D Character_Set_Position + 1; end loop Test_Chars; end Test_Character_Sets; --------------------------------------------- =20 begin--Bd_W_Char_Sets_Test T_Io.Put_Line("----------------------Starting = Test---------------------); Test_Character_Sets ( Character_Sequence_Var =3D> Latin_9_Diff_Latin_1_Super_String,=20 Set =3D> Latin_1_Char_Set); --------------------------------------------- T_Io.Put_Line("------------------------Ending = Test---------------------); exception when A: Ada.Io_Exceptions.Status_Error =3D> T_Io.Put_Line("Status_Error in Char_Sets_Test."); T_Io.Put_Line(Ada.Exceptions.Exception_Information(A)); when O: others =3D> T_Io.Put_Line("Others_Error in Char_Sets_Test."); T_Io.Put_Line(Ada.Exceptions.Exception_Information(O)); end Char_Sets_Test;