From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,a82f86f344c98f79 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,UTF8 Path: g2news2.google.com!news2.google.com!news.germany.com!newsfeed.utanet.at!newsfeed01.chello.at!newsfeed.arcor.de!newsspool4.arcor-online.net!news.arcor.de.POSTED!not-for-mail Date: Mon, 11 Sep 2006 18:43:39 +0200 From: Georg Bauhaus User-Agent: Thunderbird 1.5.0.2 (X11/20060522) MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: Avatox 1.1: Trouble with encoding in Windows References: <45051d37@news.upm.es> <45053aec$0$5142$9b4e6d93@newsspool1.arcor-online.net> <4505696b@news.upm.es> In-Reply-To: <4505696b@news.upm.es> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Message-ID: <45059117$0$5144$9b4e6d93@newsspool1.arcor-online.net> Organization: Arcor NNTP-Posting-Date: 11 Sep 2006 18:38:47 CEST NNTP-Posting-Host: 8442b772.newsspool1.arcor-online.net X-Trace: DXC=nbQ=B4I8^Efg`45cDR8l?oic==]BZ:afn4Fo<]lROoRagUcjd<3m<;bd:7_lo[WXKfUUng9_FXZ=c>:=P9Ihe`BhNfEh`d`9DJgR9boY;9DVCm X-Complaints-To: usenet-abuse@arcor.de Xref: g2news2.google.com comp.lang.ada:6552 Date: 2006-09-11T18:38:47+02:00 List-Id: Manuel Collado wrote: >> And it might help prevent dodgy arguments like the ones presented >> by implementers against the clever requirement to write the >> identifier π in the Ada 2005 library. :-) > > Spanish identifiers like 'tamaño' (size) or 'año' (year) are currently > accepted by GNAT. Which makes the argument against π in the library even more bogus in my book ;-) > XML markup is meant to be written and read mostly by tools, not by > humans. So it doesn't matter if a text fragment is coded as 'España' or > as 'España'. In fact, after parsing, an XML processing agent cannot > know how it was coded. Oh, there is nothing stopping an XML processor from keeping track of input properties, even when the character representation is not an issue after parsing. Just like an ASIS tool could (should?) know the character encoding of the Ada sources it has read. > it doesn't matter if a text fragment is coded as 'España' or as > 'España'. >> Country: Wide_String := "Espa" & Wide_Character'Val(241) & "a"; ... >> Town: String := "New" & Character'Val(32) & "York"; >> > > This is outside of scope. I've not spoken about adequate character > representation in Ada sources, just in XML documents. Right, this was meant as an analogy: When I have to look at the text, not process it, I'll be glad if identifiers and literals are easy to read. I think there is still a tradeoff between a 7bit external represenation of ASIS in XML and its usability[1]. For example, when you look at ASIS streams in order to find out why one of them is broken, XML processors can't do much, because, their input is broken as a consequence. Or when I am developing an XSL transformation for "refactoring" some of the identifiers in a program, then I will have to look hard at "tamaño" in order to see that it just is "tamaño". That's not productive in my view. [1] 7bit might seem simple bitwise, but it isn't necessarily easier to process because character entities must be handled, too. -- Georg