From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!.POSTED!not-for-mail From: "G.B." Newsgroups: comp.lang.ada Subject: Re: UTF-8 Output and "-gnatW8" Date: Mon, 4 Apr 2016 12:52:07 +0200 Organization: A noiseless patient Spider Message-ID: References: <35689862-61dc-4186-87d3-37b17abed5a2@googlegroups.com> <3a65e71c-41ee-49eb-916d-c0be8be9abc6@googlegroups.com> <6406289c-06a8-46d1-a633-8a1c8a22f79b@googlegroups.com> Reply-To: nonlegitur@futureapps.de Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Mon, 4 Apr 2016 10:48:51 -0000 (UTC) Injection-Info: mx02.eternal-september.org; posting-host="b96887e80893c84a90c3007226ca0d1c"; logging-data="30033"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+gnYgt/ClA6Sco/yMK4NMmVjB2zyi/nHQ=" User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 In-Reply-To: Cancel-Lock: sha1:A9bA3DnOTfZe/DprMLW8Qxi+BoE= Xref: news.eternal-september.org comp.lang.ada:29969 Date: 2016-04-04T12:52:07+02:00 List-Id: On 30.03.16 00:35, Randy Brukardt wrote: > "Michael Rohan" wrote in message > news:6406289c-06a8-46d1-a633-8a1c8a22f79b@googlegroups.com... > ... >> It does, however, feel like there is something missing where it's >> "difficult" to have >> a Wide_String literal without having to have extra meta data for compiler >> (-gnatW8) >> or having a relatively cumbersome concatenation of Wide_Character's based >> on >> code points. BTW, the performance of GNAT for such a concatenated string >> is >> pretty dismal. > > Both of these are clearly implementation issues as opposed to language > issues. Ada users would expect to be able to express numeric literals, I think, and without any implementation issues whatsoever. This includes numeric capacity, which they expect a compiler to report correctly, which implies no implementation issues when parsing numeric literals. However —I'm guessing— there is embarrassment lurking behind handling non-ASCII strings: it mostly hinges on the pampered, old misunderstanding that char has eight bits, 7 of which are to be used, and each is fixed to represent one ASCII character. Hence, trying to handle more than that in any tool, including a compiler reading a source unit, is deemed equivalent to tackling a hard problem of number theory. No one would tolerate that kind of allegation of complexity of handling contemporary character sets, historically grown as it may be, for numeric literals of Ada. There is room for compromise when ISO-ing source character sets, I would hope just like there is room for compromise when a compiler is not required to solve problems of number theory when lexing and parsing numeric literals. C++ has a related problem with string literals. It costs customers' time and money.