From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail From: "Dmitry A. Kazakov" Newsgroups: comp.lang.ada Subject: Re: Bug in Ada - Latin 1 is not a subset of UTF-8 Date: Fri, 21 Oct 2016 18:43:49 +0200 Organization: Aioe.org NNTP Server Message-ID: References: <86f0d2fe-d498-4bc4-bb9d-e34629c89bb4@googlegroups.com> NNTP-Posting-Host: XXXaKfQ6zzC8DMOzOT/pgA.user.gioia.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Complaints-To: abuse@aioe.org User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 X-Notice: Filtered by postfilter v. 0.8.2 Xref: news.eternal-september.org comp.lang.ada:32156 Date: 2016-10-21T18:43:49+02:00 List-Id: On 2016-10-21 18:13, Lucretia wrote: > On Friday, 21 October 2016 13:28:52 UTC+1, G.B. wrote: > >> Test_Bom : constant My_Utf_8_String := Bom & "ABC"; >> Test_US : constant My_Utf_8_String := "ABC"; >> Test_GR : constant My_Utf_8_String := "ΑΒΓ"; >> Test_RU : constant My_Utf_8_String := "АБГ"; >> Test_Xx : constant My_Utf_8_String := >> ('A', Character'Val (16#E4#), 'E'); > > Also, the most inefficient string ever: > > Appended : My_UTF_8_String := "App"; > > Appended := Some_Other_String & 'e'; -- Call's Is_Well_Formed for each assignment! Sloooooooooooooow > Appended := Some_Other_String & 'n'; > Appended := Some_Other_String & 'd'; For an UTF-8 string proper no checks would be ever required when a character is appanded. The above is a sorry mess of representation colliding with the semantics, octets with characters. 'e' is a Latin-1 character appended as an octet while Unicode character meant. Wrong design gets always punished this way or another. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de