From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail From: "Dmitry A. Kazakov" Newsgroups: comp.lang.ada Subject: Re: Exclusive file access Date: Sat, 29 Aug 2015 14:02:36 +0200 Organization: cbb software GmbH Message-ID: References: <75714e3f-c047-413d-9aa5-3ff423167863@googlegroups.com> <1440837116.20971.33.camel@obry.net> Reply-To: mailbox@dmitry-kazakov.de NNTP-Posting-Host: j5pd6+YW13W3aOTpCbIMJw.user.speranza.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-Complaints-To: abuse@aioe.org User-Agent: 40tude_Dialog/2.0.15.1 X-Notice: Filtered by postfilter v. 0.8.2 Xref: news.eternal-september.org comp.lang.ada:27636 Date: 2015-08-29T14:02:36+02:00 List-Id: On Sat, 29 Aug 2015 10:31:56 +0200, Pascal Obry wrote: > Le samedi 29 août 2015 à 09:05 +0200, Dmitry A. Kazakov a écrit : >> I doubt that. Windows "non-ascii" file names are UTF-16. The only >> consistent way to have them would be Wide_Wide_Text_IO with names in >> Wide_Wide_String internally recoded to UTF-16. Does GNAT this? I >> didn't >> look at the implementation, but I bet it does not. Then how would you >> do >> I/O if the content is not Wide_Wide_String? > > I bet you meant Wide_ instead of Wide_Wide_ above. Right? No, I meant Wide_Wide_String. Ada's Wide_String is legally UCS-2. Windows is UTF-16. The only full Unicode string type is Wide_Wide_String. For an Indo-European language there is no difference, of course. Under Linux most applications simply ignore Ada standard and use String encoded in UTF-8. I suppose that under Linux GNAT calmly passes String file names as-is, i.e. as UTF-8 [*]. A conformant, but totally useless implementation would assume names in Latin-1 and recode them into UTF-8 before passing to Linux. GNAT under Windows is non-conformant either. I doubt it recodes UCS-2 Wide_String into UTF-16. Thus an application that uses Wide_String names should recode names into UTF-16 first. I.e. same mess as under Linux. A properly designed Text_IO (Unicode aware) should have used Wide_Wide_String and/or an UTF-8 encoded string type for all file names everywhere. That is why I use GIO instead of Ada standard library. GIO is UTF-8 on both Windows and Linux, which makes the applications using it portable. ------------------------------------------------------------ * Here is a program illustrating non-conformity of Linux GNAT: with Ada.Text_IO; use Ada.Text_IO; with Ada.Characters.Latin_1; use Ada.Characters.Latin_1; procedure Test_Latin is File : File_Type; begin Create (File, Out_File, "" & LC_A_Diaeresis); Close (File); end Test_Latin; The created file name is garbage, instead of "ä" (a-umlaut). -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de