From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Simon Wright Newsgroups: comp.lang.ada Subject: Re: GNAT vs UTF-8 source file names Date: Tue, 04 Jul 2017 14:57:03 +0100 Organization: A noiseless patient Spider Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: mx02.eternal-september.org; posting-host="420f3a4b09de10dc0178d9c01230336f"; logging-data="20846"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19eCPDns9gdz4maRhPmPYfgRP1B2F1IfAE=" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (darwin) Cancel-Lock: sha1:uqNCyAOobcLYaiKFfZCKGE9YddU= sha1:I3ebIgY8Ocz4d6dFHxxGq+dz1YI= Xref: news.eternal-september.org comp.lang.ada:47276 Date: 2017-07-04T14:57:03+01:00 List-Id: Simon Wright writes: > PR ada/81114 refers[1]. > > It turns out that this failure occurs on Windows and macOS. The problem > is that GNAT smashes the file name to lower case if it knows that the > file system is case-insensitive (using an ASCII to-lower, so of course > 'smash' is the right word if there are UTF-8 characters in there). > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114 It's worse than that, on macOS anyway[2]. $ GNAT_FILE_NAME_CASE_SENSITIVE=1 gnatmake -c p*.ads gcc -c páck3.ads páck3.ads:1:10: warning: file name does not match unit name, should be "páck3.ads" The reason for this apparently-bizarre message is[3] that macOS takes the composed form (lowercase a acute) and converts it under the hood to what HFS+ insists on, the fully decomposed form (lowercase a, combining acute); thus the names are actually different even though they _look_ the same. I have to say that, great as it would be to have this fixed, the changes required would be extensive, and I can’t see that anyone would think it worth the trouble. The recommendation would be "don’t use international characters in the names of library units". [2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114#c1 [3] https://stackoverflow.com/a/6153713/40851