From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!nntp-feed.chiark.greenend.org.uk!ewrotcd!newsfeed.xs3.de!io.xs3.de!news.jacob-sparre.dk!franka.jacob-sparre.dk!pnx.dk!.POSTED.109.56.233.237.mobile.3.dk!not-for-mail From: Jacob Sparre Andersen Newsgroups: comp.lang.ada Subject: Re: GNAT vs UTF-8 source file names Date: Fri, 07 Jul 2017 13:49:57 +0200 Organization: JSA Research & Innovation Message-ID: <87inj4xy8q.fsf@jacob-sparre.dk> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: franka.jacob-sparre.dk; posting-host="109.56.233.237.mobile.3.dk:109.56.233.237"; logging-data="15185"; mail-complaints-to="news@jacob-sparre.dk" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux) Cancel-Lock: sha1:eIUAKBmbX3m57VJ66Qh2ReoOSNA= Xref: news.eternal-september.org comp.lang.ada:47312 Date: 2017-07-07T13:49:57+02:00 List-Id: Simon Wright wrote: > The rest is about GNAT's behaviour; to reiterate, ARM 2.1(16/3) says > > "An Ada implementation shall accept Ada source code in UTF-8 > encoding, with or without a BOM (see A.4.11), where every character > is represented by its code point." > > which for GNAT is not met unless either there is a BOM or -gnatW8 is > used. Which sounds perfectly okay. There are no limitations to which command-line arguments a program can require to behave like an Ada compiler. > On the other hand, ARM 2.1(4/3) says "The coded representation for > characters is implementation defined", which seems to conflict with > (16) - but then, the AARM ramification (4.b/2) notes that the rule > doesn't have much force! That sounds like the classical wording. I suppose that the intent is that UTF-8 encoded ISO-10646 (in the right normalization form) _has_ to be supported, but that any other encoding is allowed in addition to that. It would of course be nice if that was also what the ARM actually said. Greetings, Jacob -- "Only Hogwarts students really need spellcheckers" -- An anonymous RISKS reader