From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00
	autolearn=unavailable autolearn_force=no version=3.4.4
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Simon Wright <simon@pushface.org>
Newsgroups: comp.lang.ada
Subject: Re: GNAT vs UTF-8 source file names
Date: Thu, 06 Jul 2017 19:43:49 +0100
Organization: A noiseless patient Spider
Message-ID: <lyfue91k4a.fsf@pushface.org>
References: <lytw55kei5.fsf@pushface.org> <lyefuia5ur.fsf@pushface.org>
	<lyeftw2tlc.fsf@pushface.org> <ojhspu$sb2$1@dont-email.me>
	<ly60f72p1g.fsf@pushface.org> <ojihrl$qu2$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: mx02.eternal-september.org;
 posting-host="6cd72f177e9a97330d7091eb0fb69bbb";
	logging-data="20719"; mail-complaints-to="abuse@eternal-september.org";
	posting-account="U2FsdGVkX19kzAK8DegKdz7rOpp6dEo12DhEkVYlGKU="
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (darwin)
Cancel-Lock: sha1:KMF/u+n5bvQ0bHkvdCKEjTsu9no=
	sha1:i/N34bj9iXs7gA1/7PrZsKb6gNM=
Xref: news.eternal-september.org comp.lang.ada:47307
Date: 2017-07-06T19:43:49+01:00
List-Id: <comp.lang.ada>

"J-P. Rosen" <rosen@adalog.fr> writes:

>> GNAT uses this if
>> either you compile with -gnatW8 or the file begins with a UTF8 BOM.
> Actually, this has nothing to do with encoding or coded character sets.
> Even if you use Latin-1, the set of allowed characters is defined as
> those that belong to NFKC.

I don't understand.

If your source has no BOM and you don't say -gnatW8, GNAT expects
Latin-1 encoding. If your source has a BOM or you say -gnatW8, GNAT
expects UTF8 encoding (I haven't tried what happens if you use NFD).

I haven't tried giving UTF8 coding without BOM or -gnatW8 - ignoring the
use in unit names - ARM 2.1(16) says it should be accepted.

(later) UTF8 is accepted in strings but not in identifiers.