comp.lang.ada
 help / color / mirror / Atom feed
From: "J-P. Rosen" <rosen@adalog.fr>
Subject: Re: GNAT vs UTF-8 source file names
Date: Fri, 7 Jul 2017 10:26:08 +0200
Date: 2017-07-07T10:26:08+02:00	[thread overview]
Message-ID: <ojngbr$rcp$1@dont-email.me> (raw)
In-Reply-To: <lyfue91k4a.fsf@pushface.org>

Le 06/07/2017 à 20:43, Simon Wright a écrit :
>> Even if you use Latin-1, the set of allowed characters is defined as
>> those that belong to NFKC.
> I don't understand.
> 
> If your source has no BOM and you don't say -gnatW8, GNAT expects
> Latin-1 encoding. If your source has a BOM or you say -gnatW8, GNAT
> expects UTF8 encoding (I haven't tried what happens if you use NFD).
> 
> I haven't tried giving UTF8 coding without BOM or -gnatW8 - ignoring the
> use in unit names - ARM 2.1(16) says it should be accepted.
> 
> (later) UTF8 is accepted in strings but not in identifiers.

This is a common confusion between characters, coded sets, and encodings...

ISO-10646 defines a coded set (code points) for a number of characters
(identical to the one defined by Unicode). Some of these characters can
be represented in NFKC. These are the allowed characters.

If you use Latin-1, you have different code points for the same
characters - and the allowed characters are still those representable in
NFKC, even with different code points.

UTF8 is an encoding, nothing more than a compression algorithm for
numerical values. It is generally used to compress Unicode strings, but
could be used for any numerical values. In any case, it doesn't change
logical values, just the way they are stored.


-- 
J-P. Rosen
Adalog
2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
Tel: +33 1 45 29 21 52, Fax: +33 1 45 29 25 00
http://www.adalog.fr

  reply	other threads:[~2017-07-07  8:26 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-30 17:10 GNAT vs UTF-8 source file names Simon Wright
2017-06-17 17:20 ` Simon Wright
2017-06-27 13:22   ` Jacob Sparre Andersen
2017-06-27 21:45     ` Niklas Holsti
2017-06-28  5:05       ` G.B.
2017-07-04 13:57   ` Simon Wright
2017-07-04 17:30     ` Shark8
2017-07-04 18:08       ` Dennis Lee Bieber
2017-07-05  5:25       ` J-P. Rosen
2017-07-06 15:18         ` Shark8
2017-07-07  8:19           ` J-P. Rosen
2017-07-05  5:21     ` J-P. Rosen
2017-07-05  9:47       ` Simon Wright
2017-07-05 11:20         ` J-P. Rosen
2017-07-05 18:42           ` Randy Brukardt
2017-07-06 18:43           ` Simon Wright
2017-07-07  8:26             ` J-P. Rosen [this message]
2017-07-07 11:01               ` Simon Wright
2017-07-07 11:49                 ` Jacob Sparre Andersen
2017-07-07 19:44                   ` Randy Brukardt
2017-07-07 19:40                 ` Randy Brukardt
2017-07-07 21:02                   ` Simon Wright
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox