comp.lang.ada
 help / color / mirror / Atom feed
From: Georg Bauhaus <rm.dash-bauhaus@futureapps.de>
Subject: Re: strange behaviour of utf-8 files
Date: Mon, 18 Nov 2013 09:38:06 +0100
Date: 2013-11-18T09:37:59+01:00	[thread overview]
Message-ID: <5289d1e7$0$6643$9b4e6d93@newsspool2.arcor-online.net> (raw)
In-Reply-To: <10ec0vuld83fy.1t7bduzwsrfe.dlg@40tude.net>

On 17.11.13 21:38, Dmitry A. Kazakov wrote:
> The problem is that the common part (ASCII) is sufficient for Ada
> programming while the varying part is subtle enough to cause difficult to
> detect bugs in string literals. Bugs that cannot be detected by the
> compiler.

UTF-8 can actually be so checked (and is checked by typical implementations)
that accidentally mistaking some octets of a string literal for Latin-1
coded characters is impossible: this is a consequence of the design of
UTF-8, as you know: the {1}+0 prefix rules.

Actually, a compiler---GNAT having a helpful spell checker already---could
detect occurrences in string literals of

    String'(N   => Character'Val (195),
            N+1 => Character'Val (179))

as very likely being the valid UTF-8 sequence representing "ó". It will
then emit a warning saying that source text might be UTF-8 rather than
Latin-1, and suggest a compiler switch accordingly. Of course, the presence
of a BOM can add further support to this warning.


  reply	other threads:[~2013-11-18  8:38 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-16 13:12 strange behaviour of utf-8 files Stoik
2013-11-16 13:34 ` Dmitry A. Kazakov
2013-11-16 15:09   ` Stoik
2013-11-16 15:55     ` Dmitry A. Kazakov
2013-11-17 13:32       ` Georg Bauhaus
2013-11-17 14:07         ` Dmitry A. Kazakov
2013-11-17 17:19           ` Dennis Lee Bieber
2013-11-17 18:07             ` Dmitry A. Kazakov
2013-11-17 19:05           ` Georg Bauhaus
2013-11-17 20:38             ` Dmitry A. Kazakov
2013-11-18  8:38               ` Georg Bauhaus [this message]
2013-11-18  9:01                 ` Dmitry A. Kazakov
2013-11-18 10:06                   ` Georg Bauhaus
2013-11-18  8:44               ` Georg Bauhaus
2013-11-18 10:24                 ` Dmitry A. Kazakov
2013-11-18 13:05                   ` G.B.
2013-11-18 15:25                     ` Dmitry A. Kazakov
2013-11-18 15:51                       ` G.B.
2013-11-18 17:34                         ` Dmitry A. Kazakov
2013-11-18  0:34           ` Stoik
2013-11-16 17:01     ` Georg Bauhaus
2013-11-17 10:38       ` Stoik
2013-11-16 15:12   ` Stoik
2013-11-16 15:57     ` Dmitry A. Kazakov
2013-11-17 11:12       ` Stoik
2013-11-22  1:03         ` Randy Brukardt
2013-11-22  3:02           ` Shark8
2013-11-22 11:54             ` Georg Bauhaus
2013-11-23  4:14             ` Randy Brukardt
2013-12-06  2:17               ` Georg Bauhaus
2013-11-16 20:06     ` Peter C. Chapin
2013-11-17 10:34       ` Stoik
2013-11-22  0:53       ` Randy Brukardt
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox