"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> wrote in message
news:15jxp8z1iu5fk.1oeihvavjghgg$.dlg@40tude.net...
> On 27 Dec 2006 11:06:36 -0800, Hyman Rosen wrote:
>
> > Dmitry A. Kazakov wrote:
> >> The "encoding language" is outside the programming language,
> >> so it is not the language problem
> >
> > Remember that Ada wishes to be case-insensitive,
>
> That's no problem in a closed alphabet, like English/Latin.
>
> > so it cannot ignore
> > Unicode issues if it wishes to allow Unicode characters in identifiers.
>
> Which is a BAD idea, IMO.
>
> We cannot know anything about properties of letters in Klingon. As a
> practical example consider Russian where e can be used (and is) in place
of
> ? see (http://en.wikipedia.org/wiki/%D0%81), but not reverse. Or, maybe we
> should make Ada compilers capable to detect program written by Germans to
> consider � and ue same? Should we handle diacritical vowel points of
Hebrew
> as well? What about parsing the source right to left, or top to bottom?

The Unicode standard has grappled with these issues and produced results
which are useful for the vast majority of languages. Surely Ada is not going
to repeat that work (and arguments). And Ada is not going to drop case
insensitivity and start claiming that "this" and "This" are somehow
different.

> > Not to mention "normalization form KC".
>
> They reap what they sowed. Should Ada or C++ go into that mess?

Well, that's irrelevant because they have. Ada 2005 says that the semantics
of a program not in Normalization form KC are implementation-defined.
(2.1(4.1/2)). That was done because there was concern about programs that
are represented differently being treated the same (we originally considered
requiring converting into that form).

Similarly, upper case conversion is defined by various Unicode properties
(such as Upper Case Mapping) (2.1(5/2)). It should be noted that such
conversions aren't necessarily reversible, but that's irrelevant to
identifier equivalence. Identifier equivalence is defined in 2.3(5-5.3/2).

This is more complicated than the English-only definition, but it was
thought to be mandatory to get approval of a new standard. (This sort of
internationalization is being required of all languages: C++ has a number of
proposals on the table for handling this as well.) It's also a ramification
of case insensitivity - the only alternative would be to completely abandon
it, and that would be very bad for compatibility with Ada 95.

                                    Randy.