"Dmitry A. Kazakov" wrote in message news:15jxp8z1iu5fk.1oeihvavjghgg$.dlg@40tude.net... > On 27 Dec 2006 11:06:36 -0800, Hyman Rosen wrote: > > > Dmitry A. Kazakov wrote: > >> The "encoding language" is outside the programming language, > >> so it is not the language problem > > > > Remember that Ada wishes to be case-insensitive, > > That's no problem in a closed alphabet, like English/Latin. > > > so it cannot ignore > > Unicode issues if it wishes to allow Unicode characters in identifiers. > > Which is a BAD idea, IMO. > > We cannot know anything about properties of letters in Klingon. As a > practical example consider Russian where e can be used (and is) in place of > ? see (http://en.wikipedia.org/wiki/%D0%81), but not reverse. Or, maybe we > should make Ada compilers capable to detect program written by Germans to > consider � and ue same? Should we handle diacritical vowel points of Hebrew > as well? What about parsing the source right to left, or top to bottom? The Unicode standard has grappled with these issues and produced results which are useful for the vast majority of languages. Surely Ada is not going to repeat that work (and arguments). And Ada is not going to drop case insensitivity and start claiming that "this" and "This" are somehow different. > > Not to mention "normalization form KC". > > They reap what they sowed. Should Ada or C++ go into that mess? Well, that's irrelevant because they have. Ada 2005 says that the semantics of a program not in Normalization form KC are implementation-defined. (2.1(4.1/2)). That was done because there was concern about programs that are represented differently being treated the same (we originally considered requiring converting into that form). Similarly, upper case conversion is defined by various Unicode properties (such as Upper Case Mapping) (2.1(5/2)). It should be noted that such conversions aren't necessarily reversible, but that's irrelevant to identifier equivalence. Identifier equivalence is defined in 2.3(5-5.3/2). This is more complicated than the English-only definition, but it was thought to be mandatory to get approval of a new standard. (This sort of internationalization is being required of all languages: C++ has a number of proposals on the table for handling this as well.) It's also a ramification of case insensitivity - the only alternative would be to completely abandon it, and that would be very bad for compatibility with Ada 95. Randy.