"Dmitry A. Kazakov" wrote in message news:1a9k0vk46bqrq.1cx6cdld0wd9f$.dlg@40tude.net... > On Fri, 29 Dec 2006 20:25:28 -0600, Randy Brukardt wrote: > > > For what it's worth, Ada says that all three of these represent the same > > identifier. That's not ideal, but it's the best that we can do without > > dropping into the character handling mess ourselves. > > > > This is even more interesting when you consider that there are alternative > > spellings for reserved words. For instance "acce�" is identical to "access". > > (See 2.3(5.c/2) in the AARM for more examples). We wrestled with that quite > > a while before deciding that such identifiers had to be illegal > > (2.3(5.3/2)); we didn't want them appearing in programs in place of reserved > > words. > > Yuck. Would "acce?" with Greek beta (?) and "if" with Cyrillic ? in it be > valid identifiers? Sure, the upper case of a Greek beta is still a Greek beta, it's not "SS" (and doesn't look anything like "ss", either). I don't know much about Cyrillic, so I don't know the answer to that (but I suspect you do). I would guess that you'll want some external style rules to prevent bogus mixing of letters from different character sets. That's not any worse that the style rules for capitalization and indentation that Gnat can enforce. I've always limited myself to using the characters commonly available on Windows systems (roughly 680 glyphs), and there needs to be something that checks for use of letters that won't necessarily display well. But all of that is outside of the language. It should be pointed out that one of the reasons for Ada's support of Unicode is that we had a long discussion of how to support Latin-9 (which contains the euro symbol). Eventually, we decided that that way lies madness - at least by using Unicode, there is only one definition to worry about, rather than a set of them. My only regret is that we didn't find a way to include real runtime UTF-8 support in the language: it's wasteful to store everything as 32-bit characters. Randy.