From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: a07f3367d7,8ea33c39efc56ac3
X-Google-Attributes: gida07f3367d7,public,usenet
X-Google-NewGroupId: yes
X-Google-Language: ENGLISH,UTF8
Received: by 10.68.8.135 with SMTP id r7mr2662436pba.8.1318452580759;
        Wed, 12 Oct 2011 13:49:40 -0700 (PDT)
Path: 
 d5ni2852pbc.0!nntp.google.com!news1.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!feedme.ziplink.net!news.swapon.de!aioe.org!.POSTED!not-for-mail
From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Newsgroups: comp.lang.ada
Subject: Re: sharp =?UTF-8?B?w58gYW5kIHNzIGluIEFkYSBrZXl3b3JkcyBsaWtlIEFD?=
 =?UTF-8?B?Q0VTUw==?=
Date: Wed, 12 Oct 2011 22:48:25 +0200
Organization: cbb software GmbH
Message-ID: <jzbw65n7sj1o.1c75ryih8kppi$.dlg@40tude.net>
References: <4e931db5$0$6541$9b4e6d93@newsspool4.arcor-online.net>
 <1f9a5099-f5f5-49a8-8773-b7eaca771427@s5g2000pra.googlegroups.com>
 <4e93381d$0$6545$9b4e6d93@newsspool4.arcor-online.net>
 <op.v2661evjz25lew@macpro-eth1.krischik.com>
 <4e959011$0$6627$9b4e6d93@newsspool2.arcor-online.net>
 <4r1gqrovnlyw$.u64367deu6pt$.dlg@40tude.net>
 <4e95db62$0$6554$9b4e6d93@newsspool4.arcor-online.net>
Reply-To: mailbox@dmitry-kazakov.de
NNTP-Posting-Host: EMY6V9w2JsuJ/8EEiAFEEw.user.speranza.aioe.org
Mime-Version: 1.0
X-Complaints-To: abuse@aioe.org
User-Agent: 40tude_Dialog/2.0.15.1
X-Notice: Filtered by postfilter v. 0.8.2
Xref: news1.google.com comp.lang.ada:18426
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
Date: 2011-10-12T22:48:25+02:00
List-Id: <comp.lang.ada>

On Wed, 12 Oct 2011 20:24:33 +0200, Georg Bauhaus wrote:

> On 12.10.11 15:48, Dmitry A. Kazakov wrote:
>> On Wed, 12 Oct 2011 15:03:13 +0200, Georg Bauhaus wrote:
>> 
>>> But I imagine a language rule that addresses common sense
>>> more than it does the mechanics of Unicode or the history
>>> of writing; it might even be easy to implement:
>> 
>> Speaking of common sense one should simply drop ß and all other letters not
>> present in 7-bit ASCII.
> 
> (Why character case? Let's save bits by dropping small letters. ;-)

This is what Ada 83 did being case agnostic.
 
>> If ß=ss, then sch=sh, when matching two
>> simple names of different alphabets. How are you going to tag names?
>> 
>>    German#acceß#  
>>    US#access#
>> 
>> (:-))
> 
> The "alphabet" of both "access" and "acceß" (Horrible!) shall
> be "Latin", see below.  Thus "access" is not Greek, and
> "acceβ" will be an error, because it mixes two "alphabets",
> Latin and Greek.

ß has nothing to do with Greek alphabet, it is a ligature promoted to a
separate character. 

> The compiler will detected the syntax error.

That was not my question. It was what to do with this:

   type Acceß_Type is access Integer;
   type Access_Type is access String;

Do these identifiers conflict?

   I : Integer; -- Latin I
   І : Integer; -- Ukrainian I

>>> Presuming some practical definition of "alphabet".
>> 
>> For example?
> 
> I'd try a KISS definition of "alphabet". It does not involve
> national languages, or meaning.
> 
> - Latin characters

Is ö Latin? Are k, u, w Latin? BTW, Latin script was all upper case.

> - Cyrillic characters

"Cyrillic characters" is a wild mixture of various characters and
ligatures of (like German ß) from different national Cyrillic alphabets,
with borrowing from Greek, Latin and later inventions. There is no reason
to treat combinations of those as something cohesive.

> But this should be fairly easy
> to implement,

It is not about implementation, it is about understanding the rules without
looking into the categorization tables.

BTW, why "ΔT" should be illegal?

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de