From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!mx02.eternal-september.org!feeder.eternal-september.org!aioe.org!.POSTED!not-for-mail
From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Newsgroups: comp.lang.ada
Subject: Re: Exclusive file access
Date: Tue, 1 Sep 2015 09:26:57 +0200
Organization: cbb software GmbH
Message-ID: <qj67bcoxm9k6.110vqsia49goc$.dlg@40tude.net>
References: <eec18573-876b-4151-b13a-15a103bba30e@googlegroups.com>
 <75714e3f-c047-413d-9aa5-3ff423167863@googlegroups.com>
 <nrnswuq80yyi$.uf01xt8a4dz5.dlg@40tude.net>
 <1440837116.20971.33.camel@obry.net>
 <lnt8n39l9dme$.1lq59r9ar72t1$.dlg@40tude.net>
 <87oahpovpn.fsf@mid.deneb.enyo.de>
 <mvil865iebyb$.1of2shk5faacq$.dlg@40tude.net>
 <87y4gsmut1.fsf@mid.deneb.enyo.de>
 <p4xiiq77lt19$.1ba1le2pivgsg.dlg@40tude.net>
 <87zj179n7n.fsf@mid.deneb.enyo.de>
Reply-To: mailbox@dmitry-kazakov.de
NNTP-Posting-Host: jSS3it0g+GyWYSMU5pi+5g.user.speranza.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: 40tude_Dialog/2.0.15.1
X-Notice: Filtered by postfilter v. 0.8.2
Xref: news.eternal-september.org comp.lang.ada:27665
Date: 2015-09-01T09:26:57+02:00
List-Id: <comp.lang.ada>

On Mon, 31 Aug 2015 23:12:28 +0200, Florian Weimer wrote:

> * Dmitry A. Kazakov:
> 
>>>> Semantically no. Wide_String according to RM 3.5.2 (3/3) represents a
>>>> narrower set of Unicode than UTF-16.
>>> 
>>> I don't have a current Windows system to try this, but I think Windows
>>> allows you to use lone surrogates in file names.  Such names are not
>>> valid UTF-16, but valid UCS-2.
>>
>> The system may have integrated AI that accepts names in English: "a file
>> with the name of Swahili dhadi". Would it make ASCII same as Unicode?
> 
> Sorry, there is no need for being silly.

It is not silly. It is the difference between semantics of the type and a
possibility to misuse bit patterns of type values representation for
anything else. You could put a whole system kernel into a string. That
won't make characters machine instructions.

> Non-encodable file names are
> definite problems and happen in practice (see Java programs on
> non-Windows platforms in a multi-byte locale).

Anything you can encode in Unicode you can encode in Unicode.

> The user may select a
> file, but the application cannot open it.  That's a poor user
> experience.

That is not a problem at all. You cannot create a 999TB large file either.
System-specific constraints put on an implementation do not effect the
interface, which has Name_Error in it already. 

>>> You can end up with non-expressible names, depending on how the
>>> conversion to the external representation is performed.  (I.e., the
>>> system may have file names which cannot be encoded as
>>> Wide_Wide_String.)
>>
>> Wasn't the purpose of Unicode to represent all possible characters?
> 
> This discussion isn't about characters, it's about conversions for
> sequences of code units which do not quite match the (current) Unicode
> specifications.

String is an array of characters. Anything that is not a sequence of
characters is not a sequence of characters.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de