From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,FREEMAIL_FROM,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,5bcc293dc5642650
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII-7-bit
Received: by 10.68.15.134 with SMTP id x6mr15938259pbc.0.1319737327427;
        Thu, 27 Oct 2011 10:42:07 -0700 (PDT)
MIME-Version: 1.0
Path: 
 p6ni4349pbn.0!nntp.google.com!news1.google.com!goblin3!goblin1!goblin.stu.neva.ru!feeder.news-service.com!aioe.org!.POSTED!not-for-mail
From: anon@att.net
Newsgroups: comp.lang.ada
Subject: Re: Why no Ada.Wide_Directories?
Date: Thu, 27 Oct 2011 17:40:30 +0000 (UTC)
Organization: Aioe.org NNTP Server
Message-ID: <j8c52c$ouh$1@speranza.aioe.org>
References: <9937871.172.1318575525468.JavaMail.geo-discussion-forums@prib32>
Reply-To: anon@anon.org
NNTP-Posting-Host: aWps+rBG+eV0nU4J2KGjtQ.user.speranza.aioe.org
X-Complaints-To: abuse@aioe.org
X-Notice: Filtered by postfilter v. 0.8.2
X-Newsreader: IBM NewsReader/2 2.0
Xref: news1.google.com comp.lang.ada:18725
Date: 2011-10-27T17:40:30+00:00
List-Id: <comp.lang.ada>

Here is a reason from a link at Unicode.org: 
       http://www.cl.cam.ac.uk/~mgk25/unicode.html

    "...An ASCII or Latin-1 file can be transformed into a UCS-2 file by 
    simply inserting a 0x00 byte in front of every ASCII byte. If we 
    want to have a UCS-4 file, we have to insert three 0x00 bytes instead 
    before every ASCII byte.

    Using UCS-2 (or UCS-4) under Unix would lead to very severe problems. 
    Strings with these encodings can contain as parts of many wide 
    characters bytes like "\0" or "/" which have a special meaning in 
    filenames and other C library function parameters. In addition, the 
    majority of UNIX tools expects ASCII files and cannot read 16-bit 
    words as characters without major modifications. For these reasons, 
    UCS-2 is not a suitable external encoding of Unicode in filenames, 
    text files, environment variables, etc."


So Wide_Character could cause problems in other parts of the OS
or Ada/C libraries. And Ada has does have a "Safety and Security" 
concerns. Like paragraph 4 in Annex H.

    4  Restricting language constructs whose usage might complicate the
       demonstration of program correctness

Plus, the goal of "reliability, maintainability, and efficiency" could 
not be keep if Ada_Directory was Wide_Character. Because the storage 
of Wide_Character rather 16-bit or 32-bit is not as efficient as 8 bit 
for filenames. Just think about the old simple 8 by 3 character file 
names. In Wide_Characters that would minimally be 16 by 6 byte (UCS-2)
or even 32 by 12 byte (UCS-4). Which means searching and comparing names 
could take 2 to 4 time longer and 2 or 4 time more storage for the name.  
Which is less efficiency. A quick note on maintainability, and how many 
systems will be using the (16/32) Unicode for their filenames. 

So, to be reliability and efficiency, Wide_Characters should be keep 
to the routines and data that requires the addition storage to be 
accurate, not to files that are already hurt because they are normally 
on a slower access media. And causing more time is defeat the purpose 
of timely reliability program.


In <9937871.172.1318575525468.JavaMail.geo-discussion-forums@prib32>, Michael Rohan <michael.k.rohan@gmail.com> writes:
>Hi,
>
>I've working a little on accessing files and directories using Ada.Director=
>ies and have been using a thin wrapper layer to convert from Wide_String to=
> UTF8 and back.  It does, however, seem strange there is no Wide_Directorie=
>s version in the std library.  Was there a technical reason it wasn't inclu=
>ded?
>
>Take care,
>Michael