From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,FREEMAIL_FROM, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,5bcc293dc5642650 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.68.15.134 with SMTP id x6mr15938259pbc.0.1319737327427; Thu, 27 Oct 2011 10:42:07 -0700 (PDT) MIME-Version: 1.0 Path: p6ni4349pbn.0!nntp.google.com!news1.google.com!goblin3!goblin1!goblin.stu.neva.ru!feeder.news-service.com!aioe.org!.POSTED!not-for-mail From: anon@att.net Newsgroups: comp.lang.ada Subject: Re: Why no Ada.Wide_Directories? Date: Thu, 27 Oct 2011 17:40:30 +0000 (UTC) Organization: Aioe.org NNTP Server Message-ID: References: <9937871.172.1318575525468.JavaMail.geo-discussion-forums@prib32> Reply-To: anon@anon.org NNTP-Posting-Host: aWps+rBG+eV0nU4J2KGjtQ.user.speranza.aioe.org X-Complaints-To: abuse@aioe.org X-Notice: Filtered by postfilter v. 0.8.2 X-Newsreader: IBM NewsReader/2 2.0 Xref: news1.google.com comp.lang.ada:18725 Date: 2011-10-27T17:40:30+00:00 List-Id: Here is a reason from a link at Unicode.org: http://www.cl.cam.ac.uk/~mgk25/unicode.html "...An ASCII or Latin-1 file can be transformed into a UCS-2 file by simply inserting a 0x00 byte in front of every ASCII byte. If we want to have a UCS-4 file, we have to insert three 0x00 bytes instead before every ASCII byte. Using UCS-2 (or UCS-4) under Unix would lead to very severe problems. Strings with these encodings can contain as parts of many wide characters bytes like "\0" or "/" which have a special meaning in filenames and other C library function parameters. In addition, the majority of UNIX tools expects ASCII files and cannot read 16-bit words as characters without major modifications. For these reasons, UCS-2 is not a suitable external encoding of Unicode in filenames, text files, environment variables, etc." So Wide_Character could cause problems in other parts of the OS or Ada/C libraries. And Ada has does have a "Safety and Security" concerns. Like paragraph 4 in Annex H. 4 Restricting language constructs whose usage might complicate the demonstration of program correctness Plus, the goal of "reliability, maintainability, and efficiency" could not be keep if Ada_Directory was Wide_Character. Because the storage of Wide_Character rather 16-bit or 32-bit is not as efficient as 8 bit for filenames. Just think about the old simple 8 by 3 character file names. In Wide_Characters that would minimally be 16 by 6 byte (UCS-2) or even 32 by 12 byte (UCS-4). Which means searching and comparing names could take 2 to 4 time longer and 2 or 4 time more storage for the name. Which is less efficiency. A quick note on maintainability, and how many systems will be using the (16/32) Unicode for their filenames. So, to be reliability and efficiency, Wide_Characters should be keep to the routines and data that requires the addition storage to be accurate, not to files that are already hurt because they are normally on a slower access media. And causing more time is defeat the purpose of timely reliability program. In <9937871.172.1318575525468.JavaMail.geo-discussion-forums@prib32>, Michael Rohan writes: >Hi, > >I've working a little on accessing files and directories using Ada.Director= >ies and have been using a thin wrapper layer to convert from Wide_String to= > UTF8 and back. It does, however, seem strange there is no Wide_Directorie= >s version in the std library. Was there a technical reason it wasn't inclu= >ded? > >Take care, >Michael