From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM
	autolearn=ham autolearn_force=no version=3.4.4
X-Google-Thread: 103376,5bcc293dc5642650
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,UTF8
Received: by 10.68.8.135 with SMTP id r7mr2043602pba.8.1318930329818;
        Tue, 18 Oct 2011 02:32:09 -0700 (PDT)
Path: 
 d5ni26288pbc.0!nntp.google.com!news2.google.com!goblin2!goblin.stu.neva.ru!aioe.org!.POSTED!not-for-mail
From: =?utf-8?Q?Yannick_Duch=C3=AAne_=28Hibou57?=
 =?utf-8?Q?=29?=
 <yannick_duchene@yahoo.fr>
Newsgroups: comp.lang.ada
Subject: Re: Why no Ada.Wide_Directories?
Date: Tue, 18 Oct 2011 11:32:07 +0200
Organization: Ada @ Home
Message-ID: <op.v3jjfty0ule2fv@index.ici>
References: <9937871.172.1318575525468.JavaMail.geo-discussion-forums@prib32>
 <418b8140-fafb-442f-b91c-e22cc47f8adb@y22g2000pri.googlegroups.com>
 <j7i6va$nso$1@munin.nbi.dk>
 <7156122c-b63f-487e-ad1b-0edcc6694a7a@u10g2000prl.googlegroups.com>
 <ffeeb5d0-5685-42ff-a141-72bea410f239@u10g2000prl.googlegroups.com>
 <409c81ab-bd54-493b-beb4-a0cca99ec306@p27g2000prp.googlegroups.com>
 <58a8ef13-4b67-4548-b20e-469991e445d8@h23g2000pra.googlegroups.com>
NNTP-Posting-Host: KHj9AOPOidgt0YptnGtG5g.user.speranza.aioe.org
Mime-Version: 1.0
X-Complaints-To: abuse@aioe.org
User-Agent: Opera Mail/11.51 (Linux)
X-Notice: Filtered by postfilter v. 0.8.2
Xref: news2.google.com comp.lang.ada:14028
Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes
Content-Transfer-Encoding: Quoted-Printable
Date: 2011-10-18T11:32:07+02:00
List-Id: <comp.lang.ada>

Le Tue, 18 Oct 2011 06:46:13 +0200, ytomino <aghia05@gmail.com> a =C3=A9=
crit:

> Well...If my supplement is allowed, in my honest opinion ignoring the
> existing way of Ada, "File_Name_String" is better.
> (In addition, It's  welcome that UTF_8_String and UTF_16_String be new=

> types like Yannick says.)
For personal and specific use cases, yes, however, for a standard, I wou=
ld  =

be more in favor of an Unicode_String type. To be honest, my dream would=
  =

be to replace the Ada String type with that Unicode_String type (a dream=
=E2=80=A6  =

I said). I use to attempt to create packages where the String type was  =

redefined, but failed due to some scope trouble (could never make my min=
d  =

about wither or not this was a GNAT bug or not).

This is important, because UTF-8, vs UTF-16LE, UTF-16BE and even possibl=
y  =

UTF-32BE and UTF-32LE, is only a matter of implementation and is not a  =

good candidate for an interface, unless participating in a specific use =
 =

case.

Unicode_String implementation could be optionally encoded, or not, at th=
e  =

sole discretion of implementation. The implementation could use UTF-32 i=
f  =

it wish to be simple, or be in favor of the same encoding as the target =
 =

platform. This Unicode_String type would have method to return a  =

conversion into one of UTF-8, UTF-16 and UTF-32, and optionally (may rai=
se  =

runtime error) to ISO-8859-1. For efficiency, this could also provide  =

primitive for common iterated composition, such as concatenation, gettin=
g  =

slice, comparison (which can be implemented far more efficiently at the =
 =

implementation level, that by mean getting and setting character, which =
 =

involve encoding and decoding each time). I would also suggest a  =

Change_To_Uppercase (Unicode_String, Index), and the same with  =

Change_To_Lower_Case, along with a Remove_Slice and Insert_Slice  =

primitives. These primitive would cover most of use case and help preser=
ve  =

efficiency.

This could also solve a glitch. Actually, if you want to store UTF-8  =

string in an Ada source, you have to cheat the compiler: edit the file a=
s  =

UTF-8, and compile as if it was ISO-8859-1 (*). Unfortunately, this is n=
ot  =

clean. If there was a real Unicode_String type (or the String type chang=
ed  =

into a Unicode one=E2=80=A6 in my dreams), this would not be a trouble a=
ny more.

On the other hand, if this would cause troubles to Ada, I prefer no  =

change, and to go on with personal methods.

(*) You can do the same for UTF-16, with some variation: use  =

Wide_Character for your string, edit sources in UTF-16, and cheat the  =

compiler telling him the sources are UCS2 encoded (note: UCS2 is another=
  =

no-encoding Unicode subset, the same way ISO-8859-1 is, except two bytes=
  =

wide instead of one byte wide).

-- =

=E2=80=9CSyntactic sugar causes cancer of the semi-colons.=E2=80=9D  [Ep=
igrams on  =

Programming =E2=80=94 Alan J. =E2=80=94 P. Yale University]
=E2=80=9CStructured Programming supports the law of the excluded muddle.=
=E2=80=9D [Idem]
Java: Write once, Never revisit