From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,5bcc293dc5642650 X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,UTF8 Received: by 10.68.8.135 with SMTP id r7mr2043602pba.8.1318930329818; Tue, 18 Oct 2011 02:32:09 -0700 (PDT) Path: d5ni26288pbc.0!nntp.google.com!news2.google.com!goblin2!goblin.stu.neva.ru!aioe.org!.POSTED!not-for-mail From: =?utf-8?Q?Yannick_Duch=C3=AAne_=28Hibou57?= =?utf-8?Q?=29?= Newsgroups: comp.lang.ada Subject: Re: Why no Ada.Wide_Directories? Date: Tue, 18 Oct 2011 11:32:07 +0200 Organization: Ada @ Home Message-ID: References: <9937871.172.1318575525468.JavaMail.geo-discussion-forums@prib32> <418b8140-fafb-442f-b91c-e22cc47f8adb@y22g2000pri.googlegroups.com> <7156122c-b63f-487e-ad1b-0edcc6694a7a@u10g2000prl.googlegroups.com> <409c81ab-bd54-493b-beb4-a0cca99ec306@p27g2000prp.googlegroups.com> <58a8ef13-4b67-4548-b20e-469991e445d8@h23g2000pra.googlegroups.com> NNTP-Posting-Host: KHj9AOPOidgt0YptnGtG5g.user.speranza.aioe.org Mime-Version: 1.0 X-Complaints-To: abuse@aioe.org User-Agent: Opera Mail/11.51 (Linux) X-Notice: Filtered by postfilter v. 0.8.2 Xref: news2.google.com comp.lang.ada:14028 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes Content-Transfer-Encoding: Quoted-Printable Date: 2011-10-18T11:32:07+02:00 List-Id: Le Tue, 18 Oct 2011 06:46:13 +0200, ytomino a =C3=A9= crit: > Well...If my supplement is allowed, in my honest opinion ignoring the > existing way of Ada, "File_Name_String" is better. > (In addition, It's welcome that UTF_8_String and UTF_16_String be new= > types like Yannick says.) For personal and specific use cases, yes, however, for a standard, I wou= ld = be more in favor of an Unicode_String type. To be honest, my dream would= = be to replace the Ada String type with that Unicode_String type (a dream= =E2=80=A6 = I said). I use to attempt to create packages where the String type was = redefined, but failed due to some scope trouble (could never make my min= d = about wither or not this was a GNAT bug or not). This is important, because UTF-8, vs UTF-16LE, UTF-16BE and even possibl= y = UTF-32BE and UTF-32LE, is only a matter of implementation and is not a = good candidate for an interface, unless participating in a specific use = = case. Unicode_String implementation could be optionally encoded, or not, at th= e = sole discretion of implementation. The implementation could use UTF-32 i= f = it wish to be simple, or be in favor of the same encoding as the target = = platform. This Unicode_String type would have method to return a = conversion into one of UTF-8, UTF-16 and UTF-32, and optionally (may rai= se = runtime error) to ISO-8859-1. For efficiency, this could also provide = primitive for common iterated composition, such as concatenation, gettin= g = slice, comparison (which can be implemented far more efficiently at the = = implementation level, that by mean getting and setting character, which = = involve encoding and decoding each time). I would also suggest a = Change_To_Uppercase (Unicode_String, Index), and the same with = Change_To_Lower_Case, along with a Remove_Slice and Insert_Slice = primitives. These primitive would cover most of use case and help preser= ve = efficiency. This could also solve a glitch. Actually, if you want to store UTF-8 = string in an Ada source, you have to cheat the compiler: edit the file a= s = UTF-8, and compile as if it was ISO-8859-1 (*). Unfortunately, this is n= ot = clean. If there was a real Unicode_String type (or the String type chang= ed = into a Unicode one=E2=80=A6 in my dreams), this would not be a trouble a= ny more. On the other hand, if this would cause troubles to Ada, I prefer no = change, and to go on with personal methods. (*) You can do the same for UTF-16, with some variation: use = Wide_Character for your string, edit sources in UTF-16, and cheat the = compiler telling him the sources are UCS2 encoded (note: UCS2 is another= = no-encoding Unicode subset, the same way ISO-8859-1 is, except two bytes= = wide instead of one byte wide). -- = =E2=80=9CSyntactic sugar causes cancer of the semi-colons.=E2=80=9D [Ep= igrams on = Programming =E2=80=94 Alan J. =E2=80=94 P. Yale University] =E2=80=9CStructured Programming supports the law of the excluded muddle.= =E2=80=9D [Idem] Java: Write once, Never revisit