From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,73cb216d191f0fef X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,ASCII X-Received: by 10.224.110.68 with SMTP id m4mr6682215qap.2.1363452953940; Sat, 16 Mar 2013 09:55:53 -0700 (PDT) X-Received: by 10.50.151.205 with SMTP id us13mr721057igb.2.1363452953761; Sat, 16 Mar 2013 09:55:53 -0700 (PDT) Path: k8ni188qas.0!nntp.google.com!dd2no1928476qab.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Sat, 16 Mar 2013 09:55:53 -0700 (PDT) In-Reply-To: <3p6p8k0yfly7.ctazdw7fc5so$.dlg@40tude.net> Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=69.20.190.126; posting-account=lJ3JNwoAAAAQfH3VV9vttJLkThaxtTfC NNTP-Posting-Host: 69.20.190.126 References: <8klywqh2pf$.1f949flc1xeia.dlg@40tude.net> <513f6e2f$0$6572$9b4e6d93@newsspool3.arcor-online.net> <513faaf7$0$6626$9b4e6d93@newsspool2.arcor-online.net> <51408e81$0$6577$9b4e6d93@newsspool3.arcor-online.net> <11rcs3gg4taww$.bylek8fsshyz$.dlg@40tude.net> <99929f93-b80f-47c3-8a37-c81002733754@googlegroups.com> <87ec4b1d-f7cd-49a4-8cff-d44aeb76a1ad@googlegroups.com> <78103a2f-5d19-4378-b211-1917175d5694@googlegroups.com> <3p6p8k0yfly7.ctazdw7fc5so$.dlg@40tude.net> User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: Subject: Re: Is this expected behavior or not From: Shark8 Cc: mailbox@dmitry-kazakov.de Injection-Date: Sat, 16 Mar 2013 16:55:53 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Date: 2013-03-16T09:55:53-07:00 List-Id: On Saturday, March 16, 2013 1:41:27 AM UTC-6, Dmitry A. Kazakov wrote: > On Fri, 15 Mar 2013 22:52:02 -0700 (PDT), Shark8 wrote: >=20 > > True; but there could be some more interesting cases, say, for Ada 2020= . > > Something like: > >=20 > > Abstract Type UNIVERSAL_STRING(Element : UNIVERSAL_CHARACTER) is=20 > > Array(Positive Range <>) of Element'Type; >=20 > That would not work. I presume that here you want to create a root type f= or > the class of string types and get at the members of the class (specific > types like Wide_String) using a constraint. The problem is that string > types must have different representations. The mechanism of constraining > does not support.=20 Kind of, but not really; I'm thinking a sort of combination of generics and= classes (in the general sense, not the OOP-sense): a way to specify a gene= ral behavior for a type-class. (i.e. having the ability to fully-specify th= ings like attributes [not really shown in this example].) > Thus either subtypes will have same representation or you > won't have a class. I'm thinking of it more in the terms of generic operations: independent of = representation. > Another problem is that string types must have more=20 > than one interface to deal with UTF-8 etc. An UTF-8 string is *both* an > array of Wide_Wide_Character (=3D Unicode code points) and an array or > sequence of Character (octets). Ah, things get tricky here; Unicode is kind of a bear when you consider 'ch= aracters' because its codepoints aren't necessarily characters. An example = would be the so-called "combining characters" which you can use for things = like accents or ZALGO-text. (See these, respectively: http://en.wikipedia.o= rg/wiki/Combining_character and, http://eeemo.net/ ) An important implication of this is that string search/manipulation becomes= MUCH more complex. (`+a =3D =E0) means that you now have to search for mul= tiple possibilities when your target is "=E0" -> the single-glyph code-poin= t, or the combining-character points... and that's not taking into consider= ation whether you should consider a & =E0 & (a+diacritic) to be the same or= unique entities -- and casing is another combinatorial factor. It would be a big mistake to assume character =3D code-point when dealing w= ith Unicode. > An UTF-16 string is an array of Wide_Wide_Character and an array of Wide_= String. UTF-16 is perhaps the worst possible encoding you can have for Unicode. Wit= h UTF-8 you don't need to worry about byte-order (everything's sequential) = and with UTF-32 you don't need to decode the information (each element *IS*= a code-point)... but UTF-16 offers neither of these. ------------------------------------------------------ I guess what I'm trying to say is that if we did it right, we could modify/= expand the type-system so that something like UNIVERSAL_INTEGER could be ma= de/explicitly-specified. (And if done extremely well, something like UNIVER= SAL_STRING where perhaps the only thing differentiating the strings would b= e the 'instantiation' with their proper character-type* and manipulation-fu= nctions.) -- GNAT already has the 'Universal_Literal_String which works in = either of the following lines: Ada.Wide_Wide_Text_IO.Put_Line( Ada.Numerics.Pi'Universal_Literal_String ) Ada.Text_IO.Put_Line( Ada.Numerics.Pi'Universal_Literal_String ) In any case; I think it worth considering not just outward/downward expansi= on of the language, but inward/upward [unification] as well.