From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 Path: border1.nntp.dca3.giganews.com!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!goblin2!goblin.stu.neva.ru!aioe.org!.POSTED!not-for-mail From: "Dmitry A. Kazakov" Newsgroups: comp.lang.ada Subject: Re: Is this expected behavior or not Date: Sat, 6 Apr 2013 09:54:11 +0200 Organization: cbb software GmbH Message-ID: References: <1gnmajx2fdjju.1bo28xwmzt1nr.dlg@40tude.net> <3gv2jwc95otm.pl2aahsh9ox8.dlg@40tude.net> <1gkxiwepaxvtt$.u3ly33rbwthf.dlg@40tude.net> <1fmcdkj58brky.bjedt0pr39cd$.dlg@40tude.net> <1bj564vat3q1j$.1s4d00rlzx4ux$.dlg@40tude.net> <4hzv51v872q2$.1imijbwd7heqm$.dlg@40tude.net> Reply-To: mailbox@dmitry-kazakov.de NNTP-Posting-Host: rHWOzyHApalsT5sEUcbvVQ.user.speranza.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Complaints-To: abuse@aioe.org User-Agent: 40tude_Dialog/2.0.15.1 X-Notice: Filtered by postfilter v. 0.8.2 X-Original-Bytes: 5950 Xref: number.nntp.dca.giganews.com comp.lang.ada:180954 Date: 2013-04-06T09:54:11+02:00 List-Id: On Fri, 5 Apr 2013 21:55:27 +0200, Stefan.Lucks@uni-weimar.de wrote: > On Fri, 5 Apr 2013, Dmitry A. Kazakov wrote: > >> On Fri, 5 Apr 2013 17:16:59 +0200, Stefan.Lucks@uni-weimar.de wrote: > >>> I agree with you that there is no reason to distinguish between them. The >>> entire distinction narrow, Wide_ and Wide_Wide_ Strings (and Characters) >>> is a historical artifact, no more, no less. >>> >>> But if there is no reason to disting between them -- there is no reason to >>> mix them either! >> >> Sorry, but it cannot be both. > > No, if you mix them, you must distinguish them. At least, when > implementing your own multi-method, such as > > function Longest_Common_Substring(S1, S2: Universal_String) > return Universal_String; > -- maybe "is abstract"; > > Either Universal_String is abstract, then so would be this multi-method, > and you would to override it n^3 times, when supporting n different string > types. You could inherit it per composition with conversion [the language should support this sort of delegation]. > Or you can actually have objects of type Universal_String, but then > using this method without making explicit conversions means a whole new > bunch of implicit conversions. Actually there are more than 2 alternatives. Either argument and/or result can covariant or contravariant. > One problem with implicit conversions is that Strings don't "know" their > encoding (Is it UTF-8? Or ISO-Latin-1? Or ...?), so you don't even have > the information you need to perform the conversion. The space of related types is at least 3D: 1. One hierarchy follows the hierarchy of characters: Wide_Wide_Character :> Wide_Character :> Character :> ASCII_Character + EBCDIC_Character etc. 2. Another hierarchy is about constrained vs. unbounded strings 3. The third hierarchy encompasses encoding. For each two points in this space it is perfectly clear how to convert one to another. > Which is why you need to make explicit conversions. There is no need in explicit conversions because it is well defined how to obtain one string from another. > And then it is not a big leap to say "convert everything into my favourite > kind of string, then call my method, and then convert the result back". This one possible implementation. Once you have a mesh of related types you can define specific bodies for interesting combinations of arguments and leave other generated per composition with conversion. >> Consider Ada.Text_IO.Create. It has name a string and content. You tell >> us that it is not necessary to be able to open an UTF-8 file which name >> is UTF-16? Blame Microsoft. > > I know a lot of things to blame Microsoft for, but in the Unix world, all > files are essentially sequences of bytes that you read or write, and it is > your problem to know the semantic of these sequences. Which is why you need a Text_IO package for each combination content encoding x name encoding. > In any case, Ada.Text_IO.Create is a good example. > > As much as I understand you, if Name is of type Universal_String'Class and > you call Ada.Text_IO.Create(File, Name) you expect the proper thing to > happen, right? I meant two arguments: Name => some string type Form => some encoding of the content as an example when different string types must be mixed. Under MS Windows so called W-calls use UTF-16 encoded names, while the file content could be anything, e.g. UTF-8. You can guess how many combinations exist. > But firstly, the strings we have don't know their encoding. They know. See 3.5.2 which defines character set. Yes, some people including me use String for UTF-8 and Wide_String for UTF-16. This is clearly wrong. UTF-8 string is equivalent to Wide_Wide_String and cannot be reinterpreted as String. > No Universal_ type will solve this issue -- you just cannot get rid of > explicit conversions. You can. Actually, what people do right now is implicit unchecked conversions from UTF-8 to String. It is even worse than PL/1, it is plain wrong, C-esque of worst kind. Unfortunately Ada simply offers no means to design it right. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de