From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,40843b637af826a X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.180.104.170 with SMTP id gf10mr5946111wib.3.1354109780386; Wed, 28 Nov 2012 05:36:20 -0800 (PST) X-FeedAbuse: http://nntpfeed.proxad.net/abuse.pl feeded by 88.191.116.97 Path: ha8ni64194wib.1!nntp.google.com!feeder1-2.proxad.net!proxad.net!feeder1-1.proxad.net!nntpfeed.proxad.net!dedibox.gegeweb.org!gegeweb.eu!usenet.pasdenom.info!aioe.org!.POSTED!not-for-mail From: "Dmitry A. Kazakov" Newsgroups: comp.lang.ada Subject: Re: IBM 437 encoded String to UTF-16 Wide_String Date: Wed, 28 Nov 2012 14:36:02 +0100 Organization: cbb software GmbH Message-ID: <347rnekt4in1.12pbyz0phdelf$.dlg@40tude.net> References: <11112110-03b1-4977-ba80-00204926ea23@googlegroups.com> <68663891-14ad-4780-a00d-1cc48ed75323@googlegroups.com> <027679a1-dc5e-4888-9dd1-2a4ccf32e66c@googlegroups.com> <50b5dcd0$0$6581$9b4e6d93@newsspool3.arcor-online.net> <1mefvxxar8vn3$.16pejjtgf8hhg.dlg@40tude.net> <50b5f60e$0$9524$9b4e6d93@newsspool1.arcor-online.net> Reply-To: mailbox@dmitry-kazakov.de NNTP-Posting-Host: FbOMkhMtVLVmu7IwBnt1tw.user.speranza.aioe.org Mime-Version: 1.0 X-Complaints-To: abuse@aioe.org User-Agent: 40tude_Dialog/2.0.15.1 X-Notice: Filtered by postfilter v. 0.8.2 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Date: 2012-11-28T14:36:02+01:00 List-Id: On Wed, 28 Nov 2012 12:31:35 +0100, Georg Bauhaus wrote: > On 28.11.12 10:58, Dmitry A. Kazakov wrote: >> When I mentioned Wide_Wide_String I meant an array of code points. The >> logical view of *any* string type is array of code points. The only >> difference between different string types is in the constraints put on the >> code points. E.g. String has code points 0 to 255. IBM_437_String would >> have a non-contiguous set of code points etc. > > What is a non-contiguous set? A convex set in this case, i.e.: for code points x,y,z, such that x In case of differentiation by sets of code points, I'd rather > have an honest type Unicode_String and---if we are already > fixing the language---put everything that has {Wide_}String > in its name in Annex J. > > But then, consider > > type Index is range 1 .. 12; > > type R is ('I', 'V', 'X', 'L', 'C', 'D', 'M'); > > type N is array (Index range <>) of R; > > A string of R, named N here, is just fine. In fact, > > Year : constant N := "MCMLXXXIII"; > > has a valid literal, and the year so written is not of any of > the standard string types. The definition of type R actually > implies a codespace, and, for example, Character'('V') or > Wide_Character'('V') have no role in it, irrespective of > any accidental overlap in encoding or representation or > position. > > So, which by force should type N be in Whatever_String'Class? Per inheritance: type N is new Wide_Wide_String and array (...) of R; >>> As a practical alternative, why not add a generalized >>> std::valarray)> to the language instead >>> of fixing it? >> >> No idea what this is supposed to mean. > > Call it > > generic > type Element_Type is ... > type Index_Type is ... > package Ada.Containers.Tuples is > ... > > and make it have standard container operations, extended as needed. It does not solve anything. The problem is not construction of a container type. It is the relation of the obtained type to the string interface. The string interface is an array of code points. The container must implement this interface in order to be a string. All strings must implement this interface, this is why they are called "strings." > The idea is that if Element_Type is ordered scalars, and if container > operations provide for writing algorithms efficiently, then > that's a more practical way of having strings of anything > than, say, finally removing "tagged" from the language and make > every type be in some 'Class. Every type is in more than just one class, trivially. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de