From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=0.4 required=5.0 tests=BAYES_00,FORGED_MUA_MOZILLA
	autolearn=no autolearn_force=no version=3.4.4
X-Google-Thread: 103376,40843b637af826a
X-Google-NewGroupId: yes
X-Google-Attributes: gida07f3367d7,domainid0,public,usenet
X-Google-Language: ENGLISH,ASCII-7-bit
Received: by 10.180.104.170 with SMTP id gf10mr5818345wib.3.1354102330530;
        Wed, 28 Nov 2012 03:32:10 -0800 (PST)
Path: 
 ha8ni63201wib.1!nntp.google.com!feeder1-2.proxad.net!proxad.net!feeder2-2.proxad.net!newsfeed.arcor.de!newsspool1.arcor-online.net!news.arcor.de.POSTED!not-for-mail
Date: Wed, 28 Nov 2012 12:31:35 +0100
From: Georg Bauhaus <rm.dash-bauhaus@futureapps.de>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7;
 rv:16.0) Gecko/20121026 Thunderbird/16.0.2
MIME-Version: 1.0
Newsgroups: comp.lang.ada
Subject: Re: IBM 437 encoded String to UTF-16 Wide_String
References: <11112110-03b1-4977-ba80-00204926ea23@googlegroups.com>
 <68663891-14ad-4780-a00d-1cc48ed75323@googlegroups.com>
 <027679a1-dc5e-4888-9dd1-2a4ccf32e66c@googlegroups.com>
 <mv2j5fu41fb6.rc23cjp3s2bs$.dlg@40tude.net>
 <50b5dcd0$0$6581$9b4e6d93@newsspool3.arcor-online.net>
 <1mefvxxar8vn3$.16pejjtgf8hhg.dlg@40tude.net>
In-Reply-To: <1mefvxxar8vn3$.16pejjtgf8hhg.dlg@40tude.net>
Message-ID: <50b5f60e$0$9524$9b4e6d93@newsspool1.arcor-online.net>
Organization: Arcor
NNTP-Posting-Date: 28 Nov 2012 12:31:27 CET
NNTP-Posting-Host: b8fda11f.newsspool1.arcor-online.net
X-Trace: 
 DXC=_DbgbMEhZ4j>jlK2>IgHGdic==]BZ:afn4Fo<]lROoRankgeX?EC@@`USa^hhA5AYjPCY\c7>ejVhE@;jmfD`k:g\Q=YVlR[PWc
X-Complaints-To: usenet-abuse@arcor.de
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Date: 2012-11-28T12:31:27+01:00
List-Id: <comp.lang.ada>

On 28.11.12 10:58, Dmitry A. Kazakov wrote:
>  When I mentioned Wide_Wide_String I meant an array of code points. The
> logical view of *any* string type is array of code points. The only
> difference between different string types is in the constraints put on the
> code points. E.g. String has code points 0 to 255. IBM_437_String would
> have a non-contiguous set of code points etc.

What is a non-contiguous set?

In case of differentiation by sets of code points, I'd rather
have an honest type Unicode_String and---if we are already
fixing the language---put everything that has {Wide_}String
in its name in Annex J.

But then, consider

    type Index is range 1 .. 12;

    type R is ('I', 'V', 'X', 'L', 'C', 'D', 'M');

    type N is array (Index range <>) of R;

A string of R, named N here, is just fine. In fact,

    Year : constant N := "MCMLXXXIII";

has a valid literal, and the year so written is not of any of
the standard string types. The definition of type R actually
implies a codespace, and, for example, Character'('V') or
Wide_Character'('V') have no role in it, irrespective of
any accidental overlap in encoding or representation or
position.

So, which by force should type N be in Whatever_String'Class?

>> As a practical alternative, why not add a generalized
>> std::valarray<type T = (<>)> to the language instead
>> of fixing it?
>
> No idea what this is supposed to mean.

Call it

generic
    type Element_Type is ...
    type Index_Type is ...
package Ada.Containers.Tuples is
    ...

and make it have standard container operations, extended as needed.
The idea is that if Element_Type is ordered scalars, and if container
operations provide for writing algorithms efficiently, then
that's a more practical way of having strings of anything
than, say, finally removing "tagged" from the language and make
every type be in some 'Class.