From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM
	autolearn=unavailable autolearn_force=no version=3.4.4
X-Received: by 10.107.155.66 with SMTP id d63mr1037983ioe.7.1514471309321;
        Thu, 28 Dec 2017 06:28:29 -0800 (PST)
X-Received: by 10.157.88.6 with SMTP id r6mr413136oth.6.1514471309199; Thu, 28
 Dec 2017 06:28:29 -0800 (PST)
Path: 
 eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!feeder.eternal-september.org!border1.nntp.ams1.giganews.com!nntp.giganews.com!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer02.am4!peer.am4.highwinds-media.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!i6no3347491itb.0!news-out.google.com!b73ni12851ita.0!nntp.google.com!i6no3347488itb.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
Newsgroups: comp.lang.ada
Date: Thu, 28 Dec 2017 06:28:28 -0800 (PST)
In-Reply-To: <p22mdb$1s3d$1@gioia.aioe.org>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com;
 posting-host=185.30.132.97;
 posting-account=hya6vwoAAADTA0O27Aq3u6Su3lQKpSMz
NNTP-Posting-Host: 185.30.132.97
References: <ouqpnm$a2j$1@gioia.aioe.org>
 <0cc30dc8-4528-4e5c-91dd-24dfbe3cbcb2@googlegroups.com>
 <96764e4c-48df-4042-845e-12341149bc87@googlegroups.com>
 <p22mdb$1s3d$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <37c30172-9386-45fb-86d0-a10998fcade8@googlegroups.com>
Subject: Re: When to use Bounded_String?
From: vincent.diemunsch@gmail.com
Injection-Date: Thu, 28 Dec 2017 14:28:29 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Body-CRC: 901836974
X-Received-Bytes: 2746
Xref: reader02.eternal-september.org comp.lang.ada:49677
Date: 2017-12-28T06:28:28-08:00
List-Id: <comp.lang.ada>

Le jeudi 28 d=C3=A9cembre 2017 13:00:46 UTC+1, Dmitry A. Kazakov a =C3=A9cr=
it=C2=A0:
> > Yes, they are really a great improvement. But they would be perfect if =
:
> > 1. they handled UTF-8 as the de-facto standard encoding, for strings.
>=20
> You can ignore encoding and use them as if they were UTF-8
>=20
Sure. That's what is done, at least on Unixes (Linux and OSX).


> > 2. they could see strings as sequences of 32-bits Unicode Code Points (=
Wide_Wide_Characters).
>=20
> 23 / 4 =3D 5 characters

No. At least 5 characters if they are very complicated. But 23 ASCII=C2=A0C=
haracters.
The idea here is to decode the UTF-8 string to extract a character and give=
 it in Unicode in the most common format for integers : 32-bits.
=20
The only limitation is that you would have sequential access to the string,=
 not random access as with the usual array of characters. But I really don'=
t see the
point of having a random access to the characters in a string !

> P.S. Just never copy strings if you have performance concerns (even if=20
> you have none). Nothing to optimize then. Use string slices, pass string=
=20
> + an index to start at, do everything in a single pass, there is no=20
> reason to waste CPU time, memory and brain cells on "tokenizing".

True. Except for storing the identifiers in a symbol table...

Kind regards,

Vincent