From: vincent.diemunsch@gmail.com
Subject: Re: When to use Bounded_String?
Date: Thu, 28 Dec 2017 06:28:28 -0800 (PST)
Date: 2017-12-28T06:28:28-08:00 [thread overview]
Message-ID: <37c30172-9386-45fb-86d0-a10998fcade8@googlegroups.com> (raw)
In-Reply-To: <p22mdb$1s3d$1@gioia.aioe.org>
Le jeudi 28 décembre 2017 13:00:46 UTC+1, Dmitry A. Kazakov a écrit :
> > Yes, they are really a great improvement. But they would be perfect if :
> > 1. they handled UTF-8 as the de-facto standard encoding, for strings.
>
> You can ignore encoding and use them as if they were UTF-8
>
Sure. That's what is done, at least on Unixes (Linux and OSX).
> > 2. they could see strings as sequences of 32-bits Unicode Code Points (Wide_Wide_Characters).
>
> 23 / 4 = 5 characters
No. At least 5 characters if they are very complicated. But 23 ASCII Characters.
The idea here is to decode the UTF-8 string to extract a character and give it in Unicode in the most common format for integers : 32-bits.
The only limitation is that you would have sequential access to the string, not random access as with the usual array of characters. But I really don't see the
point of having a random access to the characters in a string !
> P.S. Just never copy strings if you have performance concerns (even if
> you have none). Nothing to optimize then. Use string slices, pass string
> + an index to start at, do everything in a single pass, there is no
> reason to waste CPU time, memory and brain cells on "tokenizing".
True. Except for storing the identifiers in a symbol table...
Kind regards,
Vincent
next prev parent reply other threads:[~2017-12-28 14:28 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-19 2:19 When to use Bounded_String? Victor Porton
2017-11-19 9:55 ` Niklas Holsti
2017-11-20 5:38 ` J-P. Rosen
2017-11-20 7:32 ` Niklas Holsti
2017-11-23 10:04 ` briot.emmanuel
2017-12-28 11:46 ` Vincent DIEMUNSCH
2017-12-28 12:00 ` Dmitry A. Kazakov
2017-12-28 12:29 ` Mehdi Saada
2017-12-29 0:42 ` Randy Brukardt
2017-12-29 9:11 ` Simon Wright
2017-12-28 14:28 ` vincent.diemunsch [this message]
2017-12-29 0:36 ` Randy Brukardt
2017-12-29 8:48 ` Dmitry A. Kazakov
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox