From: "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de>
Subject: Re: GNOGA - RFC UXStrings package.
Date: Tue, 12 May 2020 11:35:53 +0200
Date: 2020-05-12T11:35:53+02:00 [thread overview]
Message-ID: <r9dqlo$1tkv$1@gioia.aioe.org> (raw)
In-Reply-To: r9dlrq$1im8$1@gioia.aioe.org
On 2020-05-12 10:13, Blady wrote:
> I've checked Simple Components, it might be completed with some parsing
> functions in order to fulfill all Gnoga needs.
You are welcome to ask.
I am not sure what kind of parsing you mean, most of nightmarish legacy
encodings are supported already, e.g.
http://www.dmitry-kazakov.de/ada/strings_edit.htm#7.10
> But I think that UTF-8
> (or UTF-16) internal representation would make too much penalties in
> term of execution time which is critical for Gnoga as server.
Well, whatever minor overhead UTF-8 may have it is in order of many
magnitude less than Unbounded_String or what you do in your code for
UXStrings would inflict.
> That's why I would like to experiment some data structure with smart
> character size (1, 2 or 4 bytes) and smart string length (either static
> or dynamic).
When I am concerned about performance:
1. I make all content in UTF-8. I convert anything to UTF-8 first, if I
get it from outside.
2. I never use dynamically allocated strings in any form, never in the
standard memory pool. If I really, really need a pool, I use a custom
arena pool and allocate a String there. As a nice side effect the server
will be resilient all sorts of something-is-too-large attacks, no space
in the arena, drop connection, bye.
3. I never copy anything. Thus, again, never Unbounded_String, only
String and its slices.
4. I never tokenize anything. I walk down the string in a single pass,
notice start/stop indices of a token, pass a string slice down to a
semantic callback, better, pass it straight to a look-up table. No
string copies.
5. I never use Wide or Wide_Wide. They are mess and require conversions
=> copying => a lot of resources.
--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de
prev parent reply other threads:[~2020-05-12 9:35 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-11 8:59 GNOGA - RFC UXStrings package Blady
2020-05-11 17:44 ` Jere
2020-05-12 8:13 ` Blady
2020-05-12 9:35 ` Dmitry A. Kazakov [this message]
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox