comp.lang.ada
 help / color / mirror / Atom feed
From: "Randy Brukardt" <randy@rrsoftware.com>
Subject: Re: Reserve_Capacity for Unbounded_String?
Date: Tue, 24 Jul 2007 20:28:04 -0500
Date: 2007-07-24T20:28:04-05:00	[thread overview]
Message-ID: <f868qn$qac$1@jacob-sparre.dk> (raw)
In-Reply-To: 46A69136.3010803@obry.net

(I managed to press Send on this without actually writing a reply...sorry)
"Pascal Obry" <pascal@obry.net> wrote in message
news:46A69136.3010803@obry.net...
...
> > So I stand by my original statement. Anyone that depends on the
performance
> > characteristics of a particular implementation (of anything!) is an
idiot
> > and will be burned in the future.
>
> Well Randy, its all too easy to dismiss points like this by saying that
> you have not seen a problem with the current implementation and if
> somebody show you something wrong in this area then he has to use
> something else that Unbounded_String... At least that's the way I'm
> reading your message.

I want to see a real program compiled with using Janus/Ada that shows that
(a) the performance is not good enough currently and (b) that tweaking
Ada.Strings.Unbounded would make the performance good enough. Otherwise,
time spent on Ada.Strings.Unbounded is a waste of time (and potentially a
compatibility problem, such that existing programs might run out of memory
where they do not currently).

If I spent much time on things where (a) and (b) were not true, then I'd be
the idiot, by not making the best use of my time to improve the
compile/runtime/etc.

> Just try this on your implementation:
>
>    C : constant Character := 'x';
>
>    U : Unbounded_String;
>
>    for k in 1 .. 5_000_000 loop
>       U := U & C;
>    end loop;
>
> And then test it with GNAT. Of course that is not a real example but
> very close to something done in my templates engine.

(Aside: I hope not. It's always bad practice to append small things to large
things, be it strings, I/O, or anything else. It always leads to performance
issues, more so the more abstract the interface is. I realize that you have
to balance understandability with performance, and sometimes occassional
appends of small things make sense. But doing it in an inner loop like this
sets off my "this is a very bad idea" detector.)

The old version of GNAT I have installed allocates for each call to '&' (I
just checked the source code), so I doubt testing it would prove anything
whatsoever. (Indeed, looking at the code, the Janus/Ada version is likely to
be more efficient. But, based on your assertions, I have to assume that the
GNAT code is out of date, so there is no point in looking at it further.)

But even if it did, so what? Using any version of "&" (be it for String,
Unbounded_String, or any other type) in a tight loop is silly. The amount of
code needed to implement "&" is huge, and virtually any alternative way of
doing the operation would be better.

Moreover, the only way to save allocations in an increasing length string is
to overallocate the memory; that is, to waste memory. (If you had a way to
find out the blocking factor of the underlying storage pool, you could
possibly avoid that by rounding allocations up, but that would only help in
pathological programs that are adding single characters to end of strings
thousands of times.) Janus/Ada is all about minimum memory usage; speed is a
secondary consideration and in general we will not write runtime packages
such that they waste memory. As such, even if I was to rewrite this, I would
not allocate more than the memory needed by the current operation (modulo a
very small blocking factor), and as such this program would *still* need a
new allocation nearly every iteration (because the string is continuously
getting longer).

I'd expect a program that *deleted* parts of strings to show much more
improvement than one that added strings, because in that case there wouldn't
be a need to free anything (just adjust the length). But such programs are
far rarer than the append everything ones like this example.

> You were speaking about performance about checking the preallocated
> buffer. My point is that it is not that costly compared to the price of
> reallocating a whole string thousands of time.

I'm sure that's true, but my point also was that programs that do that are
generally pathologies (surely, that's the only examples that I've been shown
so far). And the real programs I've looked at (like our spam filter), this
overhead is not a significant part of the entire program. Even though our
spam filter does something like the above when decoding BASE64 encoded
e-mails. And I don't think that the performance improvement of a tweaked
Unbounded_String would be a tenth of simply not using it at all for
calculations -- which would make far more difference in a misbehaving
application.

After all, the first rule of performance improvement is to find a better
algorithm; tweaking an existing one to make it faster can only provide
incremental improvements (and that rarely helps anything). Surely that
applies here, too.

And now I've wasted far too much time on this worthless topic.

                                   Randy.





  parent reply	other threads:[~2007-07-25  1:28 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-22 19:54 Reserve_Capacity for Unbounded_String? Maciej Sobczak
2007-07-22 21:32 ` Robert A Duff
2007-07-23 19:29   ` Maciej Sobczak
2007-07-23 20:30     ` Robert A Duff
2007-07-23  4:28 ` Jeffrey R. Carter
2007-07-23 15:07 ` Adam Beneschan
2007-07-24  1:01   ` Randy Brukardt
2007-07-24  7:57     ` Pascal Obry
2007-07-24 18:58       ` Randy Brukardt
2007-07-24 23:50         ` Robert A Duff
2007-07-25  0:00           ` Randy Brukardt
2007-07-24 23:54         ` Pascal Obry
2007-07-25  0:52           ` Randy Brukardt
2007-07-25  1:28           ` Randy Brukardt [this message]
2007-07-25  7:48             ` Pascal Obry
2007-07-25  9:55               ` Georg Bauhaus
2007-07-25 10:02                 ` Georg Bauhaus
2007-07-25 18:58               ` Randy Brukardt
2007-07-25  8:50             ` Martin Krischik
2007-07-25  9:26               ` AW: " Grein, Christoph (Fa. ESG)
2007-07-25 15:32                 ` Martin Krischik
2007-07-25 15:39                 ` Martin Krischik
2007-07-24 23:41     ` Robert A Duff
2007-07-25  0:16       ` Randy Brukardt
2007-07-25  2:25         ` Robert A Duff
2007-07-25  6:07           ` Simon Wright
2007-07-25 19:08           ` Randy Brukardt
2007-07-25 20:37             ` Maciej Sobczak
2007-07-25 22:06               ` Georg Bauhaus
2007-07-26  6:24                 ` Maciej Sobczak
2007-07-26  8:09                   ` Dmitry A. Kazakov
2007-07-26  8:20                     ` Pascal Obry
2007-07-26  9:59                       ` Dmitry A. Kazakov
2007-07-26  8:35                   ` Georg Bauhaus
2007-07-26 22:11               ` Randy Brukardt
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox