comp.lang.ada
 help / color / mirror / Atom feed
From: "Nick Roberts" <nick.roberts@acm.org>
Subject: Re: reading a text file into a string
Date: Sat, 24 Jul 2004 16:14:50 +0100
Date: 2004-07-24T16:14:50+01:00	[thread overview]
Message-ID: <opsbndy0o1p4pfvb@bram-2> (raw)
In-Reply-To: nvCdnXEk6MblI5zcRVn-pg@megapath.net

On Fri, 23 Jul 2004 20:42:53 -0500, Randy Brukardt <randy@rrsoftware.com>  
wrote:

> ...
> Well, first of all, books don't necessarily equal practice.

In other words, you /are/ trying to say all those computer
scientists got it wrong ;-)

> If aligning things causes a program to use more pages, it
> can make it run slower, because it makes it load code from
> disk more frequently.

But we (Robert and I) are talking about using alignments
sparingly, to improve the efficiency of the speed-critical
parts of a program. Surely you've heard of the 80-20 rule?
(Which is, of course, silly, being the 99-1 rule in reality.)

> Anyway, I wasn't arguing that alignment per-se is a bad
> idea. We do it on integers, for instance, and I think that
> virtually all compilers do that.

> I was arguing that on the x86, stack alignments beyond 4
> can only be done at run-time. (Unless *all* software in
> the system in under your control, and there are no
> interrupts/signals on your stack -- never true in
> practice.)

But Randy, it you get a signal/interrupt on your stack, it
all happens on the top of your stack. It doesn't affect the
stack's alignment! Were you actually talking about
callbacks?

In any event, all the compiler has to do to align the stack
to 2^n bytes just prior to (parameter pushing and) subroutine
call is to emit:

    and esp, -2^n

et voila!

> That's a distributed penalty that gets paid everywhere.

No it isn't. Only in calling those subroutines which require
alignment, and even then the penalty is an 'and' instruction
which, as you know, can probably be scheduled to take zero
time on a superscalar target.

> Similarly, existing Windows linkers don't support alignments
> beyond 16 to my knowledge -- so again you would have to do
> something at runtime with a penalty.

But then the point is that the linkers /should/ support other
alignments. It's no good saying "Oh, we can't do that because
the linker doesn't support it!" Obviously, you need to change
the linker. It's called not letting the tail wag the dog :-)

> In both cases, the penalty might very well cost more than
> the time savings possible.

I think I've demonstrated that this is very unlikely.

> Given there is a penalty, doing alignments automatically is
> a bad idea.

All I can say is that, given that there /isn't/ a penalty,
doing (cache-line) alignments automatically is a /good/
idea :-)

> Last time I checked, Intel was recommending that labels in
> code not be aligned further than 4 byte boundaries.

The latest advice is:

    Loop entry labels should be 16-byte-aligned when less than
    eight bytes away from a 16-byte boundary.

    Labels that follow a conditional branch need not be aligned.

    Labels that follow an unconditional branch or function call
    should be 16-byte-aligned when less than eight bytes away
    from a 16-byte boundary.

    Use a compiler that will assure these rules are met for the
    generated code.

[Section 2, Intel Architecture Optimization Reference Manual,
Copyright (c) 1998, 1999 Intel Corporation All Rights Reserved
Issued in U.S.A., Order Number: 245127-001]

> I don't know precisely why they recommended that, but I don't
> claim to know better than Intel!

Well, I don't think they ever did; maybe you need to do some
re-reading.

>> If you are worried about the fact that all stacks and heaps/
>> pools must be cache-line aligned (32, 64 bytes?), you have
>> missed the RAM revolution that has been going on for the last
>> two decades ;-)
>
> That's only possible if you build a new OS from the ground up.

Hehe :-)

> Stacks aren't aligned in Windows or Linux. So you have a pay
> a penalty to make them so;

Again, I think the penalty is tiny (or zero), and not universal.

> and because of interrupt handlers and the like,

Did you mean callbacks?

> you can't even trust your own stack.

Indeed, so you have to align it yourself using an 'and'.

> Heap allocations aren't aligned in Windows, either. (Although you could  
> build you own heap on top of the page management in
> Windows -- but you better be prepared to allocate 64K at a
> time.) Again, you can fix this with run-time overhead.

Okay, but the example that Robert gave was of a (presumably)
stack allocated object, and nobody mentioned anything about
Windows or the IA-32 before you did. In general, there's
nothing to prevent heaps/pools being capable of cache-line
aligned allocation; I guess it would be harder to use the
gaps for smaller allocations, but I'm sure that doesn't
really matter.

> But if you're willing to spend run-time overhead, an
> address clause does the same thing without any work.

Well, I would argue that a good highly optimising compiler
should provide a convenient and portable way of enabling the
programmer to achieve cache-line optimisations, for both code
and data. Probably the best way is by providing appropriate
pragmas (that will be harmlessly ignored when irrelevant).

A possibility is to interpret the humble

    pragma Optimize(Time);

to mean doing the cache-line alignments recommended for the
target processor (group or architecture).

In general, it is better for the compiler to make decisions
about code or data placement for optimisation purposes,
since only the compiler can know /all/ the other implement-
ational details which could affect these decisions. I think
it is best for the compiler to make these decisions guided
by hints given in the form of pragmas.

However, if a compiler does not do cache-line optimisations
itself (automatically), it ought to support some reasonable
method by which it can be done explicitly (and I don't think
using an address clause is ideal for this purpose). I think
think it is implicit that by 'compiler' Robert and I mean
'the toolchain necessary to get from source to executable'.

-- 
Nick Roberts



  reply	other threads:[~2004-07-24 15:14 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-15 17:27 reading a text file into a string zork
2004-07-15 17:49 ` Marius Amado Alves
2004-07-15 19:57   ` Nick Roberts
2004-07-15 17:59 ` Marius Amado Alves
2004-07-15 19:18   ` Nick Roberts
2004-07-15 19:18 ` Nick Roberts
2004-07-15 20:02   ` Nick Roberts
2004-07-16  1:23 ` Jeffrey Carter
2004-07-16  2:20 ` Steve
2004-07-16  2:26 ` Steve
2004-07-16 16:16   ` Jeffrey Carter
2004-07-16 17:45     ` Nick Roberts
2004-07-16 21:19   ` Randy Brukardt
2004-07-17  2:27     ` Robert I. Eachus
2004-07-17 11:31       ` Mats Weber
2004-07-17 15:52         ` Robert I. Eachus
2004-07-17 22:38           ` Jeffrey Carter
2004-07-18 13:44             ` zork
2004-07-19  8:07       ` Dale Stanbrough
2004-07-19  8:58         ` Martin Dowie
2004-07-21  0:17           ` Robert I. Eachus
2004-07-21 21:39             ` Randy Brukardt
2004-07-22 22:34               ` Robert I. Eachus
2004-07-23  0:49                 ` Randy Brukardt
2004-07-23 21:56                   ` Nick Roberts
2004-07-24  0:34                     ` tmoran
2004-07-24  1:16                       ` Nick Roberts
2004-07-24  1:42                     ` Randy Brukardt
2004-07-24 15:14                       ` Nick Roberts [this message]
2004-07-26 23:48                         ` Randy Brukardt
2004-07-27 12:08                           ` Nick Roberts
2004-07-27 23:24                             ` Robert I. Eachus
2004-07-29  0:55                               ` Randy Brukardt
2004-07-29  0:53                             ` Randy Brukardt
2004-07-29  7:25                               ` Martin Dowie
2004-07-29 20:08                               ` Robert I. Eachus
2004-07-30  0:14                                 ` tmoran
2004-07-24  2:56                   ` Robert I. Eachus
2004-07-19 11:51       ` Ada2005 (was " Peter Hermann
2004-07-19 12:51         ` Dmitry A. Kazakov
2004-07-19 13:01         ` Nick Roberts
2004-07-19 13:35           ` Martin Dowie
2004-07-19 17:22             ` Nick Roberts
2004-07-19 23:50           ` Randy Brukardt
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox