From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Thread: 103376,a3fe2aac201210c0
X-Google-Attributes: gid103376,public
X-Google-Language: ENGLISH,ASCII-7-bit
Path: 
 g2news1.google.com!news2.google.com!fu-berlin.de!uni-berlin.de!not-for-mail
From: "Nick Roberts" <nick.roberts@acm.org>
Newsgroups: comp.lang.ada
Subject: Re: reading a text file into a string
Date: Tue, 27 Jul 2004 13:08:18 +0100
Message-ID: <opsbspb4bep4pfvb@bram-2>
References: <40f6bf21@dnews.tpgi.com.au> <E1HJc.101277$Oq2.96646@attbi_s52>
 <LNOdnQjer4Hf22XdRVn-sA@megapath.net> <fOednXzORfHlE2Xd4p2dnA@comcast.com>
 <MrNoSpam-C3D3BB.18074619072004@news-server.bigpond.net.au>
 <40fb8c00$1_1@baen1673807.greenlnk.net> <nM-dnegXLbmdK2DdRVn-hQ@comcast.com>
 <XMCdnQjqXrrDf2PdRVn-pw@megapath.net> <OrednWv2_cdw3Z3cRVn-uQ@comcast.com>
 <rOWdnYp1K5bk_Z3cRVn-vA@megapath.net> <opsbl1vsgsp4pfvb@bram-2>
 <nvCdnXEk6MblI5zcRVn-pg@megapath.net> <opsbndy0o1p4pfvb@bram-2>
 <A8OdnbhfjamSBZjcRVn-ug@megapath.net>
Mime-Version: 1.0
Content-Type: text/plain; format=flowed; delsp=yes; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de G2rgRV0aPIVMMNxv2UgPqQXDYaYqpuKzv9KVmMNxpI6RmgDqk=
User-Agent: Opera M2/7.51 (Win32, build 3798)
Xref: g2news1.google.com comp.lang.ada:2411
Date: 2004-07-27T13:08:18+01:00
List-Id: <comp.lang.ada>

[I've put my replies out of order, because I think there's a bit
in the middle that needs to be said first.]

On Mon, 26 Jul 2004 18:48:04 -0500, Randy Brukardt
<randy@rrsoftware.com> wrote:

> ...
>> > Last time I checked, Intel was recommending that labels in
>> > code not be aligned further than 4 byte boundaries.
>>
>> The latest advice is:
>>
>>     Loop entry labels should be 16-byte-aligned when less than
>>     eight bytes away from a 16-byte boundary.
>>
>>     Labels that follow a conditional branch need not be aligned.
>>
>>     Labels that follow an unconditional branch or function call
>>     should be 16-byte-aligned when less than eight bytes away
>>     from a 16-byte boundary.
>>
>>     Use a compiler that will assure these rules are met for the
>>     generated code.
>>
>> [Section 2, Intel Architecture Optimization Reference Manual,
>> Copyright (c) 1998, 1999 Intel Corporation All Rights Reserved
>> Issued in U.S.A., Order Number: 245127-001]
>>
>> > I don't know precisely why they recommended that, but I don't
>> > claim to know better than Intel!
>>
>> Well, I don't think they ever did; maybe you need to do some
>> re-reading.
>
> That's it. That's the third time in the last few months that
> you've essentially called me a liar - or senile - and I'm done
> taking it without comment. Either we're going to talk without
> personal attacks, or we're not going to talk at all. OK?

Well, that comes as a bolt out of the blue, Randy.

Let me first assure you that neither this time nor at any time in
the past have I intended to imply that were lying or to make any
personal slight against you.

On consideration, I feel that I should not have made the remark
"maybe you need to do some re-reading", and I do truly apologise
for it. It was intended to be lighthearted and to be taken in a
friendly manner. Usenet is a medium given to stripping away all
the extra cues that a different medium (such as a telephone call)
would convey that help to disambiguate communications. It is easy,
sometimes, to forget this, but I should have known better.

In fact, I'm very unhappy that this seems to be the impression
that you have got of me Randy, because the truth is -- though
sadly you may not believe it now -- I have the greatest respect for
you, and I honestly admire you: for what you have done and continue
to do for the ARG and Ada standards and to champion the use of Ada;
for your contributions to the Ada community (as I know it, in terms
of Usenet and other Internet venues), and the friendly and helpful
manner of those contributions.

We may have had disagreements about lots of things during the
course of discussions between us, but there is big, big difference,
as far as I am concerned, between disagreeing with someone and
having less respect for them.

I do really hope that I have not permanently destroyed any faith
you may have had in me, and I regret anything I may have said in
the past to this effect. I often have a clumsy and hasty style of
writing on Usenet, and I'm sure that often what I say comes across
with a different meaning or emphasis to what I intended.

That said, I hope my remaining replies will be taken in good part.

> For the record, my knowledge of Intel's recommendations primarily
> comes from an Intel seminar I attended some years ago. Since it
> was covered by an NDA (non-disclosure agreement), I can't even
> show you - or tell you for that matter - much more than that.

I think I once read a magazine article that said Intel were no
longer recommending cache-line (or half-line) alignments for code,
for their (as it was then) upcoming Pentium model. I have read
this sort of thing before, and dismissed it as hype or gossip,
since the official (published) Intel recommendations never changed
in the event. So I have tended to assume that repetitions of the
idea have simply been repetitions of gossip.

Obviously, since your information in fact comes from direct from
Intel, I was wrong, and I was wrong to have doubted you.

> In any case, the rules that you gave above are weaker in most
> areas than the ones I remember (labels at 4, subprograms at 16),
> and certainly give no indication of the value of cache-line
> sized optimizations -- which is what I think we were talking
> about. I see nothing above recommending alignments greater than
> 16 for anything.

According to the manual, the 16-byte alignments are to do with
the way the instruction pre-decoding unit loads code, which is
16-bytes (a cache 'half-line') at a time. But is the manual
correct?

>> > If aligning things causes a program to use more pages,
>> > it can make it run slower, because it makes it load
>> > code from disk more frequently.
>>
>> But we (Robert and I) are talking about using alignments
>> sparingly, to improve the efficiency of the speed-critical
>> parts of a program. Surely you've heard of the 80-20 rule?
>> (Which is, of course, silly, being the 99-1 rule in
>> reality.)
>
> The largest alignment that you allow impacts the design of
> your stack and of your storage pool, at least if you intend
> to do it at compile-time. That's a distributed overhead -
> it's small, but certainly not zero.

Well, that's true and I cannot argue with it per se.

However, based on the presumption that typical software does
spend something like 99% of the time in 1% of the code (and
that 1% tends to be fairly 'tight' loops), I am not convinced
that the extra memory space that a program will take up (both
code and data) due to cache-line alignments is more likely to
cause the program to slow down more than it will cause it to
speed up (in that critical 1% of the code).

This will be dependent on how big the working set is during
the execution of that speed-critical code, in particular
whether the working set is caused to exceed available RAM; if
it is, then the program will indeed be slowed down. But, of
course, I am saying that even cheap computers have a lot of
RAM these days, so I think that eventuality is unlikely.

> ...
>> In any event, all the compiler has to do to align the stack
>> to 2^n bytes just prior to (parameter pushing and) subroutine
>> call is to emit:
>>
>>     and esp, -2^n
>>
>> et voila!
>
> How do you undo this when you leave the scope? You have to
> save the ESP value somewhere and restore it to do that, and
> *that* is an extra overhead.

Well, I don't think so. The usual thing is to do is to save
ESP in the EBP register at stack frame creation, and restore it
 from EBP just prior to return. There is, I grant, a need for a
little care, in that one would (I guess) need to do the stack
alignment I suggested before pushing anything onto the stack
that you might want to pop off it afterwards. Otherwise, I
think the 'and' instruction is the only extra thing required.

I vaguely remember that I have actually used this technique,
but a long time ago.

>
> ...
>> > Similarly, existing Windows linkers don't support
>> > alignments beyond 16 to my knowledge -- so again you would
>> > have to do something at runtime with a penalty.
>>
>> But then the point is that the linkers /should/ support other
>> alignments. It's no good saying "Oh, we can't do that because
>> the linker doesn't support it!" Obviously, you need to change
>> the linker. It's called not letting the tail wag the dog :-)
>
> You know as well I as do that you don't get to change your
> target system to your whim. You have to use the tools that
> users want to use, such as the Microsoft linker.
>
> But even if you wrote your own linker, I don't think that there
> is any guarentee of alignment in the loading of the parts of an
> .EXE file. So I don't know if any alignment that you have in
> your linker would actually be preserved.

I can't quickly find information on the subject, but I rather
suspect that an .EXE or .DLL is likely to be loaded page
aligned. That would mean alignments up to the page size would
be safe.

Also, I think possibly we're arguing at crossed purposes on
this point. I'm only arging that linkers and execution
environments /should/ support cache-line alignments. I accept
that many do not, in practice, and I accept that a compiler
targetting such a linker or environment cannot be expected
to so so either. I think this is how Robert's original comment
can be construed, also.

-- 
Nick Roberts