From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,6a9844368dd0a842
X-Google-Attributes: gid103376,public
From: bobduff@world.std.com (Robert A Duff)
Subject: Re: seperate keyword and seperate compilation with Gnat?
Date: 1996/07/03
Message-ID: <DtzC8p.3o2@world.std.com>
X-Deja-AN: 163567663
references: <31D95D93.28D8D15B@jinx.sckans.edu>
 <31DA7327.6CE2@epi.syr.lmco.com> <PhvqKBA1Lq2xEwOr@rk-comp.demon.co.uk>
organization: The World Public Access UNIX, Brookline, MA
newsgroups: comp.lang.ada
Date: 1996-07-03T00:00:00+00:00
List-Id: <comp.lang.ada>


In article <PhvqKBA1Lq2xEwOr@rk-comp.demon.co.uk>,
Rob Kirkbride  <rob@rk-comp.demon.co.uk> wrote:
>>It appears that what GNAT in effect does is a #include of all subunit
>>bodies. If you try to compile a subunit by itself, GNAT will compile it
>>but not generate any object code for it. Likewise, if you compile a
>>package body which has within it subunit declarations (i.e. "function X
>>return Boolean is separate;") GNAT will only generate object code *if*
>>it can compile all of the subunits. 

That's right.  GNAT has an option to just check legality, and not
generate code.  You can compile something in this mode without having
its subunits available.  But if you want to generate code, all subunits
have to be available.  And if you change a subunit, you have to
recompile the parent, it and re-generates code for the whole tree of
subunits.

This is slightly annoying, but I think the designers of GNAT made a
reasonable choice here, because:

1. If you really want the benefits of separate compilation, you can use
child library units, instead of subunits.  Anything that can be done
with subunits can be restructured as a set of library units with child
units.  (This isn't much help for existing Ada 83 code, I admit.)

2. Compiling the whole tree at once allows more efficient code to be
generated.  For example, suppose you have a procedure containing a
package body stub.  If you compile that procedure without looking at the
package body subunit, you have to assume the worst.  For example, you
have to assume that the package body will create some tasks, so the
procedure will have to have a "task master" data structure as a
compiler-generated local variable.  Just in case the package body
declares some tasks.  But the package body will usually *not* declare
any tasks, so you've wasted the effort of creating and initializing and
finalizing that data structure.  GNAT, on the other hand, always knows
whether a given procedure contains any tasks, because it doesn't
generate code until it can look at all the package body subunits.

3. Compiling the whole tree at once makes the compiler simpler (= less
buggy).  Consider the same example as above -- a procedure containing a
package body stub.  How big is the stack frame for that procedure?
Well, GNAT knows, because it can see how many variables are declared in
that package body.  But if the compiler can't look at the package body,
then the size is unknown, which means you need a pretty clever linker,
or else you introduce some indirections.

On the other hand, package body subunits are a much bigger problem for
the implementer than subprogram subunits.  And the latter are more
common.  One could argue that it's better to use the "suck it all in"
approach for packages, but to use the more traditional approach for
subprograms.

>>Now, I suppose that GNAT's "#include" approach could be defended
>>by an argument like "GNAT *compiles* the subunits and their enclosing
>>bodies in accordance with the LRM, it just doesn't generate object code
>>for them unless ..."  I think that GNAT could thus be said to obey the 
>>*letter* of the LRM, but "#include'ing" subunits does not, IMHO, obey
>>its *spirit*.

True.  The letter of the law is: the compiler reads the source code, and
says "Yes, it's legal", or "No, it's illegal".  A nice compiler will say
*why* it's illegal, but the language standard doesn't even require that.
The language standard doesn't say anything about when the object code is
generated.  For example, an interpretive implementation is allowed --
the "compiler" has to detect legality errors at "compile time", but it
might not generate any object code, and an interpreter then executes the
program from source code.  That's an extreme case, but is allowed.  It's
also allowed to generate all code at link time.  If your machine is fast
enough, or your program small enough, that's perfectly reasonable.

Most people think "compile" means "take some source code, and generate
some object code".  But the Ada RM doesn't say that -- the RM says that
"compile" means "take some source code, and tell me if its legal".

>I did ask ACT about this and they said that it generates significantly
>better code.

Certainly true.

>... I have personally found it compiles the packages much
>faster presumably because it doesn't have to keep loading the body in
>each time a new separate is compiled. However, it may be also be due to
>just being a better compiler!

Interesting point.  Compiling subunits the "right" (i.e. traditional)
way requires some sort of program-library database stored on disk.  I've
seen compilers where reading the program library junk is horribly slow
-- so slow, that it might be faster to read the source code, as GNAT
does.  This isn't necessarily true, but in practise, it sometimes is.

Furthermore, implementing a traditional program library adds a big chunk
of complexity to the compiler (= compiler bugs -- see point 3 above).

>2) The other problem is that it seems to use a heck of a lot of virtual
>memory as it compiles, which starts to slow it down by greater amounts
>as it begins to swap. Its one thing I was going to ask Robert Dewar
>about whether separates really should make the memory requirements go up
>so much, and if so why. I have units that want over a gigabyte of
>virtual memory to compile whereas another compiler only takes 100M or
>so.

The reason GNAT is using so much memory is that it is sucking up the
entire tree of subunits.  If you compare the amount of memory GNAT uses
with the amount of memory used by a traditional compiler when you
replace all the stubs with their subunits, it should be comparable.

Of course gcc, and therefore GNAT, is not designed to run on
small-memory machines, either.

>I suppose I was surprised that they managed to wangle this feature in.

GNAT is certainly obeying the letter of the law.

As to whether the GNAT way of doing things is desirable, well, I don't
know, but it seems like a reasonable trade-off.

Note that a similar issue arises with generics.  The compiler cannot
generate code when it sees a generic body, unless it is *always* doing
generic-code-sharing, which leads to all kinds of complexity and
inefficiency.

- Bob