Hotspot. Dynamic compilers "better" than static one?

comp.lang.ada
 help / color / mirror / Atom feed

* Hotspot. Dynamic compilers "better" than static one?
       [not found] <6knj4m$odp$1@nnrp1.dejanews.com>
@ 1998-05-30  0:00 ` nabbasi
  1998-05-30  0:00   ` Roedy Green
  0 siblings, 1 reply; 9+ messages in thread
From: nabbasi @ 1998-05-30  0:00 UTC (permalink / raw)



from comp.java.programmer :

In article <6knj4m$odp$1@nnrp1.dejanews.com>, pgpagel@yahoo.com says...
>
>Check it out (well-written article and good links):
>
>http://www.developer.com/journal/techfocus/052598_hotspot.html
>

thanks for the pointer.

This below is from http://www.javaworld.com/jw-03-1998/jw-03-hotspot.html
which is a link from above link:

>Runtime information
>The second major advantage of dynamic compilation is the ability to take into
>account information that is known only at runtime. Again, the details are
>proprietary. But it's not hard to
>imagine, for example, that if a method is invoked in a loop that has an upper
>>index value of 10,000, it's an immediate candidate for optimization. If, on the
>>other hand, the upper loop
>index value is 1, the optimizer will know to ignore that method and let it be
>>interpreted. Because a static compiler has no way of knowing for sure what the
>>values might be, it can't make such judgements.

I think the above statment is all bogus and hand waving.

a static compiler has all the time in the world to do optimization. after all,
it is done before run-time.  It does not matter if the static compiler is
optmizing too many things. some which might turn out at run time that they are
not needed. so, I can wait few more seconds for the program to compile, and 
I am willing to live with an un-necceserarly "over-optimized" machine code.

Also, this dynamic compiler, inthe above example,  needs to check for the 
upper limit every time, may be one time it is called with upper limit of 
1. but next time it is not.

I just do not see how this example makes dynamic compiler somehow better than
static compilers. may be someone can comes up with a better example.

Nasser

------------------
Spam free Usenet news http://www.newsguy.com




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hotspot. Dynamic compilers "better" than static one?
  1998-05-30  0:00 ` Hotspot. Dynamic compilers "better" than static one? nabbasi
@ 1998-05-30  0:00   ` Roedy Green
  1998-05-30  0:00     ` Andi Kleen
  1998-06-01  0:00     ` Norman H. Cohen
  0 siblings, 2 replies; 9+ messages in thread
From: Roedy Green @ 1998-05-30  0:00 UTC (permalink / raw)



nabbasi asked, wrote, or quoted:
>I just do not see how this example makes dynamic compiler somehow better
>than
>static compilers. may be someone can comes up with a better example.

Here are a couple of examples where information gleaned from dynamic
analysis could help substantially in optimisation.

(1) Most machine architectures run fastest if there are no jumps, and if
any conditional jumps don't jump, just fall through.

By dynamic analysis you can determine which branch is the more likely, and
put that one inline and branch off to code elsewhere for the less likely
one.

Over a decade ago I spent months writing screaming fast code for the
nucleus of a 32-bit Forth compiler that hand optimised every jump this way.
The secondary advantage is that the most commonly used code is more likely
to be pre-fetched or in cache.  A static optimiser can't do this, since it
has no knowledge of which branch is the more likely.

(2) If through dynamic analysis, the compiler discovered a loop is chewing
up many cycles, it can consider unravelling it, inlining routines in it, or
partially unravelling it.  It would be counter productive to do this to a
non-crucial loop.  You would add to the code bulk and thus slow the whole
process down with longer load times, wasted cache, and wasted RAM.



For the JAVA GLOSSARY and the CMP Utilities: <http://oberon.ark.com/~roedy>
--
Roedy Green                          Canadian Mind Products 

Sponsored by: www.athena.com, makers of Integer, a multiuser
spreadsheet JavaBean. Opinions expressed are  not necessarily 
those of Athena Design.
-30-




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hotspot. Dynamic compilers "better" than static one?
  1998-05-30  0:00   ` Roedy Green
@ 1998-05-30  0:00     ` Andi Kleen
       [not found]       ` <dewar.896629645@merv>
  1998-06-01  0:00     ` Norman H. Cohen
  1 sibling, 1 reply; 9+ messages in thread
From: Andi Kleen @ 1998-05-30  0:00 UTC (permalink / raw)

Roedy Green <roedy@oberon.ark.com> writes:

> Over a decade ago I spent months writing screaming fast code for the
> nucleus of a 32-bit Forth compiler that hand optimised every jump this way.
> The secondary advantage is that the most commonly used code is more likely
> to be pre-fetched or in cache.  A static optimiser can't do this, since it
> has no knowledge of which branch is the more likely.

Many modern compilers support profiling feedback. This means you compile
the program, run it to generate the profiling option and compile the
program again with feeding the profiling data into the compiler.

Dynamic compiling has the potential advantage that the code is tuned to
the particular usage pattern of the enduser, but I think for most programs
that does not matter much.

-Andi

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hotspot. Dynamic compilers "better" than static one?
  1998-05-30  0:00   ` Roedy Green
  1998-05-30  0:00     ` Andi Kleen
@ 1998-06-01  0:00     ` Norman H. Cohen
  1998-06-03  0:00       ` John Volan
  1 sibling, 1 reply; 9+ messages in thread
From: Norman H. Cohen @ 1998-06-01  0:00 UTC (permalink / raw)

Roedy Green wrote:

> Here are a couple of examples where information gleaned from dynamic
> analysis could help substantially in optimisation.
> 
> (1) Most machine architectures run fastest if there are no jumps, and if
> any conditional jumps don't jump, just fall through.
> 
> By dynamic analysis you can determine which branch is the more likely, and
> put that one inline and branch off to code elsewhere for the less likely
> one.
> 
> Over a decade ago I spent months writing screaming fast code for the
> nucleus of a 32-bit Forth compiler that hand optimised every jump this way.
> The secondary advantage is that the most commonly used code is more likely
> to be pre-fetched or in cache.  A static optimiser can't do this, since it
> has no knowledge of which branch is the more likely.
...

In particular, dynamic analysis has been shown to be effective in
optmizing dynamically bound function calls in object-oriented
languages.  Suppose there is an Ada tagged type T with primitive
subprogram

   procedure P(X: in out T);

and types T1, T2, T3, ... derived from T.  If we know that most of the
calls on P are going to be with actual parameters of type T1, the call

   P(A);

(where A is of type T'Class) can, in effect, be transformed into

   if A'Tag = T1'Tag then
      P( T1(A) );  -- Static call on T1's version of P;
                   --   tag check for conversion can be
                   --   optimized away
   else
      P(A);  -- Dispatching call based on tag of A
   end if;

(Translation into Java:  Suppose there is a class T with method p and
final subclasses T1, T2, T3, ....  If we know that most of the calls on
p are going to be for objects of class T1, the call

   x.p();

(where x is of type T) can, in effect, be transformed into

   if ( x instanceof T1 ) {
      ((T1)x).p();
   } else {
      x.p();
   }

where the check for the cast to T1 can be optimized away and the call to
((T1)x).p can be compiled as direct method call to T1's version of p.)

By eliminating the indirect branch (dynamic call) in the most common
case, this transformation (sometimes called "if-conversion") improves
I-cache locality and instruction prefetching.  More importantly
(especially when cache effects are neglible compared to the overhead of
interpreting Java byte codes), the static call on T1's version of P can
be inlined.  Besides eliminating the procedure-call overhead, the
inlining enables further optimizations, because the inlined copy of the
procedure body can be optimized with all the information available in
the calling context.

It makes no sense to perform "if-conversion" unless there is information
available indicating that more of the dispatching calls from a given
call site go to one target than to others.  This information can be
gleaned from profiling or from dynamic compilation.  The Java Hotspot
compiler goes even further, rechecking dispatching frequencies as a
program executes and recompiling if-conversions favoring different
target methods as the program's dynamic behavior evolves.  It has even
been claimed that inlining optimizations are redone dynmically, but I
suspect that the inlining optimizations performed dynamically are much
less ambitious than those traditionally performed statically.

Studies of OO code indicate that if-conversions can be very worthwhile. 
(See, for example, the paper "Reducing Indirect Function Call Overhead
in C++ Programs" by Brad Calder and Dirk Grunwald of the University of
Colorado, in the 1994 POPL proceedings.)  Indeed, often an apparently
dynamic call has only one possible target in the program (e.g. when an
Ada programmer declared a type tagged to provide flexibility for future
enhancements, but did not derive from that type, or when a Java
programmer neglected to declare a method final); in this case the
if-conversion can actually be done at link time, without any profiling
information.  In other cases, a class library with several subclasses
may be linked in, but the program might actually construct objects of
only one class; this is detectable by profiling or dynamic compilation. 
Of course there are also cases where more than one target is dispatched
to at run time, but the distribution of dispatch targets is highly
skewed.

-- 
Norman H. Cohen
mailto:ncohen@watson.ibm.com
http://www.research.ibm.com/people/n/ncohen

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hotspot. Dynamic compilers "better" than static one?
       [not found]       ` <dewar.896629645@merv>
@ 1998-06-02  0:00         ` Dr Richard A. O'Keefe
  1998-06-02  0:00           ` Lieven Marchand
  0 siblings, 1 reply; 9+ messages in thread
From: Dr Richard A. O'Keefe @ 1998-06-02  0:00 UTC (permalink / raw)

Robert Dewar wrote:
<<Many modern compilers support profiling feedback.>>

> Actually fewer production compilers than you might imagine actually use
> this approach, though of course it has appeared in research compilers for
> a long time.

Sun's C, Pascal, and Fortran compilers for SPARC do this.
Digital's compilers for the Alpha do this.
The MIPS compilers have done this even longer.
In fact, of the machines I have or could get accounts on here,
the only ones that DON'T have 'production compilers' already
installed that use profile-driven-feedback are Macs and PCs.

What may be of more interest is that I have seen profile-driven
feedback make a difference of 0-20%, with 0% actually being quite
common, and I've had better performance using gcc -O6 (without
feedback) on an Alpha than I've had from the Dec compiler _with_
feedback.

So profile-driven feedback HAS BEEN SHIPPING FOR SEVERAL YEARS
in production compilers from MIPS, Digital, and Sun, and that's
just the ones I've used.  But it's only one technique, and not
always the most important.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hotspot. Dynamic compilers "better" than static one?
  1998-06-02  0:00         ` Dr Richard A. O'Keefe
@ 1998-06-02  0:00           ` Lieven Marchand
  0 siblings, 0 replies; 9+ messages in thread
From: Lieven Marchand @ 1998-06-02  0:00 UTC (permalink / raw)



"Dr Richard A. O'Keefe" <ok@atlas.otago.ac.nz> writes:

> Robert Dewar wrote:
> <<Many modern compilers support profiling feedback.>>
> 
> > Actually fewer production compilers than you might imagine actually use
> > this approach, though of course it has appeared in research compilers for
> > a long time.
> 
> Sun's C, Pascal, and Fortran compilers for SPARC do this.
> Digital's compilers for the Alpha do this.

The K&R C compiler for Ultrix already had profiling feedback about 10 years 
ago.

> What may be of more interest is that I have seen profile-driven
> feedback make a difference of 0-20%, with 0% actually being quite
> common, and I've had better performance using gcc -O6 (without
> feedback) on an Alpha than I've had from the Dec compiler _with_
> feedback.
> 

I've never seen it made much difference however. 

> So profile-driven feedback HAS BEEN SHIPPING FOR SEVERAL YEARS
> in production compilers from MIPS, Digital, and Sun, and that's
> just the ones I've used.  But it's only one technique, and not
> always the most important.

Has anybody done any studies on how to choose input for the trial runs 
you're going to feed back to the optimizer? There seems to be a silent
assumption that "typical" input values are adequate and that the choice
of input values won't make the performance worse on some other type of
values.

-- 
Lieven Marchand <mal@bewoner.dma.be> 
------------------------------------------------------------------------------
Few people have a talent for constructive laziness. -- Lazarus Long




















Fascist news feed line fodder




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hotspot. Dynamic compilers "better" than static one?
  1998-06-01  0:00     ` Norman H. Cohen
@ 1998-06-03  0:00       ` John Volan
  1998-06-05  0:00         ` Norman H. Cohen
  0 siblings, 1 reply; 9+ messages in thread
From: John Volan @ 1998-06-03  0:00 UTC (permalink / raw)



Norman H. Cohen wrote:
> 
> (Translation into Java:  Suppose there is a class T with method p and
> final subclasses T1, T2, T3, ....  If we know that most of the calls on
> p are going to be for objects of class T1, the call
> 
>    x.p();
> 
> (where x is of type T) can, in effect, be transformed into
> 
>    if ( x instanceof T1 ) {
>       ((T1)x).p();
>    } else {
>       x.p();
>    }
> 
> where the check for the cast to T1 can be optimized away and the call to
> ((T1)x).p can be compiled as direct method call to T1's version of p.)

Doesn't that presume that T1 is a "final" class (i.e., a leaf on the
inheritance tree, with no further subclasses allowed)?  Or at least that
T1.p() is a "final" method (i.e., locked out from all further overriding
in any subclass of T1)?  If T1 isn't final, then even the expression
(T1)x has to be treated as a potentially polymorphic. Given the JVM's
dynamic class-loading mechanism, you can never know when a subclass of
T1 might suddenly be pulled into the JVM.  If T1.p() isn't final, then
the call ((T1)x).p() is still potentially dispatching, so how can the
optimization described above be done legally?

Note that the Java expression

   ( x instanceof T1 )

is NOT equivalent to the Ada expression

   ( A in T1 )        -- i.e., A'Tag = T1'Tag

it's equivalent to

   ( A in T1'Class )  -- i.e., A'Tag = 'Tag of T1 or any subclass of T1

If I'm missing something here, I'd be grateful for a clarification.

-- 
Signature volanSignature = 
  new Signature
  ( /*name:      */ "John G. Volan",
    /*employer:  */ "Raytheon Advanced C3I Systems, San Jose",
    /*workEmail: */ "johnv@ac3i.dseg.ti.com",
    /*homeEmail: */ "johnvolan@sprintmail.com",
    /*selfPlug:  */ "Sun Certified Java Programmer",
    /*twoCents:  */ "Java would be even cooler with Ada95's " +
                    "generics, enumerated types, function types, " +
                    "named parameter passing, etc...",
    /*disclaimer:*/ "These views not packaged in COM.ti.dseg.ac3i, " +
                    "so loading them throws DontQuoteMeError. :-)" );




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hotspot. Dynamic compilers "better" than static one?
  1998-06-03  0:00       ` John Volan
@ 1998-06-05  0:00         ` Norman H. Cohen
  1998-06-08  0:00           ` John Volan
  0 siblings, 1 reply; 9+ messages in thread
From: Norman H. Cohen @ 1998-06-05  0:00 UTC (permalink / raw)
  To: John Volan


John Volan wrote:

> Norman H. Cohen wrote:
> >
> > (Translation into Java:  Suppose there is a class T with method p and
> > final subclasses T1, T2, T3, ....  If we know that most of the calls on
...
> 
> Doesn't that presume that T1 is a "final" class (i.e., a leaf on the
> inheritance tree, with no further subclasses allowed)? 
...
> 
> If I'm missing something here, I'd be grateful for a clarification.

Yeah, you're missing the first word on the second quoted line from my
post! :-)

-- 
Norman H. Cohen
mailto:ncohen@watson.ibm.com
http://www.research.ibm.com/people/n/ncohen




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Hotspot. Dynamic compilers "better" than static one?
  1998-06-05  0:00         ` Norman H. Cohen
@ 1998-06-08  0:00           ` John Volan
  0 siblings, 0 replies; 9+ messages in thread
From: John Volan @ 1998-06-08  0:00 UTC (permalink / raw)



Norman H. Cohen wrote:
> 
> Yeah, you're missing the first word on the second quoted line from my
> post! :-)

Sheesh! It's like when you don't notice you've doubled a word on the
the margins from one line to another... :-)

-- John Volan




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~1998-06-08  0:00 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <6knj4m$odp$1@nnrp1.dejanews.com>
1998-05-30  0:00 ` Hotspot. Dynamic compilers "better" than static one? nabbasi
1998-05-30  0:00   ` Roedy Green
1998-05-30  0:00     ` Andi Kleen
     [not found]       ` <dewar.896629645@merv>
1998-06-02  0:00         ` Dr Richard A. O'Keefe
1998-06-02  0:00           ` Lieven Marchand
1998-06-01  0:00     ` Norman H. Cohen
1998-06-03  0:00       ` John Volan
1998-06-05  0:00         ` Norman H. Cohen
1998-06-08  0:00           ` John Volan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox