Concerning subscript bounds checks

comp.lang.ada
 help / color / mirror / Atom feed

* Concerning subscript bounds checks
@ 1996-06-21  0:00 Richard A. O'Keefe
  1996-06-21  0:00 ` Robert Dewar
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Richard A. O'Keefe @ 1996-06-21  0:00 UTC (permalink / raw)



There was some discussion in comp.lang.misc recently about array bounds
checks.  As an exercise, I converted Algorithm AS 47 (an implementation
of the Nelder-Mead simplex minimisation procedure) from Fortran to Ada.

The Fortran original contains 85 array subscripts.
The Ada version      contains 66 array subscripts.
Of these 66, all but 4 occur within the scope of a "for" loop
which *obviously* guarantees the safety of the subscript.

The remaining 4 occur in this context:
    subtype Simplex_Range is Natural range 0 .. Point'Length;
    P: "array (Simplex_Range) of ..."
    Y: "array (Simplex_Range) of ..."
    X: Point;
    J: Simplex_Range;
    ...
    J := 0;		-- at the start, J = Simplex_Range'First
    for I in X'Range loop
	...
	P(J) := ...
	Y(J) := ...
	J := J + 1;
    end loop;		-- at the end, J = Simplex_Range'Last
    P(J) := ...
    Y(J) := ...
end;

A reasonably smart compiler should be able to tell that these four
subscripts are also safe.

-- 
Fifty years of programming language research, and we end up with C++ ???
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-21  0:00 Concerning subscript bounds checks Richard A. O'Keefe
@ 1996-06-21  0:00 ` Robert Dewar
  1996-06-24  0:00   ` Adam Beneschan
                     ` (2 more replies)
  1996-06-25  0:00 ` ++           robin
  1996-06-25  0:00 ` William Clodius
  2 siblings, 3 replies; 16+ messages in thread
From: Robert Dewar @ 1996-06-21  0:00 UTC (permalink / raw)



Richard said

"The remaining 4 occur in this context:
    subtype Simplex_Range is Natural range 0 .. Point'Length;
    P: "array (Simplex_Range) of ..."
    Y: "array (Simplex_Range) of ..."
    X: Point;
    J: Simplex_Range;
    ...
    J := 0;             -- at the start, J = Simplex_Range'First
    for I in X'Range loop
        ...
        P(J) := ...
        Y(J) := ...
        J := J + 1;
    end loop;           -- at the end, J = Simplex_Range'Last
    P(J) := ...
    Y(J) := ...
end;

A reasonably smart compiler should be able to tell that these four
subscripts are also safe.
"

I suspect this judgment is based on informal reasoning ("well it is pretty
obvious to me that it can be figured out"). As always compiler optimizations,
particularly range analysis are always more complicated than they appear
from simple examples. Yes, a compiler could figure this out, but "reasonably
smart" is probably an underestimate. I would be surprised if many existing
compilers can figure even this particular one out.

P.S. GNAT has not even started to think about optimizing checks yet,
you get junk checks even in simple loops. It's something we plan
to start work on soon!





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-21  0:00 ` Robert Dewar
@ 1996-06-24  0:00   ` Adam Beneschan
  1996-06-24  0:00   ` Richard A. O'Keefe
  1996-06-24  0:00   ` William Clodius
  2 siblings, 0 replies; 16+ messages in thread
From: Adam Beneschan @ 1996-06-24  0:00 UTC (permalink / raw)



In article <dewar.835402601@schonberg> dewar@cs.nyu.edu (Robert Dewar) writes:
 >Richard said
 >
 >"The remaining 4 occur in this context:
 >    subtype Simplex_Range is Natural range 0 .. Point'Length;
 >    P: "array (Simplex_Range) of ..."
 >    Y: "array (Simplex_Range) of ..."
 >    X: Point;
 >    J: Simplex_Range;
 >    ...
 >    J := 0;             -- at the start, J = Simplex_Range'First
 >    for I in X'Range loop
 >        ...
 >        P(J) := ...
 >        Y(J) := ...
 >        J := J + 1;
 >    end loop;           -- at the end, J = Simplex_Range'Last
 >    P(J) := ...
 >    Y(J) := ...
 >end;
 >
 >A reasonably smart compiler should be able to tell that these four
 >subscripts are also safe.
 >"
 >
 >I suspect this judgment is based on informal reasoning ("well it is pretty
 >obvious to me that it can be figured out"). As always compiler optimizations,
 >particularly range analysis are always more complicated than they appear
 >from simple examples. Yes, a compiler could figure this out, but "reasonably
 >smart" is probably an underestimate. I would be surprised if many existing
 >compilers can figure even this particular one out.
 >
 >P.S. GNAT has not even started to think about optimizing checks yet,
 >you get junk checks even in simple loops. It's something we plan
 >to start work on soon!

Maybe I'm missing something, but it seems that the compiler should be
able to figure this out easily just from the fact that J is declared
as having subtype Simplex_Range.  If the program isn't erroneous, J
can never have a value outside Simplex_Range, and therefore no array
bounds checks should be necessary.

But is this still the case with Ada95, given the new definitions of
"bounded error" and the like?

In any case, if the declaration of J is changed to

    J : Natural;

then yes, the compiler's job would be harder.

                                -- Adam





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-21  0:00 ` Robert Dewar
  1996-06-24  0:00   ` Adam Beneschan
@ 1996-06-24  0:00   ` Richard A. O'Keefe
  1996-06-24  0:00     ` Robert Dewar
  1996-06-28  0:00     ` joeuser
  1996-06-24  0:00   ` William Clodius
  2 siblings, 2 replies; 16+ messages in thread
From: Richard A. O'Keefe @ 1996-06-24  0:00 UTC (permalink / raw)

dewar@cs.nyu.edu (Robert Dewar) writes:

>Richard said
>"The remaining 4 occur in this context:
...
>A reasonably smart compiler should be able to tell that these four
>subscripts are also safe.
>"

>I suspect this judgment is based on informal reasoning ("well it is pretty
>obvious to me that it can be figured out").

Wrong.  I worked out in some detail a straightforward method for a compiler
to handle this.  I chose not to write it up here because it's (a) obvious
and (b) old hat.

>As always compiler optimizations,
>particularly range analysis are always more complicated than they appear
>from simple examples.

Look, I was *PRAISING* Ada, not criticising anything.
I have *NO IDEA* what optimisations GNAT or any other Ada compiler does
or does not do. 

I know for a fact that the analysis required for this particular example
and for a wide range of examples like it (basically looping over a bunch
of vectors and _conditionally_ routing elements from one set of vectors
to another: APL Compress and Merge operations and a few other patterns
I've seen in Fortran code) is straightforward to specify and would take
linear time to compute (I am taking the attitude here that if a variable
outlives a goto you just throw up your hands and be pessimistic).

*Of course* doing such an analysis takes time.
*Of course* the compiler's time spent doing that could have been spent
doing other things.
*Of course* the compiler writers' time spent implementing that could
have been spent implementing other things.
*Of course* it might be beyond any reasonable dispute in some particular
case that *not* doing this optimisation might be the right choice.

>P.S. GNAT has not even started to think about optimizing checks yet,
>you get junk checks even in simple loops. It's something we plan
>to start work on soon!

The central point I was making was a very simple, very obvious, very
positive one:

    I translated an algorithm from Fortran to Ada,
    and it _easily_ and _naturally_ fell out that
    62 out of 66 subscripts could _obviously_ be
    compiled without checks using purely local
    information (no data flow or range analysis needed).

I call that a _good_ result.  (The 4 subscripts that aren't quite so
obvious are executed with low frequency, and only the upper bound is
not obviously safe.)

-- 
Fifty years of programming language research, and we end up with C++ ???
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-24  0:00   ` Richard A. O'Keefe
@ 1996-06-24  0:00     ` Robert Dewar
  1996-06-28  0:00     ` joeuser
  1 sibling, 0 replies; 16+ messages in thread
From: Robert Dewar @ 1996-06-24  0:00 UTC (permalink / raw)


Richard said

"The central point I was making was a very simple, very obvious, very
positive one:

    I translated an algorithm from Fortran to Ada,
    and it _easily_ and _naturally_ fell out that
    62 out of 66 subscripts could _obviously_ be
    compiled without checks using purely local
    information (no data flow or range analysis needed).

I call that a _good_ result.  (The 4 subscripts that aren't quite so
obvious are executed with low frequency, and only the upper bound is
not obviously safe.)"

Yes, indeed for this kind of Fortran algorithm, almost all checks can
be eliminated (even if we don't get the last 4, we are certainly doing
well).

For reasonably thorough global analysis tecniques used to optimize checks
(e.g. the old Alsys technology using partial redundancy techniques), it
is reasonable to estimate that the total cost of checks is of the order
of 10%.

Even more interesting than the range checks is the issue of getting rid
of elaboration checks, and of course getting rid of overflow checks is
quite hard (you really need the full machinery of range analysis for
this, and even then you don't get very far). However in the Fortran
world, you are dealing with floating-point, and floating-point overflow
is often close to free to detect at the hardware level (e.g. by 
implementing IEEE infinities -- neither IEEE nor Ada requires exception
traps at runtime for floating-point).

So I certainly do not want to argue with Richard's "central point". It is
an important one that needs repeating, I was just commenting around the
edge on those remaining four checks. Richard have you actually implemented
a compiler that eliinates these checks? That would be interesting if so,
and it would be interesting to have the reference.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-24  0:00   ` Richard A. O'Keefe
  1996-06-24  0:00     ` Robert Dewar
@ 1996-06-28  0:00     ` joeuser
  1996-06-28  0:00       ` Adam Beneschan
  1996-07-01  0:00       ` Richard A. O'Keefe
  1 sibling, 2 replies; 16+ messages in thread
From: joeuser @ 1996-06-28  0:00 UTC (permalink / raw)


I think I have the right piece of text here.


        >The remaining 4 occur in this context:
        >    subtype Simplex_Range is Natural range 0 .. Point'Length;
        >    P: "array (Simplex_Range) of ..."
        >    Y: "array (Simplex_Range) of ..."
        >    X: Point;
        >    J: Simplex_Range;
        >    ...
        >    J := 0;            -- at the start, J = Simplex_Range'First
        >    for I in X'Range loop
        >       ...
        >       P(J) := ...
        >       Y(J) := ...
        >       J := J + 1;
        >    end loop;          -- at the end, J = Simplex_Range'Last
        >    P(J) := ...
        >    Y(J) := ...
        >end;

        >A reasonably smart compiler should be able to tell that these four
        >subscripts are also safe.


and this is intuitively obvious to the most casual observer?
 
I think not.

Your problem lies in the J:=J+1; statement

You would be better off to use I as your index and not J.  (and it would work 
too.)  Here is why.

What happens the first time through this loop?

I=0
J=0

BUT guess what!!!!
J:=J+1;

That means that when I=X'Last
J will become X'Last+1

This basically equates to Simplex_Range'Last+1
                 Hence-----> your constraint error

J is now out of all it's possible ranges (and the ranges for the subscripts 
for both P and Y.

Is Point a string????

I love to bitch at the compilers too, but they are never wrong.  They cannot 
be wrong because they meet their standards.  Now WE have to meet theirs.

I just hope I got this to the right person.  I don't have the original 
message and had to kind of make this up from fragments.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-28  0:00     ` joeuser
@ 1996-06-28  0:00       ` Adam Beneschan
  1996-07-01  0:00       ` Richard A. O'Keefe
  1 sibling, 0 replies; 16+ messages in thread
From: Adam Beneschan @ 1996-06-28  0:00 UTC (permalink / raw)



joeuser@satcom.whit.org (joeuser) writes:
 >I think I have the right piece of text here.
 >
 >
 >        >The remaining 4 occur in this context:
 >        >    subtype Simplex_Range is Natural range 0 .. Point'Length;
 >        >    P: "array (Simplex_Range) of ..."
 >        >    Y: "array (Simplex_Range) of ..."
 >        >    X: Point;
 >        >    J: Simplex_Range;
 >        >    ...
 >        >    J := 0;            -- at the start, J = Simplex_Range'First
 >        >    for I in X'Range loop
 >        >       ...
 >        >       P(J) := ...
 >        >       Y(J) := ...
 >        >       J := J + 1;
 >        >    end loop;          -- at the end, J = Simplex_Range'Last
 >        >    P(J) := ...
 >        >    Y(J) := ...
 >        >end;
 >
 >        >A reasonably smart compiler should be able to tell that these four
 >        >subscripts are also safe.
 >
 >
 >and this is intuitively obvious to the most casual observer?
 > 
 >I think not.
 >
 >Your problem lies in the J:=J+1; statement
 >
 >You would be better off to use I as your index and not J.  (and it would work 
 >too.)  Here is why.
 >
 >What happens the first time through this loop?
 >
 >I=0
 >J=0

Uh, no.  The definition of "Point" isn't shown here, so how can you
assume that Point'first = 0?

Just from looking at the above code, without seeing the definition of
Point, my guess is that Point is an array whose 'first is 1, and the
intent was deliberately to make P and Y arrays with one more element
than X.

 >BUT guess what!!!!
 >J:=J+1;
 >
 >That means that when I=X'Last
 >J will become X'Last+1
 >
 >This basically equates to Simplex_Range'Last+1
 >                 Hence-----> your constraint error
 
*Whose* constraint error?  Nobody ever posted this code fragment
complaining about a constraint error or about code that doesn't work.
The issue being discussed was whether the unnecessary constraint
checks could be eliminated by a smart compiler.

                                -- Adam




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-28  0:00     ` joeuser
  1996-06-28  0:00       ` Adam Beneschan
@ 1996-07-01  0:00       ` Richard A. O'Keefe
  1996-07-01  0:00         ` Robert A Duff
  1 sibling, 1 reply; 16+ messages in thread
From: Richard A. O'Keefe @ 1996-07-01  0:00 UTC (permalink / raw)



I wrote
>    subtype Simplex_Range is Natural range 0 .. Point'Length;
:    P: "array (Simplex_Range) of ..."
:    Y: "array (Simplex_Range) of ..."
:    X: Point;
:    J: Simplex_Range;
:    ...
:    J := 0;            -- at the start, J = Simplex_Range'First
:    for I in X'Range loop
:       ...
:       P(J) := ...
:       Y(J) := ...
:       J := J + 1;
:    end loop;          -- at the end, J = Simplex_Range'Last
:    P(J) := ...
:    Y(J) := ...
:end;
>A reasonably smart compiler should be able to tell that these four
:subscripts are also safe.

joeuser@satcom.whit.org (joeuser) writes:

>You would be better off to use I as your index and not J.  (and it would work 
>too.)  Here is why.

No, it would NOT work.  I and J *HAVE DIFFERENT TYPES*.
J has type Simplex_Range, which is the index range for Simplex.
I has type Point'Range,   which is the index range for Point.
They are simply different types.  For example, in one use of this
function,
	Point'Range is -1 .. +1
	Simplex_Range is 0 .. 3

>What happens the first time through this loop?
>I=0
>J=0

WRONG!  THe first time through the loop, I is Point'First, which
could be *anything*, and in none of my uses of this procedure is
it zero.  Most of the time it's 1, but one example used -1.


>BUT guess what!!!!
>J:=J+1;

>That means that when I=X'Last
>J will become X'Last+1

Completely wrong.  When I = X'Last, J will be Y'Last-1.

>This basically equates to Simplex_Range'Last+1
>                 Hence-----> your constraint error

But I haven't *GOT* any constraint error.

>J is now out of all it's possible ranges (and the ranges for the subscripts 
>for both P and Y.

>Is Point a string????
No it isn't.  It's a point in Euclidean N-space.

>I love to bitch at the compilers too, but they are never wrong.  They cannot 
>be wrong because they meet their standards.  Now WE have to meet theirs.

But I *wasn't* bitching at the compiler,
it *wasn't* making any mistake,
and *neither was I*.


For what it's worth, I had great trouble figuring out how to write that
loop.  This is the simplex construction phase of nelder-mead.  The basic
idea is
	for J from P'First as I in Basis_Vector'Range loop
	    P(J) := Origin + Delta * Basis_Vector(I);
	    Y(J) := Funct(P(J));
	end loop;
	P(P'Last) := Origin;
	Y(P'Last) := F(P(P'Last));

In Algol, Fortran, or PL/I it wouldn't be much of a problem:
if Basis_Vector'Range is L..U, take P'Range to be L..U+1.
However, in Ada, U+1 may not exist.  For example,
	type Coord is (X,Y,Z);
	type Point is array (Coord) of Float;
So Simplex_Range has to be some other type.  The obvious thing was
	subtype Simplex_Range is Natural range 0 .. Point'Length;
(note that not only is the first value of I in this case not zero,
it isn't even a number).  The next obvious thing is

	for I in Point'Range loop
	    let J be Simplex_Range'Val(
		Point'Range'Pos(I) - Point'Range'Pos(Point'Range'First))
	    P(J) := ...
	
However, I couldn't figure out any way to code that conversion from I to J.
(The immediate problem is that Point'Range is a _range_, not a _(sub)type_,
so Point'Range'Pos is inexpressible.)

-- 
Fifty years of programming language research, and we end up with C++ ???
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-07-01  0:00       ` Richard A. O'Keefe
@ 1996-07-01  0:00         ` Robert A Duff
  1996-07-02  0:00           ` Richard A. O'Keefe
  0 siblings, 1 reply; 16+ messages in thread
From: Robert A Duff @ 1996-07-01  0:00 UTC (permalink / raw)



In article <4r7r65$nj@goanna.cs.rmit.EDU.AU>,
Richard A. O'Keefe <ok@goanna.cs.rmit.EDU.AU> wrote:
>	type Coord is (X,Y,Z);
>	type Point is array (Coord) of Float;
>So Simplex_Range has to be some other type.  The obvious thing was
>	subtype Simplex_Range is Natural range 0 .. Point'Length;
>(note that not only is the first value of I in this case not zero,
>it isn't even a number).  The next obvious thing is
>
>	for I in Point'Range loop
>	    let J be Simplex_Range'Val(
>		Point'Range'Pos(I) - Point'Range'Pos(Point'Range'First))
>	    P(J) := ...
>	
>However, I couldn't figure out any way to code that conversion from I to J.
>(The immediate problem is that Point'Range is a _range_, not a _(sub)type_,
>so Point'Range'Pos is inexpressible.)

What's wrong with "Coord'Pos(I) - Coord'Pos(Point'First)"?

By the way, if you have an array with unknown bounds, you can construct
a subtype for those bounds like this:

    subtype The_Range is The_Index_Type range The_Array'Range;

But there's no need to do that to use the 'Pos attribute.

Note that for T'Pos, the bounds of subtype T are ignored -- all that
matters is the type of that subtype.  For example:

    subtype S is Character range 'a'..'b';
    X: Integer := S'Pos('*'); -- No problem here.

S'Pos('*') is exactly the same as Character'Pos('*').

- Bob




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-07-01  0:00         ` Robert A Duff
@ 1996-07-02  0:00           ` Richard A. O'Keefe
  0 siblings, 0 replies; 16+ messages in thread
From: Richard A. O'Keefe @ 1996-07-02  0:00 UTC (permalink / raw)



I wrote:
	type Coord is ...
	type Point is array (Coord) of Float;
	    let J be Simplex_Range'Val(
		Point'Range'Pos(I) - Point'Range'Pos(Point'Range'First))
  However, I couldn't figure out any way to code that conversion from I to J.
  (The immediate problem is that Point'Range is a _range_, not a _(sub)type_,
  so Point'Range'Pos is inexpressible.)

bobduff@world.std.com (Robert A Duff) writes:

>What's wrong with "Coord'Pos(I) - Coord'Pos(Point'First)"?

I think it was probably too obvious.
For some reason, I didn't want to use the name of the index type.
I can no longer recall why.

-- 
Fifty years of programming language research, and we end up with C++ ???
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-21  0:00 ` Robert Dewar
  1996-06-24  0:00   ` Adam Beneschan
  1996-06-24  0:00   ` Richard A. O'Keefe
@ 1996-06-24  0:00   ` William Clodius
  1996-06-27  0:00     ` Richard A. O'Keefe
  2 siblings, 1 reply; 16+ messages in thread
From: William Clodius @ 1996-06-24  0:00 UTC (permalink / raw)



In article <4ql9eq$hdt@goanna.cs.rmit.EDU.AU> ok@goanna.cs.rmit.EDU.AU
(Richard A. O'Keefe) writes:

   From: ok@goanna.cs.rmit.EDU.AU (Richard A. O'Keefe)
   Keywords: subscripts
   Newsgroups: comp.lang.ada,comp.lang.misc
   Date: 24 Jun 1996 15:35:22 +1000
   Organization: Comp Sci, RMIT, Melbourne, Australia
   Path: newshost.lanl.gov!ncar!gatech!swrinde!ihnp4.ucsd.edu!munnari.OZ.AU!news.mel.connect.com.au!news.mira.net.au!harbinger.cc.monash.edu.au!news.rmit.EDU.AU!goanna.cs.rmit.EDU.AU!not-for-mail
   Lines: 60
   References: <4qdj3e$btf@goanna.cs.rmit.EDU.AU> <dewar.835402601@schonberg>
   NNTP-Posting-Host: goanna.cs.rmit.edu.au
   NNTP-Posting-User: ok
   X-Newsreader: NN version 6.5.0 #0 (NOV)
   Xref: newshost.lanl.gov comp.lang.ada:13943 comp.lang.misc:25665

   The central point I was making was a very simple, very obvious, very
   positive one:

       I translated an algorithm from Fortran to Ada,
       and it _easily_ and _naturally_ fell out that
       62 out of 66 subscripts could _obviously_ be
       compiled without checks using purely local
       information (no data flow or range analysis needed).

   I call that a _good_ result.  (The 4 subscripts that aren't quite so
   obvious are executed with low frequency, and only the upper bound is
   not obviously safe.)

But the comparison is not complete as it is done only one way.  I
remember reading somewhere that in the 80s an experimental F77 compiler
was written that could optimize away 90% of all array bounds checks on
a large variety of codes when array bounds checking was turned on.  (I
don't know if this capability was ever implemented in a commercial
Fortran compiler, but then it hasn't been implemented in the obvious
Ada 95 compiler.)  Did you check to see what an equivalent algorithm
would do on the original F77 code?  How would it do if rewritten in
F90 stle?



-- 

William B. Clodius		Phone: (505)-665-9370
Los Alamos National Laboratory	Email: wclodius@lanl.gov
Los Alamos, NM 87545




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-24  0:00   ` William Clodius
@ 1996-06-27  0:00     ` Richard A. O'Keefe
  1996-06-28  0:00       ` Ken Thomas
  0 siblings, 1 reply; 16+ messages in thread
From: Richard A. O'Keefe @ 1996-06-27  0:00 UTC (permalink / raw)


clodius@hotspec.lanl.gov (William Clodius) writes:

>In article <4ql9eq$hdt@goanna.cs.rmit.EDU.AU> ok@goanna.cs.rmit.EDU.AU
>(Richard A. O'Keefe) writes:
>   I call that a _good_ result.  (The 4 subscripts that aren't quite so
>   obvious are executed with low frequency, and only the upper bound is
>   not obviously safe.)

>But the comparison is not complete as it is done only one way.

Er, *what* comparison?  Nowhere did I state or imply that Ada was
*better* than Fortran.  All I stated, and all I meant to imply, was
that
    - in an algorithm whose original coding DID NOT HAVE ADA IN MIND
    - it fell out quite naturally
    - that almost all of the subscripts in the Ada version
    - OBVIOUSLY don't need checking
    - using only local information

>Did you check to see what an equivalent algorithm
>would do on the original F77 code?  How would it do if rewritten in
>F90 stle?

There are two difficulties with doing this kind of thing in Fortran.
(1) My Ada procedure is generic, mainly because I wanted it to be
    valid Ada 83, and that was the only way to pass a function.
    But having done that, it was possible to have a generic formal
    array type as well.

generic

    type Coord is digits <>;
    type Index is (<>);
    type Point is array (Index) of Coord;
    type Value is digits <>;
    with function Funct(X: in Point) return Value;

procedure Nelder_Mead(
    Start    : in  Point;
    Step     : in  Point;
    DevMin   : in  Value;
    Konvge   : in  Positive;
    Kcount   : in  Natural;
    Ncalls   : out Natural;
    Nrestarts: out Natural;
    Xmin     : out Point;
    Ymin     : out Value;
    Good     : out Boolean
);

    This means that if the Index bounds happen to be static, the Ada
    compiler can know what they are when generating code for an instance
    of the body.  The Fortran version starts out like this:

      SUBROUTINE NELMIN(FN, N, START, XMIN, YNEWLO, REQMIN, STEP,
     *          KONVGE, KCOUNT, ICOUNT, NUMRES, IFAULT)
         REAL FN
         INTEGER N
         REAL START(N)
         REAL XMIN(N)
         REAL YNEWLO
         REAL REQMIN
         REAL STEP(N)
         INTEGER KONVGE
         INTEGER KCOUNT
         INTEGER ICOUNT
         INTEGER NUMRES
         INTEGER IFAULT

    In F77, I cannot declare a local array of the same size as XMIN.
    (One Nelder-Mead subroutine I've seen handled that by assuming
    that N <= 20 and using fixed-size local arrays; another required
    the caller to pass workspace.)  In F90, I can.  

(2) The loop control variables in Fortran, whether F77 or F90, are
    not tied to the array bounds.  They are INTEGER, not Index.


Point (2) is not actually a difficulty for the obvious algorithm, in
which (scalar) variables start out with with the empty set as their
possible interval, and the set is expanded as necessary.  The key point
is that the form of the Ada loop makes it _obvbious_ without any data
flow analysis at all that a subscript is safe.  If you have

	P: array (Simplex_Range) of Point;
	Pstar, Pbar: Point;
	...
	for I in Point'Range loop
	    PStar(I) := PBar(I) + (P(Hi)(I) - PBar(I)) * (-Reflection);
	--        #          #           #         #
	end loop;

then it is obvious that each of the flagged subscripts is safe.

These particular subscripts, of course, would vanish entirely in
APL, PL/I, or Fortran 90:

	PSTAR = PBAR + (P(Hi,*) - PBAR)*(-REFLECTION);	/*PL/I*/

Now, I _already_ eliminated a lot of these from my count, because
I wrote some inline subprograms to do special bits of vector arithmetic.

Point (1) _is_ a significant difference between Ada and Fortran 90.
In this particular example, it doesn't matter so much, because the
simple rule
	"if a subscript is a loop identifier and the range of the loop
	 identifier is the same as the range of that subscript, then
	 no subscript check is needed"
works very well.  This is what I mean by the check being local.
The Fortran equivalent would be
	"in A(...,I,...) where A was declared to have dimensions
	 (...,L:U,...) and the definition of I that is live at this
	 point is from DO[]I = L,U, no subscript check is needed for I."

In the two Fortran versions,
- the version with fixed size local arrays would need a smart compiler
  to notice that "IF (N .GT. 20)" the subroutine exits at once, so that
  it can assume N <= 20 elsewhere.
- the version with workspace provided by the caller needs _two_ size
  parameters, N and N1 (= N+1), and the compiler doesn't _know_ that
  N1 = N+1 (because it isn't told, and in F77, cannot be told).
A Fortran 90 version would not have these problems.


Just to go over the main point one more time, I was *not* saying that Ada
is *better* than anything else, only that even the simplest of local-
information-only rules is *good* for Ada numeric code.

For what it's worth, here are some times:

-- Original Fortran results on a 36-bit DEC-2060
-- Function     NCalls  NRestarts       Value of Min    Time
-- 1            177      0              1.67E-9           T1    msec
-- 2            266      0              4.03E-9          230-T1 msec
-- 3            230      0              2.79E-9           T2    msec
-- 4            712     14              5.75E-14         640-T2 msec

-- My Fortran results on a 32-bit SPARC
-- Function     NCalls  NRestarts       Value of Min    Time
-- 1            177      0              1.0E-9          0.43 msec
-- 2            213      0              2.0E-8          0.88 msec
-- 3            230      0              6.3E-9          1.1  msec
-- 4            666     13              7.1E-14         5.8  msec
-- Compiled with "f77 -native -fast -stackvar -O3"

-- My Ada results on a 32-bit SPARC
-- Function     NCalls  NRestarts       Value of Min    Time
-- 1            177      0              1.79E-9          0.40 msec
-- 2            213      0              1.98E-8          0.94 msec
-- 3            231      0              3.24E-9          2.3  msec
-- 4            736     17              9.54E-14         6.9  msec
-- Compiled with "gnatmake -cargs -gnatp -O3 -Wall"

The SPARC results were obtained on a SPARCstation-10 (sunm4)
running Solaris 2.5, f77 SC4.0, gnat 3.04, gcc 2.7.2.
Oddly enough, it was much easier to rewrite the Fortran code to Ada
than to re_type_ the Fortran code and get it working.  It was only
when I noticed the "-XlistE" flag in Sun's f77 that I made any progress
with the Fortran version.  Static checking helps a _lot_.
Why the big difference for function 4?  I've no idea.
-- 
Fifty years of programming language research, and we end up with C++ ???
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-27  0:00     ` Richard A. O'Keefe
@ 1996-06-28  0:00       ` Ken Thomas
  0 siblings, 0 replies; 16+ messages in thread
From: Ken Thomas @ 1996-06-28  0:00 UTC (permalink / raw)
  To: Richard A. O'Keefe

My response to this does not concern the thread of the subject but to
suggest a technique that I have found useful in developing numerical
code that contains a subprogram in the generic formal part.

Many applications are not concerned with a single function but a family
of functions. The device I am suggesting is additional declarations to
the formal part.
> 
> generic
> 
      type USER_PARAMETERS is private;
>     type Coord is digits <>;
>     type Index is (<>);
>     type Point is array (Index) of Coord;
>     type Value is digits <>;
>     with function Funct(UP : User_Parameters; X: in Point) return Value;
> 
> procedure Nelder_Mead(UP : User_Parameters,...


-- 
Dr K.S. Thomas
Department of Electronics and Computer Science
University of Southampton
Highfield
Southampton SO17 1BJ
United Kingdom

Telephone : (+44) 01703 593029
Fax :       (+44) 01703 593903

email: kst@ecs.soton.ac.uk




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-21  0:00 Concerning subscript bounds checks Richard A. O'Keefe
  1996-06-21  0:00 ` Robert Dewar
@ 1996-06-25  0:00 ` ++           robin
  1996-06-27  0:00   ` Richard A. O'Keefe
  1996-06-25  0:00 ` William Clodius
  2 siblings, 1 reply; 16+ messages in thread
From: ++           robin @ 1996-06-25  0:00 UTC (permalink / raw)



	ok@goanna.cs.rmit.EDU.AU (Richard A. O'Keefe) writes:

	>There was some discussion in comp.lang.misc recently about array bounds
	>checks.  As an exercise, I converted Algorithm AS 47 (an implementation
	>of the Nelder-Mead simplex minimisation procedure) from Fortran to Ada.

	>The Fortran original contains 85 array subscripts.
	>The Ada version      contains 66 array subscripts.
	>Of these 66, all but 4 occur within the scope of a "for" loop
	>which *obviously* guarantees the safety of the subscript.

---I expect that the same observations could be made about
re-writing the program in PL/I, or in Fortran 90.

   Why not post the original Fortran version, and the Ada
translation, so we can see?

	>The remaining 4 occur in this context:
	>    subtype Simplex_Range is Natural range 0 .. Point'Length;
	>    P: "array (Simplex_Range) of ..."
	>    Y: "array (Simplex_Range) of ..."
	>    X: Point;
	>    J: Simplex_Range;
	>    ...
	>    J := 0;		-- at the start, J = Simplex_Range'First
	>    for I in X'Range loop
	>	...
	>	P(J) := ...
	>	Y(J) := ...
	>	J := J + 1;
	>    end loop;		-- at the end, J = Simplex_Range'Last
	>    P(J) := ...
	>    Y(J) := ...
	>end;

	>A reasonably smart compiler should be able to tell that these four
	>subscripts are also safe.

	>Fifty years of programming language research, and we end up with C++ ???

---50 years?  Wasn't Ada (after whom the language in which this
posting also appeared first) the first programmer?  Wouldn't that
make it ~130 years?

	>Richard A. O'Keefe




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-25  0:00 ` ++           robin
@ 1996-06-27  0:00   ` Richard A. O'Keefe
  0 siblings, 0 replies; 16+ messages in thread
From: Richard A. O'Keefe @ 1996-06-27  0:00 UTC (permalink / raw)

rav@goanna.cs.rmit.EDU.AU (++           robin) writes:

>---I expect that the same observations could be made about
>re-writing the program in PL/I, or in Fortran 90.

Again, this imputes to me a claim I never made.
I did not claim that Ada was *better* in this respect than PL/I or F90.
All I claimed was that it is *good*.
There is a big difference!

Obviously, APL, PL/I, and Fortran 90 have array arithmetic (although
I used APL enough to be seriously confused by PL/I's differences there).
Equally obviously, a vector arithmetic expression all of the vectors in
which have the same declared bounds needs no subscript or length checks.

>   Why not post the original Fortran version, and the Ada
>translation, so we can see?

If there's enough interest, I'll put them on the Web.
The Fortran code has _already_ been published; it's algorithm
AS 47 from the Journal "Applied Statistics".

nelder_mead.ads		 18 non-blank-non-comment lines
nelder_mead.adb		216 non-blank-non-comment lines
nelmin.f		204 non-blank-non-comment lines (my F77 version)

(rav: look in goanna:~ok/Ada.d/ if you want to see them.)

The difference between the Ada and Fortran line counts is entirely due
to my Ada style being one declaration per line, with the Fortran code
still having several grouped declarations.

>	>Fifty years of programming language research, and we end up with C++ ???

>---50 years?  Wasn't Ada (after whom the language in which this
>posting also appeared first) the first programmer?  Wouldn't that
>make it ~130 years?

I did not write "fifty years of _programming_ research" by
"fifty years or programming _language_ research".
-- 
Fifty years of programming language research, and we end up with C++ ???
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Concerning subscript bounds checks
  1996-06-21  0:00 Concerning subscript bounds checks Richard A. O'Keefe
  1996-06-21  0:00 ` Robert Dewar
  1996-06-25  0:00 ` ++           robin
@ 1996-06-25  0:00 ` William Clodius
  2 siblings, 0 replies; 16+ messages in thread
From: William Clodius @ 1996-06-25  0:00 UTC (permalink / raw)



In article <4qncnv$i94@goanna.cs.rmit.EDU.AU>
rav@goanna.cs.rmit.EDU.AU (++           robin) writes: 

   From: rav@goanna.cs.rmit.EDU.AU (++           robin)
   Keywords: subscripts
   Newsgroups: comp.lang.ada,comp.lang.misc,comp.lang.pl1,comp.lang.fortran
   Date: 25 Jun 1996 10:43:43 +1000
   Organization: Comp Sci, RMIT, Melbourne, Australia
   Path: newshost.lanl.gov!ferrari.mst6.lanl.gov!tesuque.cs.sandia.gov!lynx.unm.edu!newshub.tc.umn.edu!spool.mu.edu!munnari.OZ.AU!news.unimelb.EDU.AU!news.rmit.EDU.AU!goanna.cs.rmit.EDU.AU!not-for-mail
   Lines: 45
   References: <4qdj3e$btf@goanna.cs.rmit.EDU.AU>
   NNTP-Posting-Host: goanna.cs.rmit.edu.au
   NNTP-Posting-User: rav
   X-Newsreader: NN version 6.5.0 #0 (NOV)
   Xref: newshost.lanl.gov comp.lang.ada:14004 comp.lang.misc:25683 comp.lang.pl1:1161 comp.lang.fortran:40726


   ---50 years?  Wasn't Ada (after whom the language in which this
   posting also appeared first) the first programmer?  Wouldn't that
   make it ~130 years?

The fifty years undoubtably refers to the first high level language,
Plankalkul, designed in 1945 by the unemployed (for obvious reasons)
German computer engineer Konrad Zuse.  From the Language List
(http://cuiwww.unige.ch/langlist) 

   Plankalkul 
         Konrad Zuse, ca. 1945. The first programming language,
         implemented for the Z3 computer. Included arrays and
         records. Much of his work may have been either lost or
         confiscated in the aftermath of WWII. "The Plankalkul of
         Konrad Zuse", F.L. Bauer et al, CACM 15(7):678-685 (Jul 1972).

I believe that the language list is mistaken in saying that it was
implemented, although its design may have been planned for
implementation on the Z3.

Ada was a machine language programmer.

-- 

William B. Clodius		Phone: (505)-665-9370
Los Alamos National Laboratory	Email: wclodius@lanl.gov
Los Alamos, NM 87545




^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~1996-07-02  0:00 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1996-06-21  0:00 Concerning subscript bounds checks Richard A. O'Keefe
1996-06-21  0:00 ` Robert Dewar
1996-06-24  0:00   ` Adam Beneschan
1996-06-24  0:00   ` Richard A. O'Keefe
1996-06-24  0:00     ` Robert Dewar
1996-06-28  0:00     ` joeuser
1996-06-28  0:00       ` Adam Beneschan
1996-07-01  0:00       ` Richard A. O'Keefe
1996-07-01  0:00         ` Robert A Duff
1996-07-02  0:00           ` Richard A. O'Keefe
1996-06-24  0:00   ` William Clodius
1996-06-27  0:00     ` Richard A. O'Keefe
1996-06-28  0:00       ` Ken Thomas
1996-06-25  0:00 ` ++           robin
1996-06-27  0:00   ` Richard A. O'Keefe
1996-06-25  0:00 ` William Clodius

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox