From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: fac41,9a0ff0bffdf63657
X-Google-Attributes: gidfac41,public
X-Google-Thread: 103376,4b06f8f15f01a568
X-Google-Attributes: gid103376,public
X-Google-Thread: 1108a1,9a0ff0bffdf63657
X-Google-Attributes: gid1108a1,public
X-Google-Thread: f43e6,9a0ff0bffdf63657
X-Google-Attributes: gidf43e6,public
From: Matthew Heaney <matthew_heaney@acm.org>
Subject: Re: Software landmines (loops)
Date: 1998/09/17
Message-ID: <m3pvcv6cr5.fsf@mheaney.ni.net>
X-Deja-AN: 391918418
Sender: matt@mheaney.ni.net
References: <slrn6sopmg.4fg.jdege@jdege.visi.com> <35f51e53.48044143@
 <m3af4mq7f4 <35EDC648.76F03F32@draper.com> <EypnAz.L66@ecf.toronto.edu>
 <35EEF597.A1119EF4@draper.com> <m3ogsw503u.fsf@mheaney.ni.net>
 <35F5B529.6DC6A59A@draper.com>
NNTP-Posting-Date: Wed, 16 Sep 1998 20:03:33 PDT
Newsgroups: comp.lang.eiffel,comp.object,comp.software-eng,comp.lang.ada
Date: 1998-09-17T00:00:00+00:00
List-Id: <comp.lang.ada>

Tim McDermott <mcdermott@draper.com> writes:

> Matthew Heaney wrote:
> 
> > By separating the array iteration from the item comparison, I can reason
> > about the two different termination conditions independently.  So
> > instead of one (larger) problem with two terms, I have two (smaller)
> > problems, with one term each.
> 
> But here you are assuming that the termination conditions are
> disjoint!  In the general case they are not.  Have you never had the
> experience of writing something of the form
> 
> while (A) {
> 
> if (B) return b;
> ...
> if (C) return c;
> ...
> if (D) return d;
> 
> }
> 
> only to discover some time later that A&!B&!C (the path predicate of
> the if(D) statement forces the value of D?  Worse yet, that the
> evaluation of the loop predicate interacts with the evaluation of the
> if predicates.  My experience tells me that distributing complexity is
> a Bad Idea.
> 
> That is not to say that I always use se/se.  But I know that se/se is better
> form, and I have a real good idea of the risks of multiple return coding.

Let's consider a real-life problem that I had to solve last week.

I have a string that was padded on the right with blanks.  What I need
to do is strip off the blanks, and return the index of the last
non-blank character.

Let's analyze this problem first using Dijkstra's predicate transformer,
and then using Ada.

S is a string of length n.  It is indexed starting from 0, and therefore
ending with index n-1.

The Linear Search theorem tells us to search backwards from the maximum
index value, since we want to find the maximum index with a character
that is not blank.

If the string is all blanks, then we return -1.

Here's the postcondition:

R: -1 <= i < n and 
   Aj: i+1 <= j < n: S[j] = ' ' and 
   i /= -1 => S[i] /= ' '

Let's derive the invariant by weakening the postcondition, which in this
case means just removing some conjuncts.  Studying the postcondition for
things always true means that the postcondition is

P: -1 <= i < n and Aj: i+1 <= j < n: S[j] = ' '

We know that 

   P and not B => R

where B is the loop predicate.  So our loop predicate is the negation of
that last conjunct of the postcondition:

not B is: i /= -1 => S[i] /= ' '

Getting rid of the implication:

  i /= -1 => S[i] /= ' ' is same as

  i = -1 cor S[i] /= ' ' 

Now negate this to give us the loop predicate:

  i /= -1 cand S[i] = ' '

Initialization means making the invariant true, so just initialize i to
the value n-1.  (The universal quantifier is true by virtue of the fact
that a null range is defined to have the value True.)

Here is the result of our efforts:

i := n-1;
do i /= -1 cand S[i] = ' ' -> i := i - 1 od

QED

As Benjamin Whorf is want to remind us, the language we use has a
profound effect on how we think.  The Ada language provides loop
constructs to traverse an array, and so the Ada programmer will
"naturally" use a for loop, not a while loop, to index into the string.

This doesn't mean you can't use a while loop to do the job.  But if you
do, you'll force readers of your code to ask why you didn't just use a
for loop.  They'll ask, because you violated the Principle of Least
Astonishment.

What if the string is all blanks?  In a real program, the bounds of a
non-null string can be any positive integer (starting with 1).  Let's
therefore return the value 0 if there are all blanks in the string.  (We
could have easily decided to return S'First - 1, or indeed any number
less than S'First.)

The Ada programmer first tackles the job of setting up the loop.  We use
a for loop, iterating in reverse (because we're searching for the
largest index that's not a blank):

procedure Op (S : in String) is 
   Last : Natural := ???;
begin
   for Index in reverse S'Range loop
      ???
   end loop;


We have to give Last the value of the largest index that's not a blank.

   for Index in reverse S'Range loop
      if S (Index) /= ' ' then
         Last := Index;
...


If the string comprises all blanks, then we'll iterate over the entire
string without the if-test ever being true, and so we have to give Last
a default value to handle that case:

   Last : Natural := 0;
begin
   for Index in reverse S'Range loop
...


If we do find a value that's not blank, then we just bail out of the
loop, because we're done:

   Last : Natural := 0;
begin
   for Index in reverse S'Range loop
      if S (Index) /= ' ' then
         Last := Index;
         exit;
      end if;
   end loop;

   {R: Last = 0 or else S (Last) /= ' '}

QED

That's how an Ada programmer thinks.  Just think of a for loop as an
array iterator.  We terminate the loop early, via an exit, because we
don't need to visit all components of the array.

This is an example of reasoning by using the programming language
itself, instead of Dijkstra's formalism.  The point is that the
algorithm above can be reasoned about, in spite of the presence of an
exit from the middle of the loop.

The examples above illustrate two modes of thinking: top-down and
bottom-up.  Dijkstra is clearly a top-down thinker - and a very, very
good one.  He likes to reason about correctness in the abstract, using
the predicate calculus, independently of any specific programming
language.  (Indeed, he seems almost proud of the fact the none of his
programs are executable.)

The rest of us are bottom-up thinkers.  We reason about correctness
using constructs of the target language, and using idioms that have
greater "cognitive fit" to the human programmer.  

That's what the ACM paper I cited earlier was all about: how rules about
programming derived via top-down thinking clash with cognitive models
(bottom-up reality) used by ordinary programmers.

You tell the programmer how it "should" be done by (ie "don't terminate
a loop by exiting from the middle"), but such a solution doesn't always
match what a human programmer feels is a more "natural" solution to the
problem.

So he makes mistakes trying to reason about what he has been told is the
"right" way to do it.  Worse, he may even reason less, or not at all,
thinking that "this must be OK because I don't exit the loop from the
middle."