From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: fac41,9a0ff0bffdf63657 X-Google-Attributes: gidfac41,public X-Google-Thread: 103376,4b06f8f15f01a568 X-Google-Attributes: gid103376,public X-Google-Thread: 1108a1,9a0ff0bffdf63657 X-Google-Attributes: gid1108a1,public X-Google-Thread: f43e6,9a0ff0bffdf63657 X-Google-Attributes: gidf43e6,public From: Matthew Heaney Subject: Re: Software landmines (loops) Date: 1998/09/17 Message-ID: X-Deja-AN: 391918418 Sender: matt@mheaney.ni.net References: <35f51e53.48044143@ <35EEF597.A1119EF4@draper.com> <35F5B529.6DC6A59A@draper.com> NNTP-Posting-Date: Wed, 16 Sep 1998 20:03:33 PDT Newsgroups: comp.lang.eiffel,comp.object,comp.software-eng,comp.lang.ada Date: 1998-09-17T00:00:00+00:00 List-Id: Tim McDermott writes: > Matthew Heaney wrote: > > > By separating the array iteration from the item comparison, I can reason > > about the two different termination conditions independently. So > > instead of one (larger) problem with two terms, I have two (smaller) > > problems, with one term each. > > But here you are assuming that the termination conditions are > disjoint! In the general case they are not. Have you never had the > experience of writing something of the form > > while (A) { > > if (B) return b; > ... > if (C) return c; > ... > if (D) return d; > > } > > only to discover some time later that A&!B&!C (the path predicate of > the if(D) statement forces the value of D? Worse yet, that the > evaluation of the loop predicate interacts with the evaluation of the > if predicates. My experience tells me that distributing complexity is > a Bad Idea. > > That is not to say that I always use se/se. But I know that se/se is better > form, and I have a real good idea of the risks of multiple return coding. Let's consider a real-life problem that I had to solve last week. I have a string that was padded on the right with blanks. What I need to do is strip off the blanks, and return the index of the last non-blank character. Let's analyze this problem first using Dijkstra's predicate transformer, and then using Ada. S is a string of length n. It is indexed starting from 0, and therefore ending with index n-1. The Linear Search theorem tells us to search backwards from the maximum index value, since we want to find the maximum index with a character that is not blank. If the string is all blanks, then we return -1. Here's the postcondition: R: -1 <= i < n and Aj: i+1 <= j < n: S[j] = ' ' and i /= -1 => S[i] /= ' ' Let's derive the invariant by weakening the postcondition, which in this case means just removing some conjuncts. Studying the postcondition for things always true means that the postcondition is P: -1 <= i < n and Aj: i+1 <= j < n: S[j] = ' ' We know that P and not B => R where B is the loop predicate. So our loop predicate is the negation of that last conjunct of the postcondition: not B is: i /= -1 => S[i] /= ' ' Getting rid of the implication: i /= -1 => S[i] /= ' ' is same as i = -1 cor S[i] /= ' ' Now negate this to give us the loop predicate: i /= -1 cand S[i] = ' ' Initialization means making the invariant true, so just initialize i to the value n-1. (The universal quantifier is true by virtue of the fact that a null range is defined to have the value True.) Here is the result of our efforts: i := n-1; do i /= -1 cand S[i] = ' ' -> i := i - 1 od QED As Benjamin Whorf is want to remind us, the language we use has a profound effect on how we think. The Ada language provides loop constructs to traverse an array, and so the Ada programmer will "naturally" use a for loop, not a while loop, to index into the string. This doesn't mean you can't use a while loop to do the job. But if you do, you'll force readers of your code to ask why you didn't just use a for loop. They'll ask, because you violated the Principle of Least Astonishment. What if the string is all blanks? In a real program, the bounds of a non-null string can be any positive integer (starting with 1). Let's therefore return the value 0 if there are all blanks in the string. (We could have easily decided to return S'First - 1, or indeed any number less than S'First.) The Ada programmer first tackles the job of setting up the loop. We use a for loop, iterating in reverse (because we're searching for the largest index that's not a blank): procedure Op (S : in String) is Last : Natural := ???; begin for Index in reverse S'Range loop ??? end loop; We have to give Last the value of the largest index that's not a blank. for Index in reverse S'Range loop if S (Index) /= ' ' then Last := Index; ... If the string comprises all blanks, then we'll iterate over the entire string without the if-test ever being true, and so we have to give Last a default value to handle that case: Last : Natural := 0; begin for Index in reverse S'Range loop ... If we do find a value that's not blank, then we just bail out of the loop, because we're done: Last : Natural := 0; begin for Index in reverse S'Range loop if S (Index) /= ' ' then Last := Index; exit; end if; end loop; {R: Last = 0 or else S (Last) /= ' '} QED That's how an Ada programmer thinks. Just think of a for loop as an array iterator. We terminate the loop early, via an exit, because we don't need to visit all components of the array. This is an example of reasoning by using the programming language itself, instead of Dijkstra's formalism. The point is that the algorithm above can be reasoned about, in spite of the presence of an exit from the middle of the loop. The examples above illustrate two modes of thinking: top-down and bottom-up. Dijkstra is clearly a top-down thinker - and a very, very good one. He likes to reason about correctness in the abstract, using the predicate calculus, independently of any specific programming language. (Indeed, he seems almost proud of the fact the none of his programs are executable.) The rest of us are bottom-up thinkers. We reason about correctness using constructs of the target language, and using idioms that have greater "cognitive fit" to the human programmer. That's what the ACM paper I cited earlier was all about: how rules about programming derived via top-down thinking clash with cognitive models (bottom-up reality) used by ordinary programmers. You tell the programmer how it "should" be done by (ie "don't terminate a loop by exiting from the middle"), but such a solution doesn't always match what a human programmer feels is a more "natural" solution to the problem. So he makes mistakes trying to reason about what he has been told is the "right" way to do it. Worse, he may even reason less, or not at all, thinking that "this must be OK because I don't exit the loop from the middle."