From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00,
	REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,a7f223906eb859a,start
X-Google-Attributes: gid103376,public
From: smize@news.imagin.net (Samuel Mize)
Subject: GOTO considered necessary
Date: 1997/06/06
Message-ID: <5n977i$2cmc$1@prime.imagin.net>
X-Deja-AN: 246556163
Organization: ImagiNet Communications Ltd, Arlington, Texas
Reply-To: smize@imagin.net (Samuel Mize)
Newsgroups: comp.lang.ada
Date: 1997-06-06T00:00:00+00:00
List-Id: <comp.lang.ada>


Greetings,

I'm rebeating a dead horse, in order to write up a consensus paper to
submit to adahome.  I'll appreciate your input.

This one concerns use of "goto" statements.

I have scanned through the back c.l.a messages in DejaNews.  I request
your comments on this summary.  I also have a few questions for everyone.

Samuel Mize

                        USE OF "GOTO" IN ADA
                        --------------------
SUMMARY:
--------

1) In general, "goto" statements should be avoided.  They create
   confusing execution flow structures -- at first, or after a few
   maintenance changes.

2) As a rule of thumb, if your code needs a "goto" to work efficiently
   (or at all), you have probably overlooked a simpler, better design.

3) "Goto" statements are acceptable in machine-generated code, but ONLY
   if a human is NEVER expected to read the code.  (For example, a
   module generated by a tool from data-flow diagrams.)

4) As an exception, Finite State Machines (FSMs) are often implemented
   with goto statements.

   The people doing this argue that, for FSMs, "goto"s improve clarity
   and readability, primarily because they let the code more directly
   reflect the FSM model.

5) Other usages are very suspect, but there may be OK cases.  If it really
   improves clarity, go ahead -- but document WHY you used a "goto," and
   why other approaches were found lacking.

6) Norm Cohen recommended Knuth's article in the December 1974 Computing
   Surveys, as "the best discussion I've ever seen of the goto issue."

[------------------------------------------------------------------------]
[ Point 2 is new material from me.  As an example, some code (originally ]
[ from Knuth) was posted as a case where a goto makes sense.  Yet when   ]
[ redesigned without it, the code is clearer and simpler.  I've appended ]
[ the example and my analysis at the end of this write-up.               ]
[------------------------------------------------------------------------]

QUESTIONS:
----------

1) Are there other enumerable cases where "goto" improves clarity?
   (Not examples -- I'll grant as given that isolated instances may
   occur -- but application classes or general situations that could
   be described and watched out for.)

2) In what application areas do goto-based FSM implementations
   commonly occur?

3) Does this template reflect the design of a goto-based FSM, or have I
   missed something?  Is there a better general approach?

      begin
         <<State_1>>                  -- A block like this exists for
            [processing for state 1]  -- each state of the machine.
                                      --
            if [condition] then       -- An explicit variable keeping
               goto [State_X];        -- the "current" state is not
            elsif [condition] then    -- needed -- it is represented
               goto [State_Y];        -- by the program counter.
            else                      --
               goto [State_Z];        -- The coder must ensure execution
            end if;                   -- does not "fall through" to the
                                      -- next state by accident.
   
         <<State2>>
            ...

4) The alternative would be as shown below.  This discussion is biased
   by my ignorance of leading-edge FSM work.  Please comment on
   advantages and disadvantages of this approach.

      -- good grief, use meaningful names in a real application
      type State is (Initial_State, State1, ... Staten);
      Current_State: State;
   
      -- variables that may affect the next transition
      ...

      function Next_State (Current_State: State) return State;
   
      loop
         exit when
           (Current_State = [a terminal state])
           or 
           (Current_State = [another terminal state])
               ...;
   
         -------------------------------------------------------- AAA
         case Current_State is
            when State_1 =>
               [Processing for state 1]
               Current_State := Next_State (Current_State);
            ...
         end case;

         if Debug then
            Text_Io.Put_Line (State'Image (Current_State)); ----- BBB
         end if;

      end loop;

   Next_State can use an if/elsif cascade, or an array or hash-table
   lookup, or whatever is appropriate for the application-specific
   transition table.
   
   It has these advantages:

   (a) The transition logic is separate from the processing logic for
       each state.  (This is good or bad, depending on how intertwined
       their logic naturally is.)

   (b) There is a place to put processing that must occur at the start
       or end of each state (note lines AAA and BBB).

   (c) The possible states are listed in an enumeration, usable for
       I/O or to represent a "memory" of previous states, if needed.

   It has these disadvantages:

   (a) The transition logic is separate from the processing logic for
       each state.

   (b) It is less efficient: at the end of each state, we jump to restart
       the loop, evaluate the case expression, and only then jump to the
       next state.

   (c) If may be harder to locate accidentally-"killed" states.  A
       cross-reference tool could identify unused statement labels.

   Adding or deleting a state is harder with the case structure because
   you have to edit in two locations, keeping the type definition and
   the case structure synchronized.  On the other hand, adding or
   deleting a state is harder with the goto structure because the logic
   that jumps to the new (or deleted) state is less localized -- you
   have to hunt through all the code for the "goto"s.  This seems to
   be a wash.

EXAMPLE OF SIMPLIFYING DESIGN TO ELIMINATE GOTOS:
-------------------------------------------------

The code as given was: - - - - - - - - - - - - - - - - - - - - - - -

> Here's an example to ponder:  Consider the declarations
> 
>    type Node_Type;
> 
>    type Tree_Type is access Node_Type;
> 
>    type Node_Type is
>       record
>          Count       : Natural := 0;
>          Left, Right : Tree_Type;
>       end record;
> 
>    type Bit_Vector_Type is array (Natural range <>) of Boolean;
> 
>    type Bit_Vector_Pointer_Type is access Bit_Vector_Type;
> 
>    Root : Tree_Type;
>    BV   : Bit_Vector_Pointer_Type;
> 
> BV points to a bit vector describing a path (False = left, True = right)
> through the tree pointed to by Root.  We are counting the number of times
> a given bit vector is encountered in the Count component of the
> corresponding tree node.  Root starts out pointing to a tree consisting
> only of a root node, and new nodes are added to the tree as needed.  The
> loop below has three exits, and the actions appropriate when one of the
> exits is taken is different from the actions appropriate when the other
> two are taken, so a goto comes in handy.  Here's the code:
> 
>       Node := Root;
>       Bit := BV'First;
>       loop
>          if Bit > BV'Last then
>             goto Increment;
>          end if;
>          if BV(Bit) then
>             if Node.Left = null then
>                Node.Left := new Node;
>                Node := Node.Left;
>                exit;
>             else
>                Node := Node.Left;
>             end if;
>          else
>             if Node.Right = null then
>                Node.Right := new Node;
>                Node := Node.Right;
>                exit;
>             else
>                Node := Node.Right;
>             end if;
>          end if;
>          Bit := Bit + 1;
>       end loop;
> 
>       -- We've reached a leaf.  Read the rest of BV.all while extending
>       --    the tree along the corresponding path only.
>       for I in Bit+1 .. BV'Last loop
>          if BV(I) then
>             Node.Left := new Node;
>             Node := Node.Left;
>          else
>             Node.Right := new Node;
>             Node := Node.Right;
>          end if;
>       end loop;
> 
>    <<Increment>>
>       Node.Count := Node.Count + 1;


Is this a good design?

It would be clearer to just let the first loop do all the work.  Eliminate
the "exit" statements and turn it into a "for" loop.  This works, but it
checks for "null" in nodes you just created.  If this makes a significant
performance difference to your application, you can optimize the loop.
(In this case, the compiler can't do it.)

However, the code should still be readable.  Specifically, you should be
able to state what each loop does each iteration, and what it has done
once it exits.  (If you like semi-mathematical jargon, you need a loop
invariant and a termination condition).

Let's use two clearly-defined loops:

    -- Move to the last existing node on the path
    FOR all existing nodes on path
        move to node

    -- Create any new nodes
    FOR all new nodes on path
        create the node

The new code is: - - - - - - - - - - - - - - - - - - - - - - - - - - -

     -- Declarations the same (but declare Bit and initialize Root)
   begin

      Node := Root;
      Bit := BV'First;

      ---------------------------------------------
      -- Move to the connect-node for the first  --
      -- node that needs to be created (if any). --
      ---------------------------------------------
      -- We are manually simulating a "for" loop to
      -- retain the index after loop termination.
      loop
         exit when Bit > BV'Last;

         if BV(Bit) then
            if Node.Left = null then
               exit;
            else
               Node := Node.Left;
            end if;
         else
            if Node.Right = null then
               exit;
            else
               Node := Node.Right;
            end if;
         end if;
         Bit := Bit + 1;
      end loop;

      ---------------------------------------------------
      -- We are at the last existing node in the path. --
      -- If there are any non-existent nodes on the    --
      -- path, Bit defines the first; create them.     --
      ---------------------------------------------------

      for I in Bit .. BV'Last loop
         if BV(I) then
            Node.Left := new Node;
            Node := Node.Left;
         else
            Node.Right := new Node;
            Node := Node.Right;
         end if;
      end loop;

      -------------------------------------------------------
      -- Increment the count at the node identified by BV. --
      -------------------------------------------------------

      Node.Count := Node.Count + 1;

This is exactly as efficient, and easier to maintain.  The loops have
clear missions, node creation is more localized, and the Increment
statement is only reached by one path of execution.

-- end --


-- 
Samuel Mize -- smize@imagin.net -- Team Ada
(personal net account)