From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,a7f223906eb859a,start X-Google-Attributes: gid103376,public From: smize@news.imagin.net (Samuel Mize) Subject: GOTO considered necessary Date: 1997/06/06 Message-ID: <5n977i$2cmc$1@prime.imagin.net> X-Deja-AN: 246556163 Organization: ImagiNet Communications Ltd, Arlington, Texas Reply-To: smize@imagin.net (Samuel Mize) Newsgroups: comp.lang.ada Date: 1997-06-06T00:00:00+00:00 List-Id: Greetings, I'm rebeating a dead horse, in order to write up a consensus paper to submit to adahome. I'll appreciate your input. This one concerns use of "goto" statements. I have scanned through the back c.l.a messages in DejaNews. I request your comments on this summary. I also have a few questions for everyone. Samuel Mize USE OF "GOTO" IN ADA -------------------- SUMMARY: -------- 1) In general, "goto" statements should be avoided. They create confusing execution flow structures -- at first, or after a few maintenance changes. 2) As a rule of thumb, if your code needs a "goto" to work efficiently (or at all), you have probably overlooked a simpler, better design. 3) "Goto" statements are acceptable in machine-generated code, but ONLY if a human is NEVER expected to read the code. (For example, a module generated by a tool from data-flow diagrams.) 4) As an exception, Finite State Machines (FSMs) are often implemented with goto statements. The people doing this argue that, for FSMs, "goto"s improve clarity and readability, primarily because they let the code more directly reflect the FSM model. 5) Other usages are very suspect, but there may be OK cases. If it really improves clarity, go ahead -- but document WHY you used a "goto," and why other approaches were found lacking. 6) Norm Cohen recommended Knuth's article in the December 1974 Computing Surveys, as "the best discussion I've ever seen of the goto issue." [------------------------------------------------------------------------] [ Point 2 is new material from me. As an example, some code (originally ] [ from Knuth) was posted as a case where a goto makes sense. Yet when ] [ redesigned without it, the code is clearer and simpler. I've appended ] [ the example and my analysis at the end of this write-up. ] [------------------------------------------------------------------------] QUESTIONS: ---------- 1) Are there other enumerable cases where "goto" improves clarity? (Not examples -- I'll grant as given that isolated instances may occur -- but application classes or general situations that could be described and watched out for.) 2) In what application areas do goto-based FSM implementations commonly occur? 3) Does this template reflect the design of a goto-based FSM, or have I missed something? Is there a better general approach? begin <> -- A block like this exists for [processing for state 1] -- each state of the machine. -- if [condition] then -- An explicit variable keeping goto [State_X]; -- the "current" state is not elsif [condition] then -- needed -- it is represented goto [State_Y]; -- by the program counter. else -- goto [State_Z]; -- The coder must ensure execution end if; -- does not "fall through" to the -- next state by accident. <> ... 4) The alternative would be as shown below. This discussion is biased by my ignorance of leading-edge FSM work. Please comment on advantages and disadvantages of this approach. -- good grief, use meaningful names in a real application type State is (Initial_State, State1, ... Staten); Current_State: State; -- variables that may affect the next transition ... function Next_State (Current_State: State) return State; loop exit when (Current_State = [a terminal state]) or (Current_State = [another terminal state]) ...; -------------------------------------------------------- AAA case Current_State is when State_1 => [Processing for state 1] Current_State := Next_State (Current_State); ... end case; if Debug then Text_Io.Put_Line (State'Image (Current_State)); ----- BBB end if; end loop; Next_State can use an if/elsif cascade, or an array or hash-table lookup, or whatever is appropriate for the application-specific transition table. It has these advantages: (a) The transition logic is separate from the processing logic for each state. (This is good or bad, depending on how intertwined their logic naturally is.) (b) There is a place to put processing that must occur at the start or end of each state (note lines AAA and BBB). (c) The possible states are listed in an enumeration, usable for I/O or to represent a "memory" of previous states, if needed. It has these disadvantages: (a) The transition logic is separate from the processing logic for each state. (b) It is less efficient: at the end of each state, we jump to restart the loop, evaluate the case expression, and only then jump to the next state. (c) If may be harder to locate accidentally-"killed" states. A cross-reference tool could identify unused statement labels. Adding or deleting a state is harder with the case structure because you have to edit in two locations, keeping the type definition and the case structure synchronized. On the other hand, adding or deleting a state is harder with the goto structure because the logic that jumps to the new (or deleted) state is less localized -- you have to hunt through all the code for the "goto"s. This seems to be a wash. EXAMPLE OF SIMPLIFYING DESIGN TO ELIMINATE GOTOS: ------------------------------------------------- The code as given was: - - - - - - - - - - - - - - - - - - - - - - - > Here's an example to ponder: Consider the declarations > > type Node_Type; > > type Tree_Type is access Node_Type; > > type Node_Type is > record > Count : Natural := 0; > Left, Right : Tree_Type; > end record; > > type Bit_Vector_Type is array (Natural range <>) of Boolean; > > type Bit_Vector_Pointer_Type is access Bit_Vector_Type; > > Root : Tree_Type; > BV : Bit_Vector_Pointer_Type; > > BV points to a bit vector describing a path (False = left, True = right) > through the tree pointed to by Root. We are counting the number of times > a given bit vector is encountered in the Count component of the > corresponding tree node. Root starts out pointing to a tree consisting > only of a root node, and new nodes are added to the tree as needed. The > loop below has three exits, and the actions appropriate when one of the > exits is taken is different from the actions appropriate when the other > two are taken, so a goto comes in handy. Here's the code: > > Node := Root; > Bit := BV'First; > loop > if Bit > BV'Last then > goto Increment; > end if; > if BV(Bit) then > if Node.Left = null then > Node.Left := new Node; > Node := Node.Left; > exit; > else > Node := Node.Left; > end if; > else > if Node.Right = null then > Node.Right := new Node; > Node := Node.Right; > exit; > else > Node := Node.Right; > end if; > end if; > Bit := Bit + 1; > end loop; > > -- We've reached a leaf. Read the rest of BV.all while extending > -- the tree along the corresponding path only. > for I in Bit+1 .. BV'Last loop > if BV(I) then > Node.Left := new Node; > Node := Node.Left; > else > Node.Right := new Node; > Node := Node.Right; > end if; > end loop; > > <> > Node.Count := Node.Count + 1; Is this a good design? It would be clearer to just let the first loop do all the work. Eliminate the "exit" statements and turn it into a "for" loop. This works, but it checks for "null" in nodes you just created. If this makes a significant performance difference to your application, you can optimize the loop. (In this case, the compiler can't do it.) However, the code should still be readable. Specifically, you should be able to state what each loop does each iteration, and what it has done once it exits. (If you like semi-mathematical jargon, you need a loop invariant and a termination condition). Let's use two clearly-defined loops: -- Move to the last existing node on the path FOR all existing nodes on path move to node -- Create any new nodes FOR all new nodes on path create the node The new code is: - - - - - - - - - - - - - - - - - - - - - - - - - - - -- Declarations the same (but declare Bit and initialize Root) begin Node := Root; Bit := BV'First; --------------------------------------------- -- Move to the connect-node for the first -- -- node that needs to be created (if any). -- --------------------------------------------- -- We are manually simulating a "for" loop to -- retain the index after loop termination. loop exit when Bit > BV'Last; if BV(Bit) then if Node.Left = null then exit; else Node := Node.Left; end if; else if Node.Right = null then exit; else Node := Node.Right; end if; end if; Bit := Bit + 1; end loop; --------------------------------------------------- -- We are at the last existing node in the path. -- -- If there are any non-existent nodes on the -- -- path, Bit defines the first; create them. -- --------------------------------------------------- for I in Bit .. BV'Last loop if BV(I) then Node.Left := new Node; Node := Node.Left; else Node.Right := new Node; Node := Node.Right; end if; end loop; ------------------------------------------------------- -- Increment the count at the node identified by BV. -- ------------------------------------------------------- Node.Count := Node.Count + 1; This is exactly as efficient, and easier to maintain. The loops have clear missions, node creation is more localized, and the Increment statement is only reached by one path of execution. -- end -- -- Samuel Mize -- smize@imagin.net -- Team Ada (personal net account)