Re: Garbage Collection

comp.lang.ada
 help / color / mirror / Atom feed

* Re: Garbage Collection
@ 1988-12-26 23:37 Erland Sommarskog
  1988-12-27 21:24 ` William Thomas Wolfe,2847,
  1988-12-27 22:24 ` Bob Hathaway
  0 siblings, 2 replies; 6+ messages in thread
From: Erland Sommarskog @ 1988-12-26 23:37 UTC (permalink / raw)


Bill Wolfe (wtwolfe@hubcap.UUCP) first said:
>>>     The deallocation of every object in the local environment is
>>>     performed as an automatic service when a procedure, function,
>>>     or local block is exited.  This is not garbage collection,
>>>     because the programmer has implicitly directed that the 
>>>     destruction be performed.  

I tried to understand:
>> I read this as "on block exit all memory allocated to variables
>> declared in that block should be deallocated". 
>>   Isn't this very dangerous? What if programmer copied the object to
>> a variable declared in a outer block? Or stored it in a table of some
>> sort in a subprogram call made in block? 

And Bill explains:
>   Not at all.  If a programmer assigns the value stored in some local
>   object to some nonlocal object, then we simply have a new value for
>   a nonlocal object, assuming we have not engaged in the foul practice
>   of structural sharing.  Those who engage in structural sharing will
>   get the run-time errors they deserve for engaging in such space-hacking. 


And I still don't understand. Let's say we have a tree handler
   Generic 
      Data_type is limited private;
      Procedure -- Assign, "<" and ">" 
   Package Tree_handler
      Tree_type is private;
Assume further that we instantiate this package with Some_access_type, 
it could be an imported limited type, it could be a locally defined.
Then we have the procedure:

    Procedure Insert_data_to_tree(Tree : in out Tree_type;
                                  Data_of_some_sort : in Any_type); 
    Reference : Some_access_type;
    Begin
       Reference := new Some_type;  -- Or subprogram call for external type
       Prepare(Reference, Data_of_some_sort);
       Insert(Tree, Reference);
    End Insert_data_to_tree;
    
According to Bill the data allocated to Reference should be freed when 
we exit this procedure, but if Insert does what we think this is of
course a severe error. Where did I go wrong? Did I plead guilty to
structure sharing just because I inserted a pointer to a tree? 
  Or does Bill mean that when Reference is inserted to the tree that 
Reference.all should be inserted to the tree, not the pointer itself?
But what if I also insert the reference into another tree (or some other 
structure) and then modify the referred data (some counter, for instance) 
and want this change to appear in both places?
  Note also that in this example, there are no side effects. We are only
dealing with local variables and paramerters. And still deallocation
on block exit is completely wrong. (And there is no way the compiler
could deduce that Insert really saves Reference somewhere.)

Several times in this discussion Bill Wolfe has claimed that 
garbage collection is a trouble-maker for maintainers. In what 
way? I can only see one: If the system due to a heavy use of memory
is spending a lot of time on garbage collecttion, something must be 
done. But in this case the problem is often quite easy to spot.
  Much worse for maintainence is when the system dies with "System 
access violation" or "segmentation fault" from time to time and you 
don't know why. The reason may be that an allocated area was erroneously 
released and then reused and the original data, that someone appearently 
is relying in, has been over-written. Or because an already deallocated 
was deallocated a second time.
  Bill Wolfe is trying to outline some alternative, but I
think he has to think more about, until it is equally safe.
(And really, I don't think he should. He is just reinventing
the wheel.)
  
Bill Wolfe said earlier that garbage collection encourages  
sloppy programming. Whether it is sloppy or not, I don't care,
but obviously garbage collection is taking a lot of the burden
of the programmer's shoulder and the maintainer's too. Why
should a programmer waste time bothering about something that
the computer could do much safer for him? After all, the  
programmer does not work for free.

So to conclude: I think Ada should require garbage collection
that could be disabled with a pragma for critical applications.
As it is now a portable application cannot rely on it, with
all the troubles explicit deallocation implies.
-- 
Erland Sommarskog
ENEA Data, Stockholm              This signature is not to be quoted.
sommar@enea.se

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Garbage Collection
  1988-12-26 23:37 Garbage Collection Erland Sommarskog
@ 1988-12-27 21:24 ` William Thomas Wolfe,2847,
  1988-12-28 16:09   ` Snorri Agnarsson
  1988-12-27 22:24 ` Bob Hathaway
  1 sibling, 1 reply; 6+ messages in thread
From: William Thomas Wolfe,2847, @ 1988-12-27 21:24 UTC (permalink / raw)


From article <4195@enea.se>, by sommar@enea.se (Erland Sommarskog):

 [Erland describes a situation in which a user instantiates a tree 
  structure over an access type, and then considers the insertion
  procedure with which the user inserts an object into the tree]

> According to Bill the data allocated to Reference [in the example,
> a pointer to the user's object] should be freed when 
> we exit this procedure, but if Insert does what we think this is of
> course a severe error. Where did I go wrong? Did I plead guilty to
> structure sharing just because I inserted a pointer to a tree? 
>   Or does Bill mean that when Reference is inserted to the tree that 
> Reference.all should be inserted to the tree, not the pointer itself?
> But what if I also insert the reference into another tree (or some other 
> structure) and then modify the referred data (some counter, for instance) 
> and want this change to appear in both places?
>   Note also that in this example, there are no side effects. We are only
> dealing with local variables and paramerters. And still deallocation
> on block exit is completely wrong. (And there is no way the compiler
> could deduce that Insert really saves Reference somewhere.)

   The user has supplied the limited private type STORED_OBJECT, and
   defined assignment over this object.  Let's assume that the tree
   is implemented in terms of a record structure, one field of which
   is a STORED_OBJECT.  In the INSERT_ITEM procedure, the user sends
   "in out TREE_TYPE; in STORED_OBJECT".  The tree is modified as follows:
   a new record is created to represent a new node of the tree.  The 
   statement ASSIGN (NEW_NODE.STORED_ITEM, NEW_OBJECT) is executed,
   along with whatever pointer manipulations are necessary to maintain
   the structure of the tree.  

   Now we proceed to block exit.  Since the tree and the new object are
   *parameters* and not local variables, they are not subject to the
   automatic destruction which applies to the block's local variables.

   Since the user supplied an access type as the type of object to be
   stored in the tree, the user bears full responsibility for making
   responsible use of the fact that he/she is dealing with a tree of pointers.

> Several times in this discussion Bill Wolfe has claimed that 
> garbage collection is a trouble-maker for maintainers. In what way? 

      When there is great difficulty deciding whether a given object 
      exists or not, the maintainer experiences great difficulty 
      pinning down the precise state of the program.  
 
>   Much worse for maintainence is when the system dies with "System 
> access violation" or "segmentation fault" from time to time and you 
> don't know why. The reason may be that an allocated area was erroneously 
> released and then reused and the original data, that someone appearently 
> is relying in, has been over-written. Or because an already deallocated 
> was deallocated a second time.

      There are basically two cases in which this can happen:

         1) When an inexperienced ADT designer is in the process 
            of acquiring the mental discipline required of anyone
            who programs pointer-based structures.  To avoid this,
            purchase ADTs from professional houses or use experienced
            ADT designers.  Otherwise, it's a cost of professional training.

         2) When a user has engaged in S T R U C T U R A L  S H A R I N G.

                   *** Don't engage in structural sharing ***

> Bill Wolfe said earlier that garbage collection encourages  
> sloppy programming. Whether it is sloppy or not, I don't care,

    Obviously.

> Why should a programmer waste time bothering about something that
> the computer could do much safer for him? After all, the  
> programmer does not work for free.

    DESTROY procedures are easy to write, need only be written once
    per ADT, and can be reused indefinitely.  GC costs us the
    computer time required to repeatedly determine which storage
    is in use and which is not, and this cost must be paid repeatedly
    AT RUN TIME.  Given that this PERMANENT, RUN_TIME COST can be
    avoided by simply communicating the fact that a particular object
    is no longer needed, GC is a rather costly crutch for lazy programmers. 



                                             Bill Wolfe

                                     wtwolfe@hubcap.clemson.edu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Garbage Collection
  1988-12-26 23:37 Garbage Collection Erland Sommarskog
  1988-12-27 21:24 ` William Thomas Wolfe,2847,
@ 1988-12-27 22:24 ` Bob Hathaway
  1988-12-28  6:30   ` Procedural variables, abstraction mechanisms, ADTs William Thomas Wolfe,2847,
  1 sibling, 1 reply; 6+ messages in thread
From: Bob Hathaway @ 1988-12-27 22:24 UTC (permalink / raw)


>According to Bill the data allocated to Reference should be freed when 
>we exit this procedure, but if Insert does what we think this is of
>course a severe error. Where did I go wrong? Did I plead guilty to

I think Bill meant deallocation should only occur if an object is no
longer accessible at the end of a subprogram or block, and Reference is 
accessible.  Subprogram and block exit is a very convenient time to perform 
clean-up of access types.  If Ada had an Adt construct for the "completion of
the Adt paradigm", we could specify a termination procedure for Adts which
could also be called at scope exit.  I agree that Adts/objects are an important
enough programming development to justify a new language construct.  I've 
worked with over 200,000 lines of code which used the older data-structure
oriented design and found the greatest problem in understanding and working
with these programs was caused by the use of global variables and the 
indescriminant access and modification of data-structures which were not 
implemented as Adts.  Such data-structures are poorly defined and lead to
code which is hard to read.

For my own opinion on the garbage collection discussion, I think a destroy 
procedure is desirable.  Adts need to be explicitly deallocated for long 
running daemons or loops even while objects are still accessible.  While 
calling a procedure or declaring objects in a local block from within a loop
allows automatic deallocation of objects, I'd rather use scope rules and 
subprograms for creating environments and performing actions and not for 
storage (de)allocation.  Also, I think Adt operations should include a destroy
procedure for completeness. If space management is not a problem the destroy 
procedure does not have to be called.  Concerning working sets, if a page is 
not accessed it will not reside in memory for long and should not cause a 
virtual space management problem.

I strongly agree with assignment operator overloading, and another
powerful extension is to allow any name/operator to be overloadable.

For yet another Ada 9X extension, I propose procedural variables.  As in 
Modula procedural variables can be limited to top level procedures but formal
parameter names must be included for named parameter association.  Here is an
example declaration:
    type insertNode_procedure = procedure (adt : in out structure,
					   node : in node);
Variables of type insertNode_procedure can be assigned to any procedure with
the same parameter type structure (type domain).  Procedural variables can
avoid the redundant use of case statements by allowing an operation to be 
stored within an Adt.  It also allows individually parameterizable and 
reprogrammable Adts since operations can be provided to alter the Adts actions
or structure.  Generic subprogram parameters can only allow Adts to be set 
once for any instantiation.  I use procedural variables and function pointers
in Adts frequently when programming in languages other than Ada and am 
convinced they are an elegant way to model Adt actions.

                                         Bob Hathaway
                                         rjh@purdue.edu

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Procedural variables, abstraction mechanisms, ADTs
  1988-12-27 22:24 ` Bob Hathaway
@ 1988-12-28  6:30   ` William Thomas Wolfe,2847,
  0 siblings, 0 replies; 6+ messages in thread
From: William Thomas Wolfe,2847, @ 1988-12-28  6:30 UTC (permalink / raw)


From article <5696@medusa.cs.purdue.edu>, by rjh@cs.purdue.edu (Bob Hathaway):
> 
> For yet another Ada 9X extension, I propose procedural variables.  

     This proposal has been given detailed consideration in Chapter 4 of 
     "Abstraction Mechanisms and Language Design", an ACM Distinguished 
     Dissertation by Paul N. Hilfinger.  Hilfinger was a member of the 
     Ada design team (see the Foreword in your LRM), and Chapter 4 is 
     concerned with ways in which the abstraction mechanisms in current 
     Ada can be strengthened even further.  It's published in book form 
     by the MIT Press, and should be available in the Purdue library.

     Another book I would strongly recommend to anyone interested in
     abstraction and programming languages is "Concurrency and Programming
     Languages", by David Harland.  Harland spends 2/3 of the book discussing
     *abstraction* and programming languages, and eventually does get around
     to actually discussing concurrency in Chapter 3.  Harland drives home
     the point that the default must be to maximize expressivity, and the
     burden of proof must fall upon those who argue that certain forms of
     abstraction must not be permitted.  Although I believe that such an
     argument can be made successfully in certain cases, whereas Harland
     apparently does not, the book is nevertheless a very powerful argument
     in favor of "unfettered abstraction", and it will certainly make you
     think.  Harland also takes on Lisp and the functional programming 
     languages; his comments are very terse, and quite devastating.  This  
     book is relatively small, but its intellectual content is tremendous.
     It's definitely a book you'll want to reserve several weekends for. 

     For a more practical look at abstract data types, Grady Booch's
     "Software Components with Ada" is a classic.  I also like Booch's
     definition of object-oriented programming, which essentially states
     that OOP is the practice of applying ADTs in a sensible manner.


                                          Bill Wolfe

                                   wtwolfe@hubcap.clemson.edu
 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Garbage Collection
  1988-12-27 21:24 ` William Thomas Wolfe,2847,
@ 1988-12-28 16:09   ` Snorri Agnarsson
  1988-12-30  0:46     ` Bill Wolfe
  0 siblings, 1 reply; 6+ messages in thread
From: Snorri Agnarsson @ 1988-12-28 16:09 UTC (permalink / raw)


From article <3985@hubcap.UUCP>, by billwolf@hubcap.clemson.edu (William Thomas Wolfe,2847,):
>       When there is great difficulty deciding whether a given object 
>       exists or not, the maintainer experiences great difficulty 
>       pinning down the precise state of the program.  

 Exactly - and garbage collection is precisely what you need to get rid
 of this difficulty.
 When you have garbage collection there is no difficulty in determining
 whether an object exists or not.  It exists if you have some way to
 refer to it, otherwise not.

>     DESTROY procedures are easy to write, need only be written once
>     per ADT, and can be reused indefinitely.  GC costs us the
>     computer time required to repeatedly determine which storage
>     is in use and which is not, and this cost must be paid repeatedly
>     AT RUN TIME.  Given that this PERMANENT, RUN_TIME COST can be
>     avoided by simply communicating the fact that a particular object
>     is no longer needed, GC is a rather costly crutch for lazy programmers. 

 The PERMANENT, RUN_TIME COST is very often much worse if you use the
 methods Mr. Wolfe dogmatically demands.  For example if you have a program
 which does calculations with indefinite size integers, then each new integer
 value computed is most appropriately stored in a new area in the heap.
 If you disallow sharing then each assignment must create a new area.
 This is costly.  Each time you stop using an integer variable you must
 deallocate it.  This is costly.
 If you have a simple copying garbage collector and allow sharing, which
 by the way is not dangerous at all in this case at least, you don't have
 the abovementioned costly operations, but instead you have a simple garbage
 collection which will in my experience be infrequent (e.g. every ten seconds
 for a program doing intensive calculations with 10000 digit integers) and 
 fast (less than a fraction of a second for each collection).  And this
 is on a machine with small memory (a PC with 640KB memory).

 Memory management is one of the most difficult things for programmers to
 do correctly and efficiently.  Garbage collection certainly takes this
 burden from the programmer and allows him to think in more abstract terms.

 In my opinion reusable software components can not be adequately created
 in languages that do not have good automatic memory management.
 In this respect Ada is a failure unless garbage collection is required.
 The C++ approach of using destructors is costly both in memory and time
 and does not really solve the problem, although it is better than nothing.

 It is regrettable in my opinion that the issue of garbage collection
 is so much misunderstood.  It is also regrettable that proponents of
 the various interesting new programming methodologies do not push
 garbage collection as an issue.  Functional programming, logic
 programming and object oriented programming have only one thing in
 common; reliance on garbage collection.  None of those methodologies
 would have much use in my opinion if it were not for garbage collection.
 And, I repeat: reusable modules (that includes generic packages) are
 almost useless in languages with no garbage collection compared to
 the use you can get in languages having garbage collection.

 
-- 
Snorri Agnarsson                        --  snorri@rhi.is
Raunvisindastofnun Haskolans      
Dunhaga 3
IS-107 Reykjavik ICELAND

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Garbage Collection
  1988-12-28 16:09   ` Snorri Agnarsson
@ 1988-12-30  0:46     ` Bill Wolfe
  0 siblings, 0 replies; 6+ messages in thread
From: Bill Wolfe @ 1988-12-30  0:46 UTC (permalink / raw)


From article <692@krafla.rhi.hi.is>, by snorri@rhi.hi.is (Snorri Agnarsson):
> From article <3985@hubcap.UUCP>, by billwolf@hubcap.clemson.edu (William Thomas Wolfe,2847,):
>>       When there is great difficulty deciding whether a given object 
>>       exists or not, the maintainer experiences great difficulty 
>>       pinning down the precise state of the program.  
> 
>  Exactly - and garbage collection is precisely what you need to get rid
>  of this difficulty.
>  When you have garbage collection there is no difficulty in determining
>  whether an object exists or not.  It exists if you have some way to
>  refer to it, otherwise not.

     So basically you have to sit there and total up the references to
     every piece of space manually.  Sounds like fun.  Got an algorithm
     for doing this in constant time?  If not, there is "great difficulty
     pinning down the precise state of the program". 
 
>>     DESTROY procedures are easy to write, need only be written once
>>     per ADT, and can be reused indefinitely.  GC costs us the
>>     computer time required to repeatedly determine which storage
>>     is in use and which is not, and this cost must be paid repeatedly
>>     AT RUN TIME.  Given that this PERMANENT, RUN_TIME COST can be
>>     avoided by simply communicating the fact that a particular object
>>     is no longer needed, GC is a rather costly crutch for lazy programmers. 
> 
>  The PERMANENT, RUN_TIME COST is very often much worse if you use the
>  methods Mr. Wolfe dogmatically demands.  For example if you have a program
>  which does calculations with indefinite size integers, then each new integer
>  value computed is most appropriately stored in a new area in the heap.
>  If you disallow sharing then each assignment must create a new area.

      Consider what happens to the storage associated with the old value
      of the target of the assignment statement.  The ASSIGN procedure
      can reuse it directly rather than deallocating and reallocating it.
      
>  Memory management is one of the most difficult things for programmers to
>  do correctly and efficiently.  Garbage collection certainly takes this
>  burden from the programmer and allows him to think in more abstract terms.

     Space is an important part of the abstraction.  And I challenge the
     claim that memory management is difficult.  One need only deal with
     a single structural model within the ADT; in the course of writing
     that ADT's implementation, one will become thoroughly familiar with
     that structure, and there should be no difficulty deciding when a
     part of it is no longer needed.

>  In my opinion reusable software components can not be adequately created
>  in languages that do not have good automatic memory management.

    In my opinion, reliance upon garbage collection is the single best
    way to create an inferior ADT implementation which will be totally
    rejected by anyone who is programming critical applications.  

>  In this respect Ada is a failure unless garbage collection is required.

    In this respect garbage collection is a total failure and should
    NEVER be required.  In the interests of seeing to it that portability
    to the critical-application environment is always possible, GC should
    be PROHIBITED.

>  It is regrettable in my opinion that the issue of garbage collection
>  is so much misunderstood.  

    Particularly by people who have no understanding of the issues
    applying to proper ADT design.

>  It is also regrettable that proponents of
>  the various interesting new programming methodologies do not push
>  garbage collection as an issue.  Functional programming, logic
>  programming and object oriented programming have only one thing in
>  common; reliance on garbage collection.  None of those methodologies
>  would have much use in my opinion if it were not for garbage collection.

    They don't have much use anyway.  See David Harland's "Concurrency
    and Programming Languages".


                                            Bill Wolfe

                                     wtwolfe@hubcap.clemson.edu

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~1988-12-30  0:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1988-12-26 23:37 Garbage Collection Erland Sommarskog
1988-12-27 21:24 ` William Thomas Wolfe,2847,
1988-12-28 16:09   ` Snorri Agnarsson
1988-12-30  0:46     ` Bill Wolfe
1988-12-27 22:24 ` Bob Hathaway
1988-12-28  6:30   ` Procedural variables, abstraction mechanisms, ADTs William Thomas Wolfe,2847,

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox