From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=0.7 required=5.0 tests=BAYES_00,INVALID_DATE,
	MSGID_SHORT,REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no
	version=3.4.4
Path: utzoo!utgpu!watmath!clyde!att!pacbell!ames!haven!uflorida!gatech!hubcap!billwolf
From: billwolf@hubcap.clemson.edu (William Thomas Wolfe,2847,)
Newsgroups: comp.lang.ada
Subject: Re: Garbage Collection
Message-ID: <4066@hubcap.UUCP>
Date: 11 Jan 89 16:01:24 GMT
References: <35328@think.UUCP>
Sender: news@hubcap.UUCP
Reply-To: billwolf@hubcap.clemson.edu
List-Id: <comp.lang.ada>

>From article <35328@think.UUCP>, by barmar@think.COM (Barry Margolin):
> First of all, whether the file is locked is immaterial to the
> discussion (I never actually said that the compiler compiles files --
> in Lisp the compiler can also be invoked on in-core interpreted
> functions, and many Lisp programming environments allow in-core editor
> buffers to be compiled).

    Whatever is being processed, be it a file or something else,
    would have to be locked.  If it is already inaccessible to
    all other processes, then it is continuously locked already.

> [discussion of copying vs. read-locking]

    All editors I know of work by copying the targeted file into
    memory, holding it there while it's being modified, and then
    writing it back to the file.  Another approach is the locking
    mechanism.  Since compiler warnings generally amount to very
    small text files, I'd probably go the copying route if locking
    /unlocking consumed too much time.  However, a good argument 
    can also be made that there should not be 3 million people 
    simultaneously editing and/or compiling the same file anyway, 
    so it probably makes little difference which method is chosen. 

> Most modern GC schemes have time overhead that is a function (often
> linear) of the frequency of allocation.  Since assignments are
> always more frequent than allocations, and I suspect usually MUCH more
> frequent, this difference is important.

    No, not just the frequency of allocation.  GC's performance also
    depends upon the frequency of running out of memory.  Furthermore,
    GC is a global mechanism, and it wastes much time scanning space
    which is already being properly managed.  
 
> the subroutine for reading the database into memory read each line
> into a newly allocated string, replaced each delimiter with a null
> character (C's string terminator), and allocated structures containing
> pointers to the first character of each field.  This was fine for the
> read-only applications, as they could simply deallocate all the
> strings and record structures that were allocated (they didn't
> actually bother, since the programs run on Unix and they depended on
> everything being deallocated when the program terminated).  I tried to
> write a program that reads in this database and then allows the user
> to edit fields.  If the new field value is shorter than the original,
> I simply overwrote the original value; if not, I allocated a new
> string for the new value, and changed the record's field pointer to
> point there instead of into the middle of the original line.  But when
> it came time to deallocate, I needed to know whether individual fields
> needed to be deallocated independently of the lines.  Had I gotten
> around to finishing this program, I probably would have added a
> parallel data structure containing flags indicating whether each field
> had been reallocated.

     Why was each line read into a newly allocated monolithic string,
     with pointers into this string?  It would seem far more sensible
     to read each *field* into a newly allocated string; then when we
     need to revise a field to a larger value, deallocate the old string 
     and allocate a new one.  Flags are not necessary.

> In general, what I see coming out of this is that manual deallocation
> may always be possible, but it frequently requires a non-trivial
> amount of extra coding to provide bookkeeping to let the program know
> when it is safe to deallocate things.  Remember, every additional line
> of code in a program increases its complexity and is another potential
> failure point.  And details such as storage management are generally
> uninteresting to the application programmer, who is probably more
> interested in the problem domain

     Which is why application programmers should make use of ADTs,
     which encapsulate and hide the details of storage management.

> In conclusion, I expect that I can manage storage as well as you can.
> And I could also manually convert a number to a character string and
> send these characters to an I/O device.  Every major language provides
> runtime support for the latter so we don't all have to waste our time
> writing it.  What's so special about storage management that we
> should all be burdened with it?

    Conversion of a number to a character string will perform identically
    whether we use a system routine or code that same routine ourselves.

    Deallocation of storage is a one-time cost (and a small one at that)
    if done by the programmer.  Given the implicit destruction of local
    environments and the use of ADTs, application programmers will 
    practically never have to do any explicit deallocation anyway.  
    When it is necessary, it's not that difficult.  In the database example,
    a single line of code would suffice.


                                     Bill Wolfe

                               wtwolfe@hubcap.clemson.edu