From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.5-pre1 (2020-06-20) on
	ip-172-31-74-118.ec2.internal
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=3.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.5-pre1
Date: Sat, 20 Jul 1991 14:45 EDT
From: SAHARBAUGH@ROO.FIT.EDU
Subject: ZZZZ HeapGuard Naecon paper (16,047 Bytes)
Message-ID: <9107201848.AA13692@ajpo.sei.cmu.edu>
List-Id: <comp.lang.ada>

On 20 July Norm Cohen writes about task access type variables (very well I 
might add) and says in passing:

"Of course no Ada compiler yet provides garbage collection for ANY types,
let alone task types."

Thanks Norm for reminding me to post our HeapGuard paper.

----
<This paper(with figures) was published in NAECON 91, IEEE 
catalog# 91CH3007-2, ISSN# 0547-3578, pp 704-708>


HeapGuard  TM,
Eliminating Garbage Collection in Real-Time Ada Systems

Sam Harbaugh and Bill Wavering
Integrated Software, Inc., Palm Bay, Florida


<HeapGuard is a trademark of Integrated Software, Inc.,
all rights reserved>



Abstract

	In any kind of real-time system, garbage collection must 
be avoided. Garbage collection requires a great deal of time 
to move data and occurs at unpredictable times.  This paper 
introduces HeapGuard, a heap management system which 
eliminates the problem of garbage collection in real-time 
systems.  The sources of garbage and traditional garbage 
collection are reviewed.  The HeapGuard scheme and optional 
supporting hardware are presented.


The Cause of Garbage

	Programming languages, such as Ada [LRM], Lisp, 
Smalltalk, C++, make extensive use of dynamic memory and heap 
management techniques.  For example, heap memory is used 
explicitly when the Ada programmer uses the keyword "new" to 
dynamically allocate an object such as a node in a linked 
list or when declaring an access type [Booch].  The Ada 
program makes implicit calls to the Ada Runtime Environment 
(ARTE) to allocate heap memory for task stack space, I/O data 
buffers and for various size program objects.  The ARTE calls 
the Heap Manager to get the heap space.  See Figure 1.

	When the program is done with an object it calls the 
ARTE to deallocate the object's heap space.  This happens 
implicitly when the object goes out of scope, for example, a  
subprogram's local storage  is  deallocated when the  
subprogram is exited.  Deallocation occurs explicitly  when 
the program calls an instantiation of the Ada generic 
procedure "unchecked_deallocation" [Booch].  The programmer 
depends on the ARTE calls to the Heap Manager to release and 
reclaim the heap space.


<figure shows relationship of Ada program, ARTE, Heap 
Manager, Heap management information and Heap memory>

Figure 1.   Run-Time Software Architecture

	Some heap managers return the released space for 
reallocation while others do nothing at all to make the space 
reusable.  The heap managers that do nothing to reclaim heap 
space will surely cause a "STORAGE_ERROR" exception when a 
program's allocation request cannot be satisfied.  The heap 
managers which reuse heap space can create garbage (unused 
heap fragments) as explained next.

	As the Ada program allocates and deallocates various 
size objects, the heap becomes fragmented when newly 
allocated objects don't fit exactly into holes left by 
deallocations.  See Figure 2.  Eventually an allocation 
request will be denied even though the total amount of unused 
memory exceeds the request.  The heap manager must then 
garbage collect or the program must quit.

	Garbage collection is the process of moving the 
program's data in the heap to a heap space where it can be 
stored contiguously [Aho].  See Figure 2.  Garbage collection 
takes an unpredictable amount of  time and occurs at 
unpredictable times, thus rendering any real-time use of 
garbage collection as unreliable.  The processing time is 
significant because the program's data must be moved.  After 
compaction, addresses of heap objects must be changed in the 
Ada program to point to the objects' new location.  Changing 
these program addresses is a complex 


<Figure shows empty heap, fragmented heap and compacted heap>
Figure 2.   A Traditional Heap

management problem not even attempted by today's Ada vendors 
or run-time systems.  Also, changing addresses is impossible 
in an Ada program implemented in read-only memory, such as in 
an embedded computer system.  In all, garbage collection must 
be avoided in real-time Ada programs.

General Discussion of Garbage Collection

	How can garbage collection be avoided?  Garbage 
collection is typically avoided by making all objects the 
same size or by making the heap requests all the size of the 
largest object.  This is difficult in programs such as expert 
systems, object oriented systems and data base applications.  
This technique, also, wastes space and restricts 
implementation techniques.

	How is garbage collection dealt with?  A common method 
of garbage collection is to always perform some form of 
incremental garbage collection according to some method of 
allocating processor resources in the "background."  Another 
method of performing garbage collection is to size the heap 
for the worst case storage scenario.  And the easiest method 
is to perform no garbage collection at all, at the expense 
of, eventually, receiving a "STORAGE ERROR" exception.

	How much memory does garbage collection use?  
Incremental garbage collection can waste 50% of the heap 
space.  During incremental garbage collection the users' data 
is marked and copied  from the fragmented heap space to a 
"clean" contiguous heap space.  The clean heap space has to 
be "large enough" to hold all the data from the fragmented 
heap space.  Large enough is roughly the same size as the 
fragmented heap space being copied from.  Thus, the space set 
aside to manage and accommodate the heap is roughly twice the 
needed space.  The other methods of dealing with fragmented 
heap space all operate by using some amount of extra memory.

Project Background

	In 1985 the Air Force awarded a contract, to Integrated 
Software, Inc., to identify a set of primitive functions that 
could be implemented in hardware to support the functions of 
the Ada Runtime Environment (ARTE) in real-time systems, such 
as avionics.  The work focused strongly on the support of 
heap management to avoid garbage collection and allow Ada to 
be safely used in real-time avionics systems.  Research 
performed on that contract validated the HeapGuard heap 
management scheme.  Further, the HeapGuard scheme is being 
used on another contract to improve real-time performance of 
expert systems and Artificial Intelligence.  On this second 
contract, HeapGuard has been integrated into the ARTE on a 
UNIX platform.  The HeapGuard scheme has been implemented in 
both hardware and software.

The HeapGuard Scheme

	The HeapGuard scheme can be stated as follows.  
Assuming random allocations and deallocations, for a heap 

< note to Ada BB readers: note the word "Assuming">

size of N storage units, reserve a heap area of 2N-2 units 
and allocate objects on the heap using first-fit allocation*.  
The first fit allocation must always start the allocation 
attempt from the same end of the heap.  For example, see 
Figure 3, to allocate a new object of size four words on the 
heap, the first fit allocator would begin searching for 
unallocated heap storage at location zero.  The new object 
would, finally, be stored beginning at location twenty.

<figure shows program address range, 0-37, for a 20 word 
HeapGuard heap using 2N-2 derating>

Figure 3.   A 20 word HeapGuardHeap

HeapGuard  Feasibility Tests

	In order to determine the feasibility of using 
HeapGuard , tests were run to try and "bust the heap."  
HeapGuard feasibility tests were performed using a prototype 
of HeapGuard.  The tests were run, one at a time, for varying 
values of reserved heap size, M, and for varying values of 
the amount of the heap, N, used by the program at any one 
time.  The size of the stored objects varied randomly from 1 
to N and the time at which they were deallocated varied 
randomly.  For each value of N, tests were run for M < 2*N-2 
and then tests were run for M >= 2*N-2.  In all cases the 
tests failed for M < 2*N-2 and never failed for  M >= 2*N-2.

	*For non-random allocations and deallocations there is a 
heap usage pattern which, according to Robson [ROB], requires 
more than 2*N-2 for heaps of 512 or greater storage units. 

<note to Ada BB readers: pay attention to this footnote.  The 
consequence is that a software-only HeapGuard scheme for over 
512 heap objects will become more costly (in terms of wasted 
heap RAM) and therefore to absolutely guarentee no need to 
garbage collect for a worst-case usage pattern one should use 
the hardware pointer scheme which "wastes" only address space 
and not RAM.>

Hardware Supported HeapGuard

	There are two reasons to provide hardware support for 
HeapGuard.  The first reason is that the additional hardware 
can perform heap management concurrent with execution of the 
Ada program and therefore obtain higher throughput.  The 
second, and more interesting, reason is that by using address 
pointers to a heap which stores large objects, fewer total 
memory elements are needed.  The remainder of this section 
describes the benefits of using address pointers.

	When using hardware support for HeapGuard, address 
pointers point to the allocated blocks in the heap.  See 
Figures 4 and 5.  All memory accesses by the Ada program to 
the heap are made indirectly through the pointers.  This 
indirect addressing is done by address bus interface level 
hardware, transparently to the Ada program.  Pointer 
management is done by the heap pointer manager. The heap 
pointer manager can be in the support hardware.  
Alternatively, the heap pointer manager can be in a separate 
program in the Ada program address space if the support 
hardware provides a data path from the heap pointer manager 
program to the pointer storage.


<figure shows pointers from program address space to heap 
memory locations>

Figure 4.

	The decision to use address pointers depends on the 
range of heap object sizes and the cost of memory.  In the 
case of large objects (as in AI applications) and/or 
expensive memory (as in avionics applications), the address 
pointers may be the most economical implementation.  In the 
case of small objects and/or inexpensive memory (as in an 
office workstation), the software-only implementation of 
HeapGuard may be the most economical implementation.

	The size of heap memory required for software or 
hardware implementation of pointers can be calculated.  For 
purposes of explanation assume that the heap memory is 
organized as 32 bit words (4 bytes) and that each address 
pointer is 16 bits (2 bytes).  Further assume that the Ada 
program wishes to use a heap of size N words.  In order to 
simplify the explanation, assume that the heap usage is 
derated by exactly 50%. For random usage patterns, the 
address pointers will occupy a memory space of 2*N and each 
pointer will be capable of pointing to any word in the actual 
heap.  For non-random usage patterns refer for examples to 
Robson [ROB].  If the Ada program wishes to be able to 
allocate one word (and larger) objects then 2*N pointers must 
be provided.  The total storage used for actual heap (4 
bytes/word * N words) and pointers (2 bytes/pointer * 2N 
pointers) is 4*N + 2*2N = 8N bytes.  In this case of one word 
objects (see Figure 4) it would be simpler to provide the 
actual heap space of 2N words (8N bytes) and implement 
HeapGuard in software only.



<figure shows one pointer per heap block>

Figure 5.

	By instrumenting an application, the sizes of objects 
the application allocates can be determined.  Now assume that 
the Ada program's heap object's sizes are always some 
multiple of 4 words, (see figure 5) termed a 4 word block.  
The address pointer memory space, as seen by the Ada program, 
will still be 2N but only 2N/4 address pointers need to be 
provided, one to each block.  This is simply implemented by 
not decoding the 2 least significant bits of the heap memory 
address.  In this case of 4 word blocks the memory required 
for actual heap (4 bytes/word * N words) and pointers (2 
bytes/pointer * 2N/4 pointers) is 4*N + 2*2N/4 =  4*N + N 
bytes.   

Conclusions

	In real-time computer systems, heap memory could be 
managed using HeapGuard in order to provide predictable 
timing and eliminate storage errors due to fragmentation.  If 
feasible, a coprocessor could be provided to manage the 
dynamic memory and increase system throughput.  If the 
application program is known to store only small objects, 
then a software-only implementation could be effective.  If 
the application program is known to store many large objects 
in dynamic memory then the coprocessor could provide and 
manage a set of pointers to these large objects.  If the 
program stores many small and many large objects, then a 
software-only heap for the small objects and an address 
pointer based heap for large objects could be provided.

Disclaimer

Neither the author, Integrated Software, Inc., nor the U. S. 
Government in any way warrant HeapGuard for any usage and 
are not responsible for any consequences of its usage.

Integrated Software, Inc., retains all rights to the 
HeapGuard technique and a patent is pending.  As such, the 
HeapGuard technique and the HeapGuard name are subject to 
use only with permission from ISI.

Acknowledgements

This research and development was sponsored by:    
Wright Research and Development Center
Avionics Laboratory
Wright-Patterson AFB, Ohio 45433-6543

In particular we wish to acknowledge Sharon Barklay, Captain 
Rick A. Long, Lt. Rick Ricart, Terry Rogers, James Williamson 
and Lt. Jon Wood for their support.

And we wish to thank Larry Koos, of Koos Technical Services,
Orlando FL, for his support.

A special thanks is due to Professor John McHugh at the 
University of North Carolina, Chapel Hill, NC.

<note to Ada BB readers: John computed worst-case upper 
bounds values from Robsen's paper>



References 

[Aho]  "Data Structures and Algorithms",             
A. V. Aho et al, Addison-Wesley, 1983, ISBN 0-201-00023-7

[Booch]  Software Engineering with Ada, 2nd edition (1986), 
Grady Booch, Benjamin-Cummings, ISBN 0-8053-0604-8

[LRM]  ANSI/MIL-STD-1815A The Ada Programming Language

[McEntee]  "Garbage Collection Enhances AI, Symbolic 
Processing Functions", Timothy J. McEntee, Computer 
Technology Review, Summer 1986

[ROB] "An Estimate of the Store Size Necessary for Dynamic 
Storage Allocation", J. M. Robson, JACM, Vol 18, No 3, July 
71, pp 416-423.

About the Authors

Sam Harbaugh received his BSEE, MSEE and PhD EE (1964) from 
Carnegie Institute of Technology (now Carnegie-Mellon 
University).  He has 25 years experience in real-time 
computer systems, 7 years using Ada.  He is President of 
Integrated Software, Inc., which specializes in real-time Ada 
for applications such as Avionics, Expert Systems and Aircrew 
Training Simulators.

Bill Wavering received his BSCS from the University of 
Illinois (1977), and MSCS (1982) from Florida Institute of 
Technology.  He has 13 years experience in real-time computer 
systems, the past 2 years using Ada, and the past 5 years 
using Artificial Intelligence techniques.  He is currently 
concluding a research project during which hardware support 
for the Ada Fuzzy Expert System (AFES, available from ISI), 
and for HeapGuard, was developed.  He is, also, Principal 
Investigator on a NASA contract to research techniques for 
inference in a real-time expert system.  He is a Member of 
the Technical Staff of Integrated Software, Inc.

The authors may be reached at: Integrated Software, Inc., 
1945 Palm Bay Rd. N.E. #7, Palm Bay, FL 32905,  407-984-1986
---
* HeapGuard is a trademark of Integrated Software, Inc.
*   For an explanation of first-fit allocation see [Aho].