From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.5-pre1 (2020-06-20) on ip-172-31-74-118.ec2.internal X-Spam-Level: X-Spam-Status: No, score=-1.9 required=3.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.5-pre1 Date: Sat, 20 Jul 1991 14:45 EDT From: SAHARBAUGH@ROO.FIT.EDU Subject: ZZZZ HeapGuard Naecon paper (16,047 Bytes) Message-ID: <9107201848.AA13692@ajpo.sei.cmu.edu> List-Id: On 20 July Norm Cohen writes about task access type variables (very well I might add) and says in passing: "Of course no Ada compiler yet provides garbage collection for ANY types, let alone task types." Thanks Norm for reminding me to post our HeapGuard paper. ---- HeapGuard TM, Eliminating Garbage Collection in Real-Time Ada Systems Sam Harbaugh and Bill Wavering Integrated Software, Inc., Palm Bay, Florida Abstract In any kind of real-time system, garbage collection must be avoided. Garbage collection requires a great deal of time to move data and occurs at unpredictable times. This paper introduces HeapGuard, a heap management system which eliminates the problem of garbage collection in real-time systems. The sources of garbage and traditional garbage collection are reviewed. The HeapGuard scheme and optional supporting hardware are presented. The Cause of Garbage Programming languages, such as Ada [LRM], Lisp, Smalltalk, C++, make extensive use of dynamic memory and heap management techniques. For example, heap memory is used explicitly when the Ada programmer uses the keyword "new" to dynamically allocate an object such as a node in a linked list or when declaring an access type [Booch]. The Ada program makes implicit calls to the Ada Runtime Environment (ARTE) to allocate heap memory for task stack space, I/O data buffers and for various size program objects. The ARTE calls the Heap Manager to get the heap space. See Figure 1. When the program is done with an object it calls the ARTE to deallocate the object's heap space. This happens implicitly when the object goes out of scope, for example, a subprogram's local storage is deallocated when the subprogram is exited. Deallocation occurs explicitly when the program calls an instantiation of the Ada generic procedure "unchecked_deallocation" [Booch]. The programmer depends on the ARTE calls to the Heap Manager to release and reclaim the heap space.
Figure 1. Run-Time Software Architecture Some heap managers return the released space for reallocation while others do nothing at all to make the space reusable. The heap managers that do nothing to reclaim heap space will surely cause a "STORAGE_ERROR" exception when a program's allocation request cannot be satisfied. The heap managers which reuse heap space can create garbage (unused heap fragments) as explained next. As the Ada program allocates and deallocates various size objects, the heap becomes fragmented when newly allocated objects don't fit exactly into holes left by deallocations. See Figure 2. Eventually an allocation request will be denied even though the total amount of unused memory exceeds the request. The heap manager must then garbage collect or the program must quit. Garbage collection is the process of moving the program's data in the heap to a heap space where it can be stored contiguously [Aho]. See Figure 2. Garbage collection takes an unpredictable amount of time and occurs at unpredictable times, thus rendering any real-time use of garbage collection as unreliable. The processing time is significant because the program's data must be moved. After compaction, addresses of heap objects must be changed in the Ada program to point to the objects' new location. Changing these program addresses is a complex
Figure 2. A Traditional Heap management problem not even attempted by today's Ada vendors or run-time systems. Also, changing addresses is impossible in an Ada program implemented in read-only memory, such as in an embedded computer system. In all, garbage collection must be avoided in real-time Ada programs. General Discussion of Garbage Collection How can garbage collection be avoided? Garbage collection is typically avoided by making all objects the same size or by making the heap requests all the size of the largest object. This is difficult in programs such as expert systems, object oriented systems and data base applications. This technique, also, wastes space and restricts implementation techniques. How is garbage collection dealt with? A common method of garbage collection is to always perform some form of incremental garbage collection according to some method of allocating processor resources in the "background." Another method of performing garbage collection is to size the heap for the worst case storage scenario. And the easiest method is to perform no garbage collection at all, at the expense of, eventually, receiving a "STORAGE ERROR" exception. How much memory does garbage collection use? Incremental garbage collection can waste 50% of the heap space. During incremental garbage collection the users' data is marked and copied from the fragmented heap space to a "clean" contiguous heap space. The clean heap space has to be "large enough" to hold all the data from the fragmented heap space. Large enough is roughly the same size as the fragmented heap space being copied from. Thus, the space set aside to manage and accommodate the heap is roughly twice the needed space. The other methods of dealing with fragmented heap space all operate by using some amount of extra memory. Project Background In 1985 the Air Force awarded a contract, to Integrated Software, Inc., to identify a set of primitive functions that could be implemented in hardware to support the functions of the Ada Runtime Environment (ARTE) in real-time systems, such as avionics. The work focused strongly on the support of heap management to avoid garbage collection and allow Ada to be safely used in real-time avionics systems. Research performed on that contract validated the HeapGuard heap management scheme. Further, the HeapGuard scheme is being used on another contract to improve real-time performance of expert systems and Artificial Intelligence. On this second contract, HeapGuard has been integrated into the ARTE on a UNIX platform. The HeapGuard scheme has been implemented in both hardware and software. The HeapGuard Scheme The HeapGuard scheme can be stated as follows. Assuming random allocations and deallocations, for a heap < note to Ada BB readers: note the word "Assuming"> size of N storage units, reserve a heap area of 2N-2 units and allocate objects on the heap using first-fit allocation*. The first fit allocation must always start the allocation attempt from the same end of the heap. For example, see Figure 3, to allocate a new object of size four words on the heap, the first fit allocator would begin searching for unallocated heap storage at location zero. The new object would, finally, be stored beginning at location twenty.
Figure 3. A 20 word HeapGuardHeap HeapGuard Feasibility Tests In order to determine the feasibility of using HeapGuard , tests were run to try and "bust the heap." HeapGuard feasibility tests were performed using a prototype of HeapGuard. The tests were run, one at a time, for varying values of reserved heap size, M, and for varying values of the amount of the heap, N, used by the program at any one time. The size of the stored objects varied randomly from 1 to N and the time at which they were deallocated varied randomly. For each value of N, tests were run for M < 2*N-2 and then tests were run for M >= 2*N-2. In all cases the tests failed for M < 2*N-2 and never failed for M >= 2*N-2. *For non-random allocations and deallocations there is a heap usage pattern which, according to Robson [ROB], requires more than 2*N-2 for heaps of 512 or greater storage units. Hardware Supported HeapGuard There are two reasons to provide hardware support for HeapGuard. The first reason is that the additional hardware can perform heap management concurrent with execution of the Ada program and therefore obtain higher throughput. The second, and more interesting, reason is that by using address pointers to a heap which stores large objects, fewer total memory elements are needed. The remainder of this section describes the benefits of using address pointers. When using hardware support for HeapGuard, address pointers point to the allocated blocks in the heap. See Figures 4 and 5. All memory accesses by the Ada program to the heap are made indirectly through the pointers. This indirect addressing is done by address bus interface level hardware, transparently to the Ada program. Pointer management is done by the heap pointer manager. The heap pointer manager can be in the support hardware. Alternatively, the heap pointer manager can be in a separate program in the Ada program address space if the support hardware provides a data path from the heap pointer manager program to the pointer storage.
Figure 4. The decision to use address pointers depends on the range of heap object sizes and the cost of memory. In the case of large objects (as in AI applications) and/or expensive memory (as in avionics applications), the address pointers may be the most economical implementation. In the case of small objects and/or inexpensive memory (as in an office workstation), the software-only implementation of HeapGuard may be the most economical implementation. The size of heap memory required for software or hardware implementation of pointers can be calculated. For purposes of explanation assume that the heap memory is organized as 32 bit words (4 bytes) and that each address pointer is 16 bits (2 bytes). Further assume that the Ada program wishes to use a heap of size N words. In order to simplify the explanation, assume that the heap usage is derated by exactly 50%. For random usage patterns, the address pointers will occupy a memory space of 2*N and each pointer will be capable of pointing to any word in the actual heap. For non-random usage patterns refer for examples to Robson [ROB]. If the Ada program wishes to be able to allocate one word (and larger) objects then 2*N pointers must be provided. The total storage used for actual heap (4 bytes/word * N words) and pointers (2 bytes/pointer * 2N pointers) is 4*N + 2*2N = 8N bytes. In this case of one word objects (see Figure 4) it would be simpler to provide the actual heap space of 2N words (8N bytes) and implement HeapGuard in software only.
Figure 5. By instrumenting an application, the sizes of objects the application allocates can be determined. Now assume that the Ada program's heap object's sizes are always some multiple of 4 words, (see figure 5) termed a 4 word block. The address pointer memory space, as seen by the Ada program, will still be 2N but only 2N/4 address pointers need to be provided, one to each block. This is simply implemented by not decoding the 2 least significant bits of the heap memory address. In this case of 4 word blocks the memory required for actual heap (4 bytes/word * N words) and pointers (2 bytes/pointer * 2N/4 pointers) is 4*N + 2*2N/4 = 4*N + N bytes. Conclusions In real-time computer systems, heap memory could be managed using HeapGuard in order to provide predictable timing and eliminate storage errors due to fragmentation. If feasible, a coprocessor could be provided to manage the dynamic memory and increase system throughput. If the application program is known to store only small objects, then a software-only implementation could be effective. If the application program is known to store many large objects in dynamic memory then the coprocessor could provide and manage a set of pointers to these large objects. If the program stores many small and many large objects, then a software-only heap for the small objects and an address pointer based heap for large objects could be provided. Disclaimer Neither the author, Integrated Software, Inc., nor the U. S. Government in any way warrant HeapGuard for any usage and are not responsible for any consequences of its usage. Integrated Software, Inc., retains all rights to the HeapGuard technique and a patent is pending. As such, the HeapGuard technique and the HeapGuard name are subject to use only with permission from ISI. Acknowledgements This research and development was sponsored by: Wright Research and Development Center Avionics Laboratory Wright-Patterson AFB, Ohio 45433-6543 In particular we wish to acknowledge Sharon Barklay, Captain Rick A. Long, Lt. Rick Ricart, Terry Rogers, James Williamson and Lt. Jon Wood for their support. And we wish to thank Larry Koos, of Koos Technical Services, Orlando FL, for his support. A special thanks is due to Professor John McHugh at the University of North Carolina, Chapel Hill, NC. References [Aho] "Data Structures and Algorithms", A. V. Aho et al, Addison-Wesley, 1983, ISBN 0-201-00023-7 [Booch] Software Engineering with Ada, 2nd edition (1986), Grady Booch, Benjamin-Cummings, ISBN 0-8053-0604-8 [LRM] ANSI/MIL-STD-1815A The Ada Programming Language [McEntee] "Garbage Collection Enhances AI, Symbolic Processing Functions", Timothy J. McEntee, Computer Technology Review, Summer 1986 [ROB] "An Estimate of the Store Size Necessary for Dynamic Storage Allocation", J. M. Robson, JACM, Vol 18, No 3, July 71, pp 416-423. About the Authors Sam Harbaugh received his BSEE, MSEE and PhD EE (1964) from Carnegie Institute of Technology (now Carnegie-Mellon University). He has 25 years experience in real-time computer systems, 7 years using Ada. He is President of Integrated Software, Inc., which specializes in real-time Ada for applications such as Avionics, Expert Systems and Aircrew Training Simulators. Bill Wavering received his BSCS from the University of Illinois (1977), and MSCS (1982) from Florida Institute of Technology. He has 13 years experience in real-time computer systems, the past 2 years using Ada, and the past 5 years using Artificial Intelligence techniques. He is currently concluding a research project during which hardware support for the Ada Fuzzy Expert System (AFES, available from ISI), and for HeapGuard, was developed. He is, also, Principal Investigator on a NASA contract to research techniques for inference in a real-time expert system. He is a Member of the Technical Staff of Integrated Software, Inc. The authors may be reached at: Integrated Software, Inc., 1945 Palm Bay Rd. N.E. #7, Palm Bay, FL 32905, 407-984-1986 --- * HeapGuard is a trademark of Integrated Software, Inc. * For an explanation of first-fit allocation see [Aho].