* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?) [not found] ` <3e26mc$n9u@Starbase.NeoSoft.COM> @ 1994-12-31 1:09 ` Henry Baker 1994-12-31 2:12 ` Don Yuniskis ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Henry Baker @ 1994-12-31 1:09 UTC (permalink / raw) In article <3e26mc$n9u@Starbase.NeoSoft.COM>, dweller@Starbase.NeoSoft.COM (David Weller) wrote: > In article <3e22hi$pqf@baygull.rtd.com>, Don Yuniskis <dgy@rtd.com> wrote: > >In article <3e1rqn$ouh@news.parc.xerox.com>, > >Hans Boehm <boehm@parc.xerox.com> wrote: > >>I'm not an expert on real-time programming. But I'm always confused > >>by these discussions. It seems to me that "real-time" is usually > >>intended to mean one of the following two extremes, or anything > >>in between: > >>1. Reasonable interactive response on a fast machine. Maybe good enough > >>so that a video game isn't too jerky. > >>2. Part of a safety critical system that requires millisecond or so > >>GUARANTEED response, and GUARANTEED space bounds. > > > >Actually, neither is really correct. Typically real-time == real-fast. > >However, real-fast != real-time. Also, real-time !-> real-fast > > This is contrary to any text I've read. Maybe I just live in a > closet? :-) Can you cite a "Real-Time" text or two that says this? > > I've always learned "Real-Time" as implying determinism. Of course, > I may have a narrow view because I write space and flight simulators > :-) > > A rate of 1Hz is _definitely_ real-time, PROVIDED you neither exceed > or fall below that rate. Cxu vi komprenas ke mi diras? In the RT community there are the terms '_hard_ real-time' and '_soft_ real-time'. 'Hard' RT means that the consequences of missing a deadline are catastrophic, while 'soft' RT means that the consequences are expensive, but not catastrophic. Designing for a 'hard' RT system is different from designing for a 'soft' RT system because you have to find the latencies of _every_ operation, no matter how rare -- e.g., you have to worry about the latencies of rare combinations of rare events -- e.g., multiple page faults from a single instruction, cache faults for every memory reference of a 1000-instruction sequence, etc. Unfortunately, nearly _all_ of the progress in HW performance in the last 20 years has been in _mean_ performance, _not_ worst-case performance. In fact, worst-case performance has only improved modestly over this period. The _ratio_ of mean to worst-case performance has increased dramatically, so that it may be 100:1 or more today. This means that nearly everyone is looking for ways to convert 'hard' RT systems into 'soft' RT systems so that they can take advantage of the increased average speeds. So if you're still doin' 'hard time', wise up! :-) :-) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?) 1994-12-31 1:09 ` What's Real-Time? (was Re: Widespread C++ Competency Gap?) Henry Baker @ 1994-12-31 2:12 ` Don Yuniskis 1994-12-31 17:08 ` Przemek Klosowski 1995-01-04 22:05 ` Fred McCall 2 siblings, 0 replies; 21+ messages in thread From: Don Yuniskis @ 1994-12-31 2:12 UTC (permalink / raw) In article <hbaker-3012941712040001@192.0.2.1>, Henry Baker <hbaker@netcom.com> wrote: >In the RT community there are the terms '_hard_ real-time' and >'_soft_ real-time'. 'Hard' RT means that the consequences of missing >a deadline are catastrophic, while 'soft' RT means that the consequences >are expensive, but not catastrophic. Yes, as I indicated in my reference to the value function associated with the missed deadline (for t>deadline, value <= 0 for HRT; value > 0 for SRT). >Designing for a 'hard' RT system is different from designing for a >'soft' RT system because you have to find the latencies of _every_ >operation, no matter how rare -- e.g., you have to worry about the >latencies of rare combinations of rare events -- e.g., multiple page >faults from a single instruction, cache faults for every memory reference >of a 1000-instruction sequence, etc. Exactly! A fact that most designers neglect either from ignorance or by choice (it's orders of magnitude harder to design for the worst-case whether you're designing hardware OR software -- tho', admittedly, a worst-case hardware design is orders of magnitude EASIER than a worst-case software design ;-) :-( >This means that nearly everyone is looking for ways to convert 'hard' >RT systems into 'soft' RT systems so that they can take advantage of >the increased average speeds. > >So if you're still doin' 'hard time', wise up! :-) :-) Right. However, certain parts of evry application "must" be HRT. The trick is to move as much of the application into the domain of SRT -- "Gee, it would be real nice if this got done on time... but it's more important that I know how to deal with the consequences of when it DOESN'T get done on time!" ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?) 1994-12-31 1:09 ` What's Real-Time? (was Re: Widespread C++ Competency Gap?) Henry Baker 1994-12-31 2:12 ` Don Yuniskis @ 1994-12-31 17:08 ` Przemek Klosowski 1995-01-01 9:35 ` Robert J Carter 1995-01-04 22:05 ` Fred McCall 2 siblings, 1 reply; 21+ messages in thread From: Przemek Klosowski @ 1994-12-31 17:08 UTC (permalink / raw) In article <hbaker-3012941712040001@192.0.2.1> hbaker@netcom.com (Henry Baker) writes: In article <3e26mc$n9u@Starbase.NeoSoft.COM>, dweller@Starbase.NeoSoft.COM (David Weller) wrote: > I've always learned "Real-Time" as implying determinism. Of course, > I may have a narrow view because I write space and flight simulators > :-) > > A rate of 1Hz is _definitely_ real-time, PROVIDED you neither exceed > or fall below that rate. Cxu vi komprenas ke mi diras? Designing for a 'hard' RT system is different from designing for a 'soft' RT system because you have to find the latencies of _every_ operation, no matter how rare -- e.g., you have to worry about the latencies of rare combinations of rare events -- e.g., multiple page faults from a single instruction, cache faults for every memory reference of a 1000-instruction sequence, etc. ... This means that nearly everyone is looking for ways to convert 'hard' RT systems into 'soft' RT systems so that they can take advantage of the increased average speeds. So if you're still doin' 'hard time', wise up! :-) :-) There is an interesting article in latest issue of Dr. Dobbs, written by the real time honcho at DEC. He claims that 'hard real time' is basically impossible for most complicated systems (he had an interesting insight that often people start with 'hard' systems, put them in a distributed environment and end up with 'soft' system because of communication issues: 'hard real time' being just a self-delusion in that case). The article claims that for large systems the average resource requirements are vastly different from worst-case requirements, and so large 'hard' systems in the traditional sense are simply too wasteful. An example is the telephone system. It is not 'hard real time': the cost would be just too big. Instead, they figured out how to manage available resources to have acceptably low failure rates. -- przemek klosowski (przemek@rrdstrad.nist.gov) Reactor Division (bldg. 235), E111 National Institute of Standards and Technology Gaithersburg, MD 20899, USA (301) 975 6249 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?) 1994-12-31 17:08 ` Przemek Klosowski @ 1995-01-01 9:35 ` Robert J Carter 1995-01-02 17:10 ` Przemek Klosowski 1995-01-03 23:20 ` Robert I. Eachus 0 siblings, 2 replies; 21+ messages in thread From: Robert J Carter @ 1995-01-01 9:35 UTC (permalink / raw) In article <PRZEMEK.94Dec31120819@rrdjazz.nist.gov>, Przemek Klosowski <przemek@rrdjazz.nist.gov> wrote: >The article claims that for large systems the average resource >requirements are vastly different from worst-case requirements, and so >large 'hard' systems in the traditional sense are simply too >wasteful. An example is the telephone system. It is not 'hard real >time': the cost would be just too big. Instead, they figured out how >to manage available resources to have acceptably low failure rates. That's fine as far as it goes, but having done some work myself with reactor systems (I saw your sig), what is an acceptably low failure rate for the emergency shutdown system of a reactor? -- Robert J Carter Oghma Systems Ottawa, Ontario, Canada FAX: +1 613 231.4732 rjc@oghma.synapse.net rjc@oghma.ocunix.on.ca ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?) 1995-01-01 9:35 ` Robert J Carter @ 1995-01-02 17:10 ` Przemek Klosowski 1995-01-03 23:20 ` Robert I. Eachus 1 sibling, 0 replies; 21+ messages in thread From: Przemek Klosowski @ 1995-01-02 17:10 UTC (permalink / raw) In article <3e5t1e$2ad@oghma.synapse.net> rjc@oghma.synapse.net (Robert J Carter) writes: >large 'hard' systems in the traditional sense are simply too >wasteful. An example is the telephone system. It is not 'hard real >time': the cost would be just too big. Instead, they figured out how >to manage available resources to have acceptably low failure rates. That's fine as far as it goes, but having done some work myself with reactor systems (I saw your sig), what is an acceptably low failure rate for the emergency shutdown system of a reactor? Of course the only acceptable failure rate for reactor emergency shutdown is zero. I was talking of failure in the sense of some tasks not meeting their deadline: e.g. if we have time, we shut down the pumps gradually, but if we are behind deadlines, we just scram them. The traditional ('bigoted', as the Dr. Dobbs article had it) real time has a discontinuous 'utility' function: one if the task finishes before deadline, minus infinity otherwise. The article claims that a more realistic approximation is a function like: utility to the system ^ | ----------|1 |\ | \ ---------------|--\--------> tardiness ( = time of completion - deadline) 0| \ | | BTW, I have nothing to do with the reactor design and operation: I am a solid state physicist, doing neutron scattering. -- przemek klosowski (przemek@rrdstrad.nist.gov) Reactor Division (bldg. 235), E111 National Institute of Standards and Technology Gaithersburg, MD 20899, USA (301) 975 6249 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?) 1995-01-01 9:35 ` Robert J Carter 1995-01-02 17:10 ` Przemek Klosowski @ 1995-01-03 23:20 ` Robert I. Eachus 1 sibling, 0 replies; 21+ messages in thread From: Robert I. Eachus @ 1995-01-03 23:20 UTC (permalink / raw) In article <3e5t1e$2ad@oghma.synapse.net> rjc@oghma.synapse.net (Robert J Carter) writes: > That's fine as far as it goes, but having done some work myself with > reactor systems (I saw your sig), what is an acceptably low failure rate > for the emergency shutdown system of a reactor? Very different for type I and type II failures! This is in fact an analysis that many projects never do, and often the specifications for a project fail to say which type of failure is worse. (Other names for the concept include FAIL SAFE systems, which only fail to operate, and criticality 1 failures for something which causes the system as a whole to fail/crash.) -- Robert I. Eachus with Standard_Disclaimer; use Standard_Disclaimer; function Message (Text: in Clever_Ideas) return Better_Ideas is... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?) 1994-12-31 1:09 ` What's Real-Time? (was Re: Widespread C++ Competency Gap?) Henry Baker 1994-12-31 2:12 ` Don Yuniskis 1994-12-31 17:08 ` Przemek Klosowski @ 1995-01-04 22:05 ` Fred McCall 2 siblings, 0 replies; 21+ messages in thread From: Fred McCall @ 1995-01-04 22:05 UTC (permalink / raw) In <hbaker-3012941712040001@192.0.2.1> hbaker@netcom.com Henry Baker writes: >In the RT community there are the terms '_hard_ real-time' and >'_soft_ real-time'. 'Hard' RT means that the consequences of missing >a deadline are catastrophic, while 'soft' RT means that the consequences >are expensive, but not catastrophic. This doesn't quite jibe with the way I learned it (although reflection seems to show that we're talking about different aspects of the same thing). Real-time (in general) means that there are time-driven constraints on the system which *must* be met. 'Hard' real-time implies that the required timeline must be met *exactly*; in other words, you need a deterministic machine where you know the timing of all operations and combinations thereof in order to do this correctly. If you blow the timeline (either early or late, within a very narrow window), the system will fail and cannot be recovered. 'Soft' real-time sets upper limits on how long things can take, but if you get there early the system will run along fine. Blown timelines can generally be recovered if other later operations can be performed fast enough to 'make up' the blown timeline. >Unfortunately, nearly _all_ of the progress in HW performance in the last >20 years has been in _mean_ performance, _not_ worst-case performance. >In fact, worst-case performance has only improved modestly over this >period. The _ratio_ of mean to worst-case performance has increased >dramatically, so that it may be 100:1 or more today. Not to mention that most of the things that we do to make the hardware faster also tend to make it less deterministic (super-scalar, super-pipelined designs, etc.) and hence less suitable for 'hard' real-time unless you design with extraordinary margins (so that you can still make the timeline if the pipeline stalls and has to be flushed, for example). > -- "Insisting on perfect safety is for people who don't have the balls to live in the real world." -- Mary Shafer, NASA Ames Dryden --------------------------------------------------------------------------- merlin@annwfn.com -- I don't speak for others and they don't speak for me. ^ permalink raw reply [flat|nested] 21+ messages in thread
[parent not found: <3ckb8g$841@gateway.wiltel.com>]
[parent not found: <1994Dec21.151952.8902@merlin.h>]
[parent not found: <1994Dec21.151952.8902@merlin.h <19941230.201628.350635.NETNEWS@UICVM.UIC.EDU>]
[parent not found: <3e9f60$8du@jive.cs.utexas.edu>]
[parent not found: <3epfsi$64d@gamma.ois.com>]
[parent not found: <3eua1r$4ea@gnat.cs.nyu.edu>]
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) [not found] ` <3eua1r$4ea@gnat.cs.nyu.edu> @ 1995-01-11 1:44 ` Henry Baker 1995-01-13 13:30 ` R. William Beckwith 1 sibling, 0 replies; 21+ messages in thread From: Henry Baker @ 1995-01-11 1:44 UTC (permalink / raw) In article <3eua1r$4ea@gnat.cs.nyu.edu>, dewar@cs.nyu.edu (Robert Dewar) wrote: > "Is there any work-in-progress by the large chip manufacturers to > design GC into their next-generation CPU architectures? It seems > like the next logical step." > > I sure hope not, I hope we have seen the end of this kind of incorrect > CISC thinking. At most what you want is some very specific hardware > assist instructions that are consistent with RISC instruction design > philosophy Actually, the performance of GC these days is more hindered by cache and VM designs than instruction sets. In particular, GC needs "write-allocate with subblock placement", such as is found on the DEC MIPS machines. I believe that Alphas also have write-allocate, but I'm not completely sure. The Pentium apparently does _not_ do write-allocate, which makes any kind of initialization of untouched memory pretty much of a disaster. Ditto for VM implementations -- people keep talking about 'log-based backing stores', but the major thing that is required isn't so much a log, as the ability to blind write to a page without having to read it first. PLDI'94 and LFP'94 had some good papers on cache issues in GC. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) [not found] ` <3eua1r$4ea@gnat.cs.nyu.edu> 1995-01-11 1:44 ` Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) Henry Baker @ 1995-01-13 13:30 ` R. William Beckwith 1995-01-13 14:59 ` Kelvin Nilsen 1995-01-13 21:04 ` Henry Baker 1 sibling, 2 replies; 21+ messages in thread From: R. William Beckwith @ 1995-01-13 13:30 UTC (permalink / raw) Robert Dewar (dewar@cs.nyu.edu) wrote: : "Is there any work-in-progress by the large chip manufacturers to : design GC into their next-generation CPU architectures? It seems : like the next logical step." : I sure hope not, I hope we have seen the end of this kind of incorrect : CISC thinking. At most what you want is some very specific hardware : assist instructions that are consistent with RISC instruction design : philosophy (for an example of this kind of thinking look at the trapped : add in the SPARC design, certainly we do NOT want to put LISP type checking : into the hardware, but this simple RISC consistent instruction is quite a : help for certain approaches to dynamic type checking in LISP interpretors. I'm not looking for a complex solution. I'm looking for the simplest design changes in instruction set, cache, and register files to accommodate transparent memory management. Henry Baker comments that the primary issues are cache management and VM. The fact that GC experts like Henry and Paul Wilson quickly digress to chip level issues leads me to believe that a concerted effort on the part of the CPU designer could lead to a much more elegant and efficient solution to memory management in compiled languages. It seems to me that GC is more than just a process level decision. There are three layers of participation in the effort: 1. CPU 2. O/S kernel 3. system libraries This is all in hopes that the application does not have to participate in the memory management excercise. Why not design the CPU with the GC'tor in mind instead of designing the GC'tor with the CPU in mind. I just seems silly that hardware designers aren't thinking more about relieving us programmers from thinking about what memory we are currently using. ... Bill -------------------------------------------------------------------------- e-mail: Bill.Beckwith@ois.com FAX: 703-264-1721 | Team Ada Objective Interface Systems, Inc. Voice: 703-264-1900 | dist, full O-O 1895 Preston White Drive, Suite 250 | multithreading Reston, VA 22091-5448 U.S.A. | built in ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) 1995-01-13 13:30 ` R. William Beckwith @ 1995-01-13 14:59 ` Kelvin Nilsen 1995-01-17 2:45 ` R. William Beckwith 1995-01-17 16:29 ` Robert I. Eachus 1995-01-13 21:04 ` Henry Baker 1 sibling, 2 replies; 21+ messages in thread From: Kelvin Nilsen @ 1995-01-13 14:59 UTC (permalink / raw) beckwb@ois.com (R. William Beckwith) writes: >: "Is there any work-in-progress by the large chip manufacturers to >: design GC into their next-generation CPU architectures? It seems >: like the next logical step." >I'm not looking for a complex solution. I'm looking for the simplest >design changes in instruction set, cache, and register files to >accommodate transparent memory management. Paul Wilson mentioned my proposed hardware support for GC. In case anyone is interested in more information on this, I suggest the following five references: Schmidt, Nilsen. Performance of a Hardware-Assisted Real-Time Garbage Collector. Sixth ASPLOS. Oct. 1994. Nilsen, Schmidt. Cost-Effective Object-Space Management for Hardware- Assisted Real-Time Garbage Collection. ACM LOPLAS. Dec. 1992. Nilsen. Reliable Real-Time Garbage Collection of C++. USENIX Computing Systems. Fall 1994. Nilsen, Schmidt. A High-Performance Hardware-Assisted Real-Time Garbage Collection System. Journal of Programming Languages. Jan. 1994. Nilsen. Cost-Effective Hardware-Assisted Real-Time Garbage Collection. 1994 ACM PLDI Workshop on Languages, Compilers and Tool Support for Real-Time Systems (ftp.cs.iastate.edu:/pub/kelvin/lcts-rts.PS.Z) (this paper provides an overview of the hardware prototype that is currently under development. there are a number of recent innovations that we have introduced to reduce hardware costs and improve system flexibility) >Why not design the CPU with the GC'tor in mind instead of designing the >GC'tor with the CPU in mind. >I just seems silly that hardware designers aren't thinking more about >relieving us programmers from thinking about what memory we are currently >using. Our design proposes to place all of the special circuitry within a custom memory module. The design is portable between CPU architectures, and allows memory to be cached. The economic benefits are: 1. Nobody (these days) would even think of designing such specialized support into a CPU. The benefits have not been proven. The costs are not well understood. The investment required to deliver a price/performance competitive CPU in today's marketplace is too huge. By placing the custom hardware within the memory subsystem, we are able to provide the desired benefits without most of the costs of developing special-purpose CPUs. 2. The hardware can operate at memory speeds rather than at CPU speeds. This simplifies the implementation. 3. Since the memory architecture is portable between CPU architectures, we can separate issues of instruction-set compatibility from issues of real-time garbage collection support. (It would be a real bummer if our custom-CPU died in the marketplace not because hardware-assisted GC was a bad idea, but because the instruction set simply wasn't "popular".) Let me add a little bit of clarification on this topic: 1. Our work has been motivated by the need to support tight worst-case time bounds on dynamic memory management routines while supporting predictable memory utilization. Given these requirements, hardware support seemed like the "right thing to do." 2. If you do not need hard-real-time performance guarantees, then it is much more difficult to justify the costs of custom hardware. There are many good software garbage collection techniques that perform well enough in the absence of custom hardware. 3. On the other hand, if the proposed custom hardware is manufactured in large enough quantities, then it becomes commodity hardware and is much more affordable. This may raise the ante. Programmers will have higher expectations of their programming environments, and garbage collection activities will be a more dominant part of a "typical" system workload. (We are trying to find some high-volume users for our system in order to bring the per-unit cost down. Potential buyers include the video game, virtual reality, and television-top multimedia box developers.) 4. Lots of "good ideas" never go anywhere. To turn a good idea into a useful capability requires huge amounts of technology development, salesmanship, risk taking, faith (blinders?), and a little bit of luck. We have a "chip manufacturer" partner who is currently helping to bring this idea to the marketplace, but the partner is not one of the big names and they are extremely careful to control their risks and limit their investments. A big part of "computer architecture" research seems to consist of simply learning how to "play your cards." Kelvin Nilsen/Dept. of Computer Science/Iowa State University/Ames, IA 50011 (515) 294-2259 kelvin@cs.iastate.edu uunet!cs.iastate.edu!kelvin ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) 1995-01-13 14:59 ` Kelvin Nilsen @ 1995-01-17 2:45 ` R. William Beckwith 1995-01-19 15:57 ` Kurt Bischoff 1995-01-17 16:29 ` Robert I. Eachus 1 sibling, 1 reply; 21+ messages in thread From: R. William Beckwith @ 1995-01-17 2:45 UTC (permalink / raw) : Kelvin Nilsen/Dept. of Computer Science/Iowa State University/Ames, IA 50011 : (515) 294-2259 kelvin@cs.iastate.edu uunet!cs.iastate.edu!kelvin Thanks for the informative followup Kevin. If the chip you're working on has or will have an Ada95 compiler for it, please let me know. I try and plug in a nice hardware assisted GC'tor. ... Bill ----------------------------------------------------- e-mail: Bill.Beckwith@ois.com | Team Ada Objective Interface Systems, Inc. | dist, full O-O 1895 Preston White Drive, Suite 250 | multithreading Reston, VA 22091-5448 U.S.A. | built in ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) 1995-01-17 2:45 ` R. William Beckwith @ 1995-01-19 15:57 ` Kurt Bischoff 0 siblings, 0 replies; 21+ messages in thread From: Kurt Bischoff @ 1995-01-19 15:57 UTC (permalink / raw) In article <3ffb0d$qbp@gamma.ois.com> beckwb@ois.com (R. William Beckwith) write s: >: Kelvin Nilsen/Dept. of Computer Science/Iowa State University/Ames, IA 50011 >: (515) 294-2259 kelvin@cs.iastate.edu uunet!cs.iastate.edu!kelvin > >Thanks for the informative followup Kevin. If the chip you're working on >has or will have an Ada95 compiler for it, please let me know. I try and >plug in a nice hardware assisted GC'tor. > I think Kelvin was making the point that GC support should be kept out of the CPU--it should be in a separate memory module. Then compiler implement- ation is independent of hardware-assisted GC implementation. Kelvin wrote: : 3. Since the memory architecture is portable between CPU architectures, : we can separate issues of instruction-set compatibility from issues : of real-time garbage collection support. ------------- Kurt Bischoff, Odyssey Research, 301A Dates Drive, Ithaca, NY, 14850 (607) 277-2020 bischoff@oracorp.com ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) 1995-01-13 14:59 ` Kelvin Nilsen 1995-01-17 2:45 ` R. William Beckwith @ 1995-01-17 16:29 ` Robert I. Eachus 1995-01-18 15:27 ` Henry Baker ` (2 more replies) 1 sibling, 3 replies; 21+ messages in thread From: Robert I. Eachus @ 1995-01-17 16:29 UTC (permalink / raw) In article <kelvin.790009178@kickapoo> kelvin@cs.iastate.edu (Kelvin Nilsen) writes: > By placing the custom hardware within the memory subsystem, > we are able to provide the desired benefits without most of the > costs of developing special-purpose CPUs. There is another major reason for putting the custom hardware in the memory system. There is a fairly large class of problems which require time proportional to n^k lg2 n on any von Neuman architecture, but can be solved in time n^(k-1) lg2 on hardware which supports clearing large sections of memory to zero in constant time. (Note that if k is one, you can get orders of magnitude improvement.) Since many memory systems can support this happy property, all you need to do to get the faster algorithms is to bypass the OS, or more politely, have an OS call which clears a section of memory. Robert I. Eachus with Standard_Disclaimer; use Standard_Disclaimer; function Message (Text: in Clever_Ideas) return Better_Ideas is... -- Robert I. Eachus with Standard_Disclaimer; use Standard_Disclaimer; function Message (Text: in Clever_Ideas) return Better_Ideas is... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) 1995-01-17 16:29 ` Robert I. Eachus @ 1995-01-18 15:27 ` Henry Baker 1995-01-19 19:59 ` Norman H. Cohen 1995-01-22 2:56 ` David Hanley 2 siblings, 0 replies; 21+ messages in thread From: Henry Baker @ 1995-01-18 15:27 UTC (permalink / raw) In article <EACHUS.95Jan17112913@spectre.mitre.org>, eachus@spectre.mitre.org (Robert I. Eachus) wrote: > In article <kelvin.790009178@kickapoo> kelvin@cs.iastate.edu (Kelvin Nilsen) writes: > > > By placing the custom hardware within the memory subsystem, > > we are able to provide the desired benefits without most of the > > costs of developing special-purpose CPUs. > > There is another major reason for putting the custom hardware in > the memory system. There is a fairly large class of problems which > require time proportional to n^k lg2 n on any von Neuman architecture, > but can be solved in time n^(k-1) lg2 on hardware which supports > clearing large sections of memory to zero in constant time. (Note > that if k is one, you can get orders of magnitude improvement.) Simple example, please? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) 1995-01-17 16:29 ` Robert I. Eachus 1995-01-18 15:27 ` Henry Baker @ 1995-01-19 19:59 ` Norman H. Cohen 1995-01-20 2:20 ` Henry Baker 1995-01-20 14:49 ` Robert I. Eachus 1995-01-22 2:56 ` David Hanley 2 siblings, 2 replies; 21+ messages in thread From: Norman H. Cohen @ 1995-01-19 19:59 UTC (permalink / raw) In article <EACHUS.95Jan17112913@spectre.mitre.org>, eachus@spectre.mitre.org (Robert I. Eachus) writes: |> There is another major reason for putting the custom hardware in |> the memory system. There is a fairly large class of problems which |> require time proportional to n^k lg2 n on any von Neuman architecture, |> but can be solved in time n^(k-1) lg2 on hardware which supports |> clearing large sections of memory to zero in constant time. (Note |> that if k is one, you can get orders of magnitude improvement.) This doesn't make sense, because there is a well known programming trick (used more in analysis-of-algorithms than in real programs because of the large constant factor) for marking a large section of memory as uninitialized in constant time. (The effect of initializing a large section of memory to zero is achieved by checking on each read whether the location read from has been initialized, and using the value zero if it has not been.) The trick works as follows: View the section of memory as an array of N initializable units numbered 0 to N-1. Allocate two arrays of integers, Stack and Stack_Index, indexed from 0 to N-1. Initially, both arrays contain garbage. There is a stack pointer, Top, pointing to the highest-indexed element of Stack that does NOT contain garbage (so that initially it equals -1). When initializable unit I is initialized, Top is incremented, the new value of Top is placed in Stack_Index(I), and I is placed in Stack(Top). Then the index of every initialized unit is found exactly once in Stack(0 .. Top), and if unit I has been initialized, Stack_Index(I) contains the index within Stack(0 .. Top) containing I. The test for whether initializable unit I is initialized is: if Stack_Index(I) in 0 .. Top then if Stack(Stack_Index(I)) = I then return True; -- Stack_Index(I) contains a valid index into the stack, and the -- corresponding index in the stack indicates that unit I has -- been initialized. else return False; -- The garbage in Stack_Index(I) happened to point into the -- initialized part of Stack, but the value in Stack reveals -- that unit I was not the initializable unit for which this -- component of Stack was initialized. end if; else return False; -- Garbage in Stack_Index(I) is not even a valid stack index. end if; or, more succinctly, return Stack_Index(I) in 0 .. Top and then Stack(Stack_Index(I))=I; -- Norman H. Cohen ncohen@watson.ibm.com ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) 1995-01-19 19:59 ` Norman H. Cohen @ 1995-01-20 2:20 ` Henry Baker 1995-01-20 14:49 ` Robert I. Eachus 1 sibling, 0 replies; 21+ messages in thread From: Henry Baker @ 1995-01-20 2:20 UTC (permalink / raw) In article <3fmgbn$rns@watnews1.watson.ibm.com>, ncohen@watson.ibm.com wrote: > In article <EACHUS.95Jan17112913@spectre.mitre.org>, > eachus@spectre.mitre.org (Robert I. Eachus) writes: > > |> There is another major reason for putting the custom hardware in > |> the memory system. There is a fairly large class of problems which > |> require time proportional to n^k lg2 n on any von Neuman architecture, > |> but can be solved in time n^(k-1) lg2 on hardware which supports > |> clearing large sections of memory to zero in constant time. (Note > |> that if k is one, you can get orders of magnitude improvement.) > > This doesn't make sense, because there is a well known programming trick > (used more in analysis-of-algorithms than in real programs because of the > large constant factor) for marking a large section of memory as > uninitialized in constant time. (The effect of initializing a large > section of memory to zero is achieved by checking on each read whether > the location read from has been initialized, and using the value zero if > it has not been.) This technique is known as a 'read barrier' in GC circles. See: ftp://ftp.netcom.com/pub/hb/hbaker/RealTimeGC.html (or .ps.Z) for 1978 CACM reference. (The read barrier technique is discussed twice in this paper -- once for the basic GC algorithm itself, and once for 'moving' and/or copying large arrays incrementally.) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) 1995-01-19 19:59 ` Norman H. Cohen 1995-01-20 2:20 ` Henry Baker @ 1995-01-20 14:49 ` Robert I. Eachus 1 sibling, 0 replies; 21+ messages in thread From: Robert I. Eachus @ 1995-01-20 14:49 UTC (permalink / raw) In article <3fmgbn$rns@watnews1.watson.ibm.com> ncohen@watson.ibm.com (Norman H. Cohen) writes: > This doesn't make sense, because there is a well known programming trick > (used more in analysis-of-algorithms than in real programs because of the > large constant factor) for marking a large section of memory as > uninitialized in constant time... Read barriers are a neat trick, but for real-time applications replacing a boolean with a pointer and... is totally intolerable. I have used algorithms in this class in real-time systems, and although we devoted a whole page to each bit-vector, the typical maximums are on the order of a few hundred bits. But the difference between say 5 clocks and 20 (clearing two words at a time) is crucial. So yes, you are correct that these algorithms can be replaced by more complex versions without the special memory clearing instruction, but the large constant multiplier makes that approach useless. To give a real world example, in one application, if all the time available was assigned to the innermost loop of an N^2 lg2 N algorithm, you had 25 clocks per iteration--and a cache miss could take most of that. Fortunately I found an algorithm which met the theoretical complexity limit, and where the all the inner loop did was set a bit-vector of length N to false. In that case limiting the size of the bit-vector (so I could keep it in a register pair and avoid write throughs) was sufficient. -- Robert I. Eachus with Standard_Disclaimer; use Standard_Disclaimer; function Message (Text: in Clever_Ideas) return Better_Ideas is... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) 1995-01-17 16:29 ` Robert I. Eachus 1995-01-18 15:27 ` Henry Baker 1995-01-19 19:59 ` Norman H. Cohen @ 1995-01-22 2:56 ` David Hanley 1995-01-23 17:06 ` Robert I. Eachus 2 siblings, 1 reply; 21+ messages in thread From: David Hanley @ 1995-01-22 2:56 UTC (permalink / raw) Robert I. Eachus (eachus@spectre.mitre.org) wrote: : There is another major reason for putting the custom hardware in : the memory system. There is a fairly large class of problems which : require time proportional to n^k lg2 n on any von Neuman architecture, : but can be solved in time n^(k-1) lg2 on hardware which supports : clearing large sections of memory to zero in constant time. (Note : that if k is one, you can get orders of magnitude improvement.) Hmmm. While I suppose this is possible, I haven't yet run across this "large body of algorithms" in which the execution time is reduced by the order of magnitude when memory blocks can be zeroed out very quickly. Even if the algorithm did have to zero out the memory itself, I don't see how that could take it from, say, n^2 to n^3. Could you please point some of these out to me? : Since many memory systems can support this happy property, all you : need to do to get the faster algorithms is to bypass the OS, or more : politely, have an OS call which clears a section of memory. Hmm. Not sure what this has to do with GC, really. In any case, it would be pretty easy to design memory hardware that would zero out pages really quickly( just don't refresh them one cycle ). And the bzero() function on your C compiler could be set to use it. But mabye I'm just confused. Could you please explaain if this is wrong? -- ------------------------------------------------------------------------------ | David James Hanley -- dhanley@lac.eecs.uic.edu -- C++, OOD, martial arts | | Laboratory for advanced computing | My employer barely KNOWS me. | ------------------------------------------------------------------------------ "There are trivial truths & there are great truths. The opposite of a trivial truth is plainly false." "The opposite of a great truth is also true." -Neils Bohr ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) 1995-01-22 2:56 ` David Hanley @ 1995-01-23 17:06 ` Robert I. Eachus 0 siblings, 0 replies; 21+ messages in thread From: Robert I. Eachus @ 1995-01-23 17:06 UTC (permalink / raw) In article <19950122.094355.149517.NETNEWS@UICVM.UIC.EDU> dhanley@matisse.eecs.uic.edu (David Hanley) writes: > Hmmm. While I suppose this is possible, I haven't yet run > across this "large body of algorithms" in which the execution time > is reduced by the order of magnitude when memory blocks can be > zeroed out very quickly. Even if the algorithm did have to zero > out the memory itself, I don't see how that could take it from, > say, n^2 to n^3. Could you please point some of these out to me? Maybe I should have said "real-time" or "on-line" algorithms. The distinguishing characteristic seems to be that you have to choose the solution method without knowing the size of (or all of) the data set in advance. Another way of looking at it is that you are concerned with solving a problem, but you are only concerned with the time between receiving the last data and getting a result. (Although in practice, what you want to do is, at every stage of reading in data, have a solution based on all the data seen so far.) The seminal paper is "Real-time computation." by M. O. Rabin in 1963. (The reference I have is Isreal J. Math 1, 203-211, but I think it has been reprinted.) Hartmanis and Sterns, "On the computational complexity of algorithms," Trans. Amer. Math. Soc. 117, 285-306(1965). > Hmm. Not sure what this has to do with GC, really. In any > case, it would be pretty easy to design memory hardware that would > zero out pages really quickly( just don't refresh them one cycle ). > And the bzero() function on your C compiler could be set to use it. > But mabye I'm just confused. Could you please explaain if this is > wrong? Nothing confused in that, it is done in real-time applications all the time. A typical scenario is one where you build a hash table using a hash function with a very low likelihood of collision. Normally you would have to factor the time to clear the hash table, or the time for maintaining a structure just for clearing the table into the time required for the algorithm. Poof!--if supported by the hardware--is the most elegent solution. The only thing it has to do with garbage collection AFIAK is that there is a nice technique where data is ping-ponged back and forth between two heaps. For example, you have a structure containing all current aircraft tracks. If the old data is reviewed and, if still valid, moved every sweep, all the garbage created by building maintaining the tracks can be made to dissapear with never a worry about fragmentation. -- Robert I. Eachus with Standard_Disclaimer; use Standard_Disclaimer; function Message (Text: in Clever_Ideas) return Better_Ideas is... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) 1995-01-13 13:30 ` R. William Beckwith 1995-01-13 14:59 ` Kelvin Nilsen @ 1995-01-13 21:04 ` Henry Baker 1995-01-17 10:37 ` Mark Reinhold 1 sibling, 1 reply; 21+ messages in thread From: Henry Baker @ 1995-01-13 21:04 UTC (permalink / raw) In article <3f5vaf$r07@gamma.ois.com>, beckwb@ois.com (R. William Beckwith) wrote: > Henry Baker comments that the primary issues are cache management and VM. > The fact that GC experts like Henry and Paul Wilson quickly digress to > chip level issues leads me to believe that a concerted effort on the > part of the CPU designer could lead to a much more elegant and efficient > solution to memory management in compiled languages. If people are interested in these issues, and haven't yet read the following paper, they are in for a treat: Reinhold, M.B. "Cache Performance of Garbage-Collected Programs". PLDI'94, ACM Sigplan Notices 29, 6 (June 1994), 206-217. (It _may_ be on the net. Reinhold works for NEC in New Jersy, I believe.) ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) 1995-01-13 21:04 ` Henry Baker @ 1995-01-17 10:37 ` Mark Reinhold 0 siblings, 0 replies; 21+ messages in thread From: Mark Reinhold @ 1995-01-17 10:37 UTC (permalink / raw) In article <hbaker-1301951307490001@192.0.2.1> hbaker@netcom.com writes: > If people are interested in these issues, and haven't yet read the following > paper, they are in for a treat: > Reinhold, M.B. "Cache Performance of Garbage-Collected Programs". PLDI'94, > ACM Sigplan Notices 29, 6 (June 1994), 206-217. (It _may_ be on the net. > Reinhold works for NEC in New Jersey, I believe.) It is indeed on the net. See http://www.neci.nj.nec.com/homepages/mbr.html. Comments welcome. ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~1995-01-23 17:06 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <787227087snz@wslint.demon.co.uk> [not found] ` <3da1nd$fss@gateway.wiltel.com> [not found] ` <3e1rqn$ouh@news.parc.xerox.com> [not found] ` <3e22hi$pqf@baygull.rtd.com> [not found] ` <3e26mc$n9u@Starbase.NeoSoft.COM> 1994-12-31 1:09 ` What's Real-Time? (was Re: Widespread C++ Competency Gap?) Henry Baker 1994-12-31 2:12 ` Don Yuniskis 1994-12-31 17:08 ` Przemek Klosowski 1995-01-01 9:35 ` Robert J Carter 1995-01-02 17:10 ` Przemek Klosowski 1995-01-03 23:20 ` Robert I. Eachus 1995-01-04 22:05 ` Fred McCall [not found] ` <3ckb8g$841@gateway.wiltel.com> [not found] ` <1994Dec21.151952.8902@merlin.h> [not found] ` <1994Dec21.151952.8902@merlin.h <19941230.201628.350635.NETNEWS@UICVM.UIC.EDU> [not found] ` <3e9f60$8du@jive.cs.utexas.edu> [not found] ` <3epfsi$64d@gamma.ois.com> [not found] ` <3eua1r$4ea@gnat.cs.nyu.edu> 1995-01-11 1:44 ` Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) Henry Baker 1995-01-13 13:30 ` R. William Beckwith 1995-01-13 14:59 ` Kelvin Nilsen 1995-01-17 2:45 ` R. William Beckwith 1995-01-19 15:57 ` Kurt Bischoff 1995-01-17 16:29 ` Robert I. Eachus 1995-01-18 15:27 ` Henry Baker 1995-01-19 19:59 ` Norman H. Cohen 1995-01-20 2:20 ` Henry Baker 1995-01-20 14:49 ` Robert I. Eachus 1995-01-22 2:56 ` David Hanley 1995-01-23 17:06 ` Robert I. Eachus 1995-01-13 21:04 ` Henry Baker 1995-01-17 10:37 ` Mark Reinhold
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox