Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?)

comp.lang.ada
 help / color / mirror / Atom feed

* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?)
       [not found]       ` <3e26mc$n9u@Starbase.NeoSoft.COM>
@ 1994-12-31  1:09         ` Henry Baker
  1994-12-31  2:12           ` Don Yuniskis
                             ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Henry Baker @ 1994-12-31  1:09 UTC (permalink / raw)

In article <3e26mc$n9u@Starbase.NeoSoft.COM>, dweller@Starbase.NeoSoft.COM
(David Weller) wrote:

> In article <3e22hi$pqf@baygull.rtd.com>, Don Yuniskis <dgy@rtd.com> wrote:
> >In article <3e1rqn$ouh@news.parc.xerox.com>,
> >Hans Boehm <boehm@parc.xerox.com> wrote:
> >>I'm not an expert on real-time programming.  But I'm always confused
> >>by these discussions.  It seems to me that "real-time" is usually
> >>intended to mean one of the following two extremes, or anything
> >>in between:
> >>1. Reasonable interactive response on a fast machine.  Maybe good enough
> >>so that a video game isn't too jerky.
> >>2. Part of a safety critical system that requires millisecond or so
> >>GUARANTEED response, and GUARANTEED space bounds.
> >
> >Actually, neither is really correct.  Typically real-time == real-fast.
> >However, real-fast != real-time.  Also, real-time !-> real-fast
> 
> This is contrary to any text I've read.  Maybe I just live in a
> closet? :-)  Can you cite a "Real-Time" text or two that says this?
> 
> I've always learned "Real-Time" as implying determinism.  Of course,
> I may have a narrow view because I write space and flight simulators
> :-)
> 
> A rate of 1Hz is _definitely_ real-time, PROVIDED you neither exceed
> or fall below that rate.  Cxu vi komprenas ke mi diras?

In the RT community there are the terms '_hard_ real-time' and
'_soft_ real-time'.  'Hard' RT means that the consequences of missing
a deadline are catastrophic, while 'soft' RT means that the consequences
are expensive, but not catastrophic.

Designing for a 'hard' RT system is different from designing for a
'soft' RT system because you have to find the latencies of _every_
operation, no matter how rare -- e.g., you have to worry about the
latencies of rare combinations of rare events -- e.g., multiple page
faults from a single instruction, cache faults for every memory reference
of a 1000-instruction sequence, etc.

Unfortunately, nearly _all_ of the progress in HW performance in the last
20 years has been in _mean_ performance, _not_ worst-case performance.
In fact, worst-case performance has only improved modestly over this
period.  The _ratio_ of mean to worst-case performance has increased
dramatically, so that it may be 100:1 or more today.

This means that nearly everyone is looking for ways to convert 'hard'
RT systems into 'soft' RT systems so that they can take advantage of
the increased average speeds.

So if you're still doin' 'hard time', wise up!  :-) :-)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?)
  1994-12-31  1:09         ` What's Real-Time? (was Re: Widespread C++ Competency Gap?) Henry Baker
@ 1994-12-31  2:12           ` Don Yuniskis
  1994-12-31 17:08           ` Przemek Klosowski
  1995-01-04 22:05           ` Fred McCall
  2 siblings, 0 replies; 21+ messages in thread
From: Don Yuniskis @ 1994-12-31  2:12 UTC (permalink / raw)

In article <hbaker-3012941712040001@192.0.2.1>,
Henry Baker <hbaker@netcom.com> wrote:

>In the RT community there are the terms '_hard_ real-time' and
>'_soft_ real-time'.  'Hard' RT means that the consequences of missing
>a deadline are catastrophic, while 'soft' RT means that the consequences
>are expensive, but not catastrophic.

Yes, as I indicated in my reference to the value function associated
with the missed deadline (for t>deadline, value <= 0 for HRT; value > 0
for SRT).

>Designing for a 'hard' RT system is different from designing for a
>'soft' RT system because you have to find the latencies of _every_
>operation, no matter how rare -- e.g., you have to worry about the
>latencies of rare combinations of rare events -- e.g., multiple page
>faults from a single instruction, cache faults for every memory reference
>of a 1000-instruction sequence, etc.

Exactly!  A fact that most designers neglect either from ignorance or
by choice (it's orders of magnitude harder to design for the worst-case
whether you're designing hardware OR software -- tho', admittedly, a
worst-case hardware design is orders of magnitude EASIER than a
worst-case software design ;-) :-(

>This means that nearly everyone is looking for ways to convert 'hard'
>RT systems into 'soft' RT systems so that they can take advantage of
>the increased average speeds.
>
>So if you're still doin' 'hard time', wise up!  :-) :-)

Right.  However, certain parts of evry application "must" be HRT.
The trick is to move as much of the application into the domain of
SRT -- "Gee, it would be real nice if this got done on time... but
it's more important that I know how to deal with the consequences
of when it DOESN'T get done on time!"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?)
  1994-12-31  1:09         ` What's Real-Time? (was Re: Widespread C++ Competency Gap?) Henry Baker
  1994-12-31  2:12           ` Don Yuniskis
@ 1994-12-31 17:08           ` Przemek Klosowski
  1995-01-01  9:35             ` Robert J Carter
  1995-01-04 22:05           ` Fred McCall
  2 siblings, 1 reply; 21+ messages in thread
From: Przemek Klosowski @ 1994-12-31 17:08 UTC (permalink / raw)


In article <hbaker-3012941712040001@192.0.2.1> hbaker@netcom.com (Henry Baker) writes:

   In article <3e26mc$n9u@Starbase.NeoSoft.COM>, dweller@Starbase.NeoSoft.COM
   (David Weller) wrote:

   > I've always learned "Real-Time" as implying determinism.  Of course,
   > I may have a narrow view because I write space and flight simulators
   > :-)
   > 
   > A rate of 1Hz is _definitely_ real-time, PROVIDED you neither exceed
   > or fall below that rate.  Cxu vi komprenas ke mi diras?

   Designing for a 'hard' RT system is different from designing for a
   'soft' RT system because you have to find the latencies of _every_
   operation, no matter how rare -- e.g., you have to worry about the
   latencies of rare combinations of rare events -- e.g., multiple page
   faults from a single instruction, cache faults for every memory reference
   of a 1000-instruction sequence, etc.
	... 
   This means that nearly everyone is looking for ways to convert 'hard'
   RT systems into 'soft' RT systems so that they can take advantage of
   the increased average speeds.

   So if you're still doin' 'hard time', wise up!  :-) :-)

There is an interesting article in latest issue of Dr. Dobbs, written
by the real time honcho at DEC. He claims that 'hard real time' is
basically impossible for most complicated systems (he had an
interesting insight that often people start with 'hard' systems, put
them in a distributed environment and end up with 'soft' system
because of communication issues: 'hard real time' being just a
self-delusion in that case).

The article claims that for large systems the average resource
requirements are vastly different from worst-case requirements, and so
large 'hard' systems in the traditional sense are simply too
wasteful. An example is the telephone system. It is not 'hard real
time': the cost would be just too big. Instead, they figured out how
to manage available resources to have acceptably low failure rates.
--
			przemek klosowski (przemek@rrdstrad.nist.gov)
			Reactor Division (bldg. 235), E111
			National Institute of Standards and Technology
			Gaithersburg, MD 20899,      USA

			(301) 975 6249



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?)
  1994-12-31 17:08           ` Przemek Klosowski
@ 1995-01-01  9:35             ` Robert J Carter
  1995-01-02 17:10               ` Przemek Klosowski
  1995-01-03 23:20               ` Robert I. Eachus
  0 siblings, 2 replies; 21+ messages in thread
From: Robert J Carter @ 1995-01-01  9:35 UTC (permalink / raw)


In article <PRZEMEK.94Dec31120819@rrdjazz.nist.gov>,
Przemek Klosowski <przemek@rrdjazz.nist.gov> wrote:


>The article claims that for large systems the average resource
>requirements are vastly different from worst-case requirements, and so
>large 'hard' systems in the traditional sense are simply too
>wasteful. An example is the telephone system. It is not 'hard real
>time': the cost would be just too big. Instead, they figured out how
>to manage available resources to have acceptably low failure rates.

That's fine as far as it goes, but having done some work myself with 
reactor systems (I saw your sig), what is an acceptably low failure rate
for the emergency shutdown system of a reactor?
-- 
Robert J Carter                                         Oghma Systems
Ottawa, Ontario, Canada                          FAX: +1 613 231.4732
rjc@oghma.synapse.net                          rjc@oghma.ocunix.on.ca



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?)
  1995-01-01  9:35             ` Robert J Carter
@ 1995-01-02 17:10               ` Przemek Klosowski
  1995-01-03 23:20               ` Robert I. Eachus
  1 sibling, 0 replies; 21+ messages in thread
From: Przemek Klosowski @ 1995-01-02 17:10 UTC (permalink / raw)

In article <3e5t1e$2ad@oghma.synapse.net> rjc@oghma.synapse.net (Robert J Carter) writes:

   >large 'hard' systems in the traditional sense are simply too
   >wasteful. An example is the telephone system. It is not 'hard real
   >time': the cost would be just too big. Instead, they figured out how
   >to manage available resources to have acceptably low failure rates.

   That's fine as far as it goes, but having done some work myself with 
   reactor systems (I saw your sig), what is an acceptably low failure rate
   for the emergency shutdown system of a reactor?

Of course the only acceptable failure rate for reactor emergency
shutdown is zero.  I was talking of failure in the sense of some tasks
not meeting their deadline: e.g. if we have time, we shut down the
pumps gradually, but if we are behind deadlines, we just scram them.

The traditional ('bigoted', as the Dr. Dobbs article had it) real time
has a discontinuous 'utility' function: one if the task finishes
before deadline, minus infinity otherwise. The article claims that a more
realistic approximation is a function like:

       utility to the system
                  ^
                  |
        ----------|1
                  |\
                  | \
   ---------------|--\--------> tardiness ( = time of completion - deadline)
                 0|   \
                  |    |

BTW, I have nothing to do with the reactor design and operation: I am
a solid state physicist, doing neutron scattering.

--
			przemek klosowski (przemek@rrdstrad.nist.gov)
			Reactor Division (bldg. 235), E111
			National Institute of Standards and Technology
			Gaithersburg, MD 20899,      USA

			(301) 975 6249

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?)
  1995-01-01  9:35             ` Robert J Carter
  1995-01-02 17:10               ` Przemek Klosowski
@ 1995-01-03 23:20               ` Robert I. Eachus
  1 sibling, 0 replies; 21+ messages in thread
From: Robert I. Eachus @ 1995-01-03 23:20 UTC (permalink / raw)

In article <3e5t1e$2ad@oghma.synapse.net> rjc@oghma.synapse.net (Robert J Carter) writes:

 > That's fine as far as it goes, but having done some work myself with 
 > reactor systems (I saw your sig), what is an acceptably low failure rate
 > for the emergency shutdown system of a reactor?

   Very different for type I and type II failures!

   This is in fact an analysis that many projects never do, and often
the specifications for a project fail to say which type of failure is
worse.  (Other names for the concept include FAIL SAFE systems, which
only fail to operate, and criticality 1 failures for something which
causes the system as a whole to fail/crash.)

--

					Robert I. Eachus

with Standard_Disclaimer;
use  Standard_Disclaimer;
function Message (Text: in Clever_Ideas) return Better_Ideas is...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: What's Real-Time? (was Re: Widespread C++ Competency Gap?)
  1994-12-31  1:09         ` What's Real-Time? (was Re: Widespread C++ Competency Gap?) Henry Baker
  1994-12-31  2:12           ` Don Yuniskis
  1994-12-31 17:08           ` Przemek Klosowski
@ 1995-01-04 22:05           ` Fred McCall
  2 siblings, 0 replies; 21+ messages in thread
From: Fred McCall @ 1995-01-04 22:05 UTC (permalink / raw)

In <hbaker-3012941712040001@192.0.2.1> hbaker@netcom.com Henry Baker writes:

>In the RT community there are the terms '_hard_ real-time' and
>'_soft_ real-time'.  'Hard' RT means that the consequences of missing
>a deadline are catastrophic, while 'soft' RT means that the consequences
>are expensive, but not catastrophic.

This doesn't quite jibe with the way I learned it (although reflection
seems to show that we're talking about different aspects of the same
thing). Real-time (in general) means that there are time-driven
constraints on the system which *must* be met.

'Hard' real-time implies that the required timeline must be met
*exactly*; in other words, you need a deterministic machine where you
know the timing of all operations and combinations thereof in order to
do this correctly.  If you blow the timeline (either early or late,
within a very narrow window), the system will fail and cannot be
recovered. 

'Soft' real-time sets upper limits on how long things can take, but if
you get there early the system will run along fine.  Blown timelines can
generally be recovered if other later operations can be performed fast
enough to 'make up' the blown timeline.

>Unfortunately, nearly _all_ of the progress in HW performance in the last
>20 years has been in _mean_ performance, _not_ worst-case performance.
>In fact, worst-case performance has only improved modestly over this
>period.  The _ratio_ of mean to worst-case performance has increased
>dramatically, so that it may be 100:1 or more today.

Not to mention that most of the things that we do to make the hardware
faster also tend to make it less deterministic (super-scalar,
super-pipelined designs, etc.) and hence less suitable for 'hard'
real-time unless you design with extraordinary margins (so that you can
still make the timeline if the pipeline stalls and has to be flushed,
for example).

>

--
"Insisting on perfect safety is for people who don't have the balls to live
 in the real world."   -- Mary Shafer, NASA Ames Dryden
---------------------------------------------------------------------------
merlin@annwfn.com -- I don't speak for others and they don't speak for me.

^ permalink raw reply	[flat|nested] 21+ messages in thread

[parent not found: <3ckb8g$841@gateway.wiltel.com>]

[parent not found: <1994Dec21.151952.8902@merlin.h>]

[parent not found: <1994Dec21.151952.8902@merlin.h <19941230.201628.350635.NETNEWS@UICVM.UIC.EDU>]

[parent not found: <3e9f60$8du@jive.cs.utexas.edu>]

[parent not found: <3epfsi$64d@gamma.ois.com>]

[parent not found: <3eua1r$4ea@gnat.cs.nyu.edu>]

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
       [not found]         ` <3eua1r$4ea@gnat.cs.nyu.edu>
@ 1995-01-11  1:44           ` Henry Baker
  1995-01-13 13:30           ` R. William Beckwith
  1 sibling, 0 replies; 21+ messages in thread
From: Henry Baker @ 1995-01-11  1:44 UTC (permalink / raw)

In article <3eua1r$4ea@gnat.cs.nyu.edu>, dewar@cs.nyu.edu (Robert Dewar) wrote:

> "Is there any work-in-progress by the large chip manufacturers to
> design GC into their next-generation CPU architectures?  It seems
> like the next logical step."
> 
> I sure hope not, I hope we have seen the end of this kind of incorrect
> CISC thinking. At most what you want is some very specific hardware
> assist instructions that are consistent with RISC instruction design
> philosophy 

Actually, the performance of GC these days is more hindered by cache
and VM designs than instruction sets.  In particular, GC needs
"write-allocate with subblock placement", such as is found on the 
DEC MIPS machines.  I believe that Alphas also have write-allocate,
but I'm not completely sure.  The Pentium apparently does _not_ do
write-allocate, which makes any kind of initialization of untouched
memory pretty much of a disaster.  Ditto for VM implementations --
people keep talking about 'log-based backing stores', but the major
thing that is required isn't so much a log, as the ability to blind
write to a page without having to read it first.

PLDI'94 and LFP'94 had some good papers on cache issues in GC.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
       [not found]         ` <3eua1r$4ea@gnat.cs.nyu.edu>
  1995-01-11  1:44           ` Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) Henry Baker
@ 1995-01-13 13:30           ` R. William Beckwith
  1995-01-13 14:59             ` Kelvin Nilsen
  1995-01-13 21:04             ` Henry Baker
  1 sibling, 2 replies; 21+ messages in thread
From: R. William Beckwith @ 1995-01-13 13:30 UTC (permalink / raw)


Robert Dewar (dewar@cs.nyu.edu) wrote:
: "Is there any work-in-progress by the large chip manufacturers to
: design GC into their next-generation CPU architectures?  It seems
: like the next logical step."

: I sure hope not, I hope we have seen the end of this kind of incorrect
: CISC thinking. At most what you want is some very specific hardware
: assist instructions that are consistent with RISC instruction design
: philosophy (for an example of this kind of thinking look at the trapped
: add in the SPARC design, certainly we do NOT want to put LISP type checking
: into the hardware, but this simple RISC consistent instruction is quite a 
: help for certain approaches to dynamic type checking in LISP interpretors.

I'm not looking for a complex solution.  I'm looking for the simplest
design changes in instruction set, cache, and register files to
accommodate transparent memory management.

Henry Baker comments that the primary issues are cache management and VM.
The fact that GC experts like Henry and Paul Wilson quickly digress to
chip level issues leads me to believe that a concerted effort on the
part of the CPU designer could lead to a much more elegant and efficient
solution to memory management in compiled languages.

It seems to me that GC is more than just a process level decision.  There
are three layers of participation in the effort:

    1.	CPU
    2.	O/S kernel
    3.	system libraries

This is all in hopes that the application does not have to participate
in the memory management excercise.

Why not design the CPU with the GC'tor in mind instead of designing the
GC'tor with the CPU in mind.

I just seems silly that hardware designers aren't thinking more about
relieving us programmers from thinking about what memory we are currently
using.

... Bill

--------------------------------------------------------------------------
e-mail: Bill.Beckwith@ois.com         FAX:  703-264-1721 |    Team Ada
Objective Interface Systems, Inc.   Voice:  703-264-1900 | dist, full O-O
1895 Preston White Drive, Suite 250                      | multithreading
Reston, VA  22091-5448  U.S.A.                           |    built in



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
  1995-01-13 13:30           ` R. William Beckwith
@ 1995-01-13 14:59             ` Kelvin Nilsen
  1995-01-17  2:45               ` R. William Beckwith
  1995-01-17 16:29               ` Robert I. Eachus
  1995-01-13 21:04             ` Henry Baker
  1 sibling, 2 replies; 21+ messages in thread
From: Kelvin Nilsen @ 1995-01-13 14:59 UTC (permalink / raw)


beckwb@ois.com (R. William Beckwith) writes:

>: "Is there any work-in-progress by the large chip manufacturers to
>: design GC into their next-generation CPU architectures?  It seems
>: like the next logical step."

>I'm not looking for a complex solution.  I'm looking for the simplest
>design changes in instruction set, cache, and register files to
>accommodate transparent memory management.

Paul Wilson mentioned my proposed hardware support for GC.  In case 
anyone is interested in more information on this, I suggest the following
five references:

  Schmidt, Nilsen.  Performance of a Hardware-Assisted Real-Time Garbage
  Collector.  Sixth ASPLOS.  Oct. 1994.

  Nilsen, Schmidt.  Cost-Effective Object-Space Management for Hardware-
  Assisted Real-Time Garbage Collection.  ACM LOPLAS.  Dec. 1992.

  Nilsen.  Reliable Real-Time Garbage Collection of C++.  USENIX Computing
  Systems.  Fall 1994.

  Nilsen, Schmidt.  A High-Performance Hardware-Assisted Real-Time
  Garbage Collection System.  Journal of Programming Languages.  Jan. 1994.

  Nilsen.  Cost-Effective Hardware-Assisted Real-Time Garbage Collection.
  1994 ACM PLDI Workshop on Languages, Compilers and Tool Support for
  Real-Time Systems (ftp.cs.iastate.edu:/pub/kelvin/lcts-rts.PS.Z)
  (this paper provides an overview of the hardware prototype that is currently
  under development.  there are a number of recent innovations that we have
  introduced to reduce hardware costs and improve system flexibility)

>Why not design the CPU with the GC'tor in mind instead of designing the
>GC'tor with the CPU in mind.

>I just seems silly that hardware designers aren't thinking more about
>relieving us programmers from thinking about what memory we are currently
>using.

Our design proposes to place all of the special circuitry within a custom
memory module.  The design is portable between CPU architectures, and allows
memory to be cached.  The economic benefits are:

  1. Nobody (these days) would even think of designing such specialized
     support into a CPU.  The benefits have not been proven.  The costs
     are not well understood.  The investment required to deliver a 
     price/performance competitive CPU in today's marketplace is too
     huge.  By placing the custom hardware within the memory subsystem,
     we are able to provide the desired benefits without most of the
     costs of developing special-purpose CPUs.

  2. The hardware can operate at memory speeds rather than at CPU speeds.
     This simplifies the implementation.

  3. Since the memory architecture is portable between CPU architectures,
     we can separate issues of instruction-set compatibility from issues
     of real-time garbage collection support.  (It would be a real bummer
     if our custom-CPU died in the marketplace not because hardware-assisted
     GC was a bad idea, but because the instruction set simply wasn't
     "popular".)

Let me add a little bit of clarification on this topic:

  1. Our work has been motivated by the need to support tight worst-case
     time bounds on dynamic memory management routines while supporting 
     predictable memory utilization.  Given these requirements, hardware
     support seemed like the "right thing to do."

  2. If you do not need hard-real-time performance guarantees, then it
     is much more difficult to justify the costs of custom hardware.  There
     are many good software garbage collection techniques that perform
     well enough in the absence of custom hardware.

  3. On the other hand, if the proposed custom hardware is manufactured in
     large enough quantities, then it becomes commodity hardware and is
     much more affordable.  This may raise the ante.  Programmers will
     have higher expectations of their programming environments, and 
     garbage collection activities will be a more dominant part of a 
     "typical" system workload.  (We are trying to find some high-volume
     users for our system in order to bring the per-unit cost down.  
     Potential buyers include the video game, virtual reality, and 
     television-top multimedia box developers.)

  4. Lots of "good ideas" never go anywhere.  To turn a good idea into
     a useful capability requires huge amounts of technology development,
     salesmanship, risk taking, faith (blinders?), and a little bit of
     luck.  We have a "chip manufacturer" partner who is currently helping
     to bring this idea to the marketplace, but the partner is not one of
     the big names and they are extremely careful to control their risks
     and limit their investments.  A big part of "computer architecture"
     research seems to consist of simply learning how to "play your cards."



Kelvin Nilsen/Dept. of Computer Science/Iowa State University/Ames, IA  50011 
 (515) 294-2259   kelvin@cs.iastate.edu  uunet!cs.iastate.edu!kelvin




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
  1995-01-13 14:59             ` Kelvin Nilsen
@ 1995-01-17  2:45               ` R. William Beckwith
  1995-01-19 15:57                 ` Kurt Bischoff
  1995-01-17 16:29               ` Robert I. Eachus
  1 sibling, 1 reply; 21+ messages in thread
From: R. William Beckwith @ 1995-01-17  2:45 UTC (permalink / raw)


: Kelvin Nilsen/Dept. of Computer Science/Iowa State University/Ames, IA  50011 
:  (515) 294-2259   kelvin@cs.iastate.edu  uunet!cs.iastate.edu!kelvin

Thanks for the informative followup Kevin.  If the chip you're working on
has or will have an Ada95 compiler for it, please let me know.  I try and
plug in a nice hardware assisted GC'tor.

... Bill

-----------------------------------------------------
e-mail: Bill.Beckwith@ois.com       |    Team Ada
Objective Interface Systems, Inc.   | dist, full O-O
1895 Preston White Drive, Suite 250 | multithreading
Reston, VA  22091-5448  U.S.A.      |    built in



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
  1995-01-17  2:45               ` R. William Beckwith
@ 1995-01-19 15:57                 ` Kurt Bischoff
  0 siblings, 0 replies; 21+ messages in thread
From: Kurt Bischoff @ 1995-01-19 15:57 UTC (permalink / raw)

In article <3ffb0d$qbp@gamma.ois.com> beckwb@ois.com (R. William Beckwith) write s:
>: Kelvin Nilsen/Dept. of Computer Science/Iowa State University/Ames, IA  50011
>:  (515) 294-2259   kelvin@cs.iastate.edu  uunet!cs.iastate.edu!kelvin
>
>Thanks for the informative followup Kevin.  If the chip you're working on
>has or will have an Ada95 compiler for it, please let me know.  I try and
>plug in a nice hardware assisted GC'tor.
>

I think Kelvin was making the point that GC support should be kept out of
the CPU--it should be in a separate memory module.  Then compiler implement-
ation is independent of hardware-assisted GC implementation.  Kelvin
wrote:

:  3. Since the memory architecture is portable between CPU architectures,
:     we can separate issues of instruction-set compatibility from issues
:     of real-time garbage collection support.

-------------
Kurt Bischoff, Odyssey Research, 301A Dates Drive, Ithaca, NY,  14850
(607) 277-2020
bischoff@oracorp.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
  1995-01-13 14:59             ` Kelvin Nilsen
  1995-01-17  2:45               ` R. William Beckwith
@ 1995-01-17 16:29               ` Robert I. Eachus
  1995-01-18 15:27                 ` Henry Baker
                                   ` (2 more replies)
  1 sibling, 3 replies; 21+ messages in thread
From: Robert I. Eachus @ 1995-01-17 16:29 UTC (permalink / raw)


In article <kelvin.790009178@kickapoo> kelvin@cs.iastate.edu (Kelvin Nilsen) writes:

   > By placing the custom hardware within the memory subsystem,
   > we are able to provide the desired benefits without most of the
   > costs of developing special-purpose CPUs.

    There is another major reason for putting the custom hardware in
the memory system.  There is a fairly large class of problems which
require time proportional to n^k lg2 n on any von Neuman architecture,
but can be solved in time n^(k-1) lg2 on hardware which supports
clearing large sections of memory to zero in constant time.  (Note
that if k is one, you can get orders of magnitude improvement.)

     Since many memory systems can support this happy property, all you
need to do to get the faster algorithms is to bypass the OS, or more
politely, have an OS call which clears a section of memory.


					Robert I. Eachus

with Standard_Disclaimer;
use  Standard_Disclaimer;
function Message (Text: in Clever_Ideas) return Better_Ideas is...
--

					Robert I. Eachus

with Standard_Disclaimer;
use  Standard_Disclaimer;
function Message (Text: in Clever_Ideas) return Better_Ideas is...



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
  1995-01-17 16:29               ` Robert I. Eachus
@ 1995-01-18 15:27                 ` Henry Baker
  1995-01-19 19:59                 ` Norman H. Cohen
  1995-01-22  2:56                 ` David Hanley
  2 siblings, 0 replies; 21+ messages in thread
From: Henry Baker @ 1995-01-18 15:27 UTC (permalink / raw)


In article <EACHUS.95Jan17112913@spectre.mitre.org>,
eachus@spectre.mitre.org (Robert I. Eachus) wrote:

> In article <kelvin.790009178@kickapoo> kelvin@cs.iastate.edu (Kelvin
Nilsen) writes:
> 
>    > By placing the custom hardware within the memory subsystem,
>    > we are able to provide the desired benefits without most of the
>    > costs of developing special-purpose CPUs.
> 
>     There is another major reason for putting the custom hardware in
> the memory system.  There is a fairly large class of problems which
> require time proportional to n^k lg2 n on any von Neuman architecture,
> but can be solved in time n^(k-1) lg2 on hardware which supports
> clearing large sections of memory to zero in constant time.  (Note
> that if k is one, you can get orders of magnitude improvement.)

Simple example, please?



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
  1995-01-17 16:29               ` Robert I. Eachus
  1995-01-18 15:27                 ` Henry Baker
@ 1995-01-19 19:59                 ` Norman H. Cohen
  1995-01-20  2:20                   ` Henry Baker
  1995-01-20 14:49                   ` Robert I. Eachus
  1995-01-22  2:56                 ` David Hanley
  2 siblings, 2 replies; 21+ messages in thread
From: Norman H. Cohen @ 1995-01-19 19:59 UTC (permalink / raw)


In article <EACHUS.95Jan17112913@spectre.mitre.org>,
eachus@spectre.mitre.org (Robert I. Eachus) writes: 

|>     There is another major reason for putting the custom hardware in
|> the memory system.  There is a fairly large class of problems which
|> require time proportional to n^k lg2 n on any von Neuman architecture,
|> but can be solved in time n^(k-1) lg2 on hardware which supports
|> clearing large sections of memory to zero in constant time.  (Note
|> that if k is one, you can get orders of magnitude improvement.)

This doesn't make sense, because there is a well known programming trick
(used more in analysis-of-algorithms than in real programs because of the
large constant factor) for marking a large section of memory as
uninitialized in constant time.  (The effect of initializing a large
section of memory to zero is achieved by checking on each read whether
the location read from has been initialized, and using the value zero if
it has not been.)

The trick works as follows: View the section of memory as an array of N
initializable units numbered 0 to N-1.  Allocate two arrays of integers,
Stack and Stack_Index, indexed from 0 to N-1.  Initially, both arrays
contain garbage.  There is a stack pointer, Top, pointing to the
highest-indexed element of Stack that does NOT contain garbage (so that
initially it equals -1).  When initializable unit I is initialized, Top
is incremented, the new value of Top is placed in Stack_Index(I), and I
is placed in Stack(Top).  Then the index of every initialized unit is
found exactly once in Stack(0 .. Top), and if unit I has been
initialized, Stack_Index(I) contains the index within Stack(0 .. Top)
containing I.  The test for whether initializable unit I is initialized
is: 

   if Stack_Index(I) in 0 .. Top then
      if Stack(Stack_Index(I)) = I then
         return True;
         -- Stack_Index(I) contains a valid index into the stack, and the
         --   corresponding index in the stack indicates that unit I has
         --   been initialized.
      else
         return False;
         -- The garbage in Stack_Index(I) happened to point into the
         --   initialized part of Stack, but the value in Stack reveals
         --   that unit I was not the initializable unit for which this
         --   component of Stack was initialized.
      end if;
   else
      return False;
      -- Garbage in Stack_Index(I) is not even a valid stack index.
   end if;

or, more succinctly,

   return Stack_Index(I) in 0 .. Top and then Stack(Stack_Index(I))=I;

--
Norman H. Cohen    ncohen@watson.ibm.com



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
  1995-01-19 19:59                 ` Norman H. Cohen
@ 1995-01-20  2:20                   ` Henry Baker
  1995-01-20 14:49                   ` Robert I. Eachus
  1 sibling, 0 replies; 21+ messages in thread
From: Henry Baker @ 1995-01-20  2:20 UTC (permalink / raw)


In article <3fmgbn$rns@watnews1.watson.ibm.com>, ncohen@watson.ibm.com wrote:

> In article <EACHUS.95Jan17112913@spectre.mitre.org>,
> eachus@spectre.mitre.org (Robert I. Eachus) writes: 
> 
> |>     There is another major reason for putting the custom hardware in
> |> the memory system.  There is a fairly large class of problems which
> |> require time proportional to n^k lg2 n on any von Neuman architecture,
> |> but can be solved in time n^(k-1) lg2 on hardware which supports
> |> clearing large sections of memory to zero in constant time.  (Note
> |> that if k is one, you can get orders of magnitude improvement.)
> 
> This doesn't make sense, because there is a well known programming trick
> (used more in analysis-of-algorithms than in real programs because of the
> large constant factor) for marking a large section of memory as
> uninitialized in constant time.  (The effect of initializing a large
> section of memory to zero is achieved by checking on each read whether
> the location read from has been initialized, and using the value zero if
> it has not been.)

This technique is known as a 'read barrier' in GC circles.

See:

ftp://ftp.netcom.com/pub/hb/hbaker/RealTimeGC.html (or .ps.Z)

for 1978 CACM reference.  (The read barrier technique is discussed twice in
this paper -- once for the basic GC algorithm itself, and once for 'moving'
and/or copying large arrays incrementally.)



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
  1995-01-19 19:59                 ` Norman H. Cohen
  1995-01-20  2:20                   ` Henry Baker
@ 1995-01-20 14:49                   ` Robert I. Eachus
  1 sibling, 0 replies; 21+ messages in thread
From: Robert I. Eachus @ 1995-01-20 14:49 UTC (permalink / raw)

In article <3fmgbn$rns@watnews1.watson.ibm.com> ncohen@watson.ibm.com (Norman H. Cohen) writes:

 > This doesn't make sense, because there is a well known programming trick
 > (used more in analysis-of-algorithms than in real programs because of the
 > large constant factor) for marking a large section of memory as
 > uninitialized in constant time...

   Read barriers are a neat trick, but for real-time applications
replacing a boolean with a pointer and... is totally intolerable.  I
have used algorithms in this class in real-time systems, and although
we devoted a whole page to each bit-vector, the typical maximums are
on the order of a few hundred bits.  But the difference between say 5
clocks and 20 (clearing two words at a time) is crucial.

    So yes, you are correct that these algorithms can be replaced by
more complex versions without the special memory clearing instruction,
but the large constant multiplier makes that approach useless.  To
give a real world example, in one application, if all the time
available was assigned to the innermost loop of an N^2 lg2 N
algorithm, you had 25 clocks per iteration--and a cache miss could
take most of that.  Fortunately I found an algorithm which met the
theoretical complexity limit, and where the all the inner loop did was
set a bit-vector of length N to false.  In that case limiting the size
of the bit-vector (so I could keep it in a register pair and avoid
write throughs) was sufficient.

--

					Robert I. Eachus

with Standard_Disclaimer;
use  Standard_Disclaimer;
function Message (Text: in Clever_Ideas) return Better_Ideas is...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
  1995-01-17 16:29               ` Robert I. Eachus
  1995-01-18 15:27                 ` Henry Baker
  1995-01-19 19:59                 ` Norman H. Cohen
@ 1995-01-22  2:56                 ` David Hanley
  1995-01-23 17:06                   ` Robert I. Eachus
  2 siblings, 1 reply; 21+ messages in thread
From: David Hanley @ 1995-01-22  2:56 UTC (permalink / raw)


Robert I. Eachus (eachus@spectre.mitre.org) wrote:
:     There is another major reason for putting the custom hardware in
: the memory system.  There is a fairly large class of problems which
: require time proportional to n^k lg2 n on any von Neuman architecture,
: but can be solved in time n^(k-1) lg2 on hardware which supports
: clearing large sections of memory to zero in constant time.  (Note
: that if k is one, you can get orders of magnitude improvement.)

        Hmmm.  While I suppose this is possible, I haven't yet run
across this "large body of algorithms" in which the execution time is
reduced by the order of magnitude when memory blocks can be zeroed out
very quickly.  Even if the algorithm did have to zero out the memory
itself, I don't see how that could take it from, say, n^2 to n^3.
Could you please point some of these out to me?

:      Since many memory systems can support this happy property, all you
: need to do to get the faster algorithms is to bypass the OS, or more
: politely, have an OS call which clears a section of memory.

        Hmm.  Not sure what this has to do with GC, really.  In any
case, it would be pretty easy to design memory hardware that would
zero out pages really quickly( just don't refresh them one cycle ).
And the bzero() function on your C compiler could be set to use it.
But mabye I'm just confused.  Could you please explaain if this is
wrong?

--
------------------------------------------------------------------------------
| David James Hanley  -- dhanley@lac.eecs.uic.edu  -- C++, OOD, martial arts |
| Laboratory for advanced computing      |      My employer barely KNOWS me. |
------------------------------------------------------------------------------
"There are trivial truths & there are great truths. The opposite of a
trivial truth is plainly false."
"The opposite of a great truth is also true."

-Neils Bohr



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
  1995-01-22  2:56                 ` David Hanley
@ 1995-01-23 17:06                   ` Robert I. Eachus
  0 siblings, 0 replies; 21+ messages in thread
From: Robert I. Eachus @ 1995-01-23 17:06 UTC (permalink / raw)


In article <19950122.094355.149517.NETNEWS@UICVM.UIC.EDU> dhanley@matisse.eecs.uic.edu (David Hanley) writes:

 >      Hmmm.  While I suppose this is possible, I haven't yet run
 > across this "large body of algorithms" in which the execution time
 > is reduced by the order of magnitude when memory blocks can be
 > zeroed out very quickly.  Even if the algorithm did have to zero
 > out the memory itself, I don't see how that could take it from,
 > say, n^2 to n^3.  Could you please point some of these out to me?

    Maybe I should have said "real-time" or "on-line" algorithms.  The
distinguishing characteristic seems to be that you have to choose the
solution method without knowing the size of (or all of) the data set
in advance.  Another way of looking at it is that you are concerned
with solving a problem, but you are only concerned with the time
between receiving the last data and getting a result.  (Although in
practice, what you want to do is, at every stage of reading in data,
have a solution based on all the data seen so far.)

    The seminal paper is "Real-time computation." by M. O. Rabin in
1963. (The reference I have is Isreal J. Math 1, 203-211, but I think
it has been reprinted.)  Hartmanis and Sterns, "On the computational
complexity of algorithms," Trans. Amer. Math. Soc. 117, 285-306(1965).

 >   Hmm.  Not sure what this has to do with GC, really.  In any
 > case, it would be pretty easy to design memory hardware that would
 > zero out pages really quickly( just don't refresh them one cycle ).
 > And the bzero() function on your C compiler could be set to use it.
 > But mabye I'm just confused.  Could you please explaain if this is
 > wrong?

      Nothing confused in that, it is done in real-time applications
all the time.  A typical scenario is one where you build a hash table
using a hash function with a very low likelihood of collision.
Normally you would have to factor the time to clear the hash table, or
the time for maintaining a structure just for clearing the table into
the time required for the algorithm.  Poof!--if supported by the
hardware--is the most elegent solution.

      The only thing it has to do with garbage collection AFIAK is
that there is a nice technique where data is ping-ponged back and
forth between two heaps.  For example, you have a structure containing
all current aircraft tracks.  If the old data is reviewed and, if
still valid, moved every sweep, all the garbage created by building
maintaining the tracks can be made to dissapear with never a worry
about fragmentation.


--

					Robert I. Eachus

with Standard_Disclaimer;
use  Standard_Disclaimer;
function Message (Text: in Clever_Ideas) return Better_Ideas is...



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
  1995-01-13 13:30           ` R. William Beckwith
  1995-01-13 14:59             ` Kelvin Nilsen
@ 1995-01-13 21:04             ` Henry Baker
  1995-01-17 10:37               ` Mark Reinhold
  1 sibling, 1 reply; 21+ messages in thread
From: Henry Baker @ 1995-01-13 21:04 UTC (permalink / raw)

In article <3f5vaf$r07@gamma.ois.com>, beckwb@ois.com (R. William
Beckwith) wrote:

> Henry Baker comments that the primary issues are cache management and VM.
> The fact that GC experts like Henry and Paul Wilson quickly digress to
> chip level issues leads me to believe that a concerted effort on the
> part of the CPU designer could lead to a much more elegant and efficient
> solution to memory management in compiled languages.

If people are interested in these issues, and haven't yet read the following
paper, they are in for a treat:

Reinhold, M.B.  "Cache Performance of Garbage-Collected Programs".  PLDI'94,
ACM Sigplan Notices 29, 6 (June 1994), 206-217.  (It _may_ be on the
net.  Reinhold works for NEC in New Jersy, I believe.)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?)
  1995-01-13 21:04             ` Henry Baker
@ 1995-01-17 10:37               ` Mark Reinhold
  0 siblings, 0 replies; 21+ messages in thread
From: Mark Reinhold @ 1995-01-17 10:37 UTC (permalink / raw)


In article <hbaker-1301951307490001@192.0.2.1> hbaker@netcom.com writes:

> If people are interested in these issues, and haven't yet read the following
> paper, they are in for a treat:

> Reinhold, M.B.  "Cache Performance of Garbage-Collected Programs".  PLDI'94,
> ACM Sigplan Notices 29, 6 (June 1994), 206-217.  (It _may_ be on the net.
> Reinhold works for NEC in New Jersey, I believe.)

It is indeed on the net.  See http://www.neci.nj.nec.com/homepages/mbr.html.
Comments welcome.



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~1995-01-23 17:06 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <787227087snz@wslint.demon.co.uk>
     [not found] ` <3da1nd$fss@gateway.wiltel.com>
     [not found]   ` <3e1rqn$ouh@news.parc.xerox.com>
     [not found]     ` <3e22hi$pqf@baygull.rtd.com>
     [not found]       ` <3e26mc$n9u@Starbase.NeoSoft.COM>
1994-12-31  1:09         ` What's Real-Time? (was Re: Widespread C++ Competency Gap?) Henry Baker
1994-12-31  2:12           ` Don Yuniskis
1994-12-31 17:08           ` Przemek Klosowski
1995-01-01  9:35             ` Robert J Carter
1995-01-02 17:10               ` Przemek Klosowski
1995-01-03 23:20               ` Robert I. Eachus
1995-01-04 22:05           ` Fred McCall
     [not found] ` <3ckb8g$841@gateway.wiltel.com>
     [not found]   ` <1994Dec21.151952.8902@merlin.h>
     [not found]   ` <1994Dec21.151952.8902@merlin.h <19941230.201628.350635.NETNEWS@UICVM.UIC.EDU>
     [not found]     ` <3e9f60$8du@jive.cs.utexas.edu>
     [not found]       ` <3epfsi$64d@gamma.ois.com>
     [not found]         ` <3eua1r$4ea@gnat.cs.nyu.edu>
1995-01-11  1:44           ` Parallel & RT GC (was Re: Real-Time GC (was Re: Widespread C++...?) Henry Baker
1995-01-13 13:30           ` R. William Beckwith
1995-01-13 14:59             ` Kelvin Nilsen
1995-01-17  2:45               ` R. William Beckwith
1995-01-19 15:57                 ` Kurt Bischoff
1995-01-17 16:29               ` Robert I. Eachus
1995-01-18 15:27                 ` Henry Baker
1995-01-19 19:59                 ` Norman H. Cohen
1995-01-20  2:20                   ` Henry Baker
1995-01-20 14:49                   ` Robert I. Eachus
1995-01-22  2:56                 ` David Hanley
1995-01-23 17:06                   ` Robert I. Eachus
1995-01-13 21:04             ` Henry Baker
1995-01-17 10:37               ` Mark Reinhold

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox