comp.lang.ada
 help / color / mirror / Atom feed
* Fast locking (Was Re: Java vs Ada 95)
  1996-11-03  0:00         ` Robert Dewar
@ 1996-11-05  0:00           ` Geert Bosch
  1996-11-06  0:00             ` Larry Kilgallen
  0 siblings, 1 reply; 11+ messages in thread
From: Geert Bosch @ 1996-11-05  0:00 UTC (permalink / raw)




In article <dewar.847053681@merv> Robert Dewar writes:
   Geert said
   
   "In the more general case a very simple spin-lock is enough and the
    overhead should only be one memory-read when the object is not locked.
    String objects are locked almost never and when they are locked they
    are only locked for a short time."

   What do you mean by a very simple spin lock in a uniprocessor environment.

I meant a lock that uses some form of busy-waiting instead of
blocking. Such a lock can be implemented using atomic test-and-set 
or atomic swap. Although these locks are not efficient for long
waits because of wasted CPU time and possible starving of other
tasks, for resources that are only locked for a short time, like
the ref-counted bounded string example, they are the best solution.

When implementing a sample package for spin locks I found that GNAT
doesn't provide intrinsic subprograms as described in Annex C of the
RM:  "It is recommended that intrinsic subprograms be provided for
convenient access to any machine operations that provide special
capabilities or efficiency that are not otherwise available through the
language constructs." (RM95 C.1(11))

This means that test-and-set is not possible using just GNAT at the
moment and indeed implementing efficient spin-locks is not possible.
Hopefully the extra intrinsic subprograms will be implemented
sometime.

Regards,
   Geert


-- 
E-Mail: geert@sun3.iaf.nl    




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fast locking (Was Re: Java vs Ada 95)
  1996-11-05  0:00           ` Fast locking (Was Re: Java vs Ada 95) Geert Bosch
@ 1996-11-06  0:00             ` Larry Kilgallen
  1996-11-06  0:00               ` Robert Dewar
  1996-11-06  0:00               ` Geert Bosch
  0 siblings, 2 replies; 11+ messages in thread
From: Larry Kilgallen @ 1996-11-06  0:00 UTC (permalink / raw)



In article <55o4g4$ki8@fozzie.sun3.iaf.nl>, geert@fozzie.sun3.iaf.nl (Geert Bosch) writes:

> This means that test-and-set is not possible using just GNAT at the
> moment and indeed implementing efficient spin-locks is not possible.
> Hopefully the extra intrinsic subprograms will be implemented
> sometime.

Descriptions of the GNAT developer environment posted here have
emphasized uniformity across platforms indicating that many of
those depending on it required that uniformity.

My presumption is that those folks have portable Ada code in mind.
The machine semantics of test-and-set on whatever machine you are
describing would seem to be somewhat different from the Alpha AXP
load-locked/store-conditional semantics.  A higher level construct
which supports portable programs seems better to me than something
specific to test-and-set hardware semantics.

(When I wrote about spin locks earlier in this thread, I thought we
were discussing techniques for compiler writers to use rather than
something exposed to Ada programmers.)

Larry Kilgallen




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fast locking (Was Re: Java vs Ada 95)
  1996-10-09  0:00 Once again, Ada absent from DoD SBIR solicitation Stanley R. Allen
  1996-11-01  0:00 ` Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation) Robert I. Eachus
@ 1996-11-06  0:00 ` Hannes Haug
  1996-11-06  0:00 ` Hannes Haug
  2 siblings, 0 replies; 11+ messages in thread
From: Hannes Haug @ 1996-11-06  0:00 UTC (permalink / raw)



geert@fozzie.sun3.iaf.nl (Geert Bosch) writes:

> I meant a lock that uses some form of busy-waiting instead of
> blocking. Such a lock can be implemented using atomic test-and-set 
> or atomic swap. Although these locks are not efficient for long
> waits because of wasted CPU time and possible starving of other
> tasks, for resources that are only locked for a short time, like
> the ref-counted bounded string example, they are the best solution.

You can yield the task if the mutex is already locked.
Untested pseudo gnu c for sparc:

typedef struct mutex {
  volatile int lock;
} mutex_t;

static void
mutex_init (mutex_t *mp)
{
  mp->lock = 0;
}

static void
mutex_unlock (mutex_t *mp)
{
  mp->lock = 0;
}

static int
mutex_trylock (mutex_t *mp)
{
  int old_lock;
  asm volatile ("ldstub [%1],%0" : "=r" (old_lock) : "r" (&mp->lock));
  return old_lock;
}

static void
mutex_lock (mutex_t *mp)
{
  while (mutex_trylock (mp) != 0)
    thread_yield ();
}


typedef struct semaphore {
  mutex_t mutex;
  volatile int count;
} semaphore_t;

static void
semaphore_init (semaphore_t *sp, int count)
{
  mutex_init (&sp->mutex);
  sp->count = count;
}

static void
semaphore_post (semaphore_t *sp)
{
  mutex_lock (&sp->mutex);
  sp->count = sp->count + 1;
  mutex_unlock (&sp->mutex);
}

static int
semaphore_trywait (semaphore_t *sp) /* this is NOT the ordinary trywait */
{
  int old_count;
  mutex_lock (&sp->mutex);
  old_count = sp->count;
  if (old_count > 0)
    sp->count = old_count - 1;
  mutex_unlock (&sp->mutex);
  return old_count;
}

static void
semaphore_wait (semaphore_t *sp)
{
  while (semaphore_trywait (sp) == 0)
    thread_yield ();
}


This should be pretty fast. Even semaphore_wait
is small enough for automatic inlining.

 -hannes




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fast locking (Was Re: Java vs Ada 95)
  1996-10-09  0:00 Once again, Ada absent from DoD SBIR solicitation Stanley R. Allen
  1996-11-01  0:00 ` Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation) Robert I. Eachus
  1996-11-06  0:00 ` Hannes Haug
@ 1996-11-06  0:00 ` Hannes Haug
  2 siblings, 0 replies; 11+ messages in thread
From: Hannes Haug @ 1996-11-06  0:00 UTC (permalink / raw)



kilgallen@eisner.decus.org (Larry Kilgallen) writes:

> My presumption is that those folks have portable Ada code in mind.
> The machine semantics of test-and-set on whatever machine you are
> describing would seem to be somewhat different from the Alpha AXP
> load-locked/store-conditional semantics.  A higher level construct
> which supports portable programs seems better to me than something
> specific to test-and-set hardware semantics.

Something like an inlined mutex_trylock?
This is something like test-and-set-lock.

> (When I wrote about spin locks earlier in this thread, I thought we
> were discussing techniques for compiler writers to use rather than
> something exposed to Ada programmers.)

The compiler writers should use fast locks and unlocks,
too. I think the os-mutexes are not fast enough.

 -hannes






^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fast locking (Was Re: Java vs Ada 95)
  1996-11-06  0:00             ` Larry Kilgallen
@ 1996-11-06  0:00               ` Robert Dewar
  1996-11-06  0:00               ` Geert Bosch
  1 sibling, 0 replies; 11+ messages in thread
From: Robert Dewar @ 1996-11-06  0:00 UTC (permalink / raw)



"> This means that test-and-set is not possible using just GNAT at the
> moment and indeed implementing efficient spin-locks is not possible.
> Hopefully the extra intrinsic subprograms will be implemented
> sometime."

GNAT fully supports machine intrinsics, starting with version 3.07.





^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fast locking (Was Re: Java vs Ada 95)
  1996-11-06  0:00             ` Larry Kilgallen
  1996-11-06  0:00               ` Robert Dewar
@ 1996-11-06  0:00               ` Geert Bosch
  1996-11-07  0:00                 ` Larry Kilgallen
  1 sibling, 1 reply; 11+ messages in thread
From: Geert Bosch @ 1996-11-06  0:00 UTC (permalink / raw)



Larry Kilgallen writes:
`` My presumption is that those folks have portable Ada code in mind.
   The machine semantics of test-and-set on whatever machine you are
   describing would seem to be somewhat different from the Alpha AXP
   load-locked/store-conditional semantics.  A higher level construct
   which supports portable programs seems better to me than something
   specific to test-and-set hardware semantics. ''

The Systems Programming Annex of the RM specifically advises atomic
read-modify-write operations like test and set, compare and swap etc.

I can't imagine there are many systems around these days that do not
provide efficient test-and-set semantics. Even the 8088 processor that
does not directly have a test-and-set instruction can easily emulate
one by doing an atomic swap between a processor register and a memory
location. On very simple processors it is almost always possible to
block all interrupts while doing the test-and-set.

Of course machines that do not have any instructions that can be used
to efficiently implement hardware-assisted locking will not have a
test-and-set intrinsic sub-program, but of course nobody expects to
magically be able to use hardware locking support when the target
platform does not have it. The point is that *when* it is available,
it should be usable in a consistent way.

Note that although test-and-set kind of mutual exclusion is a
very fast solution for locking resources for which is almost
never any contention, it certainly doesn't replace high-level
high-overhead locking provided by protected types.

The difference is that protected types provide strictly 
priority-based FIFO queuing which makes fair sharing of scarce
resources possible. It would be a good idea however to use
fast test-and-set locks in these as well.

The example below shows how to do this, when there is hardware
test-and-increment and decrement-and-test. Assumptions are
that the standard queuing is expensive (expensive system calls
involved) and non-queuing but blocking event semaphores are much
cheaper. I've verified that these assumptions hold on OS/2, but
I guess the situation is the same for other operating systems.

Implementing the solution below causes slightly more expensive
blocking/unblocking and essentially free locking/unlocking in absence
of contention for the lock. The semantics of the Fast_Queuing_Mutex
are exactly the same as for the Slow_Queuing_Mutex, although some
bookkeeping should be added to record the priority for the task
owning the fast-lock in case of Ceiling_Locking policy.

Example: (untested incomplete code, details might be wrong)

type Fast_Queuing_Mutex is record
   Slow_Lock	        : Slow_Queuing_Mutex; 
   Fast_Lock    	: Integer := 0;
   Fast_Lock_Ready      : Event_Semaphore := Not_Posted;
   Have_Slow_Lock	: Boolean := False;
end record;

procedure Request (L : in out Fast_Queuing_Mutex) is
   Was_Zero	: Boolean;
begin
   Test_And_Increment(L.Fast_Lock, Was_Zero);
   if Was_Zero then
      -- We have lock without any overhead
      return;
   else
      -- The fast lock already was in use.
      -- Now wait in the queue for the Mutex,
      -- (No wait if we're first)
      Request_Mutex(L.Slow_Lock);   
      -- We also have to wait for the fast lock
      -- This can be done using a non-priority based
      -- event semaphore
      Wait_Event(L.Fast_Lock_Ready);
      -- Can't set this before fast lock is released
      L.Have_Slow_Lock := True;
   end if;
end Request;

Article Unavailable



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fast locking (Was Re: Java vs Ada 95)
  1996-11-07  0:00                 ` Larry Kilgallen
@ 1996-11-07  0:00                   ` Robert Dewar
  1996-11-11  0:00                     ` Norman H. Cohen
  1996-11-08  0:00                   ` Geert Bosch
  1 sibling, 1 reply; 11+ messages in thread
From: Robert Dewar @ 1996-11-07  0:00 UTC (permalink / raw)



Geert says

"> I can't imagine there are many systems around these days that do not
> provide efficient test-and-set semantics. Even the 8088 processor that
> does not directly have a test-and-set instruction can easily emulate
> one by doing an atomic swap between a processor register and a memory
> location. On very simple processors it is almost always possible to
> block all interrupts while doing the test-and-set."

Your imagination is deficient. Examples are the RS6000 and all earlier
MIPS chips. There are others!





^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fast locking (Was Re: Java vs Ada 95)
  1996-11-06  0:00               ` Geert Bosch
@ 1996-11-07  0:00                 ` Larry Kilgallen
  1996-11-07  0:00                   ` Robert Dewar
  1996-11-08  0:00                   ` Geert Bosch
  0 siblings, 2 replies; 11+ messages in thread
From: Larry Kilgallen @ 1996-11-07  0:00 UTC (permalink / raw)



In article <55r16k$m00@fozzie.sun3.iaf.nl>, geert@fozzie.sun3.iaf.nl (Geert Bosch) writes:

> I can't imagine there are many systems around these days that do not
> provide efficient test-and-set semantics. Even the 8088 processor that
> does not directly have a test-and-set instruction can easily emulate
> one by doing an atomic swap between a processor register and a memory
> location. On very simple processors it is almost always possible to
> block all interrupts while doing the test-and-set.

"Blocking all interrupts" in a multiprocessor environment would be
hazardous to the speed of other, non-interfering, activities.

> Of course machines that do not have any instructions that can be used
> to efficiently implement hardware-assisted locking will not have a
> test-and-set intrinsic sub-program, but of course nobody expects to
> magically be able to use hardware locking support when the target
> platform does not have it. The point is that *when* it is available,
> it should be usable in a consistent way.

My complaint is not that some machines lack fast locking primitives,
but rather that there was an assumption that those primitives will
always be "test and set".  DEC designers made the choice that their
"load-locked/store-conditional" semantics would be faster than more
traditional approaches.  Since they consistently win the SpecFP
and SpecInt comparisons with each new chip, I presume they know
what they are doing about fast hardware (although the portion of
their design having to do with locking primitives has nothing to
do with the portion of their design needed for the rather limited
SpecFP and SpecInt tests).

Larry Kilgallen




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fast locking (Was Re: Java vs Ada 95)
  1996-11-07  0:00                 ` Larry Kilgallen
  1996-11-07  0:00                   ` Robert Dewar
@ 1996-11-08  0:00                   ` Geert Bosch
  1 sibling, 0 replies; 11+ messages in thread
From: Geert Bosch @ 1996-11-08  0:00 UTC (permalink / raw)



Larry Kilgallen (kilgallen@eisner.decus.org) wrote:
   "Blocking all interrupts" in a multiprocessor environment would be
   hazardous to the speed of other, non-interfering, activities.

Please don't twist my words. I specifically said for that this
solution might be good for small/old systems. If you need an example,
think of microcontrollers.

I also wrote:
   "Of course machines that do not have any instructions that can be used
    to efficiently implement hardware-assisted locking will not have a
    test-and-set intrinsic sub-program"

How could you still think I was suggesting that "Blocking all interrupts"
in a multiprocessor environment was a good idea? A multiprocessor system
without test-and-set or similar mechanisms falls into the category:
   "[...] of course nobody expects to magically be able to use hardware
    locking support when the target platform does not have it. The
    point is that *when* it is available, it should be usable in a
    consistent way.

Larry wrote:
  "My complaint is not that some machines lack fast locking primitives,
   but rather that there was an assumption that those primitives will
   always be "test and set".  "

I never said that, although test-and-set is the most well-known
primitive in this area, so I used that. I even listed some of the
other possible primitives. In any case, the test-and-set operation
can easily be implemented with other fast-locking operations. 

And the point is not that these low-level locks should be used in
most applications, but that they can be very valuable tools in
cases where they are needed just like the other parts of Annex C
of the ARM.

Larry wrote:
  "DEC designers made the choice that their "load-locked/
   store-conditional" semantics would be faster than more traditional
   approaches."

If I use that instruction as load-locked/store-whenzero than I have the
equivalent of test-and-set, which is why it makes sense to provide a
very generic fast-locking primitive like test-and-set in Ada
implementations rather than using DECs instruction-of-the-day.

Regards,
   Geert
-- 
E-Mail: geert@sun3.iaf.nl    




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fast locking (Was Re: Java vs Ada 95)
  1996-11-07  0:00                   ` Robert Dewar
@ 1996-11-11  0:00                     ` Norman H. Cohen
  0 siblings, 0 replies; 11+ messages in thread
From: Norman H. Cohen @ 1996-11-11  0:00 UTC (permalink / raw)



Robert Dewar wrote:
> 
> Geert says
> 
> "> I can't imagine there are many systems around these days that do not
> > provide efficient test-and-set semantics. 
...
> 
> Your imagination is deficient. Examples are the RS6000 and all earlier
> MIPS chips. There are others!

The RS6000 is not a processor, but a family of systems, early versions
of which were based on the POWER architecture and later versions of
which were based on the PowerPC architecture.  Robert is presumably
referring to the POWER architecture, which did not include
syncrhonization operations for shared-memory multiprocessors.  In
contrast, the PowerPC architecture provides the lwarx (load word
arithmetic and reserve, indexed) and stwcx (store word conditional,
indexed) instructions, which provide powerful and efficient
syncrhonization.  Every PowerPC architecture document I have seen
includes an appendix with syncrhonization programming examples, showing
how a handful of instructions, generally built around a lwarx/stwcx
pair, can be used to program such syncrhonization idioms as
Fetch-and-Store, Fetch-and-Add, Test-and-Set, Compare-and-Swap, and
lock/unlock.

(My favorite PowerPC syncrhonization instruction, used to ensure that
out-of-order loads and stores do not violate the semantics of volatile
variables and memory-mapped I/O ports, is Enforce In-Order Execution of
I/O.  Its mnemonic is, naturally, eieio.)

-- 
Norman H. Cohen
mailto:ncohen@watson.ibm.com
http://www.research.ibm.com/people/n/ncohen




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Fast locking (Was Re: Java vs Ada 95)
@ 1996-11-12  0:00 Marin David Condic, 561.796.8997, M/S 731-93
  0 siblings, 0 replies; 11+ messages in thread
From: Marin David Condic, 561.796.8997, M/S 731-93 @ 1996-11-12  0:00 UTC (permalink / raw)



"Norman H. Cohen" <ncohen@WATSON.IBM.COM> writes:
>(My favorite PowerPC syncrhonization instruction, used to ensure that
>out-of-order loads and stores do not violate the semantics of volatile
>variables and memory-mapped I/O ports, is Enforce In-Order Execution of
>I/O.  Its mnemonic is, naturally, eieio.)
>
    I thought that was the Farmer's Union - The E.I.E.I.O.?

    MDC

Marin David Condic, Senior Computer Engineer    ATT:        561.796.8997
M/S 731-96                                      Technet:    796.8997
Pratt & Whitney, GESP                           Fax:        561.796.4669
P.O. Box 109600                                 Internet:   CONDICMA@PWFL.COM
West Palm Beach, FL 33410-9600                  Internet:   CONDIC@FLINET.COM
===============================================================================
    "That which belongs to another."

        --  Diogenes, when asked what wine he liked to drink.
===============================================================================




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~1996-11-12  0:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1996-11-12  0:00 Fast locking (Was Re: Java vs Ada 95) Marin David Condic, 561.796.8997, M/S 731-93
  -- strict thread matches above, loose matches on Subject: below --
1996-10-09  0:00 Once again, Ada absent from DoD SBIR solicitation Stanley R. Allen
1996-11-01  0:00 ` Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation) Robert I. Eachus
1996-11-01  0:00   ` Robert A Duff
     [not found]     ` <55gkch$gg6@fozzie.sun3.iaf.nl>
1996-11-03  0:00       ` Robert A Duff
1996-11-03  0:00         ` Robert Dewar
1996-11-05  0:00           ` Fast locking (Was Re: Java vs Ada 95) Geert Bosch
1996-11-06  0:00             ` Larry Kilgallen
1996-11-06  0:00               ` Robert Dewar
1996-11-06  0:00               ` Geert Bosch
1996-11-07  0:00                 ` Larry Kilgallen
1996-11-07  0:00                   ` Robert Dewar
1996-11-11  0:00                     ` Norman H. Cohen
1996-11-08  0:00                   ` Geert Bosch
1996-11-06  0:00 ` Hannes Haug
1996-11-06  0:00 ` Hannes Haug

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox