Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))

comp.lang.ada
 help / color / mirror / Atom feed

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-16  0:00     ` Robert Dewar
@ 1996-11-17  0:00       ` Robert A Duff
  1996-11-18  0:00         ` Robert Dewar
                           ` (2 more replies)
  1996-11-20  0:00       ` Jon S Anthony
  1 sibling, 3 replies; 29+ messages in thread
From: Robert A Duff @ 1996-11-17  0:00 UTC (permalink / raw)

In article <dewar.848202876@merv>, Robert Dewar <dewar@merv.cs.nyu.edu> wrote:
>I can see that you would *wish* this to be the case, but please prove
>it from the RM. Where is the abstract semantics of unbounded strings
>specified in such a way as to make my reference count implementation
>incorrect? I don't see any language around the unbounded strings
>package that explicitly says this (or for that matter, even implies it).

I'd say A.4.5(1) proves it -- these things "represent strings", not
strange pointers into evilly shared variables.  If you want to use
strange pointers in the implementation, then you have to make it behave
as if they are really (varying length) strings.

I don't see any statement in the RM that says these things are erroneous
-- therefore they aren't.  (You *have* to read the RM that way --
otherwise *everything* is erroneous, because we rarely say "so-and-so is
not erroneous".)

By your reasoning, it is erroneous to use every type in every predefined
package, because who knows what's in the package body -- maybe the body
of Text_IO writes zeroes over the whole address space (since the RM
doesn't explicitly forbid that)!

It seems to me that the burden of proof should be on you -- you claim
that so-and-so is erroneous, so to prove it, you have to point to an
explicit statement in the RM saying it's erroneous.  You can't just
point out that it doesn't say it's not erroneous.  Otherwise, I'll say,
"Prove that 'X := 1;' is not erroneous.", and the whole language falls
apart.

- Bob

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-17  0:00       ` Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation)) Robert A Duff
@ 1996-11-18  0:00         ` Robert Dewar
  1996-11-19  0:00           ` Joel VanLaven
  1996-11-19  0:00           ` Robert A Duff
  1996-11-21  0:00         ` Robert I. Eachus
  1996-11-25  0:00         ` Robert I. Eachus
  2 siblings, 2 replies; 29+ messages in thread
From: Robert Dewar @ 1996-11-18  0:00 UTC (permalink / raw)



Bob Duff says

"I'd say A.4.5(1) proves it -- these things "represent strings", not
strange pointers into evilly shared variables.  If you want to use
strange pointers in the implementation, then you have to make it behave
as if they are really (varying length) strings.

I don't see any statement in the RM that says these things are erroneous
-- therefore they aren't.  (You *have* to read the RM that way --
otherwise *everything* is erroneous, because we rarely say "so-and-so is
not erroneous".)"


Nope, I don't buy that for a moment. By that reasoning, A.3

3   The implementation shall ensure that each language defined subprogram is
reentrant in the sense that concurrent calls on the same subprogram perform
as specified, so long as all parameters that could be passed by reference
denote nonoverlapping objects.


would be entirely redundant, since according to you, it is implicit in the
specs of the routines.

In fact it is perfectly possible to do things with the defined procedures
that definitely ARE erroneous, for example, surely you dont think that
two tasks that modify the same string at the same time must work
correctly. No of course not, that we would expect to be erroneousw.
"behave as if they are really (varying length) strings"

is meaningless, either we ignore the parenthetical remark, in which case
it is nonsense, since these do not behave like Ada strings, or we include
it, in which case we are talking about something not in the language and
so are defining varying length string semantics in terms of varying
length string semantics which is nonsense.
\x1adp





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-18  0:00         ` Robert Dewar
  1996-11-19  0:00           ` Joel VanLaven
@ 1996-11-19  0:00           ` Robert A Duff
  1996-11-23  0:00             ` Robert Dewar
  1 sibling, 1 reply; 29+ messages in thread
From: Robert A Duff @ 1996-11-19  0:00 UTC (permalink / raw)

In article <dewar.848352010@merv>, Robert Dewar <dewar@merv.cs.nyu.edu> wrote:
>Nope, I don't buy that for a moment. By that reasoning, A.3

A(3).

>3   The implementation shall ensure that each language defined subprogram is
>reentrant in the sense that concurrent calls on the same subprogram perform
>as specified, so long as all parameters that could be passed by reference
>denote nonoverlapping objects.
>
>
>would be entirely redundant, since according to you, it is implicit in the
>specs of the routines.

Actually, I think it *is* redundant, and I had a big argument about that
with Tucker.  (I think it should have been a NOTE or an
[bracketed-in-AARM] thing.)  The reason it's there is that some Ada 83
compilers failed to make things like Text_IO work properly.  And because
Tucker wanted it there.

This is like many real-life laws -- legislatures are continually passing
laws that are just special cases of existing laws, because somebody is
griping about a rash of some particular crime.  Legislatures don't
normally sit around thinking of every possible evil, and pass a law
against it.  RM authors, on the other hand, try to do something like
that.  ;-)

>In fact it is perfectly possible to do things with the defined procedures
>that definitely ARE erroneous, for example, surely you dont think that
>two tasks that modify the same string at the same time must work
>correctly. No of course not, that we would expect to be erroneousw.

Agreed.

>"behave as if they are really (varying length) strings"
>is meaningless, either we ignore the parenthetical remark, in which case
>it is nonsense, since these do not behave like Ada strings, or we include
>it, in which case we are talking about something not in the language and
>so are defining varying length string semantics in terms of varying
>length string semantics which is nonsense.

Whenever the RM uses a term that it doesn't define, which is quite
often, you have to take the most reasonable meaning you know from plain
English, or from general computer jargon.  Perhaps I interpret this
section using a "friendly reading".  I'm not going to worry about it
until some compiler vendor tries to get away with making things
erroneous that shouldn't be.

And you still haven't answered my main objection to your reading -- if
you read it that way, then why isn't practically everything in Annex A
erroneous?  And lots of other stuff.  E.g:

    X, Y: Calendar.Time := Clock;

    X := Y; -- Erroneous?

(No tasks here.)  How do we know that the implementer has not evilly
made the full type for Time controlled, with an Adjust routine that does
something erroneous?  Or raises GNAT.Evil_Exceptions.Whatever_Error?  Or
writes the value 17 into all integer variables in the program?  We know
that, because the correct way to interpret the RM is that if it doesn't
say any of those bad things happen, then they don't.  The same way we
know that 1+1 of Integers doesn't have side effects.

(Of course I know that two tasks modifying X at the same time would be
erroneous.  That's covered by 9.10.)

- Bob

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-18  0:00         ` Robert Dewar
@ 1996-11-19  0:00           ` Joel VanLaven
  1996-11-23  0:00             ` Robert Dewar
  1996-11-19  0:00           ` Robert A Duff
  1 sibling, 1 reply; 29+ messages in thread
From: Joel VanLaven @ 1996-11-19  0:00 UTC (permalink / raw)



This discussion is related to Unbounded strings and someone's suggestion to
use a reference-count finalization scheme and the possible problems with
that with respect to tasking.  The problem is that multiple tasks could
all try to access the reference count at the same time, one decrementing
and one incrementing for instance, right?  So, to prevent that, some sort
of locking would have to occur.  However, we are talking about simply
incrementing and/or decrementing 1 value in memory.  On most processors
that I can think of that is 3 operations, two memory accesses and an
integer operation.  Is the Ada tasking/priority method so accurate that
that one lower priority task can't tie up the machine for 2 extra
instructions?  I think it ought to be able to handle even more than that.
What about making an increment/decrement function that is a task or protected
object at the very highest priotity level (higher than the highest as this
is the compiler writer ... :) So essentially all reference counts are on the
same lock.  When a task wanted to adjust a reference count it would call /
rendevous with the increment/decrementer which would take over and
deal with the requested reference count.  As long as that doesn't take
too long and the call overhead isn't too large, no ones toes will be
stepped on.
  On the issue of the RM and the implementation's responsibility I bring up
the point that there are always trade-offs.  Some users may not want to use
tasking at all and want the fastest code possible.  Some might want
tasking and can handle less performance.  The real responsibility of the
implementation is to be explicit about what trade-offs were made.  A
suggestion is to provide multiple alternitives when feasible.  Portability
concerns suggest that implementations chose similar trade-offs, but I will
leave that can of worms for another discussion.
-- 
-- Joel VanLaven




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-16  0:00     ` Robert Dewar
  1996-11-17  0:00       ` Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation)) Robert A Duff
@ 1996-11-20  0:00       ` Jon S Anthony
  1996-11-24  0:00         ` Robert Dewar
  1 sibling, 1 reply; 29+ messages in thread
From: Jon S Anthony @ 1996-11-20  0:00 UTC (permalink / raw)



In article <E14DCH.5EK@world.std.com> bobduff@world.std.com (Robert A Duff) writes:

> In article <dewar.848352010@merv>, Robert Dewar <dewar@merv.cs.nyu.edu> wrote:
> >"behave as if they are really (varying length) strings"
> >is meaningless, either we ignore the parenthetical remark, in which case
> >it is nonsense, since these do not behave like Ada strings, or we include
> >it, in which case we are talking about something not in the language and
> >so are defining varying length string semantics in terms of varying
> >length string semantics which is nonsense.
> 
> Whenever the RM uses a term that it doesn't define, which is quite
> often, you have to take the most reasonable meaning you know from plain
> English, or from general computer jargon.  Perhaps I interpret this

Criminey!  You better watch out Bob, you're beginning to sound just
like me in this sort of context! ;^)

There are indeed many of these and the worst of it is, they don't seem
all that "fundamental" or "primitive" or "intuitive" like "reasonable"
undefined terms.  As you say - you have to guess the best you can...


> And you still haven't answered my main objection to your reading -- if
> you read it that way, then why isn't practically everything in Annex A
> erroneous?  And lots of other stuff.  E.g:
> 
>     X, Y: Calendar.Time := Clock;
>     
>     X := Y; -- Erroneous?
> 
> (No tasks here.)  How do we know that the implementer has not evilly
> made the full type for Time controlled, with an Adjust routine that does
> something erroneous?  Or raises GNAT.Evil_Exceptions.Whatever_Error?  Or
> writes the value 17 into all integer variables in the program?  We know
> that, because the correct way to interpret the RM is that if it doesn't
> say any of those bad things happen, then they don't.  The same way we
> know that 1+1 of Integers doesn't have side effects.

Hmmmm.  Really.  I think Robert has been hoisted on his own petard, :-)

/Jon
-- 
Jon Anthony
Organon Motives, Inc.
Belmont, MA 02178
617.484.3383
jsa@organon.com





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-17  0:00       ` Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation)) Robert A Duff
  1996-11-18  0:00         ` Robert Dewar
@ 1996-11-21  0:00         ` Robert I. Eachus
  1996-11-24  0:00           ` Robert Dewar
  1996-11-25  0:00         ` Robert I. Eachus
  2 siblings, 1 reply; 29+ messages in thread
From: Robert I. Eachus @ 1996-11-21  0:00 UTC (permalink / raw)

In article <E14DCH.5EK@world.std.com> bobduff@world.std.com (Robert A Duff) writes:

  > (Of course I know that two tasks modifying X at the same time would be
  > erroneous.  That's covered by 9.10.)

   And that is EXACTLY the point that kicked all this discussion off.
If an implementation works non-erroneously if the programmer read and
followed 9.10, end of discussion.

   I made an innocent seeming comment that in a tasking environment
the user of Unbounded_Strings HAS to provide a syncronization point
between any two references to the same value from different tasks, one
of which modifies it.  As a result, no simple implementation of
Unbounded_Strings should worry about race conditions.  Of course, any
hidden calls to new must be safe in a tasking environment even if some
other task is also allocating or freeing storage, but that is
different issue, and not local to Unbounded_Strings.

   However, if you do reference counts, there is a very unusual case
that needs to be considered.  But as I pointed out the potential
problems can be cleverly avoided, at the risk of doing some
"unnecessary" copies in race conditions.  So efficient "lazy"
implementations which use reference counting don't incur any
distributed overhead.  The extra copy in a transitive reference case
occurs only 1) if the case required to be non-erroneous occurs, or 2)
in an erroneous program where the reference counting version will do
the "right thing" instead of breaking.

   In other erroneous cases, the reference counting version will
break.  Tough!  The program would be just as erroneous, and more
likely to break if the unbounded string was actually a
Standard.String.  Hmmm... The language lawyer in my immediately reads
that and constructs weird cases.  There are cases where two tasks can
modify different sections of the same string non-erroneously, which
have to be considered erroneous if the operation is performed on an
unbounded string.  Those programs must be considered pathological,
however, I can imagine worst case scenarios where the erroneous
behavior is visible:

      Suppose one task creates an unbounded string, and passes it to
another task through a rendezvous.  Both task create copies, then
modify different sections of the original string.  Definitely
erroneous, but it will make lazy reference counting visible.

--

					Robert I. Eachus

with Standard_Disclaimer;
use  Standard_Disclaimer;
function Message (Text: in Clever_Ideas) return Better_Ideas is...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-19  0:00           ` Robert A Duff
@ 1996-11-23  0:00             ` Robert Dewar
  1996-11-24  0:00               ` Robert A Duff
  1996-11-25  0:00               ` Norman H. Cohen
  0 siblings, 2 replies; 29+ messages in thread
From: Robert Dewar @ 1996-11-23  0:00 UTC (permalink / raw)



Bob Duff asks

"And you still haven't answered my main objection to your reading -- if
you read it that way, then why isn't practically everything in Annex A
erroneous?  And lots of other stuff.  E.g:

    X, Y: Calendar.Time := Clock;

    X := Y; -- Erroneous?

(No tasks here.)  How do we know that the implementer has not evilly
made the full type for Time controlled, with an Adjust routine that does
something erroneous?  Or raises GNAT.Evil_Exceptions ...."


No, I don't accept this as equivalent. The issue of the extent to which
library routines are task/thread safe is quite different from the issue
of single task semantics, which I think is clearly defined, and that is
why the above example definitely cannot be erroneous.

Obviously it is not the case that anything works in a threaded environment.
Your answer indeed agrees that certain things will be erroneous (e.g. if
the above assignment X := Y appeared in two tasks with no synhronization
then you would agree that it is probably erroneous -- not necessarily
of course, since the implementatoin may have a pragma atomic on its
Calendar.Time type, so this kind of erroneousness is definitely impl
dependent to some extent.

So just what *is* task safe, well the only guideline we have is the clause
in the reference manual that you don't think should be there, but I think
Tuck clearly was right to insist on it, otherwise we could not assume
anything about task safety of the libraryt routines.

As for your Text_IO example, what exactly IS required if two
tasks do a Put_Line to the same file at the same time? To me that
can still be erroneous (it is the same case as any other case where
two tasks molest the same variable, in this case the object for the
file type) at the same time, and I can't see why even the phrase in
the RM that you think is redundant gathers otherwise. Let's repeat
the RM phrase for reference:

3   The implementation shall ensure that each language defined subprogram is
reentrant in the sense that concurrent calls on the same subprogram perform
as specified, so long as all parameters that could be passed by reference
denote nonoverlapping objects.


Even if we decide that for the case of Put_Line with no file argument *is*
covered by this phrase, what does it mean? What level of interspersing
of output is implied by this wording (lines, characters, pixels???)

And of course it is quite clear that two writes to the same file using
Put_Line with a file argument are NOT covered by paragraph 3. So that's
mightly curious:

    Put_Line (Standard_Output, "xxxx");

appearing in two tasks is not covered, and could be erroneous, but

    Put_Line ("xxxx");

is OK. That seems quite weird to me, though I see how paragraph 3 could
be read this way.

Without paragraph 3, I have no idea why Bob Duff thinks that the language
has anything to say about either of these cases.

Friendly readings are fine, but I do not know what is friendly and what
is not.

Let'
s look at the reference count case.  I see two friendly readings here:

  1.  An efficient implementation with reference counts is permitted, no
      locks are needed, since if two tasks mess with the same sets of
      variables or values, the execution is erroneous.

  2.  If reference counts are used, task locks are required, requiring
      a kernel call on some systems, and therefore, becuase of the
      quite unacceptable efficiency effect, reference counts are out
      of the question.

Which of these is friendlier? Well if you don't use tasking, or if you
use only completely separated sets of values and variables in the
tasking case, you find 1 much friendlier, since a reference count
implementation makes a lot of sense in these cases.

If you insist on using unbounded strings across tasks, thinking that
the assignment semantics are like ordinary scalar types, then of course
you might find 1 friendlier.

Relying on a friendly reading is risky. I must say that initially I assumed
that reference counts could not be used and that consequently a copy on
modify approach is not practical, but I found Robert Eachus argument
here illuminating and convincing.

So Bob Duff thinks that the RM should say nothing, and that we should rely
on a friendly reading of what is and is not erroneous (he agrees that
certain calls to runtime routines are erroneous because of violating
shared variable rules, so this is a question of judgment). 

But in this case (reference counts), people do not agree on what is
friendly. So I think it is something the ARG should pin down. I don't
think that either decision (1 or 2 above) would be a major problem,
but failure to resolve this could lead to a very significant portability
problem.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-19  0:00           ` Joel VanLaven
@ 1996-11-23  0:00             ` Robert Dewar
  0 siblings, 0 replies; 29+ messages in thread
From: Robert Dewar @ 1996-11-23  0:00 UTC (permalink / raw)



Joel says

"This discussion is related to Unbounded strings and someone's suggestion to
use a reference-count finalization scheme and the possible problems with
that with respect to tasking.  The problem is that multiple tasks could
all try to access the reference count at the same time, one decrementing
and one incrementing for instance, right?  So, to prevent that, some sort
of locking would have to occur.  However, we are talking about simply
incrementing and/or decrementing 1 value in memory.  On most processors
that I can think of that is 3 operations, two memory accesses and an
integer operation.  Is the Ada tasking/priority method so accurate that
that one lower priority task can't tie up the machine for 2 extra
instructions?  I think it ought to be able to handle even more than that.
What about making an increment/decrement function that is a task or protected
object at the very highest priotity level (higher than the highest as this
is the compiler writer ... :) So essentially all reference counts are on the
same lock.  When a task wanted to adjust a reference count it would call /
rendevous with the increment/decrementer which would take over and
deal with the requested reference count.  As long as that doesn't take
too long and the call overhead isn't too large, no ones toes will be
stepped on."


The concern is not a semantic concern about taking the lock. As you point
out, as long as the lock has proper ceiling priority semantics, this is
consistent even with the strict semantic requirements of Annex D. The
concern is that in an operating system environment, taking this lock
may require a kernel call, and take thousands of instructions. This is
not just a regrettable inefficiency, it is sufficiently inefficient that
we could not consider using reference counts at all in this environment.

Basically reference counts and tasking do not go well together, which
is unfortunate, since reference counts are a powerful tool. Of course
on an architecture like the x86, which has an indivisible decrement
and increment that set the flags in a useful manner, you can implement
them efficiently, but this is not true on all architectures.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-24  0:00           ` Robert Dewar
@ 1996-11-24  0:00             ` Fergus Henderson
  1996-11-24  0:00               ` Robert Dewar
  1996-11-25  0:00             ` Kevin D. Heatwole
  1 sibling, 1 reply; 29+ messages in thread
From: Fergus Henderson @ 1996-11-24  0:00 UTC (permalink / raw)



dewar@merv.cs.nyu.edu (Robert Dewar) writes:

>Unbounded_String's are built into the language from a semantic point of view.
>There are two possible implementations that are reasonable

There are more than just two.  For example, a third reasonable
implementation is to use garbage-collected storage with copy-on-write.
This has potential advantages over the two implementations that you
considered: it avoids the overhead of copying, and avoids the overhead
of locking.  (Of course, there are some potential disadvantages too.)

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-23  0:00             ` Robert Dewar
@ 1996-11-24  0:00               ` Robert A Duff
  1996-11-25  0:00               ` Norman H. Cohen
  1 sibling, 0 replies; 29+ messages in thread
From: Robert A Duff @ 1996-11-24  0:00 UTC (permalink / raw)

In article <dewar.848758301@merv>, Robert Dewar <dewar@merv.cs.nyu.edu> wrote:
>And of course it is quite clear that two writes to the same file using
>Put_Line with a file argument are NOT covered by paragraph 3. So that's
>mightly curious:
>
>    Put_Line (Standard_Output, "xxxx");
>
>appearing in two tasks is not covered, and could be erroneous, but
>
>    Put_Line ("xxxx");
>
>is OK. That seems quite weird to me, ...

Agreed.  In my opinion, both of the above should be considered
erroneous.  I said that in another article, and I also admitted that the
RM doesn't clearly say so.  As you say, if one thinks the above should
work, then one needs to worry about the granularity of interleaving.

>... though I see how paragraph 3 could
>be read this way.

>Without paragraph 3, I have no idea why Bob Duff thinks that the language
>has anything to say about either of these cases.

I think you're confusing what I said with what somebody else said.  Two
simultaneous Put_Lines are erroneous if they are writing to the same
file, but not if they are writing to two different files.  At least, in
my opinion, that's the way it *should* be.  The glitch in the RM wording
is that for the Put_Line without any file argument, there's no variable
in sight -- even though we know that Put_Line("xxx") and
Put_Line(Current_Output, "xxx") ought to have identical semantics.  If I
weren't so lazy, I tried to dig up some wording that says so, and then
we would be OK -- I don't remember exactly how the no-arg version is
defined.

>But in this case (reference counts), people do not agree on what is
>friendly. So I think it is something the ARG should pin down.

Apparently.  I'm surprised that you think the unprotected reference
count implementation is reasonable.  (Unless of course the compiler
knows there aren't any tasks causing trouble.)

- Bob

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-24  0:00             ` Fergus Henderson
@ 1996-11-24  0:00               ` Robert Dewar
  0 siblings, 0 replies; 29+ messages in thread
From: Robert Dewar @ 1996-11-24  0:00 UTC (permalink / raw)



Fergus says

"There are more than just two.  For example, a third reasonable
implementation is to use garbage-collected storage with copy-on-write.
This has potential advantages over the two implementations that you
considered: it avoids the overhead of copying, and avoids the overhead
of locking.  (Of course, there are some potential disadvantages too.)"


Do you mean by this what I call copy-on-modify, if so this is indeed
one of the two possibilities that I considered. Note that if you have
a garbage collector around, you could take the position that ANYTHING
done with dynamic storage makes things erroneous acrosds tasks, on
the grounds that the garbage collector does non syncrhonized conflicting
accesses to storage, but this seems a bit extreme :-)





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-20  0:00       ` Jon S Anthony
@ 1996-11-24  0:00         ` Robert Dewar
  0 siblings, 0 replies; 29+ messages in thread
From: Robert Dewar @ 1996-11-24  0:00 UTC (permalink / raw)



Jon Anthony said

"Hmmmm.  Really.  I think Robert has been hoisted on his own petard, :-)"

Not at all, Jon, you have just been taken in by that classical argument
technique "argument by false analogy". It goes like this,

Ah ha, you said A, but if you say A, then you must also mean B and
B is ridiculous.

Fine, in this case I agree that B (not  providing proper sequential
semantics) is ridulous, but it has  very little to do with A (knowing
exactly what task safe means).





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-21  0:00         ` Robert I. Eachus
@ 1996-11-24  0:00           ` Robert Dewar
  1996-11-24  0:00             ` Fergus Henderson
  1996-11-25  0:00             ` Kevin D. Heatwole
  0 siblings, 2 replies; 29+ messages in thread
From: Robert Dewar @ 1996-11-24  0:00 UTC (permalink / raw)



iRobert Eachus says

"      Suppose one task creates an unbounded string, and passes it to
another task through a rendezvous.  Both task create copies, then
modify different sections of the original string.  Definitely
erroneous, but it will make lazy reference counting visible."



The "definitely erroneous" here seems too strong to me. The point is that
we have two basic possible semantics of unbounded strings, one is to copy
on all assignments, and the other is to share copies, and copy only on a
modification.

I am completely unable to determine that one of these implementations is
somehow more reasonable, natural, expected, required or whatever, they
both seem quite reasonable to me.

But if we copy only on modification, then it is indeed true that the above
scenario is erroneous because the "create copies" presumably refers to
doing assignment statements, which do not create copies at the implementation
level. 

Bob tries to argue that this implemntation is somehow improper, but I don't
see that at all. Indeed copy on modify seems quite a reasonable 
implementation to me (it is for example what all SNOBOL4 and SPITBOL
implementations have done). In a garbage collected environment it is
also the most natural implementation.

But it certainly does have consequences with regard to thread safety!

The trouble here is that of course we expect to have proper value
semantics (and that indeed is why Bob's argument about other examples,
such as 1+1 not working are irrelevant, everyone agrees on the basic
single thread value semantics).

[[oops to clarify, Bob in the above  = Bob Duff]]

The trouble is that pure value semantics breaks down in a tasking
environment because of the shared variable issues, and we introduce
the notion of erroneous access to shared variables to patch this
up -- not a very satisfactory solution, but a pragmatic one, since
the only other alternative is to put locks on all variables all the
time.

Now when we get to Unbounded_String's, some kind of similar patch up
is required, but it definitely depends on the implementation model.
Since the RM text does not specify the implementation model, we are
unable to determine exactly the scope of the corresponding pragmatic
solution.

Going back to Robert Eachus' claim of "definitely erroneous", I guess
I *do* agree in the following sense:

Unbounded_String's are built into the language from a semantic point of view.
There are two possible implementations that are reasonable
For one of these implementations, the given operation *is* erroneous
Therefore, since one allowed implementation is erroneous, the set f
 possible effects is erroneous, even if in some implementations it
 might actually work (working is certainly one of the allowed
 behaviors for an erroneous execution -- the same is true after
 all of any shared variable erroneousity -- most of the time things
 are just fine.

One thing is clear, this should have been discussed during the language
design. The implementation technique of using transparent sharing via
pointers, with copy on modify rather than copy on assign is very
familiar, especially in the realm of variable length string processing,
so we shuld have anticipated this model, an discussed its interaction
with tasking.

---

The argument that this is NOT erroneous works something like this:


From a semantic point of view, assignment involves a copy. Yes, you can
"cheat" and not do a copy, but only if "as if" semantic equivalence is
maintained.

The copy-on-modify approach, though it might be fine (i.e. have fine
"as if" semantics in a single threaded environment) does NOT work fine
in a multi-threaded environment since it does not provide "as if"
semantics.

I find some merit in this argument, but I do not find it decisive, because
at the language level we have already introduced the notion of allowing
assignments and references to become erroneous as a result of shared
variable reference -- and we did it on pragmatic grounds, there would
have been no difficulty *semantically* in requiring assignment to work
properly in a threaded environment, even if shared variables are 
accessed, and no difficulty in implementing it -- just ghastly
efficiency issues.

Well here we sure have a similar case -- we have no trouble agreeing that
Bob Duff's claimed obvious semantics are fine, and that we can implement
them with no trouble -- but the implementation has a ghastly efficiency
problem (if taking locks is expensive).

Note that it is not just reference counts that cause this problem. Even if
we use what I would guess is the expected implementation of unbounded
strings which is to use controlled types (as in GNAT), we are in trouble.

Controlled types themselves implicitly reference global shared data
structures, or at least may do so in the model where chains of objects
to be finalized are maintained. If these chains must be protected against
multi-threading, there are again ghastly efficiency implications in 
systems where taking a lock is expensive. 

Of course if you carry the argument I and Robert Eachus are lmakig to logical
extremes, since type controlled is again just a type in a library package
whose implementation is hidden, it could be argued that it is not necessary
to take locks even in the controlled case, but this would introduce an even
more dramatic restriction on what could be done between tasks (basically
you would require controlled collections to be task specific -- and that
seems awfully restrictive).

Still, we definitely need more discussion here.

This is not the first time we have run into issues of having underspecified
the preefined library routines. Two previous cases are:

  1. The interaction of predefined private types with stream attributes. Do
     these predefined types have stream attributes that "work" in an 
     expected way. The RM does not ensure this (so for example using
     Read and Write with unbounded string is not rquired to work by the
     RM, but surely (?) it should, so an AI has been approved that 
     requires it to work -- this AI is a binding interpretation, a
     recognition that the RM was inadequate here.

  2. The package categorizatoin with respect to distribution pragmas was
     not thought through. The result is that the RM does not just fail to
     guarantee that certain types cannot be used with full flexibility in
     a distributed environment -- it REQUIRES that they NOT be able to be
     used. Again the ARG is looking into this.


Reasoning by what is friendly is never a safe occupation in the world
of language semantics. Certainly I think most poeple would agree that
the stream attributes SHOULD work for unbounded strings. The RM authors
either agreed with this explicitly, or implicitly, or simply did not
think about it. But the RM does NOT pin this down.

I think the business with reference counts and shared unbounded strings
between tasks is very much in this category, and certainly needs an
ARG discussion.

--

There is an interesting lesson for package and interface design here. Some
VERY detailed thought is required to properly document exactly what the
semantics of a package is when it is used in a multi-threaded environment.
I think we have been far too casual about this in the past.

Robert






^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
@ 1996-11-25  0:00 Franco Mazzanti
  1996-11-25  0:00 ` Robert A Duff
  0 siblings, 1 reply; 29+ messages in thread
From: Franco Mazzanti @ 1996-11-25  0:00 UTC (permalink / raw)

Robert A. Duff wrote:

">And of course it is quite clear that two writes to the same file using
 >Put_Line with a file argument are NOT covered by paragraph 3. So that's
 >mightly curious:
  >
 >    Put_Line (Standard_Output, "xxxx");
 >
 >appearing in two tasks is not covered, and could be erroneous, but
 >
 >    Put_Line ("xxxx");
 >
 >is OK. That seems quite weird to me, ...

 Agreed.  In my opinion, both of the above should be considered
 erroneous.  I said that in another article, and I also admitted that the
 RM doesn't clearly say so.  As you say, if one thinks the above should
 work, then one needs to worry about the granularity of interleaving."

I know I am an heretic for my views on this subject. 

However I think that it is quite incorrect to consider "Standard_Output"
as a shared variable. "Standard_Output" is a function call, and it very likely
to return an anonymous temporary object. AT least this is what one would
expect by reading the RM (unless File_Type is a "return-by-reference" type).

Only if we make use of the version of "Standard_Output" which returns a
"File_Access" object, then in two concurrent calls of

   Put_Line (Standard_Output.all, "xxxx"); 

we would really use a shared variable.

However, even in this case, the erroneousnees is not clearly implied by the
reference manual because the File_Type argument has mode IN. So, we have
sharing, but this sharing is safe because it is "read only".

The picture which the RM naturally provides (at least to a naive reader) is
the only two concurrent Close, Delete, Reset, Flush operations on the same
File_Type variable are erroneous for 9.10 (or the execution of any of
these
operations bin parallel with a Get_Line, Put_Line, ...) because these 
operations actually take a File_Type parameter of mode IN OUT.

I agree that some attributes associated to file objects are modified by get 
and put operations (file position information). However, this kind of data
is completely outside the user view, is part of the underlying implementation
of the I/O system and A(3) seem really to imply that concurrent calls of
Put_line should be thread-safe.

With the respect to the granularity of the interleaving, I do not think that
one needs to worry about it. 
Once it is stated clearly that the effects of Put_line need not be atomic 
(with respect to the externel effect and the update of the file position
attributes) it is perfectly acceptable that the EXTERNAL effect of two 
put_line operations results in some unspecified interleaving of the data.

But this is very different from allowing the concurrent execution of even 
the operations:

   Put_Line(My_file, "my-string");
   V:Count := Col(F);

to be erroneous and "thrash my hard disk".

Franco Mazzanti

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-25  0:00 Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation)) Franco Mazzanti
@ 1996-11-25  0:00 ` Robert A Duff
  0 siblings, 0 replies; 29+ messages in thread
From: Robert A Duff @ 1996-11-25  0:00 UTC (permalink / raw)



In article <mazzanti-2511961115230001@131.114.200.115>,
Franco Mazzanti <mazzanti@iei.pi.cnr.it> wrote:
>Only if we make use of the version of "Standard_Output" which returns a
>"File_Access" object, then in two concurrent calls of
>
>   Put_Line (Standard_Output.all, "xxxx"); 
>
>we would really use a shared variable.
>
>However, even in this case, the erroneousnees is not clearly implied by the
>reference manual because the File_Type argument has mode IN. So, we have
>sharing, but this sharing is safe because it is "read only".

One reason I object to this point of view is efficiency: It forces the
implementation to do some sort of locking inside Text_IO.  (In addition
to whatever the operating system does, I mean.)  But I need to do some
sort of locking in the program, too, because if I have two tasks writing
to the same file, I'm going to want to control the granularity of
interleaving.  So the locking with Text_IO is useless for a real
program.

- Bob




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-17  0:00       ` Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation)) Robert A Duff
  1996-11-18  0:00         ` Robert Dewar
  1996-11-21  0:00         ` Robert I. Eachus
@ 1996-11-25  0:00         ` Robert I. Eachus
  2 siblings, 0 replies; 29+ messages in thread
From: Robert I. Eachus @ 1996-11-25  0:00 UTC (permalink / raw)




   I said:

 > "      Suppose one task creates an unbounded string, and passes it to
 > another task through a rendezvous.  Both task create copies, then
 > modify different sections of the original string.  Definitely
 > erroneous, but it will make lazy reference counting visible."

   Robert Dewar said:

 > The "definitely erroneous" here seems too strong to me. The point is that
 > we have two basic possible semantics of unbounded strings, one is to copy
 > on all assignments, and the other is to share copies, and copy only on a
 > modification.

   I think Robert, you may need to reread my example.  If the two
tasks modified the copies, then the erroneousness would be
questionable.  But I have both tasks updating the same value without
an intervening synchronization point.  THAT is erroneous if anything
is.  The copies allow me to determine later whether lazy updating is
used, but are not what makes the example erroneous.  Even if the areas
modified are non-overlapping, an implementation can write words
instead of bytes.

   My point was that the program must be erroneous for lazy copies to
be visible with the given implementation strategy.  (There are other
methods involving unchecked conversions, but the addresses you get are
erroneous, because Unchecked_Conversion is not required to return
values by reference.)

   Other than that, I think Robert Dewar and I are in strong
agreement.  What happens if tasks share values without explicit
synchronization can't be used to govern how a library package is
implemented.  We can only reason about the behavior in cases where the
conditions in A(3) hold.

--

					Robert I. Eachus

with Standard_Disclaimer;
use  Standard_Disclaimer;
function Message (Text: in Clever_Ideas) return Better_Ideas is...




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-24  0:00           ` Robert Dewar
  1996-11-24  0:00             ` Fergus Henderson
@ 1996-11-25  0:00             ` Kevin D. Heatwole
  1996-11-25  0:00               ` Robert A Duff
  1996-11-26  0:00               ` Geert Bosch
  1 sibling, 2 replies; 29+ messages in thread
From: Kevin D. Heatwole @ 1996-11-25  0:00 UTC (permalink / raw)

In article <dewar.848839209@merv>, dewar@merv.cs.nyu.edu (Robert Dewar) wrote:

>Well here we sure have a similar case -- we have no trouble agreeing that
>Bob Duff's claimed obvious semantics are fine, and that we can implement
>them with no trouble -- but the implementation has a ghastly efficiency
>problem (if taking locks is expensive).
>
>Note that it is not just reference counts that cause this problem. Even if
>we use what I would guess is the expected implementation of unbounded
>strings which is to use controlled types (as in GNAT), we are in trouble.
>
>Controlled types themselves implicitly reference global shared data
>structures, or at least may do so in the model where chains of objects
>to be finalized are maintained. If these chains must be protected against
>multi-threading, there are again ghastly efficiency implications in 
>systems where taking a lock is expensive. 

I think you might be going too far in categorizing this as having "ghastly
efficiency implications".  In a language that incorporates separate 
threads of execution, this "efficiency problem" occurs all over the place.
As you point out, an implementation for controlled types will probably
involve some need for serializing access to shared data structures.  So
does protected types, tasks, and even dynamic allocation of memory.

Indeed, the use of reference counts in an unbounded string implementation
is presumeably done to save a dynamic allocation which would involve
some taking of a lock anyway.

So, what is the big deal here as it applies to the specific case of unbounded
strings?

Certainly, initial implementations of controlled types, dynamic allocation,
and unbounded strings will be less than optimal, but it is up to us implementers
to provide efficient implementations of these features of Ada. 

Is it really that much harder to provide an efficient ref-counted tasking safe
implementation of unbounded strings than it is to provide an efficient tasking
safe allocator?

Kevin Heatwole
OC Systems, Inc.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-25  0:00             ` Kevin D. Heatwole
@ 1996-11-25  0:00               ` Robert A Duff
  1996-11-26  0:00                 ` Kevin D. Heatwole
  1996-11-26  0:00               ` Geert Bosch
  1 sibling, 1 reply; 29+ messages in thread
From: Robert A Duff @ 1996-11-25  0:00 UTC (permalink / raw)

In article <heatwole-ya023180002511960851580001@news.his.com>,
Kevin D. Heatwole <heatwole@his.com> wrote:
>Is it really that much harder to provide an efficient ref-counted tasking safe
>implementation of unbounded strings than it is to provide an efficient tasking
>safe allocator?

Two points:

You can make "new" efficient by having per-task pools of memory.
Allocate each task 4K (say) and use that for "new".  And if it runs out,
allocate some more from a global pool.  Or steal some from another task.
This will reduce the probability of lock contention.

I suspect that the most efficient method for unbounded strings is to use
full-fledged garbage collection.  I don't KNOW that, but from reading
various papers by GC advocates, I suspect it's probably true.  And I'm
talking about non-real-time systems here, where we care about best
"typical" performance -- I don't think one uses unbounded strings in
hard-real-time situations.  Of course, it's hard to implement an
efficient GC -- doubly so in a multi-threaded language.

OK, here's a third point:  All this talk about inefficient locks is
based on the assumption that each Ada task is mapped to a single OS
thread.  If you're willing to give up the benefits of that (on, say,
Windows 95), then it's not hard to do very efficient locking.

- Bob

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-23  0:00             ` Robert Dewar
  1996-11-24  0:00               ` Robert A Duff
@ 1996-11-25  0:00               ` Norman H. Cohen
  1996-11-27  0:00                 ` Robert Dewar
  1 sibling, 1 reply; 29+ messages in thread
From: Norman H. Cohen @ 1996-11-25  0:00 UTC (permalink / raw)



Robert Dewar wrote:

> Friendly readings are fine, but I do not know what is friendly and what
> is not.
> 
> Let's look at the reference count case.  I see two friendly readings here:
> 
>   1.  An efficient implementation with reference counts is permitted, no
>       locks are needed, since if two tasks mess with the same sets of
>       variables or values, the execution is erroneous.
> 
>   2.  If reference counts are used, task locks are required, requiring
>       a kernel call on some systems, and therefore, becuase of the
>       quite unacceptable efficiency effect, reference counts are out
>       of the question.
> 
> Which of these is friendlier? Well if you don't use tasking, or if you
> use only completely separated sets of values and variables in the
> tasking case, you find 1 much friendlier, since a reference count
> implementation makes a lot of sense in these cases.
> 
> If you insist on using unbounded strings across tasks, thinking that
> the assignment semantics are like ordinary scalar types, then of course
> you might find 1 friendlier.

What I would find friendliest is a run-time system with two versions of
Ada.Strings.Unbounded.  The version that uses reference counts but no
locks would only be linked in when I said

   pragma Restrictions (Max_Tasks => 1);

Otherwise, I would get the task-safe version (presumably without
reference counts, since the cost of locking would most likely outweigh
any savings from lazy copying).

-- 
Norman H. Cohen
mailto:ncohen@watson.ibm.com
http://www.research.ibm.com/people/n/ncohen




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-26  0:00                 ` Kevin D. Heatwole
@ 1996-11-26  0:00                   ` Robert A Duff
  1996-11-26  0:00                     ` Larry Kilgallen
                                       ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Robert A Duff @ 1996-11-26  0:00 UTC (permalink / raw)



In article <heatwole-ya023180002611960850350001@news.his.com>,
Kevin D. Heatwole <heatwole@his.com> wrote:
>I suspect that even this approach would have some difficulties involving
>pool-specific (or global) locks.  While task A may allocate the storage
>out of its per-task pool of memory, task B may be the task that deallocates
>the memory later.  If you have two threads mucking about with the same
>pool of memory, you will need to implement some form of serialization.

If you implement your pools with free lists, then when you deallocate,
you can put that chunk of memory back on the current tasks free list,
despite the fact that it was allocated from some other task's free list.

>Of course, as you point out, garbage collection techniques might help to
>make the usual case efficient but this has its drawbacks (and still might
>not reduce/eliminate the need for locking).
>
>Also, note that getting the identity of the currently executing task may be
>inefficient as well.  On our target, AIX for the PowerPC, getting the identity
>of the current thread involves an AIX call to pthread_self.  Since there is
>no thread-specific memory, this call is implemented as a search of a thread
>id table using the current stack pointer with appropriate locking (boy, is this 
>slow!).

Good point.  (Of course, IMHO, the OS *ought* to support an efficient
Current_Thread primitive -- it's not that hard to implement.  But that
doesn't help you unless you've got control of the OS.)

>...  We have chosen to reserve a register to point to thread-specific 
>memory to alleviate this problem, but this makes C calling Ada slower 
>because it needs to call pthread_self to setup this register.

And if your target were an 80x86, you wouldn't have many registers to
spare.  Sigh.

>>OK, here's a third point:  All this talk about inefficient locks is
>>based on the assumption that each Ada task is mapped to a single OS
>>thread.  If you're willing to give up the benefits of that (on, say,
>>Windows 95), then it's not hard to do very efficient locking.
>
>Why is this true?  Is it because on most OSes, OS calls are inherently
>expensive?

Well, OS calls *are* inherently expensive, because you have to cross a
protection boundary.  But for multi-threading within a single
application, the protection boundary is pointless, IMHO.  After all,
threads share memory, so they can trash one another's data -- trashing
another task's return addresses is at least as bad as destroying the
data associated with locks and so forth.

>I would think that if the OS wanted to make this efficient and it was a
>priority for the users, the lock calls could be just as efficient (whether done
>by the OS or done by the Ada runtime).  Anyway, I suspect that now that
>many OSes are becoming multi-threaded and with the more common use of
>multiple processors (even in the PC world), that the OSes will become more
>friendly and efficient for us Ada compiler implementors.  The OSes are finally
>catching up with Ada!

Well, maybe.  Don't hold your breath.

- Bob




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-26  0:00                   ` Robert A Duff
@ 1996-11-26  0:00                     ` Larry Kilgallen
  1996-11-27  0:00                     ` Norman H. Cohen
  1996-11-27  0:00                     ` Robert Dewar
  2 siblings, 0 replies; 29+ messages in thread
From: Larry Kilgallen @ 1996-11-26  0:00 UTC (permalink / raw)



In article <E1I0KG.M4r@world.std.com>, bobduff@world.std.com (Robert A Duff) writes:

> Well, OS calls *are* inherently expensive, because you have to cross a
> protection boundary.  But for multi-threading within a single
> application, the protection boundary is pointless, IMHO.  After all,
> threads share memory, so they can trash one another's data -- trashing
> another task's return addresses is at least as bad as destroying the
> data associated with locks and so forth.

Some thread-relevant OS calls affect the scheduler data in cases
where the OS can schedule separate threads on separate CPUs. The
scheduler data is typically stored together for all appications,
and one would not want an application to trash the data for some
other application.

Larry Kilgallen




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-25  0:00             ` Kevin D. Heatwole
  1996-11-25  0:00               ` Robert A Duff
@ 1996-11-26  0:00               ` Geert Bosch
  1996-11-26  0:00                 ` Robert Dewar
  1 sibling, 1 reply; 29+ messages in thread
From: Geert Bosch @ 1996-11-26  0:00 UTC (permalink / raw)



Kevin D. Heatwole (heatwole@his.com) wrote:
 ``Is it really that much harder to provide an efficient ref-counted
   tasking safe implementation of unbounded strings than it is to
   provide an efficient tasking safe allocator? ''

No, it isn't. The only problems relate to the priority requirements of
Annex D.  When you want to use ceiling priority locks, it is necessary
to change and/or check priority of tasks which requires an expensive
system call.

As far as I'm concerned it is acceptable to have an implementation that
locks a reference counter, updates the count and unlocks it without any
priority checking, since the circumstances in which starvation of a
higher priority task occurs, because a lower-priority task has the lock
are very rare and the effects are much less disastrous then the
erroneous behavior of reference counts without locks. Also, raising an
exception in such cases is not too difficult to implement.

The problems related to implementing a locking scheme with such very
specific real-time behaviour as described by Annex D. should not be
an excuse to use no locks at all.

Regards,
   Geert
-- 
E-Mail: geert@sun3.iaf.nl    




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-25  0:00               ` Robert A Duff
@ 1996-11-26  0:00                 ` Kevin D. Heatwole
  1996-11-26  0:00                   ` Robert A Duff
  0 siblings, 1 reply; 29+ messages in thread
From: Kevin D. Heatwole @ 1996-11-26  0:00 UTC (permalink / raw)

In article <E1FnJC.HI4@world.std.com>, bobduff@world.std.com (Robert A
Duff) wrote:

>In article <heatwole-ya023180002511960851580001@news.his.com>,
>Kevin D. Heatwole <heatwole@his.com> wrote:
>>Is it really that much harder to provide an efficient ref-counted tasking safe
>>implementation of unbounded strings than it is to provide an efficient tasking
>>safe allocator?
>
>Two points:
>
>You can make "new" efficient by having per-task pools of memory.
>Allocate each task 4K (say) and use that for "new".  And if it runs out,
>allocate some more from a global pool.  Or steal some from another task.
>This will reduce the probability of lock contention.

I suspect that even this approach would have some difficulties involving
pool-specific (or global) locks.  While task A may allocate the storage
out of its per-task pool of memory, task B may be the task that deallocates
the memory later.  If you have two threads mucking about with the same
pool of memory, you will need to implement some form of serialization.

Of course, as you point out, garbage collection techniques might help to
make the usual case efficient but this has its drawbacks (and still might
not reduce/eliminate the need for locking).

Also, note that getting the identity of the currently executing task may be
inefficient as well.  On our target, AIX for the PowerPC, getting the identity
of the current thread involves an AIX call to pthread_self.  Since there is
no thread-specific memory, this call is implemented as a search of a thread
id table using the current stack pointer with appropriate locking (boy, is this 
slow!).  We have chosen to reserve a register to point to thread-specific 
memory to alleviate this problem, but this makes C calling Ada slower 
because it needs to call pthread_self to setup this register.

>OK, here's a third point:  All this talk about inefficient locks is
>based on the assumption that each Ada task is mapped to a single OS
>thread.  If you're willing to give up the benefits of that (on, say,
>Windows 95), then it's not hard to do very efficient locking.

Why is this true?  Is it because on most OSes, OS calls are inherently
expensive?
I would think that if the OS wanted to make this efficient and it was a
priority for the users, the lock calls could be just as efficient (whether done
by the OS or done by the Ada runtime).  Anyway, I suspect that now that
many OSes are becoming multi-threaded and with the more common use of
multiple processors (even in the PC world), that the OSes will become more
friendly and efficient for us Ada compiler implementors.  The OSes are finally
catching up with Ada!

Kevin Heatwole
OC Systems, Inc.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-26  0:00               ` Geert Bosch
@ 1996-11-26  0:00                 ` Robert Dewar
  0 siblings, 0 replies; 29+ messages in thread
From: Robert Dewar @ 1996-11-26  0:00 UTC (permalink / raw)



Geert says, replying to Kevin:

Kevin D. Heatwole (heatwole@his.com) wrote:
 ``Is it really that much harder to provide an efficient ref-counted
   tasking safe implementation of unbounded strings than it is to
   provide an efficient tasking safe allocator? ''

No, it isn't. The only problems relate to the priority requirements of
Annex D.  When you want to use ceiling priority locks, it is necessary
to change and/or check priority of tasks which requires an expensive
system call.



Robert replies

That's incorrect, it is easy to think of strategies for writing task
safe allocators that greatly reduce the requirement for this expensive
locking. In particular, an obvious strategy is to cache a chunk of
storage for a task doing dynamic allocation, and use this cache to
satisfy local requests, so that the expensive locking is only required
to get a new chunk. We don't do that by default in GNAT, but it is
quite easy, using the storage pools mechanism and task attributes,
to write a storage pool that works that way and use it.

On the other hand, if you are forced into using locks for reference
counts, then no such tricks are at hand, and you will need the
expensive system call for every unbounded string assignment, which
is pretty fierce.

Note also that in a typical use of unbounded strings, you may do many
more assignments of UBS's than allocations, so even if you are using
the standard allocator which takes a lock, you pay that only on a new
string -- taking a lock on every assignment may be much worse.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-25  0:00               ` Norman H. Cohen
@ 1996-11-27  0:00                 ` Robert Dewar
  0 siblings, 0 replies; 29+ messages in thread
From: Robert Dewar @ 1996-11-27  0:00 UTC (permalink / raw)



Norman says

"What I would find friendliest is a run-time system with two versions of
Ada.Strings.Unbounded.  The version that uses reference counts but no
locks would only be linked in when I said
"


But the locks don't cost anything significant if no tasking is active,
at least this is true in the GNAT runtime (see the implementation of
System.Tasking_Soft_Links.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-26  0:00                   ` Robert A Duff
  1996-11-26  0:00                     ` Larry Kilgallen
  1996-11-27  0:00                     ` Norman H. Cohen
@ 1996-11-27  0:00                     ` Robert Dewar
  2 siblings, 0 replies; 29+ messages in thread
From: Robert Dewar @ 1996-11-27  0:00 UTC (permalink / raw)



Kevin D. Heatwole <heatwole@his.com> wrote:
>I suspect that even this approach would have some difficulties involving
>pool-specific (or global) locks.  While task A may allocate the storage
>out of its per-task pool of memory, task B may be the task that deallocates
>the memory later.  If you have two threads mucking about with the same
>pool of memory, you will need to implement some form of serialization.


That suspicion is wrong, there is for instance the familiar free storage
management technique (used in Microsoft Pascal, and in the x86 Alsys
implementation -- I know about the latter, I write it :-) where free simply
sets the free bit and NOTHING else, and recombination of blocks is done
entirely by the allocator. There are other techniques to avoid locking
for the free case.





^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-26  0:00                   ` Robert A Duff
  1996-11-26  0:00                     ` Larry Kilgallen
@ 1996-11-27  0:00                     ` Norman H. Cohen
  1996-11-29  0:00                       ` Fergus Henderson
  1996-11-29  0:00                       ` Robert A Duff
  1996-11-27  0:00                     ` Robert Dewar
  2 siblings, 2 replies; 29+ messages in thread
From: Norman H. Cohen @ 1996-11-27  0:00 UTC (permalink / raw)

Robert A Duff wrote:

> In article <heatwole-ya023180002611960850350001@news.his.com>,
> Kevin D. Heatwole <heatwole@his.com> wrote:
> >I suspect that even this approach would have some difficulties involving
> >pool-specific (or global) locks.  While task A may allocate the storage
> >out of its per-task pool of memory, task B may be the task that deallocates
> >the memory later.  If you have two threads mucking about with the same
> >pool of memory, you will need to implement some form of serialization.
> 
> If you implement your pools with free lists, then when you deallocate,
> you can put that chunk of memory back on the current tasks free list,
> despite the fact that it was allocated from some other task's free list.

That doesn't seem like a good idea.

Apparently you're assuming fixed-size allocations, so that there is
never a need to coalesce a freed block with neighboring free storage (or
else you are deferring the coalescing until the next allocation by the
task owning the free list).

More seriously, however, imagine a producer/consumer design in which the
producer task allocates and initializes an object and passes the access
value pointing to it to the consumer task.  The consumer task processes
the object and deallocates it.  In your scheme, the producer task will
constantly be requesting more and more storage blocks from the global
heap so that it can chop the blocks up into allocated objects; all of
the allocated objects eventually end up on the consumer task's free
list, but are never used again.

-- 
Norman H. Cohen
mailto:ncohen@watson.ibm.com
http://www.research.ibm.com/people/n/ncohen

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-27  0:00                     ` Norman H. Cohen
  1996-11-29  0:00                       ` Fergus Henderson
@ 1996-11-29  0:00                       ` Robert A Duff
  1 sibling, 0 replies; 29+ messages in thread
From: Robert A Duff @ 1996-11-29  0:00 UTC (permalink / raw)



In article <329C945E.6FCB@watson.ibm.com>,
Norman H. Cohen <ncohen@watson.ibm.com> wrote:
>That doesn't seem like a good idea.

I didn't mean it's *always* a good idea -- it depends on the program.

>Apparently you're assuming fixed-size allocations, so that there is
>never a need to coalesce a freed block with neighboring free storage (or
>else you are deferring the coalescing until the next allocation by the
>task owning the free list).

Yes, but there are a lot of programs that use fixed sizes, or a small
number of fixed sizes.

>More seriously, however, imagine a producer/consumer design in which the
>producer task allocates and initializes an object and passes the access
>value pointing to it to the consumer task.  The consumer task processes
>the object and deallocates it.  In your scheme, the producer task will
>constantly be requesting more and more storage blocks from the global
>heap so that it can chop the blocks up into allocated objects; all of
>the allocated objects eventually end up on the consumer task's free
>list, but are never used again.

Yes, if you always go to the global heap when you run out.  An
alternative is to steal memory from another task.  This means that you
have to lock the per-task heaps, but the scheme can still win, because
getting a lock that's not already locked can be made cheap, and the
scheme reduces that probability in many cases (though not in the case
you mention above).

- Bob




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation))
  1996-11-27  0:00                     ` Norman H. Cohen
@ 1996-11-29  0:00                       ` Fergus Henderson
  1996-11-29  0:00                       ` Robert A Duff
  1 sibling, 0 replies; 29+ messages in thread
From: Fergus Henderson @ 1996-11-29  0:00 UTC (permalink / raw)



"Norman H. Cohen" <ncohen@watson.ibm.com> writes:

>[...] imagine a producer/consumer design in which the
>producer task allocates and initializes an object and passes the access
>value pointing to it to the consumer task.  The consumer task processes
>the object and deallocates it.  In your scheme, the producer task will
>constantly be requesting more and more storage blocks from the global
>heap so that it can chop the blocks up into allocated objects; all of
>the allocated objects eventually end up on the consumer task's free
>list, but are never used again.

You could avoid this by keeping a count of the space in each task's
free list: each deallocation would check the amount of free space,
and if this is greater than some threshold, return most of the free
memory chunks to the global heap.

--
Fergus Henderson <fjh@cs.mu.oz.au>   |  "I have always known that the pursuit
WWW: <http://www.cs.mu.oz.au/~fjh>   |  of excellence is a lethal habit"
PGP: finger fjh@128.250.37.3         |     -- the last words of T. S. Garp.




^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~1996-11-29  0:00 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1996-11-25  0:00 Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation)) Franco Mazzanti
1996-11-25  0:00 ` Robert A Duff
  -- strict thread matches above, loose matches on Subject: below --
1996-10-09  0:00 Once again, Ada absent from DoD SBIR solicitation Stanley R. Allen
1996-11-14  0:00 ` Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation) Robert Dewar
1996-11-16  0:00   ` Robert A Duff
1996-11-16  0:00     ` Robert Dewar
1996-11-17  0:00       ` Unbounded strings (Was: Java vs Ada 95 (Was Re: Once again, Ada absent from DoD SBIR solicitation)) Robert A Duff
1996-11-18  0:00         ` Robert Dewar
1996-11-19  0:00           ` Joel VanLaven
1996-11-23  0:00             ` Robert Dewar
1996-11-19  0:00           ` Robert A Duff
1996-11-23  0:00             ` Robert Dewar
1996-11-24  0:00               ` Robert A Duff
1996-11-25  0:00               ` Norman H. Cohen
1996-11-27  0:00                 ` Robert Dewar
1996-11-21  0:00         ` Robert I. Eachus
1996-11-24  0:00           ` Robert Dewar
1996-11-24  0:00             ` Fergus Henderson
1996-11-24  0:00               ` Robert Dewar
1996-11-25  0:00             ` Kevin D. Heatwole
1996-11-25  0:00               ` Robert A Duff
1996-11-26  0:00                 ` Kevin D. Heatwole
1996-11-26  0:00                   ` Robert A Duff
1996-11-26  0:00                     ` Larry Kilgallen
1996-11-27  0:00                     ` Norman H. Cohen
1996-11-29  0:00                       ` Fergus Henderson
1996-11-29  0:00                       ` Robert A Duff
1996-11-27  0:00                     ` Robert Dewar
1996-11-26  0:00               ` Geert Bosch
1996-11-26  0:00                 ` Robert Dewar
1996-11-25  0:00         ` Robert I. Eachus
1996-11-20  0:00       ` Jon S Anthony
1996-11-24  0:00         ` Robert Dewar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox