From: Niklas Holsti <niklas.holsti@nospam.please>
Subject: Equivalent keys/elements in Ada.Containers Maps and Sets
Date: Tue, 23 Jan 2007 14:56:16 +0200
Date: 2007-01-23T14:56:16+02:00 [thread overview]
Message-ID: <45b60602$0$24602$39db0f71@news.song.fi> (raw)
To learn about Ada.Containers in Ada 2005 I have been reading the LRM
section A.18 and John Barnes' Rationale. Some details about Maps and
Sets are puzzling me enough to make me consider submitting a couple of
comments to ada-auth. As I'm a novice in Ada.Containers I thought I
would first ask here, on comp.lang.ada, in case I am misunderstanding
things or raking up issues that have been discussed before. I looked
through the Containers-related Ada Issues for Ada 2005 on ada-auth.org
but found none that seemed relevant.
The points I want to ask about are so technical that I felt it necessary
to make this posting rather long and detailed.
-- Background
The generic container packages for Maps and Sets deal with a concept
called "equivalent keys" or "equivalent elements". This equivalence is
called an "equivalence relation" in RM A.18.4(5/2) for Maps and RM
A.18.7(5/2) for Sets. An equivalence relation is defined as a reflexive,
symmetric and transitive relation: any key or element X is equivalent to
itself; if X is equivalent to Y then Y is equivalent to X; and if X and
Y are equivalent and Y and Z are equivalent then X and Z are equivalent.
I think I understand the idea, which is that two different values of the
formal types Key_Type (for Maps) or Element_Type (for Sets) may still be
similar enough (that is, "equivalent") to count as the same key or the
same element when considered as (unique) keys to a map or as (unique)
elements of a set. This is analogous to user-defined relational
operators that only inspect meaningful components of their parameters.
-- First problem: "equivalence" in Ordered Maps and Sets
For hashed Maps and Sets, the actual "equivalence" is directly defined
at instantiation time by the actual functions associated to the formal
functions Equivalent_Keys (for Hashed_Maps) and Equivalent_Elements (for
Hashed_Sets). The actual functions are required to be equivalence
relations in A.18.5(44/2) and A.18.8(66/2). I have no problem with that.
For the Ordered_Maps and Ordered_Sets, however, the "equivalence" is
defined *indirectly* by the actual ordering functions associated to the
formal functions "<" on Keys or Elements, respectively, at instantiation
time. Two keys/elements X and Y are defined to be "equivalent" if they
are incomparable, that is, if both X < Y and Y < X are False. This is in
A.18.6(55/2) and A.18.9(78/2).
The first problem I see is that the "equivalence" relation defined in
this way from "<" is *not* necessarily an equivalence relation because
it may not be transitive. It is obviously symmetric because "and" is
commutative. It is reflexive because the actual for "<" is required to
be a strict ordering and thus irreflexive. However, one can easily make
an example of a strict ordering "<" in which the "incomparable" relation
is not transitive; say we have three values X, Y, Z where the only
relation is X < Z; thus X and Y are incomparable and Y and Z are
incomparable, but X and Z are not incomparable.
So what? What if this "equivalence" is not transitive and is thus not
really an equivalence (without quotes)? Well, firstly I think it is
quite confusing to call something an "equivalence" when it is not
transitive, but secondly the semantics of the Map and Set operations are
defined in terms of "equivalent" keys or elements, and this can lead to
surprising results if the "equivalence" relation is not transitive.
For example, by A.18.7(22/2) the function Equivalent_Sets reports two
sets S1 and S2 as equivalent if they have the same number of elements
and if for every element E1 of S1 there is an element E2 of S2 such that
E1 and E2 are "equivalent". (Note the asymmetry of this definition: it
"traverses" S1 and "searches" for equivalent elements in S2.)
Now, if "equivalence" of elements is not transitive, we can have two
sets S1 and S2 such that Equivalent_Sets (S1, S2) is True even if S2
contains an element that is not "equivalent" to any element in S1.
This strange result is possible because two different elements of S1 can
be "equivalent" to the *same* element of S2, leaving one or more
elements of S2 unpaired with any "equivalent" elements of S1.
A consequence is that Equivalent_Sets is not necessarily symmetric; it
can return True for parameters (S1, S2) but False for (S2, S1).
If "equivalence" of elements is transitive, this strange result cannot
happen. If two different elements E1 and E1' of S1 are "equivalent" to
the same element E2 of S2, then by transitivity E1 and E1' are
"equivalent" to each other, which is impossible because no Set can
contain two or more "equivalent" elements, by A.18.6(5/2).
I haven't looked for similar strange results from the other operations
because I think that the case of Equivalent_Sets is strange enough.
There is also a problem of definition. The definition of the "="
operator for Maps in A.18.4(22/2, 23/2) implicitly assumes that each key
in the Left map is equivalent to at most one key in the Right map. This
assumption is false if "equivalence" of keys is not transitive; then
there may be many "equivalent" keys in the Right map that could be
matched against the same key in the Left map, so the definition of "="
for Maps is ambiguous.
To avoid this problem I think that the RM paragraphs that specify the
properties required or expected of the actual "<" functions should be
extended to say that the "incomparable" relation must be transitive.
This is A.18.6(56/2) and A.18.9(79/2).
-- Second problem: "=" for Element_Type in Sets
All generic packages for Maps and Sets take a generic formal "="
operator for the Element_Type. According to the Annotated LRM it is
meant to be used only in the "=" operator for Maps and Sets. The actual
"=" function is required to be deterministic, symmetric and reflexive.
This is all understandable and natural.
For Sets, however, I find it puzzling that no relationship is required
or expected between this "=" and the other formal functions
Equivalent_Elements, Hash and "<". For example, in the case of two
elements E1 and E2 in an Ordered_Set it would be very natural to require
that if E1 "=" E2 then we cannot also have E1 "<" E2 or E2 "<" E1, in
other words we should have E1 "equivalent" to E2.
One point where this may be significant is the definition of the "="
operator on Sets in A.18.7(20/2). The definition requires each element
of the Left set to be compared for "=" with the elements of the Right
set. If the actual "=" function is unrelated to the Hash function or to
the "<" operator for elements, I don't see how the "=" operator on Sets
can be implemented in less than O(N**2) complexity.
Moreover, the "=" operator for Sets as now defined can report two sets
S1 and S2 as "=" although S2 contains elements that are not "=" to any
elements of S1. This is because the definition is asymmetric (it
"traverses" S1 and "searches" S2), so two different elements of S1 can
be "=" to the same element of S2, leaving some elements of S2 unmatched
with elements of S1.
I think the definition of "=" for Sets expects the implementation to
compare only "equivalent" elements of the Left and Right Sets. If
"equivalence" is transitive this means that each element of Left can be
compared with "=" to exactly one element of Right (or, for Hashed_Sets,
to the elements in one bucket, if the implementation prefers to use "="
directly and not first check Equivalent_Elements).
I know that there are no RM requirements or advice on the complexity of
the "=" operator for Sets, but under the current rules I think that the
natural, fast implementation that only compares the equivalent elements
is wrong.
To solve this problem I think that A.18.6(3/2) should be extended to
also require or expect that two "=" elements are "equivalent".
For Hashed_Sets this means that if E1 "=" E2 then
Equivalent_Elements(E1,E2) must be True. A consequence is that Hash(E1)
= Hash(E2).
For Ordered_Sets this means that if E1 "=" E2 then both E1 "<" E2 and
E2 "<" E1 must be False.
-- Last words
Should I submit these comments to ada-comment@ada-auth.org? Or were the
issues already considered in the design of Ada.Containers? I would be
grateful for your advice.
--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
. @ .
next reply other threads:[~2007-01-23 12:56 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-01-23 12:56 Niklas Holsti [this message]
2007-01-23 15:45 ` Equivalent keys/elements in Ada.Containers Maps and Sets Matthew Heaney
2007-01-23 16:47 ` Matthew Heaney
2007-01-23 23:04 ` Niklas Holsti
2007-01-24 0:43 ` Randy Brukardt
2007-01-24 7:52 ` Niklas Holsti
2007-01-24 21:06 ` Randy Brukardt
2007-01-24 16:50 ` Matthew Heaney
2007-01-24 21:18 ` Randy Brukardt
2007-01-24 21:27 ` Adam Beneschan
2007-01-24 22:50 ` Matthew Heaney
2007-01-25 2:22 ` Adam Beneschan
2007-01-26 0:08 ` Georg Bauhaus
2007-01-24 20:56 ` Adam Beneschan
2007-01-24 20:57 ` Adam Beneschan
2007-01-24 22:56 ` Matthew Heaney
2007-01-25 0:44 ` Randy Brukardt
2007-01-24 16:11 ` Matthew Heaney
2007-01-24 18:12 ` Niklas Holsti
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox