From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,1fb7f2283a6b04a0
X-Google-Attributes: gid103376,public
X-Google-ArrivalTime: 2003-10-30 14:17:27 PST
Path: 
 archiver1.google.com!news2.google.com!fu-berlin.de!uni-berlin.de!82-43-33-75.cable.ubr01.croy.blueyonder.co.UK!not-for-mail
From: Nick Roberts <nick.roberts@acm.org>
Newsgroups: comp.lang.ada
Subject: Re: Grace 0.51 released
Date: Thu, 30 Oct 2003 22:17:18 +0000
Message-ID: <bns2li$14nnjr$1@ID-25716.news.uni-berlin.de>
References: <u4qxuei06.fsf@nasa.gov>
 <bnkcj9$126vjr$1@ID-25716.news.uni-berlin.de>
 <x7v3cdbg2x1.fsf@smaug.pushface.org>
NNTP-Posting-Host: 82-43-33-75.cable.ubr01.croy.blueyonder.co.uk (82.43.33.75)
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de 1067552242 38526587 82.43.33.75 (16 [25716])
User-Agent: Mozilla/5.0 (Windows; U; Win95; en-US;
 rv:1.5) Gecko/20031013 Thunderbird/0.3
X-Accept-Language: en-us, en
In-Reply-To: <x7v3cdbg2x1.fsf@smaug.pushface.org>
Xref: archiver1.google.com comp.lang.ada:1859
Date: 2003-10-30T22:17:18+00:00
List-Id: <comp.lang.ada>

Simon Wright wrote:

>> [Tenet implementation will use clustering]

> I would be fairly amazed if anyone produced such an implementation in
> a general library without needing it themselves (certainly I won't for
> the BCs, partly at least because I feel I don't understand the concept
> totally!)
> 
> I guess what you have in mind is some performance requirement; state
> that, maybe Charles or the BCs or whatever will do what you want,
> maybe not ..

Clustering is applicable to unbounded lists and unbounded maps (which I 
call 'lookups'). As with other implementations, I intend to implement these 
(in Tenet) by a (doubly) linked list and a linked tree respectively. 
However, rather than having just one data element per node, as with usually 
implementations, I will have a small array -- of fixed length -- for each 
node, so that several data elements can be stored in each node (in the 
array). This array is called a 'cluster array', and the nodes are called 
'clusters'.

The basic idea is that by varying the length of the cluster array, it is 
possible to adjust the trade off between higher memory efficiency (longer 
arrays) and speed of operations (shorter arrays). At one extreme, the array 
might have length 1; obviously this is the same as the usual 
implementation. At the other extreme, one might have a very big array length.

For linked lists, each cluster (node) will be able to store the number of 
data elements actually stored in its cluster array (from 1 to N, where N is 
the cluster array length). In this way, arbitrary insertion and deletion of 
elements can be achieved by the appropriate combination of inserting or 
deleting clusters (nodes), and inserting or deleting data elements within 
cluster arrays (by moving items and/or adjusting the dynamic length).

For lookups (maps), data elements are stored within cluster arrays sorted 
by key. Insertion requires finding the correct position in the array (by 
binary-chop search), and copying up the remainder. If the array is full, a 
new cluster (node) has to be created. Deletion works the reverse way. The 
tree is kept balanced using a standard AVL or RB algorithm.

I do indeed wish to use clustering for a particular application (which will 
be called TDL, but that's another story). However, I feel that it is a very 
general-purpose technique, suitable for most applications, and that it 
should enable applications to achieve the right balance between memory and 
speed efficiency.

Are there any existing container implementations which provide clustering?

-- 
Nick Roberts