From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.5-pre1 (2020-06-20) on
	ip-172-31-74-118.ec2.internal
X-Spam-Level: 
X-Spam-Status: No, score=-0.5 required=3.0 tests=BAYES_05 autolearn=ham
	autolearn_force=no version=3.4.5-pre1
Date: 1 Dec 92 19:25:17 GMT
From: dog.ee.lbl.gov!overload.lbl.gov!agate!spool.mu.edu!caen!destroyer!cs.ubc.
ca!mprgate.mpr.ca!lichen!janzen@ucbvax.Berkeley.EDU  (Martin Janzen)
Subject: Re: Request for reuse tool info
Message-ID: <1992Dec1.192517.16082@mprgate.mpr.ca>
List-Id: <comp.lang.ada>

In article <1992Dec1.152321.29538@rti.rti.org>, wgr@rti.org (Bucky Ransdell) wr
ites:
>I am investigating tools supporting software reuse.  I'm interested in both
>tools targeted specifically at reuse and software engineering environments
>that support it as part of the larger development process.  The key feature
>for my purposes is a repository for software assets supporting classification
>schemes for archiving, organizing, and retrieving reusable components.

One interesting approach would be to simply use a very fast text retriever.
I saw a demonstration recently of something called "Pat", from Open Text
Corporation (519-571-7111).  It accepts a large database of text in any
format ("no exceptions", they say), creates a huge, complex index, and
then allows you to retrieve all occurrences of _any_ string in the text --
from a long string to a single character -- within about two seconds; very
impressive!  It's capable of dealing with structured text (such as code?),
so that when a search results in a "hit", you should be able to have it
display the containing function, source file, or other appropriate
structure.  Since it was developed for use with the (580MB) Oxford English
Dictionary, it's able to deal with _huge_ amounts of text.  But as you can
imagine, updates to the index structure are quite slow, so it's best suited
to large volumes of seldom-modified text.  A code repository sounds like
a good application for this thing.

I think that a full-text retrieval system would have a number of advantages.
It should eliminate much of the need for elaborate classification schemes,
keyword indexes, and so on.  It would enable programmers to search for
reusable components using "keys" that were not contemplated by the code
librarian who set up the repository.  Also, there would be no danger of
having the code and the classification and indexing information grow "out
of sync" with each other, since the index can be regenerated as often as
needed, directly from the code.

The usual caveats:  This is based on a two-hour demonstration; I haven't
used "Pat" myself, much less built code repositories with it.  But the
remarkable speed of this thing offers some interesting new possibilities...

-- 
Martin Janzen                     janzen@mprgate.mpr.ca (134.87.131.13)
MPR Teltech Ltd.                  Phone: (604) 293-5309
8999 Nelson Way                   Fax: (604) 293-6100
Burnaby, BC, CANADA  V5A 4B5