From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.5-pre1 (2020-06-20) on ip-172-31-74-118.ec2.internal X-Spam-Level: X-Spam-Status: No, score=-0.5 required=3.0 tests=BAYES_05 autolearn=ham autolearn_force=no version=3.4.5-pre1 Date: 1 Dec 92 19:25:17 GMT From: dog.ee.lbl.gov!overload.lbl.gov!agate!spool.mu.edu!caen!destroyer!cs.ubc. ca!mprgate.mpr.ca!lichen!janzen@ucbvax.Berkeley.EDU (Martin Janzen) Subject: Re: Request for reuse tool info Message-ID: <1992Dec1.192517.16082@mprgate.mpr.ca> List-Id: In article <1992Dec1.152321.29538@rti.rti.org>, wgr@rti.org (Bucky Ransdell) wr ites: >I am investigating tools supporting software reuse. I'm interested in both >tools targeted specifically at reuse and software engineering environments >that support it as part of the larger development process. The key feature >for my purposes is a repository for software assets supporting classification >schemes for archiving, organizing, and retrieving reusable components. One interesting approach would be to simply use a very fast text retriever. I saw a demonstration recently of something called "Pat", from Open Text Corporation (519-571-7111). It accepts a large database of text in any format ("no exceptions", they say), creates a huge, complex index, and then allows you to retrieve all occurrences of _any_ string in the text -- from a long string to a single character -- within about two seconds; very impressive! It's capable of dealing with structured text (such as code?), so that when a search results in a "hit", you should be able to have it display the containing function, source file, or other appropriate structure. Since it was developed for use with the (580MB) Oxford English Dictionary, it's able to deal with _huge_ amounts of text. But as you can imagine, updates to the index structure are quite slow, so it's best suited to large volumes of seldom-modified text. A code repository sounds like a good application for this thing. I think that a full-text retrieval system would have a number of advantages. It should eliminate much of the need for elaborate classification schemes, keyword indexes, and so on. It would enable programmers to search for reusable components using "keys" that were not contemplated by the code librarian who set up the repository. Also, there would be no danger of having the code and the classification and indexing information grow "out of sync" with each other, since the index can be regenerated as often as needed, directly from the code. The usual caveats: This is based on a two-hour demonstration; I haven't used "Pat" myself, much less built code repositories with it. But the remarkable speed of this thing offers some interesting new possibilities... -- Martin Janzen janzen@mprgate.mpr.ca (134.87.131.13) MPR Teltech Ltd. Phone: (604) 293-5309 8999 Nelson Way Fax: (604) 293-6100 Burnaby, BC, CANADA V5A 4B5