From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,INVALID_MSGID autolearn=no autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,5c1c45943bf6a5bc X-Google-Attributes: gid103376,public From: bobduff@world.std.com (Robert A Duff) Subject: Re: 'first of strings returned from a function should be 1? Date: 1997/07/28 Message-ID: #1/1 X-Deja-AN: 259816172 References: <5rcaqi$le8$1@goanna.cs.rmit.edu.au> <01bc9a76$459c2250$4c8371a5@dhoossr> Organization: The World Public Access UNIX, Brookline, MA Newsgroups: comp.lang.ada Date: 1997-07-28T00:00:00+00:00 List-Id: In article , Matthew Heaney wrote: >In article <01bc9a76$459c2250$4c8371a5@dhoossr>, "David C. Hoos, Sr." > wrote: > > >>>From my own experience, I know that it is easy to write a poor function >>returning a string result of which the first subscript is not 1, > >Poor is a relative term. For some abstractions, returning a string whose >lower index is not 1 might make more sense. Ada would be a better language if there were a (convenient) way to specify that the lower bound of an array is fixed, and that the upper bound is not. And if the predefined type String used that feature to specify that the lower bound of all Strings is 1, whereas the upper bound is different for different strings. The current situation is error-prone: almost all strings start at 1, so if you have some code that accidentally makes that assumption, it will work fine, most of the time, but then will fail in rare cases (e.g. if somebody decides to pass a slice). (Bugs that happen all the time are easy to notice and fix during testing -- bugs that happen rarely are the ones that leak out to your customers.) For the vast majoriy of strings, what you care about is the characters (and how many there are) -- not what the bounds are. (That's why string comparison ignores the bounds, and just uses the lengths. And why Put_Line("Hello, world.") doesn't print out the bounds!) >In the example I gave, > >function To_Uppercase (S : String) return String; > >one could argue that the return value should have the same bounds as the >input string S (which doesn't necessarily have 1 as the lower bound). One could also argue that the result should be 1..whatever. Or that the bounds should be an implementation detail of the function, and the caller should make sure to use 'First and 'Last on the result. The problem here is that it's not clear what the right answer is. And you can only tell what choice was made by looking at the body of that routine. Or else trust the comments (which probably don't even exist, and might lie). If all strings started at 1, then the bounds of the result of To_Uppercase would *necessarily* match the bounds of S. > function Subsequence > (Sequence : Root_Sequence; > First : Positive; > Last : Positive) return Sequence_Item_Array; > >Surely, the array returned by this function would have First, not 1, as its >lower bound. I disagree. This is the same argument given in the Ada 83 rationale, for why the lower bound of S(5..10) is 5. The problem is that once the slice (or Subsequence) has been taken, you've just got a String (or sequence). The user of that String shouldn't have to know that it came from a slicing operation. (E.g. consider passing S(5..10) to a procedure.) IMHO, in: X: String := ...; ... Y: String := X(5..10); Y is its own string, and its first character should be numbered 1, just like most other strings. We shouldn't have to know or care that it was copied out of the middle of some other string. (Consider also "Y: String := F(...);", where F says "return X(5..10);".) Note that if you fix the lower bound at 1, you have the nice property that all strings of a given length have matching bounds. This is useful for looping through a pair of strings, even if the loop uses 'First rather than hard-coding 1. Besides, it would be more efficient to fix the lower bound of strings at 1. Suppose we have a program where the average string length (for strings in the heap) is 16 characters. The compiler needs to store *at least* 8 bytes of dope with each string -- a 50% overhead. And loops needs to spend extra time fetching the lower bound, which is almost always 1. - Bob P.S. I don't know what the right answer is to the original question, which I would paraphrase as "Given that Ada has this problem, should we work around it by making sure all string-producing code produces strings with lower bound 1, or by making sure that all string-consuming code works correctly for any bounds? Or both?" All I can say is that the choice should be made as globally as possible, and that it should be documented.