From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,INVALID_MSGID
	autolearn=no autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,5c1c45943bf6a5bc
X-Google-Attributes: gid103376,public
From: bobduff@world.std.com (Robert A Duff)
Subject: Re: 'first of strings returned from a function should be 1?
Date: 1997/07/28
Message-ID: <EE1DJ2.LnI@world.std.com>#1/1
X-Deja-AN: 259816172
References: <5rcaqi$le8$1@goanna.cs.rmit.edu.au>
 <01bc9a76$459c2250$4c8371a5@dhoossr>
 <mheaney-ya023680002707971114240001@news.ni.net>
Organization: The World Public Access UNIX, Brookline, MA
Newsgroups: comp.lang.ada
Date: 1997-07-28T00:00:00+00:00
List-Id: <comp.lang.ada>


In article <mheaney-ya023680002707971114240001@news.ni.net>,
Matthew Heaney <mheaney@ni.net> wrote:
>In article <01bc9a76$459c2250$4c8371a5@dhoossr>, "David C. Hoos, Sr."
><david.c.hoos.sr@ada95.com> wrote:
>
>
>>>From my own experience, I know that it is easy to write a poor function
>>returning a string result of which the first subscript is not 1, 
>
>Poor is a relative term.  For some abstractions, returning a string whose
>lower index is not 1 might make more sense.

Ada would be a better language if there were a (convenient) way to
specify that the lower bound of an array is fixed, and that the
upper bound is not.  And if the predefined type String used that
feature to specify that the lower bound of all Strings is 1, whereas
the upper bound is different for different strings.

The current situation is error-prone: almost all strings start at 1,
so if you have some code that accidentally makes that assumption, it
will work fine, most of the time, but then will fail in rare cases
(e.g. if somebody decides to pass a slice).  (Bugs that happen all
the time are easy to notice and fix during testing -- bugs that
happen rarely are the ones that leak out to your customers.)

For the vast majoriy of strings, what you care about is the
characters (and how many there are) -- not what the bounds are.
(That's why string comparison ignores the bounds, and just uses the
lengths.  And why Put_Line("Hello, world.") doesn't print out the
bounds!)

>In the example I gave, 
>
>function To_Uppercase (S : String) return String;
>
>one could argue that the return value should have the same bounds as the
>input string S (which doesn't necessarily have 1 as the lower bound).

One could also argue that the result should be 1..whatever.  Or that
the bounds should be an implementation detail of the function, and
the caller should make sure to use 'First and 'Last on the result.
The problem here is that it's not clear what the right answer is.
And you can only tell what choice was made by looking at the body of
that routine.  Or else trust the comments (which probably don't even
exist, and might lie).  If all strings started at 1, then the bounds
of the result of To_Uppercase would *necessarily* match the bounds
of S.

>   function Subsequence
>     (Sequence : Root_Sequence;
>      First           : Positive;
>      Last            : Positive) return Sequence_Item_Array;
>
>Surely, the array returned by this function would have First, not 1, as its
>lower bound.

I disagree.

This is the same argument given in the Ada 83 rationale, for why the
lower bound of S(5..10) is 5.  The problem is that once the slice (or
Subsequence) has been taken, you've just got a String (or sequence).
The user of that String shouldn't have to know that it came from a
slicing operation.  (E.g. consider passing S(5..10) to a procedure.)
IMHO, in:

    X: String := ...;
    ...
    Y: String := X(5..10);

Y is its own string, and its first character should be numbered 1,
just like most other strings.  We shouldn't have to know or care
that it was copied out of the middle of some other string.
(Consider also "Y: String := F(...);", where F says "return
X(5..10);".)

Note that if you fix the lower bound at 1, you have the nice
property that all strings of a given length have matching bounds.
This is useful for looping through a pair of strings, even if the
loop uses 'First rather than hard-coding 1.

Besides, it would be more efficient to fix the lower bound of strings at
1.  Suppose we have a program where the average string length (for
strings in the heap) is 16 characters.  The compiler needs to store *at
least* 8 bytes of dope with each string -- a 50% overhead.  And loops
needs to spend extra time fetching the lower bound, which is almost
always 1.

- Bob

P.S. I don't know what the right answer is to the original question,
which I would paraphrase as "Given that Ada has this problem, should
we work around it by making sure all string-producing code produces
strings with lower bound 1, or by making sure that all
string-consuming code works correctly for any bounds?  Or both?"
All I can say is that the choice should be made as globally as
possible, and that it should be documented.