From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,45b47ecb995e7a3 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2001-08-13 18:09:06 PST Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!sn-xit-01!supernews.com!newshub2.rdc1.sfba.home.com!news.home.com!news1.rdc2.on.home.com.POSTED!not-for-mail Message-ID: <3B787A30.F806DB00@home.com> From: "Warren W. Gay VE3WWG" X-Mailer: Mozilla 4.75 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: Ada Idioms Progress Preview References: <3B6F1B2F.4FC3C833@gsde.hou.us.ray.com> <5ee5b646.0108071819.6e84e33d@posting.google.com> <3_Xc7.45$NM5.84779@news.pacbell.net> <3B783712.88029BB8@worldnet.att.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Tue, 14 Aug 2001 01:09:06 GMT NNTP-Posting-Host: 24.141.193.224 X-Complaints-To: abuse@home.net X-Trace: news1.rdc2.on.home.com 997751346 24.141.193.224 (Mon, 13 Aug 2001 18:09:06 PDT) NNTP-Posting-Date: Mon, 13 Aug 2001 18:09:06 PDT Organization: Excite@Home - The Leader in Broadband http://home.com/faster Xref: archiver1.google.com comp.lang.ada:11881 Date: 2001-08-14T01:09:06+00:00 List-Id: James Rogers wrote: > Ole-Hjalmar Kristensen wrote: > > One thing which can be said in favour of having a terminator character > > is that it frees you from having to store the length explicitly. The > > length of a string is usually different from the size of the array > > used to store the string. > > So, in a sense a C string is more self-describing than a plain Ada > > string. > > Of course, as soon as you call a procedure, you can use a slice, but > > you still need the actual length to decide which slice. > > > > On the balance, I would rather have Ada strings. > > An old saying is "there is no free lunch". In other words, nothing > comes for free. In the case of a C string, you do not explicitly > carry around the length of a string. Instead, you rely on a convention > stating that the logical end of the string is indicated by a null > character. > > The C approach presents two very real costs: > > 1) You must serially read the string to find the terminating null > character. This operation is very expensive if you only need to > determine the length of the string. To be honest, it works reasonably well for C/C++ because most strings in a program tend to be short (of course this varies by application!) It would be nice to have someone sample some Open Sourced packages and come up with an average length, but I suspect that it would be short enough. It is true for _some_ C strings/applications, that this could be a significant overhead factor. > 2) Sometimes the null character is omitted. Since C arrays are > unbounded, this causes your program to read beyond the end of > the string until it finds a null character. The resulting > length will be incorrect. I'm not really wanting to support C/C++, but we should be careful about what is being said here.. its really only a problem if you _need_ a nul byte at the end. There are C programs that work with fixed sized strings, like Ada, though this tends to be rarer (it sometimes is done with embedded SQL/C). If you then need to pass the fixed string to a printf() or other function that expects a "C string", then yes, this then becomes a problem (just as it does for Ada supplying a string for C). > When copying or editing a string > this problem will result in data corruption and undefined > behaviors. This is again, not necessarily true, but it does happen if the C programmer is not careful. If instead, the user uses strncpy() for example, where the maximum size of the destination array is given, then this does not happen. However, if you strncpy() the maximum # of characters, you don't get a nul byte at the end. Novice C programmers often miss this subtle point ;-) One way to avoid this is to use a technique with strncpy() : #define BUF_LEN 8 void func(const char *in_str_with_maybe_no_null) { char my_buf[BUF_LEN]; strncpy(my_buf,in_str_with_maybe_no_null,BUF_LEN-1)[BUF_LEN-1] = 0; This restricts the copy to BUF_LEN-1 characters + 1 guaranteed nul byte. It works because strncpy() the function, returns the (char *) pointer to my_buf, resulting in the final assignment: my_buf[BUF_LEN-1] = 0; You can do this in a much less cryptic way, but I have found it useful in C programs, and it takes up less screen real-estate this way ;-) > Another less common cost occurs when copying C strings. The most > efficient copy operation for C arrays is the memcpy function. > This function allows you to copy blocks of memory efficiently. > If you try to use memcpy to copy strings you will find some > real problems. In those cases you want to copy the actual array > of characters, not just the logical string contained in it. Have you said this in reverse? Normally you don't want to copy anything beyond the nul byte, unless you're copying fixed length arrays of characters, without treating nul as a special marker. > The problem is that the C sizeof operator does not report the > correct size of arrays outside the immediate scope where they > are declared. OK, you're saying when you pass arrays into a C function, when the array is declared external to that function. Something like: void func2(char *str) { // what is the array size of str? } void func1() { char my_array[31]; func2(my_array); } This is a weakness, but if you know that func2() should work with fixed length arrays of a certain size, you can use: void func2(char str[31]) { // what is the array size of str? } instead. However, I agree that this is feeble, compared to the way Ada passes array bounds information. > Instead you will only get the size of the pointer > to the first element of the array. OK, it sounds like you're suggesting the following: void my_func(char *str) { int slen = sizeof str; // which does not make sense But this is nonsense anyway - no self respecting C programmer would do this, because you are obviously asking for the size of the pointer ;-) However, if you declared this instead: void my_func(char str[31]) { int array_len = sizeof str; // this comes close to size of array (on many platforms : array_len=32 here due to padding) > Therefore, to efficiently > copy C strings using memcpy you must provide a second "length" > argument, which may not be readily available. > > Jim Rogers > Colorado Springs, Colorado USA I'm not sure what you're pointing to here, but if you were to "efficiently" copy the string, you must have the assurance of a nul byte (so you can stop copying when you hit it with strcpy()) or a specified length (for memcpy()). Or you might need both if you use strncpy(). But if it is a "C-string", then it does have a nul byte, and is efficient to copy (copying stops at the nul byte - memcpy() is not used to implement strncpy()). I'm not wanting to defend C, but we want to be correct about the defects when we launch criticism of C/C++. Otherwise, Ada programmers lose respect ;-) -- Warren W. Gay VE3WWG http://members.home.net/ve3wwg