comp.lang.ada
 help / color / mirror / Atom feed
From: "Warren W. Gay VE3WWG" <ve3wwg@home.com>
Subject: Re: Ada Idioms Progress Preview
Date: Tue, 14 Aug 2001 14:03:08 GMT
Date: 2001-08-14T14:03:08+00:00	[thread overview]
Message-ID: <3B792F9C.49A0FC7D@home.com> (raw)
In-Reply-To: 3B78C290.4DD088A8@worldnet.att.net

James Rogers wrote:
> "Warren W. Gay VE3WWG" wrote:
> > James Rogers wrote:
> > > An old saying is "there is no free lunch". In other words, nothing
> > > comes for free. In the case of a C string, you do not explicitly
> > > carry around the length of a string. Instead, you rely on a convention
> > > stating that the logical end of the string is indicated by a null
> > > character.
> > >
> > > The C approach presents two very real costs:
> > >
> > > 1) You must serially read the string to find the terminating null
> > >    character. This operation is very expensive if you only need to
> > >    determine the length of the string.
> >
> > To be honest, it works reasonably well for C/C++ because most strings in
> > a program tend to be short (of course this varies by application!) It would
> > be nice to have someone sample some Open Sourced packages and come
> > up with an average length, but I suspect that it would
> > be short enough. It is true for _some_ C strings/applications, that this
> > could be a significant overhead factor.
> 
> I would not like to make a claim about the "average" size of strings
> in C applications. I suspect the size varies quite a bit.

I was simply making a point, one which I am sure you understand. ;-)

> > > 2) Sometimes the null character is omitted. Since C arrays are
> > >    unbounded, this causes your program to read beyond the end of
> > >    the string until it finds a null character. The resulting
> > >    length will be incorrect.
> >
> > I'm not really wanting to support C/C++, but we should be careful
> > about what is being said here.. its really only a problem if you
> > _need_ a nul byte at the end. There are C programs that work with
> > fixed sized strings, like Ada, though this tends to be rarer (it
> > sometimes is done with embedded SQL/C). If you then need to pass
> > the fixed string to a printf() or other function that expects a
> > "C string", then yes, this then becomes a problem (just as it does
> > for Ada supplying a string for C).
> 
> The definition of a C string is a null terminated array of characters.
> C functions that do not require the null termination do not
> actually use strings. They merely use arrays of characters. This may
> seem like a subtle point, but it is critical. Functions expecting
> a C string for an argument absolutely rely on the existence of the
> nul byte at the end of the logical string.

Not all functions that a user writes, nor library functions actually 
require a nul byte, though I understand the concept of your "definition
of a C string". The function strncpy() is one obvious one where a nul 
byte is not essential.

> > One way to avoid this is to use a technique with strncpy() :
> >
> > #define BUF_LEN 8
> >
> > void
> > func(const char *in_str_with_maybe_no_null) {
> >    char my_buf[BUF_LEN];
> >
> >    strncpy(my_buf,in_str_with_maybe_no_null,BUF_LEN-1)[BUF_LEN-1] = 0;
> >
> > This restricts the copy to BUF_LEN-1 characters + 1 guaranteed nul byte.
> > It works because strncpy() the function, returns the (char *) pointer to
> > my_buf, resulting in the final assignment:
> >
> >    my_buf[BUF_LEN-1] = 0;
> >
> > You can do this in a much less cryptic way, but I have found it useful
> > in C programs, and it takes up less screen real-estate this way ;-)
> 
> And this approach assumes that copying only BUF_LEN characters will
> result is valid data. Sometimes it will. Sometimes BUF_LEN may be
> too small, resulting in a truncated string.

Absolutely, but that was not my point. In fact you can just as easily
have a loss of data with Ada's array slicing. ;-)

> > > Another less common cost occurs when copying C strings. The most
> > > efficient copy operation for C arrays is the memcpy function.
> > > This function allows you to copy blocks of memory efficiently.
> > > If you try to use memcpy to copy strings you will find some
> > > real problems. In those cases you want to copy the actual array
> > > of characters, not just the logical string contained in it.
> >
> > Have you said this in reverse? Normally you don't want to copy
> > anything beyond the nul byte, unless you're copying fixed length
> > arrays of characters, without treating nul as a special marker.
> 
> No, I am saying that strncpy() is less efficient than memcpy.

This very much dependant on how strncpy() is implemented. As a library
function, this could be implemented in assembly language for all
you know. Some CPU architectures (Z80 comes to mind) have single
instructions for dealing with this sort of thing. In these cases,
a limited strncpy() is definitely more efficient than memcpy(). The
single instruction is able to stop copying at the nul byte, but your
memcpy() is going to copy the whole array, whether it needs it or not.

> It is possible to make an exact copy of a string using memcpy.
> In fact the entire character array will be copied, not just the
> data up to the null.

Yes, but what's the advantage to that? If my string is 3 bytes out
of a maximum of 64 bytes, why would I want to copy the remaining
61 bytes that I would ignore?

> No, I am suggesting:
> 
> void my_func( char str[])
>    int slen = sizeof str; // which makes sense within the declaration
>                           // scope of the actual parameter.

OK, I see your point now.
-- 
Warren W. Gay VE3WWG
http://members.home.net/ve3wwg



  reply	other threads:[~2001-08-14 14:03 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-08-03  4:16 Ada Idioms Progress Preview James Rogers
2001-08-03 19:45 ` Robert Dewar
2001-08-03 22:02   ` James Rogers
2001-08-06 22:33   ` Stanley R. Allen
2001-08-07  2:45     ` tmoran
2001-08-07 12:15       ` Larry Kilgallen
2001-08-07 13:26         ` Philip Anderson
2001-08-08  2:23         ` Robert Dewar
2001-08-08  5:58           ` Ehud Lamm
2001-08-08  2:19       ` Robert Dewar
2001-08-08 15:13         ` Ted Dennison
2001-08-08 18:03           ` tmoran
2001-08-09 20:36           ` Florian Weimer
2001-08-10 21:02         ` Jay Nabonne
2001-08-10 21:51           ` Larry Kilgallen
2001-08-13 14:19             ` Ted Dennison
2001-08-13 14:05           ` Ted Dennison
2001-08-13 14:19             ` Marin David Condic
2001-08-13 15:47             ` Ole-Hjalmar Kristensen
2001-08-13 16:22               ` Marin David Condic
2001-08-13 18:48               ` Larry Kilgallen
2001-08-14  7:05                 ` Ole-Hjalmar Kristensen
2001-08-13 20:20               ` James Rogers
2001-08-14  1:09                 ` Warren W. Gay VE3WWG
2001-08-14  6:15                   ` James Rogers
2001-08-14 14:03                     ` Warren W. Gay VE3WWG [this message]
2001-08-21  5:54                   ` C strings, was " David Thompson
2001-08-16 18:42                 ` Jay Nabonne
2001-08-17  1:25                   ` Robert Dewar
2001-08-13 21:47               ` Ted Dennison
2001-08-14  7:37                 ` Ole-Hjalmar Kristensen
2001-08-14 14:59                   ` Ted Dennison
2001-08-14 13:22                 ` Marin David Condic
2001-08-14 15:12                   ` Ted Dennison
2001-08-14 15:33                     ` Marin David Condic
2001-08-14  8:49               ` Lutz Donnerhacke
2001-08-14  9:38                 ` Ole-Hjalmar Kristensen
2001-08-14  9:54                   ` Lutz Donnerhacke
2001-08-14 14:51                     ` James Rogers
2001-08-14 16:44                   ` Darren New
2001-08-14  1:39             ` Slicing ( Ada Idioms Progress Preview ) Warren W. Gay VE3WWG
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox