From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=unavailable autolearn_force=no version=3.4.4 X-Received: by 10.42.85.142 with SMTP id q14mr4911725icl.16.1406636786257; Tue, 29 Jul 2014 05:26:26 -0700 (PDT) X-Received: by 10.182.117.167 with SMTP id kf7mr547obb.31.1406636785991; Tue, 29 Jul 2014 05:26:25 -0700 (PDT) Path: eternal-september.org!reader01.eternal-september.org!reader02.eternal-september.org!news.eternal-september.org!news.eternal-september.org!news.eternal-september.org!feeder.eternal-september.org!usenet.blueworldhosting.com!feeder01.blueworldhosting.com!news.ripco.com!news.glorb.com!h18no9826651igc.0!news-out.google.com!eg1ni10igc.0!nntp.google.com!h18no5989245igc.0!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail Newsgroups: comp.lang.ada Date: Tue, 29 Jul 2014 05:26:25 -0700 (PDT) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=173.57.209.48; posting-account=zwxLlwoAAAChLBU7oraRzNDnqQYkYbpo NNTP-Posting-Host: 173.57.209.48 References: User-Agent: G2/1.0 MIME-Version: 1.0 Message-ID: Subject: Re: What is a byte? From: "Dan'l Miller" Injection-Date: Tue, 29 Jul 2014 12:26:26 +0000 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Xref: news.eternal-september.org comp.lang.ada:21335 Date: 2014-07-29T05:26:25-07:00 List-Id: On Tuesday, July 29, 2014 5:53:54 AM UTC-5, Jeffrey Carter wrote: > On 07/28/2014 12:09 PM, Victor Porton wrote: > > When I need to pass a byte to a C function, which Ada type should I use= ? > > "Byte" isn't a C concept. Generally what others call a byte is called "ch= ar" in=20 > C. So you should use whatever type in Interfaces.C(.*) corresponds to the= C type=20 > used by the C function. Jeff, you are factually incorrect there (which is why ITU uses the term oct= et for what you call a byte). Both C & C++ overload the term byte to mean = something quite altered from the historical/customary definition, which is = how they end up with 16-bit bytes on Prime 50 Series and 32-bit bytes on ma= ny modern DSPs. The main difference is: "It shall be possible to express = the address of each individual byte of an object uniquely." In effect, thi= s portion of the definition (which is dominant over "basic character set" i= n this era of potentially UTF-16, UCS2, UTF-32, and UCS4 Unicode being pote= ntially taken as the base character set) forces a byte to be the quantity o= f bits traversed by incrementing a void* by one (which is how bytes become = 32-bit on many DSPs regardless of whether that DSP uses, say, an 8-bit char= acter set; in such a DSP, the length of ASCII or UTF-8 strings are 32-bit a= ligned with 0, 8, 16, or 24 bits of padding---typically 1, 2 or 3 ASCII nul= ls, which might be 1 or 2 more than needed for C's idiom of null-terminated= strings when packing an 8-bit-character string into the DSP's 32-bit bytes= , such as when interfacing the DSP to the outside world). Victor, if you transliterate your specification to say "octet" wherever it= says "byte" (following ITU's convention to end this perennial/chronic sill= y debate over how big a byte is in C), then the answer to your question bec= omes quite clear: use integer of range {0, ..., 255} or {-128, ..., 128} a= nd mask off the upper powers of 2 if any, because unsigned char will always= be at least 8 bits on modern processors, which means that on some processo= rs byte might be 16- or 32-bit, but those extra powers of 2 to the left are= simply ignored, just as they are in the underlying protocol. (No specific= ation of any modern interoperable protocol uses the 2^8 and higher bits in = a byte if they exist on some arcade hardware.) Likewise for signed char, r= especting twos-complement's bias of sign-extension versus sign-magnitude's = sign bit.