From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	autolearn_force=no version=3.4.4
X-Google-Language: ENGLISH,ASCII-7-bit
X-Google-Thread: 103376,9ae3749ddf1e6022
X-Google-Attributes: gid103376,public
From: phil@severn.demon.co.uk
Subject: Re: Endian and Ada
Date: 1996/04/17
Message-ID: <3175299b.9317202@news.demon.co.uk>
X-Deja-AN: 148064849
x-nntp-posting-host: severn.demon.co.uk
references: <4kamb9$om2@flute.aix.calpoly.edu>
 <4kq2lg$89@newsbf02.news.aol.com>
organization: My Home Office
newsgroups: comp.lang.ada
Date: 1996-04-17T00:00:00+00:00
List-Id: <comp.lang.ada>

ljmetzger@aol.com (LJMetzger) wrote:

> Any time you deal with target hardware that is not identical to the host
> hardware, not only do you have to deal with the Endian problem,
> but also the "which bit is lsb problem too".
> 
> e.g. I use Verdix Ada 6.2.0 (e) on a Sun (SunOs 4.1.x ) to create 
> a prom image for an AMD29050 embedded processor.  The 29050
> communicates with hardware (e.g. A/D converters) from different 
> vendors, and on some of the hardware bit 0 is lsb, and on others
> bit 0 is msb, hence the need for "Mr. Bitswap".

This is getting at the crux of the problem.

I hope the following is not too trivial an exposition for this group, but one
must be clear of the hardware situation to avoid 'endian', or more general
Data Representation pitfalls.
 
1) Hardware designers produced central processing units (CPUs) that work in
binary (there are a few exceptions). You may only work, at the machine level,
with a pool of Bits that may be either 'set' or 'not set' (e.g 1 or 0).

2) Storage elements have for a long time been available in packages organised
in 8-bit Bytes. That is to say, one cannot access single bits, only one or
more bytes.

3) The CPU can read bytes from memory and internally select individual bits
from them or combine bytes to make a Word. 

4) CPUs may be wired to read or write several bytes at a time (typically 4,
the so-called 32-bit CPUs), and work on them internally as a single entity.

5) These same CPUs have instructions which allow them to take 1, 2 or 4 bytes
at a time from memory, being termed a Byte, Word or Double-Word operation.  

6) In order to work with data containing a wide range of values, one must
define a Data Representation whereby the data is mapped onto a number of bits.


7) The Data Representation used by computer programmers involves terms such as
Floating Point, Integer (both Signed and Unsigned), Character, etc. The
compiler and operating system determines how these are mapped onto the target
machines Byte, Word or Double-Word.

8) If we work on only one target machine type we may store and correctly read
back each data type without knowledge of how the data is mapped to the
hardware.

9) Suppose we write a program to store our data, say an Integer type, on a
floppy disk. Suppose our system opts to store it in 2 consecutive bytes with
the most significant bits in the first and the least significant bits in the
second byte. The first byte would have 2^15..2^8 and the second 2^7..2^0. This
machine would probably adopt the convention of presenting these data
respectively to the 'Data0'..'Data7' pins of the storage device.

10) Then suppose this floppy disk is moved to a machine of different design,
and/or running a different operating system, and possibly with a compiler of
different source. Without knowledge of the Data Representation used we cannot
know whether our Integer will be reassembled from the 2 bytes in the correct
or reversed order.

11) Unfortunately the specifications of the data representation becomes rather
opaque, and gets more so as it passes up  the hierarchy through storage
hardware spec, to Basic Input/Output System (BIOS) spec, Operating System,
Compiler and finally Application Program.

12) Anyone dealing in data transfer between systems (and that includes just
carrying a floppy disk from a Sun to a PC) must check and allow for the data
representations on each. We all know that Unix uses just line-feed to
terminate a line of text, and this is virtually illegible on a PC that expects
line-feed *and* carriage-return. Conversion programs exist to correct for
this.

13) If the data representations is NOT identical then a conversion method must
be established on one or both machines. 

14) Hence Ada, nor any other compiler alone has control of the 'endian-ness'
of the data.

15) I am not familiar with XDR (eXternal Data Representation) referred to
earlier in the thread, but if it exists in the form of an Input/Output system
call on each machine one could ask for an Integer (say) to be written in a
standard way, which when read by the complementary call on a different machine
would guarantee to return the same value. In other words each machine would
have a library to convert 'standard' external data format to its internal
format (byte reversed or whatever).

16) Maybe Ada should insist on the availability of such a library and *force*
(at least by default) Input/Output to be written to such a spec. Or is this
achieved by running Ada on a machine with a  standardised operating system?

17) Vendors who make hardware that is internally like the standard XDR will
have a null conversion in their library and will gain a performance advantage.

Phil Addison
--
-- Phil Addison.	            ------ <phil@severn.demon.co.uk>