Re: Query on portable bit extraction

From: "Nick Roberts" <nickroberts@adaos.worldonline.co.uk>
Subject: Re: Query on portable bit extraction
Date: Sat, 27 Oct 2001 17:31:16 +0100
Date: 2001-10-27T17:31:16+01:00	[thread overview]
Message-ID: <9req9o$tlsg8$1@ID-25716.news.dfncis.de> (raw)
In-Reply-To: mailman.1004156810.28692.comp.lang.ada@ada.eu.org

To be frank, I think RM95 13.5.3 shows confusion as to what 'big-endian' and
'little-endian' really mean. The RM speaks in terms of storage elements, but
these Swiftian terms - although always informal - have really always
referred to the order of bytes (not bits).

A big-endian machine's (16-bit, 32-bit, and sometimes 64-bit) integers have
the most significant byte at the lower/lowest address; a little-endian
machine's integers have the least significant byte at the lower/lowest
address. I think the terms don't sensibly apply to machines whose
Storage_Unit is not a multiple of 8. I think those machines whose
Storage_Unit is 16, 32, or 64 can be treated as big-endian if they read
bytes from byte-oriented peripherals in the order most significant byte
first.

The AARM (13.5.3 (5.b)) wants to imply that on a big-endian 32-bit machine
with byte memory addressing, bit 0 of byte 3 (the lowest addressed and most
significant byte) is the most significant bit (bit 31) of the word. This is
wrong! By every convention I've ever come across, bit 0 of byte 3 is always
interpreted as bit 24 of the word; it is bit 7 of byte 3 that is bit 31 of
the word. I believe the documentation of processor manufacturers will bear
this out. In other words, ALL architectures (both big-endian and
little-endian) are in fact Low_Order_First.

AI95-133 confirm's this thread's view that the Bit_Order attribute is purely
about how to interpret bit numbers in a record representation clause. As
such, I'm afraid I think it was misconceived. There's no need for it on any
machine, and it doesn't solve the problem that needs solving (that of big
and little endianness). Does anyone have an example where Bit_Order was
actually useful (or vital)?

Perhaps the next revision could introduce a representation attribute
Octet_Order, which would only apply to discrete types of a size which is a
multiple of 8 and greater than 8 (and would have nothing to do with record
representation clauses). The System package would define:

   type Octet_Order is (Low_Octet_First, High_Octet_First);

   Default_Octet_Order: constant Octet_Order;

Although I am pretty certain about my comments in this case, I seem to be
prone to egregious error at times, so please someone put me out of your
misery if I have committed another blunder.

Should I send a comment to ada-comment about this?

Another facility that I believe would be useful would be to be able to
specify multiple places for a component in a record representation clause.
Engineers (being engineers, poor devils ;-) often like to split up a field
returned by their equipment into lots of little pieces dotted around all
over the place.

E.g.:

   type Table_Descriptor is
      record
         Limit: Unsigned_20;
         Base: Unsigned_32;
         Typ: Descriptor_Type;
         Is_Segment: Boolean;
         DPL: Privilege_Level;
         Is_Present: Boolean;
         Is_32_Bit: Boolean;
         Is_Coarse: Boolean;
      end record;

   pragma Assert(System.Storage_Size = 8);

   for Table_Descriptor use
      record
         Limit at all (0..7 => 0 range 0..7,
                        8..15 => 1 range 0..7,
                        16..19 => 6 range 0..3);
         Base at all (0..7 => 2 range 0..7,
                       8..15 => 3 range 0..7,
                       16..23 => 4 range 0..7,
                       24..31 => 7 range 0..7);
         Typ at 5 range 0..4;
         ...
      end record;

This suggested syntax may not be the best way to do it. Implementations
could impose a limit on the number of different places specifiable for one
component. As per the AARM, the storage place attributes would not apply to
discontiguous components.

--
Nick Roberts

"Steven Deller" <deller@smsail.com> wrote in message
news:mailman.1004156810.28692.comp.lang.ada@ada.eu.org...
> The issue with byte ordering actually has more to do with how you got
> the data.  If you get the data as a byte stream and define an array of
> bytes, then the only issue is bit order within each byte.  On the other
> hand, if you get the data 4-bytes at a time "in parallel", you will need
> to do byte swapping to get the fields to be adjacent.  If you get the
> data one-bit at a time, then no byte swapping is needed and fields will
> always be adjacent bits, regardless of the architecture.
>
> I'm *guessing* that for the problem at hand, you received the data byte
> by byte on a little-endian machine where you *defined* adjacent bits
> (across bytes) using little-endian definitions of what it means to cross
> a byte boundary.  If you now take that data 4-bytes at a time and send
> that to big-endian machine, you will have to do byte swapping (end for
> end) across the word to get fields that are adjacent (and if fields
> cross 4-byte boundaries, you will have to do word swapping).  Once you
> have done that, you need only count in the correct direction to see the
> fields.
>
> Its hard to describe in email, but you can probably figure it out if you
> realize that size of units when exchanging data determines what type of
> byte swapping is necessary (none, 2-byte pairs, 4-byte quads, 8-byte
> octets, etc).
>
> That is why this is a hard problem.  There is no *general* solution
> until you know your channel width.
>
> Regards,
> Steve
>
> > -----Original Message-----
> > From: comp.lang.ada-admin@ada.eu.org
> > [mailto:comp.lang.ada-admin@ada.eu.org] On Behalf Of Jeffrey Carter
> > Sent: Friday, October 26, 2001 8:07 PM
> > To: comp.lang.ada@ada.eu.org
> > Subject: Re: Query on portable bit extraction
> >
> >
> > I would recommend using a collection of bytes and ensuring
> > that the same bytes contain the same values on all platforms.
> > Then extract the desired parts of the desired bytes,
> > combining them as required.
> >
> > You can also, if you're sure the same bytes have the same
> > values, combine bytes into larger values using type
> > conversions and shifts or
> > multiplications:
> >
> > T1 := Shift_Left (Unsigned_16 (Byte_21), 8) or Unsigned_16
> > (Byte_22); T2 := Shift_Right (T1, 6) and 2#0111_1111#; -- YYY_YYYY
> >
> > Both work correctly regardless of endianness;
> >
> > --
> > Jeff Carter
> > "I wave my private parts at your aunties."
> > Monty Python & the Holy Grail
> > _______________________________________________
> > comp.lang.ada mailing list
> > comp.lang.ada@ada.eu.org
> > http://ada.eu.org/mailman/listinfo/comp.lang.ad> a
>