* Big-endian vs little-endian @ 1999-01-29 0:00 Mike Werner 1999-02-02 0:00 ` Nick Roberts 0 siblings, 1 reply; 11+ messages in thread From: Mike Werner @ 1999-01-29 0:00 UTC (permalink / raw) Now, I know that this problem has been around for a long time. I also am aware of the fact that it is basically a hardware implementation based problem. Or at least that's what I've always been told. I recently had a project for school that involved a binary data file that needed read using sequential IO. As the data file was created on the departments server and I was doing the project on my PC at home, I ran into the endian problem. The text string and the two enumeration types read iin just fine - it was the two numeric fields that were hosed. I did manage to get around the problem by creating my own data file to test with - fortunately the instructor told us what was in the file. But I got to wondering - shouldn't things like that be standardized by now? Or at least a way for the compiler to deal with such things? If such does exist, then I apologize for dragging this up and humbly request a pointer to where to find such info. If not ... well once I figure out what I'm doing maybe I'll tackle that as a project some day. -- Mike Werner KA8YSD | "Where do you want to go today?" ICQ# 12934898 | "As far from Redmond as possible!" '91 GS500E | Morgantown WV | -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GU d-@ s:+ a- C++>$ UL++ P+ L+++ E W++ N++ !o w--- O- !M V-- PS+ PE+ Y+ R+ !tv b+++(++++) DI+ D--- G e*>++ h! r++ y++++ ------END GEEK CODE BLOCK------ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Big-endian vs little-endian 1999-01-29 0:00 Big-endian vs little-endian Mike Werner @ 1999-02-02 0:00 ` Nick Roberts 1999-02-03 0:00 ` Mark A Biggar ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Nick Roberts @ 1999-02-02 0:00 UTC (permalink / raw) I can think of two possible solutions: (a) declare a type derived from Interfaces.Integer_8/16/32 etc. (RM95 B.2), and then apply a Bit_Order representation clause (RM95 13.5.3) to this type; (b) use Text_IO instead of Sequential_IO, and input and output the data in the form of text. The advantage of (b) is that text is the most universal data format: non-Ada programs will (almost always) be able to use the data (if that's what you might ever require). The disadvantage is that the text uses up more storage than its equivalent binary form. How much data do you have? The problem with (a) is that it isn't applicable to real types. Nor will it work if your compiler is an Ada 83 (rather than 95) compiler. Power to your ulna. ------------------------------------------- Nick Roberts ------------------------------------------- ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Big-endian vs little-endian 1999-02-02 0:00 ` Nick Roberts @ 1999-02-03 0:00 ` Mark A Biggar 1999-02-06 0:00 ` Samuel T. Harris 1999-02-04 0:00 ` Richard D Riehle 1999-02-06 0:00 ` Mike Werner 2 siblings, 1 reply; 11+ messages in thread From: Mark A Biggar @ 1999-02-03 0:00 UTC (permalink / raw) Nick Roberts wrote: > (b) use Text_IO instead of Sequential_IO, and input and output the data in > the form of text. > > The advantage of (b) is that text is the most universal data format: non-Ada > programs will (almost always) be able to use the data (if that's what you > might ever require). The disadvantage is that the text uses up more storage > than its equivalent binary form. How much data do you have? umm.. How many times have you actually coded this up both ways and compared. Almost every time I have tried this the text version of the data was smaller then the binary version, especially if you have variable sized data. The only cases where the text was bigger envolved data that consisted of large amounts of high percession floats and even then the text was only about twice as big. Even then, usually the advantages of portablility and human readablity of the text format outweigh the small space savings of binary data formats. -- Mark Biggar mark.a.biggar@lmco.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Big-endian vs little-endian 1999-02-03 0:00 ` Mark A Biggar @ 1999-02-06 0:00 ` Samuel T. Harris 1999-02-08 0:00 ` dennison 0 siblings, 1 reply; 11+ messages in thread From: Samuel T. Harris @ 1999-02-06 0:00 UTC (permalink / raw) Mark A Biggar wrote: > > Nick Roberts wrote: > > > (b) use Text_IO instead of Sequential_IO, and input and output the data in > > the form of text. > > > > The advantage of (b) is that text is the most universal data format: non-Ada > > programs will (almost always) be able to use the data (if that's what you > > might ever require). The disadvantage is that the text uses up more storage > > than its equivalent binary form. How much data do you have? > > umm.. How many times have you actually coded this up both ways and compared. > Almost every time I have tried this the text version of the data was smaller > then the binary version, especially if you have variable sized data. The > only cases where the text was bigger envolved data that consisted of large > amounts of high percession floats and even then the text was only about twice > as big. Even then, usually the advantages of portablility and human readablity > of the text format outweigh the small space savings of binary data formats. > > -- > Mark Biggar > mark.a.biggar@lmco.com As Technical Lead on a Air Force major command and control system, our initial implementation used textual representations for all the messaging between the distributed workstations and the central server. This got us a working product much faster than dealing with binary representations since the workstation and the central server hardware were so contrary to each other. This also provided easy network debugging with a simple sniffer/snopper (which was also a security concern). Since then, I have always advocated producing width, image, and value functions for all important data types. In fact, I have generics which produce these functions for arrays (trivial) and records (almost trivial) so the overhead for developing these functions is insignificant. An they do come in handy when a little text_io based debugging instrumentation is needed. A simple put_line(image(whatever)); is always available. -- Samuel T. Harris, Principal Engineer Raytheon, Scientific and Technical Systems "If you can make it, We can fake it!" ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Big-endian vs little-endian 1999-02-06 0:00 ` Samuel T. Harris @ 1999-02-08 0:00 ` dennison 1999-02-08 0:00 ` Samuel T. Harris 0 siblings, 1 reply; 11+ messages in thread From: dennison @ 1999-02-08 0:00 UTC (permalink / raw) In article <36BD02DB.737849EE@hso.link.com>, "Samuel T. Harris" <sam_harris@hso.link.com> wrote: > also a security concern). Since then, I have always advocated > producing width, image, and value functions for all important > data types. In fact, I have generics which produce these functions > for arrays (trivial) and records (almost trivial) so the overhead > for developing these functions is insignificant. An they do come > in handy when a little text_io based debugging instrumentation > is needed. A simple put_line(image(whatever)); is always available. Cool idea! But what method do you use to make generation of records "almost trivial"? And how do you handle pointers? T.E.D. -----------== Posted via Deja News, The Discussion Network ==---------- http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Big-endian vs little-endian 1999-02-08 0:00 ` dennison @ 1999-02-08 0:00 ` Samuel T. Harris 0 siblings, 0 replies; 11+ messages in thread From: Samuel T. Harris @ 1999-02-08 0:00 UTC (permalink / raw) dennison@telepath.com wrote: > > In article <36BD02DB.737849EE@hso.link.com>, > "Samuel T. Harris" <sam_harris@hso.link.com> wrote: > > > also a security concern). Since then, I have always advocated > > producing width, image, and value functions for all important > > data types. In fact, I have generics which produce these functions > > for arrays (trivial) and records (almost trivial) so the overhead > > for developing these functions is insignificant. An they do come > > in handy when a little text_io based debugging instrumentation > > is needed. A simple put_line(image(whatever)); is always available. > > Cool idea! But what method do you use to make generation of records "almost > trivial"? And how do you handle pointers? > Glad you asked. Given the requirements that all images look like Ada aggregates, supporting format options to specify the optional "decorations" such as qualification and positional vs named notation (which I'll justify later on in this message), then you have the following needs for the generics. A generic for producing width, image, and value functions for arrays needs to know the type name as a string for qualification, the width, image, and value functions for the index type (usually readily available from the appropriate attributes), the width, image, and value functions for the component type (possible made available from a previous instantiation of one of these generics). I believe we all can see the trivial nature of the width and image functions. The value function is not so trivial, having to support an optional qualification and having to deal with named and positional notation. But its not too difficult once a little effort if put into it. Of course, an initial version of these generics can be limited to support positional notation only since this greatly simplifies the value function. A generic for producing width, image, and value functions for records is a little more trouble some. The array generic can directly use the index to get the component. The record generic cannot. So, you have to provide extra "helper" functions in the form of field-level width, image, and value functions. The record generic needs the type name as a string for qualification, an enumerated type for all the discriminants and record fields, the width, image, and value functions for this field-nameing type, and a width and image function which takes a record object and field_name and produces the appropriate result (similar to the component functions of the array generic). It is the value function which different. It has to be a procedure taking an in out record object and a field_name with the objective of filling the appropriate field with the given string. Each of the helper subprograms uses a simple case statement to call the appropriate width, image, value for the field identified. Variant records are no problem since the generic functions run through all the field names. It is up to the field-level helper subprogram to either perform an action or do nothing for fields not in the variant in use. As far as pointers are concerned, they are outputed as an allocation by the component (for arrays) or field-level (for records) subprogram. One may envision a parallel to the image function called debug which outputs the pointer itself in some appropriate format so the reader can track the actual pointers themselves. This is useful not only for debugging, but also useful when the array of pointers reuses the same pointer in several slots. The usage of allocator notation in the image does not reflect that property. OTOH, output the pointer itself with its dereference does not satisfy keeping the image compliant with Ada aggregate notation, so I usually use a separate subprogram for this or provide options to image to control the outputed format. Keeping image compliant with Ada aggregate notation is important when you consider code which have large aggregates intializing complex data structures. With a conforming image/value function pair in place, you can copy the aggregate text to a file and use the enclosing package elaboration to do text_io on the file to initialize the data structure with the image of the file contents. While this will slow down elaboration, this does allow you change the data structure without recompiling and relinking the program. I find this very powerful during development. Once the initial data is tested and locked down, you can paste it back into the declaration of the object and comment out the text_io code in the package elaboration. -- Samuel T. Harris, Principal Engineer Raytheon, Scientific and Technical Systems "If you can make it, We can fake it!" ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Big-endian vs little-endian 1999-02-02 0:00 ` Nick Roberts 1999-02-03 0:00 ` Mark A Biggar @ 1999-02-04 0:00 ` Richard D Riehle 1999-02-06 0:00 ` Mike Werner 2 siblings, 0 replies; 11+ messages in thread From: Richard D Riehle @ 1999-02-04 0:00 UTC (permalink / raw) In article <7982p9$nll$3@plug.news.pipex.net>, "Nick Roberts" <Nick.Roberts@dial.pipex.com> wrote: >I can think of two possible solutions: > >(a) [ snipped > >(b) use Text_IO instead of Sequential_IO, and input and output the data in >the form of text. > >The advantage of (b) is that text is the most universal data format: The Text_IO solution is especially useful when converting floating point from one machine to floating point on another. For example, where is the sign bit on a VAX 32 floating point number? You'd be surprised! We were converting VAX floating point to IBM mainframe floating point. People came up with all sorts of algorithmic solutions. The best solution was to write the VAX numbers to a text file and read the text file back to to the IBM. No fuss. No muss. No algorithmic gymnastics. Richard Riehle richard@adaworks.com http://www.adaworks.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Big-endian vs little-endian 1999-02-02 0:00 ` Nick Roberts 1999-02-03 0:00 ` Mark A Biggar 1999-02-04 0:00 ` Richard D Riehle @ 1999-02-06 0:00 ` Mike Werner 1999-02-07 0:00 ` Matthew Heaney ` (2 more replies) 2 siblings, 3 replies; 11+ messages in thread From: Mike Werner @ 1999-02-06 0:00 UTC (permalink / raw) Nick Roberts wrote: > > I can think of two possible solutions: > > (a) declare a type derived from Interfaces.Integer_8/16/32 etc. (RM95 B.2), > and then apply a Bit_Order representation clause (RM95 13.5.3) to this type; > > (b) use Text_IO instead of Sequential_IO, and input and output the data in > the form of text. I wish I could use (b) - unfortunately this program was for class and we had to use the data file provided by the instructor. I looked at RM 13.5.3 as pointed out in (a) but really did not understand it. I'm very new at Ada, and that LRM is quite a ways over my head. But I'll take a stab and see if I've got the general idea. Here's the relevant data structure: type Sys_type is (Zarya, Unity, PMA1, PMA2); type Subsys_type is (CDH, CT, ECLSS, EPS, GNC, SM); subtype Desc_type is String(1..256); subtype Dur_Min_Type is Integer; subtype Dur_Sec_type is Integer; type Apm_Rec is record Description : Desc_Type; System : Sys_Type; Subsystem : Subsys_Type; Dur_Min : Dur_Min_Type; Dur_Sec : Dur_Sec_Type; end record; The problematic parts were the Apm_Rec.Dur_Min and the Apm_Rec.Dur_Sec - all the others read in just fine. If I'm understanding all this, should I have changed subtype Dur_Min_Type is Integer; subtype Dur_Sec_type is Integer; to subtype Dur_Min_Type is Integer(S'Bit_Order=>Low_Order_First); subtype Dur_Sec_type is Integer(S'Bit_Order=>Low_Order_First); or perhaps High_Order_First - haven't got everything handy at the moment. But the main question is do I have the right syntax and usage? Or am I completely off here? Really the main reason I'm concerned with this is I much prefer to do my assignments on my own computer as opposed to telnetting into the school's server - that telnet connection lags badly enough that most any task requires much patience. If it weren't for that, I probably wouldn't worry about it. I can work around this for future projects (hopefully) the same way I did for this project - I'm just looking for an easier way. And I do appreciate all the pointers so far - I'm reading the LRM as I have the opportunity. Just the slight problem of most of it being well beyond my knowledge level. But I'm working on it. -- Mike Werner KA8YSD | "Where do you want to go today?" ICQ# 12934898 | "As far from Redmond as possible!" '91 GS500E | Morgantown WV | -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GU d-@ s:+ a- C++>$ UL++ P+ L+++ E W++ N++ !o w--- O- !M V-- PS+ PE+ Y+ R+ !tv b+++(++++) DI+ D--- G e*>++ h! r++ y++++ ------END GEEK CODE BLOCK------ ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Big-endian vs little-endian 1999-02-06 0:00 ` Mike Werner @ 1999-02-07 0:00 ` Matthew Heaney 1999-02-09 0:00 ` Stephen Leake 1999-02-10 0:00 ` Mike Werner 2 siblings, 0 replies; 11+ messages in thread From: Matthew Heaney @ 1999-02-07 0:00 UTC (permalink / raw) Mike Werner <mwerner@wvu.edu> writes: > Here's the relevant data structure: > > type Sys_type is (Zarya, Unity, PMA1, PMA2); > type Subsys_type is (CDH, CT, ECLSS, EPS, GNC, SM); > subtype Desc_type is String(1..256); > subtype Dur_Min_Type is Integer; > subtype Dur_Sec_type is Integer; > type Apm_Rec is > record > Description : Desc_Type; > System : Sys_Type; > Subsystem : Subsys_Type; > Dur_Min : Dur_Min_Type; > Dur_Sec : Dur_Sec_Type; > end record; > > The problematic parts were the Apm_Rec.Dur_Min and the Apm_Rec.Dur_Sec - > all the others read in just fine. If I'm understanding all this, should > I have changed > > subtype Dur_Min_Type is Integer; > subtype Dur_Sec_type is Integer; > > to > > subtype Dur_Min_Type is Integer(S'Bit_Order=>Low_Order_First); > subtype Dur_Sec_type is Integer(S'Bit_Order=>Low_Order_First); > > or perhaps High_Order_First - haven't got everything handy at the > moment. But the main question is do I have the right syntax and usage? > Or am I completely off here? Yes, you are completely off. Don't bother with the Bit_Order attribute. No compiler vendors support it. That leads us to this: type Sys_type is (Zarya, Unity, PMA1, PMA2); type Subsys_type is (CDH, CT, ECLSS, EPS, GNC, SM); type Apm_Rec is record Description : Desc_Type (1 .. 256); System : Sys_Type; Subsystem : Subsys_Type; Dur_Min : Integer range 0 .. 59; Dur_Sec : Integer range 0 .. 59; end record; There are two advantages to this: 1) we can pack the last four fields in one longword 2) the integer types fit in 1 byte, so we don't have to worry about byte-swapping I think you should now be able to write a standard rep clause for this record ("for Apm_Rec use ..."), that will work for both big- and little-endian machines. (Because the latter fields are 1 byte, and the representation of one-byte data on either machine is the same.) If you do decide to use 32-bit integers for Dur_Min and Dur_Sec, you still don't have a rep clause problem. But you do have a byte-swapping problem. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Big-endian vs little-endian 1999-02-06 0:00 ` Mike Werner 1999-02-07 0:00 ` Matthew Heaney @ 1999-02-09 0:00 ` Stephen Leake 1999-02-10 0:00 ` Mike Werner 2 siblings, 0 replies; 11+ messages in thread From: Stephen Leake @ 1999-02-09 0:00 UTC (permalink / raw) Mike Werner <mwerner@wvu.edu> writes: > Here's the relevant data structure: > > type Sys_type is (Zarya, Unity, PMA1, PMA2); > type Subsys_type is (CDH, CT, ECLSS, EPS, GNC, SM); > subtype Desc_type is String(1..256); > subtype Dur_Min_Type is Integer; > subtype Dur_Sec_type is Integer; > type Apm_Rec is > record > Description : Desc_Type; > System : Sys_Type; > Subsystem : Subsys_Type; > Dur_Min : Dur_Min_Type; > Dur_Sec : Dur_Sec_Type; > end record; > > The problematic parts were the Apm_Rec.Dur_Min and the Apm_Rec.Dur_Sec - > all the others read in just fine. You have a byte-endianness problem. System.Bit_Order address a bit-endiannes problem. They are similar, but different. The best Ada solution is to use streams to read the binary file. You have to define your own Integer type (you should do this anyway, to make sure it is the same size as the school's server Integer type!). Then you can define the stream read and write functions to do byte swapping. If you're not up to streams (quite understandable :), you can just use Unchecked_Conversion. Assuming 32 bit integers, do something like: type Network_4_Bytes is record Hi_Byte : Interfaces.Unsigned_8; Byte_3 : Interfaces.Unsigned_8; Byte_2 : Interfaces.Unsigned_8; Low_Byte : Interfaces.Unsigned_8; end record; pragma Pack (Network_4_Bytes); for Network_4_Bytes'size use 32; -- confirm size function To_Network is new Unchecked_conversion (Source => Interfaces.Integer_32, Target => Network_4_Bytes); Of course, to make your code portable to your school computer, you'll have to hide this in a body, and set a compile-time flag to decide whether to swap bytes or not. I define a package Endianness to handle the compile-time flag. Good luck! -- Stephe ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Big-endian vs little-endian 1999-02-06 0:00 ` Mike Werner 1999-02-07 0:00 ` Matthew Heaney 1999-02-09 0:00 ` Stephen Leake @ 1999-02-10 0:00 ` Mike Werner 2 siblings, 0 replies; 11+ messages in thread From: Mike Werner @ 1999-02-10 0:00 UTC (permalink / raw) Thanks for all the tips and pointers so far. I spoke with my instructor today and it appears we may not be using binary data files for a while, so I've got some time to try and figure out what some of the tips I've received mean. ;) If I haven't figured it out by then, well I guess I'll try the kludge I used last time. It wasn't pretty, but it worked. My questions here were more out of puzzlement than anything else. -- Mike Werner KA8YSD | "Where do you want to go today?" ICQ# 12934898 | "As far from Redmond as possible!" '91 GS500E | Morgantown WV | -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GU d-@ s:+ a- C++>$ UL++ P+ L+++ E W++ N++ !o w--- O- !M V-- PS+ PE+ Y+ R+ !tv b+++(++++) DI+ D--- G e*>++ h! r++ y++++ ------END GEEK CODE BLOCK------ ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~1999-02-10 0:00 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 1999-01-29 0:00 Big-endian vs little-endian Mike Werner 1999-02-02 0:00 ` Nick Roberts 1999-02-03 0:00 ` Mark A Biggar 1999-02-06 0:00 ` Samuel T. Harris 1999-02-08 0:00 ` dennison 1999-02-08 0:00 ` Samuel T. Harris 1999-02-04 0:00 ` Richard D Riehle 1999-02-06 0:00 ` Mike Werner 1999-02-07 0:00 ` Matthew Heaney 1999-02-09 0:00 ` Stephen Leake 1999-02-10 0:00 ` Mike Werner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox