From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-0.3 required=5.0 tests=BAYES_00, REPLYTO_WITHOUT_TO_CC autolearn=no autolearn_force=no version=3.4.4 X-Google-Thread: 103376,43ad9ab56ebde91c X-Google-NewGroupId: yes X-Google-Attributes: gida07f3367d7,domainid0,public,usenet X-Google-Language: ENGLISH,ASCII-7-bit Received: by 10.68.191.41 with SMTP id gv9mr37947766pbc.5.1325678320382; Wed, 04 Jan 2012 03:58:40 -0800 (PST) Path: lh20ni123322pbb.0!nntp.google.com!news1.google.com!goblin2!goblin.stu.neva.ru!aioe.org!.POSTED!not-for-mail From: "Dmitry A. Kazakov" Newsgroups: comp.lang.ada Subject: Re: Does Ada support endiannes? Date: Wed, 4 Jan 2012 12:56:03 +0100 Organization: cbb software GmbH Message-ID: References: <23835087-313f-427c-b37e-4ff1bdef9d57@r6g2000yqr.googlegroups.com> <20e631fc-e7b4-41ca-be0f-aab8be3f9a25@f33g2000yqh.googlegroups.com> <53n2sd7edt5i.1boh4452h0aks.dlg@40tude.net> <1kc5n51.ffg0umddufyfN%csampson@inetworld.net> <1c2ax12bptm2g.gifwv5vndpxe$.dlg@40tude.net> <1kc8f2j.132xw621jmu761N%csampson@inetworld.net> <16jibtpb9f2o4.1pf3ro8hb8qq2.dlg@40tude.net> <1kcakce.17lpouc1o2nz0gN%csampson@inetworld.net> <1ol1w9audpvta.1drukev3uwfoe.dlg@40tude.net> <1kcu0zg.r0yxxk4341xyN%csampson@inetworld.net> <1tegoy0w6pnqm$.qr7gobrrq1t6$.dlg@40tude.net> <1kdapo9.xc9nt2v9lvh7N%csampson@inetworld.net> Reply-To: mailbox@dmitry-kazakov.de NNTP-Posting-Host: PPt+vSuBRqtkVsMLa1J3Dg.user.speranza.aioe.org Mime-Version: 1.0 X-Complaints-To: abuse@aioe.org User-Agent: 40tude_Dialog/2.0.15.1 X-Notice: Filtered by postfilter v. 0.8.2 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Date: 2012-01-04T12:56:03+01:00 List-Id: On Tue, 3 Jan 2012 20:33:30 -0800, Charles H. Sampson wrote: > Dmitry A. Kazakov wrote: > >> On Sun, 25 Dec 2011 13:42:45 -0800, Charles H. Sampson wrote: >> >>> I'm not clear on what you mean by "representation clauses ... tend >>> to convert things as a whole". Representation clauses don't convert >>> anything. They just specify how the bits are laid out in memory. >> >> But you want to use them for conversions. You somehow put raw data there >> and then reinterpret that as an integral ADT object, usually an integer. > > I don't reinterpret the raw data, Based on some knowledge of the > data the representation clause defines its interpretation. Until we get > to that point, the data are, for me, just a string of bits. Very seldom > is that interpretation an integer; most often it is a somewhat complex > structure of non-homogenous components, the very definition of a record. String of bits (array) /= record. >> What about a task object? Consider an agent-based application using some >> protocol to transport agents (tasks) over the network. Would you >> reconstruct an agent from bits and bytes using a representation clause? >> (:-)) > > I have no concept of a task object being transported over a > network. That is because you have no concept of an object at all. A task is an object and an integer is also an object. Neither is a string of bits, though a string of bits could be used to represent [encode] either. >>> For example, to change >>> the endianess of a two-byte word, I might write something like >>> >>> Internal_Small_Word := 256 * Incoming.Byte_2 + Byte_1; >>> >>> It's more than likely that I would use a representation clause for >>> Internal_Small_Word also and just do two assignments: >>> >>> Internal_Small_Word.Byte_1 := Incoming.Byte_2; >>> Internal_Small_Word.Byte_2 := Incoming.Byte_1; >>> >>> It won't surprise you when I say that I think the latter is much clearer >>> as to intent and meaning than any arithmetic manipulating, including my >>> former example. >> >> No. The first variant follows the mathematical definition of how an integer >> of the range 0 .. 2**16 - 1 is encoded as a sequence of octets (integers >> range 0 .. 2**8 - 1). > > And the second variant shows how to rearrange some of the incoming > bits to get the same value in native machine form. Value of what? Same to what? Where and how is it defined? >> Your code uses obscure assignments to some fields of some record type, >> which connection to the mathematical definition of an encoded number is not >> clear to the reader, as well as the correctness of the code, because the >> order of bytes and gaps between them is specified somewhere else by a >> representation clause. > > And your code uses obscure multiplications that are really a > disguised form of shifts. If you're thinking shifts, why not use a > shift function? On the contrary, shifts are disguised multiplications. See the question above. Any encoding of integer values *shall* in its definition use arithmetical operations to reconstruct the value. A more general statement is that any encoding of whatever object must in its definition use the operations defined on the corresponding ADT. That immediately follows from the definition of ADT in a typed language. You just have no other means to create an object, again, if the language is typed. > From a mathematical point of view, your multiplication code is > treating two bytes of the incoming code as digits in a base 256 > representation. For a mathematician, that's clear. It can be confusing > and surprising to a non-mathematician. There is no other view, sorry. Encoding is *defined* as a translation into an alphabet, which in the case of octets are symbols 0,1,..,255. >> Note also that all examples you and others have so far provided, go no >> further swapping bytes and trivial bit fields extraction. This is too >> little for real-life applications. > > The subject of this thread had to do with endianess. So? >> Consider a PT100 WAGO Modbus module. It delivers a WORD (two bytes) in >> so-called Siemens format. 12 MSBs encode the value from -200 to 883 degree >> Celsius represented as 0400..7FF8. Bigger and lesser values are cropped. >> The 4 LSBs bits are the diagnostics bits (curcuit open/close etc). You >> could not handle this using a representation clause in one step. You will >> have first to swap bytes to build a 16 bit integer and use it to obtain the >> temperature masking and shifting. Or, first reorder and extract 12 bits, >> extract 4 bits, check, recode 12 bits into value. In both cases >> arithmetical operations are unavoidable. > > I'm pretty sure I could do it to my satisfaction but I've got too > many questions about your description to actually write any code. For > example, are 200 and 883 in decimal or hex? Decimal. > What about endianess? None, in the sense that different ModBus commands allows you to read individual bytes and/or words at some ModBus addresses, which meaning depends in the command mode being used. Furthermore, bytes can be scattered, i.e. lie on non-consequent ModBus addresses. Welcome to the real world! > Doesn't the value 7FF8 preclude using the 4 LSBs as diagnostic bits? No. > Regarding "You could not handle this using a representation clause > in one step.", I don't know what "this" refers to. Even if I did, what > is the importance of "one step"? Am I correct in assuming that "one > step" means "one statement"? If the representation clauses could have handled real-life protocols, then you would need no other means on order to get values out of read data. Which would mean that all happens in just one step: you read data into memory and then take values from the data structure located at the physically same address. That does not work except for quite elementary cases. >>> Data corruption is handled by whatever mechanism is in the data >>> packets to guard against corruption. >> >> As I said, you need something to add, in order to check validity of the >> values obtained from the conversion. X'Valid is no help in the examples >> like above, when values are saturated and/or additional bits or patterns >> are used to indicate validness. >> >> Another example chain codes. E.g. the following encoding of cardinal >> numbers: >> >> 0 -> 2#0000_000# >> 1 -> 2#0000_0001# >> ... >> 127 -> 2#0111_1111# >> 128 -> 2#1000_0000# 2#0000_0001# >> 129 -> 2#1000_0001# 2#0000_0001# >> ... >> 4095 -> 2#1111_1111# 2#0111_1111# >> 4096 -> 2#1000_0000# 2#1000_0000# 2#0000_0001# >> ... >> >> [Chain codes are used when the upper bound of the transported number is >> unspecified and it is expected that lesser values are more frequent than >> bigger ones. UTF-8, is chain code.] >> >> Care to write a representation clause for the above? > > No, because they are not of a record type. Why should they be? > Give me some > information about how the chains are laid out in memory and I should be > able to come up with some useful type definitions. They aren't laid anywhere. It is you way of thinking of it. [Which is inherently flawed, as I am trying to show you by these examples] The octets are sent over some octet stream, that is all. >>> Using representation clauses >>> when the hardware of one machine doesn't match the hardware of another? >> >> All sorts of I/O: reading data from a hardware port, from a socket, from >> dual-ported shared memory etc. > > I started to say once before that I think I see the primary > difference between us but I never quite got it out. To me, data being > input "over a wire" are a sequence of bits and I use record types with > representation clauses as soon as I can to specify the meaning of those > bits. You seem to see those data as a stream of bytes that contain > integers in the range 0 .. 255 and you specify the meaning by doing > arithmetic manipulations. No, you are wrong. The transport protocol (wire) can be of any sort. IFF that protocol is *serial* http://en.wikipedia.org/wiki/Serial_communication THEN, per definition of serial, it is a stream of bits. There exist zillions of protocols which aren't serial. All this boils down to the definition of encoding. Remember, encoding is a translation into certain alphabet: http://en.wikipedia.org/wiki/Encoding When the alphabet is {0,1}, you have a serial or binary protocol. There are other alphabets, you know. Network protocols use streams of octets {0,1,2,...,255}, or packets of octets. The typewriter uses letters. Morse Code uses dot and dash, etc. > I also believe in "programming with data" and > it appears that you don't, or at least you don't to the degree that I > do. Those are pretty fundamental differences between us from the > get-go so it's not suprising that we end up with quite different looking > code to accomplish the same task. Well, if you believe in "programming with data," then there must be "programming without data" in your world... Not in mine. > If my characterization of your approach is correct, then how you do > you handle a stream of bytes when some of them represent characters? That depends on: 1. the meaning of the word byte 2. the character type. We have many in Ada. 3. the encoding. There are many so character encodings: UTF-n, USC-n, EBCDIC, RADIX-50, ASCII, KOI-n, to name a few. > Do you use 'Val? If 1 = octet, 2 = Character, 3 = ASCII or Latin-1 > What do you do when some of the bytes, when combined > properly, represent a "signed" integer (i.e. a value that could be > negative)? See 1. You first give the mathematical definition of byte, e.g. 1. Integer range 0..255 2. Integer range -128..127 3. Ordered set of bits {0,1}**8 4. Ordered set of hexadecimal digits {0,1,2,3,5,6,7,8,9,A,B,C,D,E,F}**2 ... based on how the communication hardware and the machine type chosen to represent byte interact. Then you define the encoding, i.e. a mapping from the values of the Ada character type to the symbols the alphabet above. E.g.: A -> -34 B -> 56 C -> 1 ... Then that mapping gets implemented. If 'Val is appropriate for that, then why not to use it? In the real-life octet/Interfaces.Unsigned_8, Character, ASCII covers almost 90% cases. For them, I am customary using Character'Val. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de