comp.lang.ada
 help / color / mirror / Atom feed
* Encapsulating Ada.Direct_IO
@ 2010-11-17  4:44 Bryan
  2010-11-17  5:20 ` Adam Beneschan
                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Bryan @ 2010-11-17  4:44 UTC (permalink / raw)


I'm trying to port some code to Ada for dealing with Big5-encoded
files. I realize that I might be able to use Ada.Wide_Text_IO, but I'm
trying to learn Ada and understand the language better.  I'm still
working on wrapping my head around types and packages.  My original
code in C++ opens a file as binary and parses it byte by byte and
breaking it into Big5 characters depending on the byte codes.  I
thought I'd try to do something similar by encapsulating Ada.Direct_IO
into a package.  I'm not having much luck, however.

Spec file:
======================
with Ada.Direct_IO;
package Big5_Text_IO is
  type File_Type is limited private;
  procedure Close( File : in out File_Type );
private
  package Byte_IO is new Ada.Direct_IO(Character);
  type File_Type is new Byte_IO.File_Type;
end Big5_Text_IO;
======================

Body file
======================
package body Big5_Text_IO is
  procedure Close( File : in out File_Type ) is
  begin
  	Byte_IO.Close(File);
  end Close;
end Big5_Text_IO;
======================

Test driver:
======================
with Big5_Text_IO;
with Ada.Text_IO;
procedure Big5_Test is
  Input_File : Big5_Text_IO.File_Type;
begin
  Ada.Text_IO.Put_Line("OK?");
end Big5_Test;
======================

If I leave out the Close method and remove the body file, I can build
the test driver with no issues.  Otherwise, I get the following from
GNAT:

======================
gcc -c big5_text_io.adb
big5_text_io.adb:6:23: expected private type "Ada.Direct_Io.File_Type"
from instance at big5_text_io.ads:11
big5_text_io.adb:6:23: found private type "Big5_Text_IO.File_Type"
defined at big5_text_io.ads:12
gnatmake: "big5_text_io.adb" compilation error
======================

I would *greatly* appreciate any tips in how I can better design my
package so that it can encapsulate Ada.Direct_IO or some other method
of binary I/O.  I looked at the GNAT source and I'm hoping I won't
have to emulate what they have done...its a bit over my head at this
point.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-17  4:44 Encapsulating Ada.Direct_IO Bryan
@ 2010-11-17  5:20 ` Adam Beneschan
  2010-11-26 15:31   ` Bryan
  2010-11-17 12:25 ` Peter C. Chapin
  2010-11-17 22:32 ` Yannick Duchêne (Hibou57)
  2 siblings, 1 reply; 23+ messages in thread
From: Adam Beneschan @ 2010-11-17  5:20 UTC (permalink / raw)


On Nov 16, 8:44 pm, Bryan <brobinson....@gmail.com> wrote:
> I'm trying to port some code to Ada for dealing with Big5-encoded
> files. I realize that I might be able to use Ada.Wide_Text_IO, but I'm
> trying to learn Ada and understand the language better.  I'm still
> working on wrapping my head around types and packages.  My original
> code in C++ opens a file as binary and parses it byte by byte and
> breaking it into Big5 characters depending on the byte codes.  I
> thought I'd try to do something similar by encapsulating Ada.Direct_IO
> into a package.  I'm not having much luck, however.
>
> Spec file:
> ======================
> with Ada.Direct_IO;
> package Big5_Text_IO is
>   type File_Type is limited private;
>   procedure Close( File : in out File_Type );
> private
>   package Byte_IO is new Ada.Direct_IO(Character);
>   type File_Type is new Byte_IO.File_Type;
> end Big5_Text_IO;
> ======================
>
> Body file
> ======================
> package body Big5_Text_IO is
>   procedure Close( File : in out File_Type ) is
>   begin
>         Byte_IO.Close(File);
>   end Close;
> end Big5_Text_IO;
> ======================
>
> Test driver:
> ======================
> with Big5_Text_IO;
> with Ada.Text_IO;
> procedure Big5_Test is
>   Input_File : Big5_Text_IO.File_Type;
> begin
>   Ada.Text_IO.Put_Line("OK?");
> end Big5_Test;
> ======================
>
> If I leave out the Close method and remove the body file, I can build
> the test driver with no issues.  Otherwise, I get the following from
> GNAT:
>
> ======================
> gcc -c big5_text_io.adb
> big5_text_io.adb:6:23: expected private type "Ada.Direct_Io.File_Type"
> from instance at big5_text_io.ads:11
> big5_text_io.adb:6:23: found private type "Big5_Text_IO.File_Type"
> defined at big5_text_io.ads:12
> gnatmake: "big5_text_io.adb" compilation error
> ======================
>
> I would *greatly* appreciate any tips in how I can better design my
> package so that it can encapsulate Ada.Direct_IO or some other method
> of binary I/O.  I looked at the GNAT source and I'm hoping I won't
> have to emulate what they have done...its a bit over my head at this
> point.

You're close.  Try changing

   Byte_IO.Close (File);

to

   Byte_IO.Close (Byte_IO.File_Type (File));

When you declare a derived type "type T2 is new T1", then T2 and T1
are not the same type, so you can't use an object of type T2 where
something of type T1 is expected.  But you can use a type conversion.

Note: I'm at home so I can't try this easily.  I seem to recall that
there were some issues using this paradigm with limited types
(including an incompatibility with earlier versions of the language),
but I don't recall the details and it's hard for me to look them up
right now.  If it turns out the type conversion doesn't work, then you
might have to make File_Type a record in the private part:

   type File_Type is record
      F : Byte_IO.File_Type;
   end record;

and then use File.F whenever you want to use a Byte_IO operation,
e.g.:

   Byte_IO.Close (File.F);

Hope this helps,

                                  -- Adam



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-17  4:44 Encapsulating Ada.Direct_IO Bryan
  2010-11-17  5:20 ` Adam Beneschan
@ 2010-11-17 12:25 ` Peter C. Chapin
  2010-11-18  1:16   ` Randy Brukardt
  2010-11-17 22:32 ` Yannick Duchêne (Hibou57)
  2 siblings, 1 reply; 23+ messages in thread
From: Peter C. Chapin @ 2010-11-17 12:25 UTC (permalink / raw)


On 2010-11-16 23:44, Bryan wrote:

> Spec file:
> ======================
> with Ada.Direct_IO;
> package Big5_Text_IO is
>   type File_Type is limited private;
>   procedure Close( File : in out File_Type );
> private
>   package Byte_IO is new Ada.Direct_IO(Character);
>   type File_Type is new Byte_IO.File_Type;
> end Big5_Text_IO;
> ======================

Direct_IO allows random access. There is nothing wrong with that, of
course, but if your intention is to read the file sequentially you might
prefer using Sequential_IO.

Something I wonder about (I don't have the answer) is if it necessary to
use a representation clause to force the size of the objects being read
to be 8 bits. I'm a little unclear if the standard requires Character to
be stored in a file in 8 bit units. That is, the language might treat
the type Character rather more abstractly than you want. Again I'm not
sure of this and I'd love to get some insights from others myself.

Thus you might want to do something like

package Big5_Text_IO is
  ...
  type Byte is mod 2**8;
  for Byte'Size use 8;
  ...
private
  package Byte_IO is new Ada.Direct_IO(Byte);
  ...
end Big5_Text_IO;

This approach has the advantage of creating a separate type to represent
the raw data from the file. Thus

C : Character;
B : Big5_Text_IO.Byte;

C := B;  -- Compile error. You haven't decoded B yet.

Peter



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-17  4:44 Encapsulating Ada.Direct_IO Bryan
  2010-11-17  5:20 ` Adam Beneschan
  2010-11-17 12:25 ` Peter C. Chapin
@ 2010-11-17 22:32 ` Yannick Duchêne (Hibou57)
  2010-11-17 23:03   ` Adam Beneschan
  2 siblings, 1 reply; 23+ messages in thread
From: Yannick Duchêne (Hibou57) @ 2010-11-17 22:32 UTC (permalink / raw)


Le Wed, 17 Nov 2010 05:44:49 +0100, Bryan <brobinson.eng@gmail.com> a  
écrit:
> Spec file:
> ...
> private
>   package Byte_IO is new Ada.Direct_IO(Character);
>   type File_Type is new Byte_IO.File_Type;
> end Big5_Text_IO;
> ======================
Why did you used a “type new” ? Unless you give some reason, subtype is  
better better here, as this avoid un-meaningful conversions every where.  
If the type was defined in the public part, the type-new would have make  
sense, but here it is defined in the private part, so no need to use  
type-new, unless you want to attach it some special invariants which are  
not part of the invariants of your Byte_IO.File_Type.

When you begin with Ada, you oftenly forget to make a good choice between  
type-new and subtype (probably because there is no such things with any  
other language). Always balance type-new and subtype and don't  
systematically use type-new all the way. Make choices with reasons.

-- 
Si les chats miaulent et font autant de vocalises bizarres, c’est pas pour  
les chiens.

“I am fluent in ASCII” [Warren 2010]



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-17 22:32 ` Yannick Duchêne (Hibou57)
@ 2010-11-17 23:03   ` Adam Beneschan
  2010-11-17 23:11     ` Yannick Duchêne (Hibou57)
  0 siblings, 1 reply; 23+ messages in thread
From: Adam Beneschan @ 2010-11-17 23:03 UTC (permalink / raw)


On Nov 17, 2:32 pm, Yannick Duchêne (Hibou57)
<yannick_duch...@yahoo.fr> wrote:
> Le Wed, 17 Nov 2010 05:44:49 +0100, Bryan <brobinson....@gmail.com> a  
> écrit:> Spec file:
> > ...
> > private
> >   package Byte_IO is new Ada.Direct_IO(Character);
> >   type File_Type is new Byte_IO.File_Type;
> > end Big5_Text_IO;
> > ======================
>
> Why did you used a “type new” ? Unless you give some reason, subtype is  
> better better here

File_Type was declared as a private type.  You cannot complete a
private type with a subtype.

                    -- Adam




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-17 23:03   ` Adam Beneschan
@ 2010-11-17 23:11     ` Yannick Duchêne (Hibou57)
  0 siblings, 0 replies; 23+ messages in thread
From: Yannick Duchêne (Hibou57) @ 2010-11-17 23:11 UTC (permalink / raw)


Le Thu, 18 Nov 2010 00:03:06 +0100, Adam Beneschan <adam@irvine.com> a  
écrit:
> File_Type was declared as a private type.  You cannot complete a
> private type with a subtype.
Oops, I had a bug, sorry.

-- 
Si les chats miaulent et font autant de vocalises bizarres, c’est pas pour  
les chiens.

“I am fluent in ASCII” [Warren 2010]



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-17 12:25 ` Peter C. Chapin
@ 2010-11-18  1:16   ` Randy Brukardt
  2010-11-18  2:21     ` Peter C. Chapin
                       ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Randy Brukardt @ 2010-11-18  1:16 UTC (permalink / raw)


"Peter C. Chapin" <pcc482719@gmail.com> wrote in message 
news:tM6dnaTzn6shVH7RRVn_vwA@giganews.com...
...
> Something I wonder about (I don't have the answer) is if it necessary to
> use a representation clause to force the size of the objects being read
> to be 8 bits. I'm a little unclear if the standard requires Character to
> be stored in a file in 8 bit units. That is, the language might treat
> the type Character rather more abstractly than you want. Again I'm not
> sure of this and I'd love to get some insights from others myself.

Surely not. Not all machines have 8-bits as any sort of native type. For 
instance, the Unisys U2200 (a 36-bit machine, with 9-bit bytes) used 
Character'Size = 9. (It was great fun for the cross-compiler.)

                                      Randy.





^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18  1:16   ` Randy Brukardt
@ 2010-11-18  2:21     ` Peter C. Chapin
  2010-11-18 16:36       ` Adam Beneschan
  2010-11-18  7:39     ` AdaMagica
                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Peter C. Chapin @ 2010-11-18  2:21 UTC (permalink / raw)


On 2010-11-17 20:16, Randy Brukardt wrote:

> Surely not. Not all machines have 8-bits as any sort of native type. For 
> instance, the Unisys U2200 (a 36-bit machine, with 9-bit bytes) used 
> Character'Size = 9. (It was great fun for the cross-compiler.)

So in that case if you absolutely wanted to read 8 bit units from a file
(because the file is in some externally defined binary format that uses
8 bit units) it would be necessary to do something like:

type My_Byte is mod 2**8;
for My_Byte'Size use 8;   -- This is important.

package My_Byte_IO is new Ada.Sequential_IO(My_Byte);

... and then convert from My_Byte to Character only as appropriate
during the file decoding process.

True?

Peter



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18  1:16   ` Randy Brukardt
  2010-11-18  2:21     ` Peter C. Chapin
@ 2010-11-18  7:39     ` AdaMagica
  2010-11-18 18:38       ` Randy Brukardt
  2010-11-18  9:46     ` Maciej Sobczak
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: AdaMagica @ 2010-11-18  7:39 UTC (permalink / raw)
  Cc: randy

On 18 Nov., 02:16, "Randy Brukardt" <ra...@rrsoftware.com> wrote:
> ... For
> instance, the Unisys U2200 (a 36-bit machine, with 9-bit bytes) used
> Character'Size = 9. (It was great fun for the cross-compiler.)
>
>                                       Randy.

Huh. How then is Character defined there? Accoriding to RM A.1(35),
Character has 256 positions, so Character'Size should be still 8.
Of course stand-alone objects would have X'Size = 9.

Note that Natural'Size = Integer'Size - 1.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18  1:16   ` Randy Brukardt
  2010-11-18  2:21     ` Peter C. Chapin
  2010-11-18  7:39     ` AdaMagica
@ 2010-11-18  9:46     ` Maciej Sobczak
  2010-11-18 16:31     ` Adam Beneschan
  2010-11-24 21:31     ` Warren
  4 siblings, 0 replies; 23+ messages in thread
From: Maciej Sobczak @ 2010-11-18  9:46 UTC (permalink / raw)


On 18 Lis, 02:16, "Randy Brukardt" <ra...@rrsoftware.com> wrote:

> Surely not. Not all machines have 8-bits as any sort of native type. For
> instance, the Unisys U2200 (a 36-bit machine, with 9-bit bytes) used
> Character'Size = 9.

You have used present for "not all machines have" and past for "Unisys
used". :-)

Now let's synchronize these statements - what is the actual *current*
status of this niche? Is it still relevant enough in the industry to
justify standardization effort at the level of programming languages
and development effort at the level of user programs?
What would happen if we just assumed 8 as a common denominator for the
granularity of storage for any type (in particular Character'Size = 8)
and just moved on?

--
Maciej Sobczak * http://www.inspirel.com



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18  1:16   ` Randy Brukardt
                       ` (2 preceding siblings ...)
  2010-11-18  9:46     ` Maciej Sobczak
@ 2010-11-18 16:31     ` Adam Beneschan
  2010-11-18 17:05       ` Dmitry A. Kazakov
  2010-11-18 18:45       ` Randy Brukardt
  2010-11-24 21:31     ` Warren
  4 siblings, 2 replies; 23+ messages in thread
From: Adam Beneschan @ 2010-11-18 16:31 UTC (permalink / raw)


On Nov 17, 5:16 pm, "Randy Brukardt" <ra...@rrsoftware.com> wrote:
> "Peter C. Chapin" <pcc482...@gmail.com> wrote in messagenews:tM6dnaTzn6shVH7RRVn_vwA@giganews.com...
> ...
>
> > Something I wonder about (I don't have the answer) is if it necessary to
> > use a representation clause to force the size of the objects being read
> > to be 8 bits. I'm a little unclear if the standard requires Character to
> > be stored in a file in 8 bit units. That is, the language might treat
> > the type Character rather more abstractly than you want. Again I'm not
> > sure of this and I'd love to get some insights from others myself.
>
> Surely not. Not all machines have 8-bits as any sort of native type. For
> instance, the Unisys U2200 (a 36-bit machine, with 9-bit bytes) used
> Character'Size = 9. (It was great fun for the cross-compiler.)

Yeah, I was going to say something about Honeywell 600 series, which
was also a 36-bit machine and used 9-bit bytes.  My dad worked for
Honeywell so I have some more useless knowledge about some of their
machine than others would.  Unlike the Unisys U2200, all the Honeywell
machines were dead by the time Ada was developed, so I don't think
anyone ever tried to write an Ada compiler for it.  Then there was a
DEC machine that used 36-bit words and represented strings by sticking
five 7-bit ASCII characters in each word, but I don't remember much
else.  But now I'm drifting...

I did notice, though, that Peter was talking about how data was stored
in a *file*.  In the Unisys U2200 system, how were files kept,
conceptually?  Were they streams of 8-bit bytes, or were they stored
as 36-bit words or 9-bit bytes or what?  I'd assume that for a text
file, when you read it in, four bytes would be stored in each word and
the high bit of each 9-bit byte would be zeroed---whether that zero
bit was actually stored on disk or not should be the OS's concern, not
the program's, and I'd presume that the OS would also have to handle
things correctly when text files are transferred from a different
machine.

What I'm leading up to, though, is that I think Peter's question is
too simple.  We're all spoiled in having to deal exclusively, or
almost exclusively, with machines with 8-bit byte addressability and
files that are unstructured sequences of 8-bit bytes.  But there are
other systems out there.  There are machines in use that are not byte-
addressable---Analog Devices 21060 series comes to mind, which uses 32-
bit words whose bytes are not individually addressable.  When reading
from a file on that system, do you want each word to hold one byte, or
four?  Even in VAX/VMS, which does run on a machine with 8-bit byte
addressibility, the OS is able to create files that have more
structure than just being sequences of bytes.  What would it mean to
instantiate Direct_IO(Character) on a file like that?  I don't think
the answer is trivial.  How did the Pick operating system treat files
conceptually?  How would Direct_IO work on one of that system's files?

Ada's designers have tried to design a language that could work on any
of those systems, and therefore I think the standard does not and
cannot answer Peter's question.  In fact, I'm not entirely sure that
his question is meaningful on platforms that don't use 8-bit bytes and/
or use files with some structure.  (It might have to be rephrased.)
In any event, I think that details like this are left up to the
implementation.  And if you were trying to do something like this on a
U2200, there is no Ada answer to the question, because you would have
to know more about the particular OS's file system and how the Ada
implementation interacts with it.

                                -- Adam



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18  2:21     ` Peter C. Chapin
@ 2010-11-18 16:36       ` Adam Beneschan
  2010-11-18 18:21         ` Peter C. Chapin
  0 siblings, 1 reply; 23+ messages in thread
From: Adam Beneschan @ 2010-11-18 16:36 UTC (permalink / raw)


On Nov 17, 6:21 pm, "Peter C. Chapin" <pcc482...@gmail.com> wrote:
> On 2010-11-17 20:16, Randy Brukardt wrote:
>
> > Surely not. Not all machines have 8-bits as any sort of native type. For
> > instance, the Unisys U2200 (a 36-bit machine, with 9-bit bytes) used
> > Character'Size = 9. (It was great fun for the cross-compiler.)
>
> So in that case if you absolutely wanted to read 8 bit units from a file
> (because the file is in some externally defined binary format that uses
> 8 bit units) it would be necessary to do something like:
>
> type My_Byte is mod 2**8;
> for My_Byte'Size use 8;   -- This is important.
>
> package My_Byte_IO is new Ada.Sequential_IO(My_Byte);
>
> ... and then convert from My_Byte to Character only as appropriate
> during the file decoding process.
>
> True?

I have a longish rant that touches on this, but the short answer is
that I don't think there's an Ada answer to your question.  If you had
a file defined as using 8-bit units, and that file got put onto a
system that uses 36-bit words, you'd need to know just what the OS is
going to do with it, and how the particular Ada implementation will
deal with files on that OS.  It may be that the native "read from
file" service on that OS will put each byte into a 9-bit byte and zero
the high bit.  I don't see how to avoid implementation-dependent code
in a case like this.

                                       -- Adam



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18 16:31     ` Adam Beneschan
@ 2010-11-18 17:05       ` Dmitry A. Kazakov
       [not found]         ` <ENidndoH8qoqjHvRnZ2dnUVZ_j-dnZ2d@earthlink.com>
  2010-11-18 18:45       ` Randy Brukardt
  1 sibling, 1 reply; 23+ messages in thread
From: Dmitry A. Kazakov @ 2010-11-18 17:05 UTC (permalink / raw)


On Thu, 18 Nov 2010 08:31:20 -0800 (PST), Adam Beneschan wrote:

> Then there was a
> DEC machine that used 36-bit words and represented strings by sticking
> five 7-bit ASCII characters in each word, but I don't remember much
> else.

RADIX-50?

http://en.wikipedia.org/wiki/DEC_Radix-50

It was also used on 8-bit machines to encode file names in the FILES-11
file system.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18 16:36       ` Adam Beneschan
@ 2010-11-18 18:21         ` Peter C. Chapin
  2010-11-18 18:36           ` Randy Brukardt
  2010-11-18 19:48           ` Adam Beneschan
  0 siblings, 2 replies; 23+ messages in thread
From: Peter C. Chapin @ 2010-11-18 18:21 UTC (permalink / raw)


On 2010-11-18 11:36, Adam Beneschan wrote:

> I have a longish rant that touches on this, but the short answer is
> that I don't think there's an Ada answer to your question.  If you had
> a file defined as using 8-bit units, and that file got put onto a
> system that uses 36-bit words, you'd need to know just what the OS is
> going to do with it, and how the particular Ada implementation will
> deal with files on that OS.  It may be that the native "read from
> file" service on that OS will put each byte into a 9-bit byte and zero
> the high bit.  I don't see how to avoid implementation-dependent code
> in a case like this.

I understand what you are saying but it is less than satisfying. :)

I'm thinking about the very common case when one is trying to read a
file that has a format defined by some third party. For example the
specification of the format might say, "The first octet of the header
defines the message type and can be one of the following values... The
type field is followed by a 24 bit length field in big endian form. The
body of the message follows the length field, and finally a 32 bit CRC
follows the message body."

I want to write an Ada program that can read in a file like this and
process it. Are you saying that it's impossible to write such a program
in a portable manner?

What I've been doing is as I showed earlier... define my own "Byte" type
with Byte'Size set to 8 and the instantiate Sequential_IO. My program
then interprets the individual bytes as necessary. It seems to work with
GNAT.

I suppose that C has the same issue, really. The C standard does not
promise that char is exactly 8 bits. If it isn't I'm not sure what
happens when you do

int Ch;

while ((Ch = fgetc(infile)) != EOF ) { ... }

I guess that's the same point you are making. Maybe the C standard talks
about this issue. Checking...

I just took a quick look at C99's description of fgetc and it says, "the
fgetc function returns the next character from the input stream." That
seems to beg the question, doesn't it?

Peter



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18 18:21         ` Peter C. Chapin
@ 2010-11-18 18:36           ` Randy Brukardt
  2010-11-18 19:48           ` Adam Beneschan
  1 sibling, 0 replies; 23+ messages in thread
From: Randy Brukardt @ 2010-11-18 18:36 UTC (permalink / raw)


"Peter C. Chapin" <pcc482719@gmail.com> wrote in message 
news:w6udneqaMqdi83jRRVn_vwA@giganews.com...
> On 2010-11-18 11:36, Adam Beneschan wrote:
>
>> I have a longish rant that touches on this, but the short answer is
>> that I don't think there's an Ada answer to your question.  If you had
>> a file defined as using 8-bit units, and that file got put onto a
>> system that uses 36-bit words, you'd need to know just what the OS is
>> going to do with it, and how the particular Ada implementation will
>> deal with files on that OS.  It may be that the native "read from
>> file" service on that OS will put each byte into a 9-bit byte and zero
>> the high bit.  I don't see how to avoid implementation-dependent code
>> in a case like this.
>
> I understand what you are saying but it is less than satisfying. :)
>
> I'm thinking about the very common case when one is trying to read a
> file that has a format defined by some third party. For example the
> specification of the format might say, "The first octet of the header
> defines the message type and can be one of the following values... The
> type field is followed by a 24 bit length field in big endian form. The
> body of the message follows the length field, and finally a 32 bit CRC
> follows the message body."
>
> I want to write an Ada program that can read in a file like this and
> process it. Are you saying that it's impossible to write such a program
> in a portable manner?
>
> What I've been doing is as I showed earlier... define my own "Byte" type
> with Byte'Size set to 8 and the instantiate Sequential_IO. My program
> then interprets the individual bytes as necessary. It seems to work with
> GNAT.
>
> I suppose that C has the same issue, really. The C standard does not
> promise that char is exactly 8 bits. If it isn't I'm not sure what
> happens when you do
>
> int Ch;
>
> while ((Ch = fgetc(infile)) != EOF ) { ... }
>
> I guess that's the same point you are making. Maybe the C standard talks
> about this issue. Checking...
>
> I just took a quick look at C99's description of fgetc and it says, "the
> fgetc function returns the next character from the input stream." That
> seems to beg the question, doesn't it?

Adam is right. What we did is the same thing as the C compiler.

When you imported files from the "real world" into that system, it added a 
zero bit to every byte. So there normally would not be any such thing as a 
file with 8-bit characters.The same thing happened to sockets.

The real trouble came when you did the reverse, as it then *dropped* the 
high bit if it thought that the files were text files. That would be bad 
news for truly binary files.

Ada can only define things portably that it has control over. If you read 
and write files on the same machine, then the results should be portable. 
Once you start communicating to other machines, there can be translation 
layers that mess things up.

The good new is that it is pretty rare that you will have to deal with any 
machines that have other than 8-bit bytes these days. So I wouldn't worry 
about those issues unless you happen to be working with Unisys. ;-)

                                             Randy. 





^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18  7:39     ` AdaMagica
@ 2010-11-18 18:38       ` Randy Brukardt
  0 siblings, 0 replies; 23+ messages in thread
From: Randy Brukardt @ 2010-11-18 18:38 UTC (permalink / raw)


"AdaMagica" <christoph.grein@eurocopter.com> wrote in message 
news:946c87a9-208a-4206-b925-1c48ac621acd@a37g2000yqi.googlegroups.com...
>On 18 Nov., 02:16, "Randy Brukardt" <ra...@rrsoftware.com> wrote:
>> ... For
>> instance, the Unisys U2200 (a 36-bit machine, with 9-bit bytes) used
>> Character'Size = 9. (It was great fun for the cross-compiler.)
>
>Huh. How then is Character defined there? Accoriding to RM A.1(35),
>Character has 256 positions, so Character'Size should be still 8.
>Of course stand-alone objects would have X'Size = 9.
>
>Note that Natural'Size = Integer'Size - 1.

I suppose you are right, but Ada 95 Type'Size has no important meaning. 
(It's a terrible definition, IMHO.) What matters is what AdaCore calls 
Type'Object_Size, and that is what I was referring to. (Typically, 
specifying Type'Size will have some effect on Type'Object_Size, but exactly 
what that is will vary depending on the target.)

You could, I suppose, have packed characters into 8-bits in an array, but 
the code to access them would have been unspeakably bad. And there would 
have been no reason to do so anyway, since files and streams are 
automatically converted when crossing into that machine's domain.

                                  Randy.







^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18 16:31     ` Adam Beneschan
  2010-11-18 17:05       ` Dmitry A. Kazakov
@ 2010-11-18 18:45       ` Randy Brukardt
  1 sibling, 0 replies; 23+ messages in thread
From: Randy Brukardt @ 2010-11-18 18:45 UTC (permalink / raw)


"Adam Beneschan" <adam@irvine.com> wrote in message 
news:8bb3a7c0-c473-4c35-bc6e-3920ce80e6a8@q36g2000vbi.googlegroups.com...
...
>I did notice, though, that Peter was talking about how data was stored
>in a *file*.  In the Unisys U2200 system, how were files kept,
>conceptually?  Were they streams of 8-bit bytes, or were they stored
>as 36-bit words or 9-bit bytes or what?  I'd assume that for a text
>file, when you read it in, four bytes would be stored in each word and
>the high bit of each 9-bit byte would be zeroed---whether that zero
>bit was actually stored on disk or not should be the OS's concern, not
>the program's, and I'd presume that the OS would also have to handle
>things correctly when text files are transferred from a different
>machine.

It stored text as streams of 9-bit characters. We were running on a Unix 
emulator, and there were a bunch of conversions when you communicated to the 
"normal" 8-bit world. When you think of it, it is amazing that they could 
get Unix code to run in such an environment. Just imagine all of the things 
C could do that would assume 8-bits. Us Ada people had it easy: the language 
was designed to allow use on such machines (although I think there were only 
a few such compilers built). We only ran into one language bug (for modular 
types) having to do with 1's complement math.

Perhaps this is all OBE these days, and Ada could be simplified and assume 
8-bit only. But I doubt it would make that much difference, so it probably 
wouldn't be worth the effort.

                               Randy.


What I'm leading up to, though, is that I think Peter's question is
too simple.  We're all spoiled in having to deal exclusively, or
almost exclusively, with machines with 8-bit byte addressability and
files that are unstructured sequences of 8-bit bytes.  But there are
other systems out there.  There are machines in use that are not byte-
addressable---Analog Devices 21060 series comes to mind, which uses 32-
bit words whose bytes are not individually addressable.  When reading
from a file on that system, do you want each word to hold one byte, or
four?  Even in VAX/VMS, which does run on a machine with 8-bit byte
addressibility, the OS is able to create files that have more
structure than just being sequences of bytes.  What would it mean to
instantiate Direct_IO(Character) on a file like that?  I don't think
the answer is trivial.  How did the Pick operating system treat files
conceptually?  How would Direct_IO work on one of that system's files?

Ada's designers have tried to design a language that could work on any
of those systems, and therefore I think the standard does not and
cannot answer Peter's question.  In fact, I'm not entirely sure that
his question is meaningful on platforms that don't use 8-bit bytes and/
or use files with some structure.  (It might have to be rephrased.)
In any event, I think that details like this are left up to the
implementation.  And if you were trying to do something like this on a
U2200, there is no Ada answer to the question, because you would have
to know more about the particular OS's file system and how the Ada
implementation interacts with it.

                                -- Adam 





^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18 18:21         ` Peter C. Chapin
  2010-11-18 18:36           ` Randy Brukardt
@ 2010-11-18 19:48           ` Adam Beneschan
  2010-11-18 20:15             ` Dmitry A. Kazakov
  1 sibling, 1 reply; 23+ messages in thread
From: Adam Beneschan @ 2010-11-18 19:48 UTC (permalink / raw)


On Nov 18, 10:21 am, "Peter C. Chapin" <pcc482...@gmail.com> wrote:

> I'm thinking about the very common case when one is trying to read a
> file that has a format defined by some third party. For example the
> specification of the format might say, "The first octet of the header
> defines the message type and can be one of the following values... The
> type field is followed by a 24 bit length field in big endian form. The
> body of the message follows the length field, and finally a 32 bit CRC
> follows the message body."

The problem is that this *definition* is not sufficient to tell you
what an OS will stick in your memory buffer if you ask to read from
such a file.  You need additional OS-dependent information.

                               -- Adam



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18 19:48           ` Adam Beneschan
@ 2010-11-18 20:15             ` Dmitry A. Kazakov
  0 siblings, 0 replies; 23+ messages in thread
From: Dmitry A. Kazakov @ 2010-11-18 20:15 UTC (permalink / raw)


On Thu, 18 Nov 2010 11:48:49 -0800 (PST), Adam Beneschan wrote:

> On Nov 18, 10:21�am, "Peter C. Chapin" <pcc482...@gmail.com> wrote:
> 
>> I'm thinking about the very common case when one is trying to read a
>> file that has a format defined by some third party. For example the
>> specification of the format might say, "The first octet of the header
>> defines the message type and can be one of the following values... The
>> type field is followed by a 24 bit length field in big endian form. The
>> body of the message follows the length field, and finally a 32 bit CRC
>> follows the message body."
> 
> The problem is that this *definition* is not sufficient to tell you
> what an OS will stick in your memory buffer if you ask to read from
> such a file.

I don't think so. I presume that the context of the definition above is the
OS. It is a reasonable presumption because files from another OS (file
system) cannot be read at all unless they are converted into the format
supported by the OS's file system. Another presumption is that the
definition describes memory layout, rather than the media layout. The
latter is inaccessible anyway. So a stream of octets is what a file reading
OS service delivers, when Ada would call it. There of course is no guaranty
that Direct_IO would use this service and not some other service.

> You need additional OS-dependent information.

The Ada compiler vendor will likely document the services used for
Direct_IO, but there is no way to verify that using the representation
clauses and/or assertions.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
       [not found]         ` <ENidndoH8qoqjHvRnZ2dnUVZ_j-dnZ2d@earthlink.com>
@ 2010-11-19  8:24           ` Dmitry A. Kazakov
  2010-11-19 16:19             ` Adam Beneschan
  0 siblings, 1 reply; 23+ messages in thread
From: Dmitry A. Kazakov @ 2010-11-19  8:24 UTC (permalink / raw)


On Thu, 18 Nov 2010 21:57:10 -0800, Dennis Lee Bieber wrote:

> On Thu, 18 Nov 2010 18:05:49 +0100, "Dmitry A. Kazakov"
> <mailbox@dmitry-kazakov.de> declaimed the following in comp.lang.ada:
> 
>> On Thu, 18 Nov 2010 08:31:20 -0800 (PST), Adam Beneschan wrote:
>> 
>>> Then there was a
>>> DEC machine that used 36-bit words and represented strings by sticking
>>> five 7-bit ASCII characters in each word, but I don't remember much
>>> else.
>> 
>> RADIX-50?
>> 
>> http://en.wikipedia.org/wiki/DEC_Radix-50
>> 
> 	Why bother... 5 * 7bit => 35bits, easily fits into a 36bit word with
> one left over!

Because 3*7 = 21 >> 16. Under RSX-11, RT-11 they packed 3 file name
characters into one 16-bit word. With 64K address space that was important.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-19  8:24           ` Dmitry A. Kazakov
@ 2010-11-19 16:19             ` Adam Beneschan
  0 siblings, 0 replies; 23+ messages in thread
From: Adam Beneschan @ 2010-11-19 16:19 UTC (permalink / raw)


On Nov 19, 12:24 am, "Dmitry A. Kazakov" <mail...@dmitry-kazakov.de>
wrote:
> On Thu, 18 Nov 2010 21:57:10 -0800, Dennis Lee Bieber wrote:
> > On Thu, 18 Nov 2010 18:05:49 +0100, "Dmitry A. Kazakov"
> > <mail...@dmitry-kazakov.de> declaimed the following in comp.lang.ada:
>
> >> On Thu, 18 Nov 2010 08:31:20 -0800 (PST), Adam Beneschan wrote:
>
> >>> Then there was a
> >>> DEC machine that used 36-bit words and represented strings by sticking
> >>> five 7-bit ASCII characters in each word, but I don't remember much
> >>> else.
>
> >> RADIX-50?
>
> >>http://en.wikipedia.org/wiki/DEC_Radix-50
>
> >    Why bother... 5 * 7bit => 35bits, easily fits into a 36bit word with
> > one left over!
>
> Because 3*7 = 21 >> 16. Under RSX-11, RT-11 they packed 3 file name
> characters into one 16-bit word. With 64K address space that was important.

Yeah, I remember the -11.  Cute little machines.  The 36-bit one I was
referring to was, I think, DEC-10; my recollection is that the
instruction set contained instructions that would let you stuff
characters of any size (including 7 bits) into words until it couldn't
fit any more and then go to the next word.

                                   -- Adam



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-18  1:16   ` Randy Brukardt
                       ` (3 preceding siblings ...)
  2010-11-18 16:31     ` Adam Beneschan
@ 2010-11-24 21:31     ` Warren
  4 siblings, 0 replies; 23+ messages in thread
From: Warren @ 2010-11-24 21:31 UTC (permalink / raw)


Randy Brukardt expounded in news:ic1uop$nhs$1@munin.nbi.dk:

> "Peter C. Chapin" <pcc482719@gmail.com> wrote in message 
> news:tM6dnaTzn6shVH7RRVn_vwA@giganews.com...
> ...
>> Something I wonder about (I don't have the answer) is if
>> it necessary to use a representation clause to force the
>> size of the objects being read to be 8 bits. I'm a little
>> unclear if the standard requires Character to be stored in
>> a file in 8 bit units. That is, the language might treat 
>> the type Character rather more abstractly than you want.
>> Again I'm not sure of this and I'd love to get some
>> insights from others myself. 
> 
> Surely not. Not all machines have 8-bits as any sort of
> native type. For instance, the Unisys U2200 (a 36-bit
> machine, with 9-bit bytes) used Character'Size = 9. (It was
> great fun for the cross-compiler.) 
> 
>                                       Randy.

Heh heh, the Honeywell Level 66 and DPS-8 machines were like 
that too. That extra bit in a byte sometimes came in handy. 
But that made porting to "normal platforms" tricky.

Warren



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Encapsulating Ada.Direct_IO
  2010-11-17  5:20 ` Adam Beneschan
@ 2010-11-26 15:31   ` Bryan
  0 siblings, 0 replies; 23+ messages in thread
From: Bryan @ 2010-11-26 15:31 UTC (permalink / raw)


Wow, this topic sparked a lot of interesting conversation!  I thought
I would report that I do have the code working now thanks to the
advice in this thread.

Adam,

Thank you for catching that my mistake, the casting did the trick.


Peter,

Thanks for the tips.  I will look into Sequential_IO, I somehow missed
it and thought that is what Direct_IO was for.  I need to go back and
review the different Ada file processing packges.



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2010-11-26 15:31 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-17  4:44 Encapsulating Ada.Direct_IO Bryan
2010-11-17  5:20 ` Adam Beneschan
2010-11-26 15:31   ` Bryan
2010-11-17 12:25 ` Peter C. Chapin
2010-11-18  1:16   ` Randy Brukardt
2010-11-18  2:21     ` Peter C. Chapin
2010-11-18 16:36       ` Adam Beneschan
2010-11-18 18:21         ` Peter C. Chapin
2010-11-18 18:36           ` Randy Brukardt
2010-11-18 19:48           ` Adam Beneschan
2010-11-18 20:15             ` Dmitry A. Kazakov
2010-11-18  7:39     ` AdaMagica
2010-11-18 18:38       ` Randy Brukardt
2010-11-18  9:46     ` Maciej Sobczak
2010-11-18 16:31     ` Adam Beneschan
2010-11-18 17:05       ` Dmitry A. Kazakov
     [not found]         ` <ENidndoH8qoqjHvRnZ2dnUVZ_j-dnZ2d@earthlink.com>
2010-11-19  8:24           ` Dmitry A. Kazakov
2010-11-19 16:19             ` Adam Beneschan
2010-11-18 18:45       ` Randy Brukardt
2010-11-24 21:31     ` Warren
2010-11-17 22:32 ` Yannick Duchêne (Hibou57)
2010-11-17 23:03   ` Adam Beneschan
2010-11-17 23:11     ` Yannick Duchêne (Hibou57)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox