comp.lang.ada
 help / color / mirror / Atom feed
From: Georg Bauhaus <see.reply.to@maps.futureapps.de>
Subject: Re: Q: Line_IO
Date: Tue, 01 Sep 2009 01:56:16 +0200
Date: 2009-09-01T01:56:16+02:00	[thread overview]
Message-ID: <4a9c6320$0$31347$9b4e6d93@newsspool4.arcor-online.net> (raw)
In-Reply-To: <1a4usf20z4mxa.1vct95fmrcs6h.dlg@40tude.net>

Dmitry A. Kazakov wrote:

> You could try not to concatenate:
> 
>    Stream_IO.Write (Stdout, To_Bytes (Item));
>    Stream_IO.Write (Stdout, To_Bytes (Separator_Sequence));
> 
> , which should be faster when Item is large.

Yes, though according to some measurements that have been made
in the recent past, "&" is faster for "normal" sized lines.
Other sizes did not produce stable results (on my machine at least).
A test case is in
<4a7bebaa$0$30224$9b4e6d93@newsspool1.arcor-online.net>

However, the two calls are more general, so perhaps they
should replace the concatenation.  And they seem to make
using 'Address be simpler, below---

> Then there is a crazy way to convert congruent types without
> Unchecked_Conversion. I cannot tell whether it is actually faster:

Since 'Address is used for reading anyway, and since, yes,
it is faster, it could replace the unchecked conversion.
Is there a risk with function parameters, not objects of
"better known" storage places?

New version below. If you want to see the difference between
Unchecked_Conversion and 'Address, rename either Print_1 (old)
to Put_Line or Print_2 (new, 'Address) to the same.

> P.S. The superimposed object shall not have initializers.

Does this apply to String parameters?

generic
   Separator_Sequence : in String;  --  ends a line
package Line_IO is

   pragma Elaborate_Body;

   --
   --  High(er) speed reading and writing of lines via Stream I/O.
   --  Made with Unix pipes in mind.
   --
   --  Assumptions:
   --  - Lines are separated by a sequence of characters.
   --  - Characters and stream elements can be used interchangeably.
   --  - Lines are not longer than internal buffer size.
   --
   --  I/O exceptions are propagated

   procedure Put_Line(Item : String);

   function Get_Line return String;

end Line_IO;


with Ada.Streams.Stream_IO;
with Ada.Unchecked_Conversion;

package body Line_IO is

   use Ada.Streams;

   Stdout : Stream_IO.File_Type;
   Stdin : Stream_IO.File_Type;

   -- writing

   procedure Print_1 (Item : String) is

      subtype Index is Stream_Element_Offset range
        Stream_Element_Offset(Item'First)
        .. Stream_Element_Offset(Item'Last + Separator_Sequence'Length);
      subtype XString is String (Item'First
        .. Item'Last + Separator_Sequence'Length);
      subtype XBytes is Stream_Element_Array (Index);
      function To_Bytes is new Ada.Unchecked_Conversion
        (Source => XString,
         Target => XBytes);
   begin
      Stream_IO.Write (Stdout, To_Bytes (Item & Separator_Sequence));
   end Print_1;

   -- Alternative:
   -- - call Stream_IO.Write twice, once for the string, then for the
   --   line separator (terminator)
   -- - specify 'Address, not unchecked_conversion is needed then

   -- We need the separator as a Stream_Element_Array. (Can we
   -- use 'Address on a generic formal object?  If so, then
   -- again, no Unchecked_Conversion is needed (advantage?))

   subtype Sep_String is String(Separator_Sequence'Range);
   subtype Sep_Bytes is Stream_Element_Array
     (Stream_Element_Offset(Separator_Sequence'First)
     .. Stream_Element_Offset(Separator_Sequence'Last));

   function To_Bytes is new Ada.Unchecked_Conversion
     (Source => Sep_String,
      Target => Sep_Bytes);

   Separator_Bytes : constant Stream_Element_Array :=
     To_Bytes(Separator_Sequence);

   procedure Print_2 (Item : String) is
      subtype Index is Stream_Element_Offset range
        Stream_Element_Offset(Item'First)
        .. Stream_Element_Offset(Item'Last);
      subtype XBytes is Stream_Element_Array (Index);
      Item_Bytes: XBytes;
      for Item_Bytes'Address use Item'Address;
   begin
      Stream_IO.Write (Stdout, Item_Bytes);
      Stream_IO.Write (Stdout, Separator_Bytes);
   end Print_2;

   procedure Put_Line (Item : String) renames Print_2;

   -- ----------------
   -- reading
   -- ----------------
   -- Types etc., status variables, and the buffer.  `Buffer` is at the
   -- same time an array of Character and and array of Stream_Element
   -- called `Bytes`.  They share the same address.  This setup makes the
   -- storage at the address either a String (when selecting result
   -- characters) or a Stream_Element_Array (when reading input bytes).

   BUFSIZ: constant := 8_192;
   pragma Assert(Character'Size = Stream_Element'Size);

   SL : constant Natural := Separator_Sequence'Length;

   subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL;
   subtype Buffer_Index is Extended_Buffer_Index
     range Extended_Buffer_Index'First .. Extended_Buffer_Index'Last - SL;
   subtype Extended_Bytes_Index is Stream_Element_Offset
     range 1 .. Stream_Element_Offset(Extended_Buffer_Index'Last);
   subtype Bytes_Index is Extended_Bytes_Index
     range Extended_Bytes_Index'First
     .. (Extended_Bytes_Index'Last - Stream_Element_Offset(SL));

   subtype Buffer_Data is String(Extended_Buffer_Index);
   subtype Buffer_Bytes is Stream_Element_Array(Extended_Bytes_Index);

   Buffer : Buffer_Data;
   Bytes  : Buffer_Bytes;
   for Bytes'Address use Buffer'Address;

   Position : Natural; -- start of next substring
   Last     : Natural; -- last valid character in buffer


   function Get_Line return String is

      procedure Reload;
      --  move remaining characters to the start of `Buffer` and
      --  fill the following bytes if possible
      --  post: Position in 0 .. 1, and 0 should mean end of file
      --        Last is 0 or else the index of the last valid element in
Buffer

      procedure Reload is
         Remaining : constant Natural := Buffer_Index'Last - Position + 1;
         Last_Index : Stream_Element_Offset;
      begin
         Buffer(1 .. Remaining) := Buffer(Position .. Buffer_Index'Last);

         Stream_IO.Read(Stdin,
           Item => Bytes(Stream_Element_Offset(Remaining) + 1 ..
Bytes_Index'Last),
           Last => Last_Index);
         Last := Natural(Last_Index);
         Buffer(Last + 1 .. Last + SL) := Separator_Sequence;

         Position := Boolean'Pos(Last_Index > 0
           and then Buffer(1) /= ASCII.EOT   -- ^D
           and then Buffer(1) /= ASCII.SUB); -- ^Z

      end Reload;

      function Sep_Index return Natural;
      --  position of next Separator_Sequence
      pragma Inline(Sep_Index);

      function Sep_Index return Natural is
         K : Natural := Position;
      begin
         pragma Assert(K >= Buffer'First);
         pragma Assert(Buffer(Buffer_Index'Last + 1 .. Buffer'Last)
           = Separator_Sequence);

         while Buffer(K) /= Separator_Sequence(1) loop
            K := K + 1;
         end loop;

         return K;
      end Sep_Index;

      Next_Separator : Natural;
   begin  -- Get_Line
      pragma Assert(Position = 0 or else Position in Extended_Buffer_Index);
      pragma Assert(Last = 0 or else Last in Buffer_Index);

      if Position = 0 then
         raise Stream_IO.End_Error;
      end if;

      Next_Separator := Sep_Index;

      if Next_Separator > Buffer_Index'Last then
         -- must be sentinel
         Reload;
         return Get_Line;
      end if;

      if Next_Separator <= Last then
         declare
            Limit : constant Natural := Natural'Max(0, Next_Separator - SL);
            -- there was trouble (Print) when Integer Limit could be
negative
            -- (for 2-char SL and Next_Separator = 1)
            Result : constant String := Buffer(Position .. Limit);
         begin
            Position := Limit + SL + 1;
            return Result;
         end;
      else
         -- the separator is among the characters beyond `Last`
         declare
            Limit : constant Positive := Last;
            Result : constant String := Buffer(Position .. Limit);
         begin
            --  -- makes the spurious line go away
            --  -- But make sure that it isn't cause by Put_Line!
            if Position > Last then
               raise Stream_IO.End_Error;
            end if;
            Position := 0;  -- next call will raise End_Error
            return Result;
         end;
      end if;

      raise Program_Error;
   end Get_Line;


begin
   -- (see <ILmdnWHx29q5VMrZnZ2dnUVZ_sednZ2d@megapath.net> for names
   -- of standard I/O streams when using Janus Ada on Windows.)

   Stream_IO.Open (Stdout,
     Mode => Stream_IO.Out_File,
     Name => "/dev/stdout");
   Stream_IO.Open (Stdin,
     Mode => Stream_IO.In_File,
     Name => "/dev/stdin");

   -- make sure there is no line separator in `Buffer` other than the
sentinel
   Buffer := Buffer_Data'(others => ASCII.NUL);
   Buffer(Buffer_Index'Last + 1 .. Buffer'Last) := Separator_Sequence;
   Position := Buffer_Index'Last + 1;  -- See also
`Getline.Reload.Remaining`
   Last := 0;
end Line_IO;



  parent reply	other threads:[~2009-08-31 23:56 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <4a9b045a$0$31875$9b4e6d93@newsspool3.arcor-online.net>
2009-08-31  8:28 ` Q: Line_IO Martin
2009-08-31 10:05   ` Georg Bauhaus
2009-08-31 15:33     ` Anh Vo
2009-08-31 16:52       ` Georg Bauhaus
2009-08-31 18:39 ` Dmitry A. Kazakov
2009-08-31 22:51   ` Robert A Duff
2009-09-01  0:35     ` Georg Bauhaus
2009-08-31 23:56   ` Georg Bauhaus [this message]
2009-09-01  0:19     ` Georg Bauhaus
2009-09-01  1:08       ` Robert A Duff
2009-09-01  7:02     ` Ludovic Brenta
2009-09-01  9:55       ` Georg Bauhaus
2009-09-01 12:03       ` jonathan
     [not found]       ` <4a9e2c86$0$30235$9b4e6d93@newsspool1.arcor-online.net>
2009-09-02  8:47         ` Georg Bauhaus
2009-09-05 20:30       ` Georg Bauhaus
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox