comp.lang.ada
 help / color / mirror / Atom feed
* Re: Q: Line_IO
       [not found] <4a9b045a$0$31875$9b4e6d93@newsspool3.arcor-online.net>
@ 2009-08-31  8:28 ` Martin
  2009-08-31 10:05   ` Georg Bauhaus
  2009-08-31 18:39 ` Dmitry A. Kazakov
  1 sibling, 1 reply; 15+ messages in thread
From: Martin @ 2009-08-31  8:28 UTC (permalink / raw)


On Aug 30, 11:59 pm, Georg Bauhaus <see.reply...@maps.futureapps.de>
wrote:
> Text_IO seems fairly slow when just reading lines of text.
> Here are two alternative I/O subprograms for Line I/O, in plain Ada,
> based on Stream_IO.   They seem to run significantly faster.
>
> However, there is one glitch and I can't find the cause:
> output always has one more line at the end, an empty one.
> Why?  If you have got a minute to look at this, you will
> also help us with getting faster programs at the Shootout.
> These read lines by the megabyte.
>
> generic
>    Separator_Sequence : in String;  --  ends a line
> package Line_IO is
>
>    pragma Elaborate_Body;
>
>    --
>    --  High(er) speed reading and writing of lines via Stream I/O.
>    --  Made with Unix pipes in mind.
>    --
>    --  Assumptions:
>    --  - Lines are separated by a sequence of characters.
>    --  - Characters and stream elements can be used interchangeably.
>    --  - Lines are not longer than internal buffer size.
>    --
>    --  I/O exceptions are propagated
>
>    procedure Print(Item : String);
>
>    function Getline return String;
>
> end Line_IO;
>
> with Ada.Streams.Stream_IO;
> with Ada.Unchecked_Conversion;
>
> package body Line_IO is
>
>    use Ada.Streams;
>
>    Stdout : Stream_IO.File_Type;
>    Stdin : Stream_IO.File_Type;
>
>    -- writing
>
>    procedure Print (Item : String) is
>
>       subtype Index is Stream_Element_Offset range
>         Stream_Element_Offset(Item'First)
>         .. Stream_Element_Offset(Item'Last + Separator_Sequence'Length);
>       subtype XString is String (Item'First
>         .. Item'Last + Separator_Sequence'Length);
>       subtype XBytes is Stream_Element_Array (Index);
>       function To_Bytes is new Ada.Unchecked_Conversion
>         (Source => XString,
>          Target => XBytes);
>    begin
>       Stream_IO.Write (Stdout, To_Bytes (Item & Separator_Sequence));
>    end Print;
>
>    -- ----------------
>    -- reading
>    -- ----------------
>    -- Types etc., status variables, and the buffer.  `Buffer` is at the
>    -- same time an array of Character and and array of Stream_Element
>    -- called `Bytes`.  They share the same address.  This setup makes the
>    -- storage at the address either a String (when selecting result
>    -- characters) or a Stream_Element_Array (when reading input bytes).
>
>    BUFSIZ: constant := 8_192;
>    pragma Assert(Character'Size = Stream_Element'Size);
>
>    SL : constant Natural := Separator_Sequence'Length;
>
>    subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL;
>    subtype Buffer_Index is Extended_Buffer_Index
>      range Extended_Buffer_Index'First .. Extended_Buffer_Index'Last - SL;
>    subtype Extended_Bytes_Index is Stream_Element_Offset
>      range 1 .. Stream_Element_Offset(Extended_Buffer_Index'Last);
>    subtype Bytes_Index is Extended_Bytes_Index
>      range Extended_Bytes_Index'First
>      .. (Extended_Bytes_Index'Last - Stream_Element_Offset(SL));
>
>    subtype Buffer_Data is String(Extended_Buffer_Index);
>    subtype Buffer_Bytes is Stream_Element_Array(Extended_Bytes_Index);
>
>    Buffer : Buffer_Data;
>    Bytes  : Buffer_Bytes;
>    for Bytes'Address use Buffer'Address;
>
>    Position : Natural; -- start of next substring
>    Last     : Natural; -- last valid character in buffer
>
>    function Getline return String is
>
>       procedure Reload;
>       --  move remaining characters to the start of `Buffer` and
>       --  fill the following bytes if possible
>       --  post: Position in 0 .. 1, and 0 should mean end of file
>       --        Last is 0 or else the index of the last valid element in
> Buffer
>
>       procedure Reload is
>          Remaining : constant Natural := Buffer_Index'Last - Position + 1;
>          Last_Index : Stream_Element_Offset;
>       begin
>          Buffer(1 .. Remaining) := Buffer(Position .. Buffer_Index'Last);
>
>          Stream_IO.Read(Stdin,
>            Item => Bytes(Stream_Element_Offset(Remaining) + 1 ..
> Bytes_Index'Last),
>                         Last => Last_Index);
>          Last := Natural(Last_Index);
>          Buffer(Last + 1 .. Last + SL) := Separator_Sequence;
>
>          Position := Boolean'Pos(Last_Index > 0
>            and then Buffer(1) /= ASCII.EOT   -- ^D
>            and then Buffer(1) /= ASCII.SUB); -- ^Z
>
>       end Reload;
>
>       function Sep_Index return Natural;
>       --  position of next Separator_Sequence
>       pragma Inline(Sep_Index);
>
>       function Sep_Index return Natural is
>          K : Natural := Position;
>       begin
>          pragma Assert(K >= Buffer'First);
>          pragma Assert(Buffer(Buffer_Index'Last + 1 .. Buffer'Last)
>            = Separator_Sequence);
>
>          while Buffer(K) /= Separator_Sequence(1) loop
>             K := K + 1;
>          end loop;
>
>          return K;
>       end Sep_Index;
>
>       Next_Separator : Natural;
>    begin  -- Getline
>       pragma Assert(Position = 0 or else Position in Extended_Buffer_Index);
>       pragma Assert(Last = 0 or else Last in Buffer_Index);
>
>       if Position = 0 then
>          raise Stream_IO.End_Error;
>       end if;
>
>       Next_Separator := Sep_Index;
>
>       if Next_Separator > Buffer_Index'Last then
>          -- must be sentinel
>          Reload;
>          return Getline;
>       end if;
>
>       if Next_Separator <= Last then
>          declare
>             Limit : constant Natural := Natural'Max(0, Next_Separator - SL);
>             -- there was trouble (Print) when Integer Limit could be
> negative
>             -- (for 2-char SL and Next_Separator = 1)
>             Result : constant String := Buffer(Position .. Limit);
>          begin
>             Position := Limit + SL + 1;
>             return Result;
>          end;
>       else
>          -- the separator is among the characters beyond `Last`
>          declare
>             Limit : constant Positive := Last;
>             Result : constant String := Buffer(Position .. Limit);
>          begin
>             Position := 0;  -- next call will raise End_Error
>             return Result;
>          end;
>       end if;
>
>       raise Program_Error;
>    end Getline;
>
> begin
>    -- (see <ILmdnWHx29q5VMrZnZ2dnUVZ_sedn...@megapath.net> for names
>    -- of standard I/O streams when using Janus Ada on Windows.)
>
>    Stream_IO.Open (Stdout,
>      Mode => Stream_IO.Out_File,
>      Name => "/dev/stdout");
>    Stream_IO.Open (Stdin,
>      Mode => Stream_IO.In_File,
>      Name => "/dev/stdin");
>
>    -- make sure there is no line separator in `Buffer` other than the
> sentinel
>    Buffer := Buffer_Data'(others => ASCII.NUL);
>    Buffer(Buffer_Index'Last + 1 .. Buffer'Last) := Separator_Sequence;
>    Position := Buffer_Index'Last + 1;  -- See also
> `Getline.Reload.Remaining`
>    Last := 0;
> end Line_IO;
>
> --
> -- A small test program.
> --
> with Line_IO;
> with Ada.Text_IO;
>
> procedure Test_Line_IO is
>    Want_Text_IO : constant Boolean := False;
>
>    -- pick the correct one for your input files
>    UnixLF  : constant String := String'(1 => ASCII.LF);
>    MacCR   : constant String := String'(1 => ASCII.CR);
>    OS2CRLF : constant String := String'(1 => ASCII.CR, 2 => ASCII.LF);
>
>    package LIO is new Line_IO(Separator_Sequence => UnixLF);
>
> begin
>    if Want_Text_IO then
>       loop
>          declare
>             A_Line : constant String := Ada.Text_IO.Get_Line;
>          begin
>             LIO.Print(A_Line);
>             null;
>             pragma Inspection_Point(A_Line);
>          end;
>       end loop;
>    else
>       loop
>          declare
>             A_Line : constant String := LIO.Getline;
>          begin
>             LIO.Print(A_Line);
>             null;
>             pragma Inspection_Point(A_Line);
>          end;
>       end loop;
>    end if;
>
> end Test_Line_IO;

Nice one...I'll try these out on Win23 and see what happens :-)

But surely "Put_Line" and "Get_Line" are preferable subprogram
names?...

Cheers
-- Martin



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
  2009-08-31  8:28 ` Q: Line_IO Martin
@ 2009-08-31 10:05   ` Georg Bauhaus
  2009-08-31 15:33     ` Anh Vo
  0 siblings, 1 reply; 15+ messages in thread
From: Georg Bauhaus @ 2009-08-31 10:05 UTC (permalink / raw)


Martin schrieb:

> Nice one...I'll try these out on Win23 and see what happens :-)

Thanks.  As is, the program will raise NAME_ERROR on Win32: It
still seems impossible to name the standard streams for
Stream_IO.Open on Win32?  The package has worked with (other) named
files, though.

> But surely "Put_Line" and "Get_Line" are preferable subprogram
> names?...

The names Put_Line (for Print) and Get_Line (for Getline)
could suggest that these are perfect replacements.
For Print, that is basically the case for standard output,
I think.
Getline, however, needs a little more care than Get_Line
when using it.  At the moment, at least.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
  2009-08-31 10:05   ` Georg Bauhaus
@ 2009-08-31 15:33     ` Anh Vo
  2009-08-31 16:52       ` Georg Bauhaus
  0 siblings, 1 reply; 15+ messages in thread
From: Anh Vo @ 2009-08-31 15:33 UTC (permalink / raw)


On Aug 31, 3:05 am, Georg Bauhaus <rm.dash-bauh...@futureapps.de>
wrote:
> Martin schrieb:
>
> > Nice one...I'll try these out on Win23 and see what happens :-)
>
> Thanks.  As is, the program will raise NAME_ERROR on Win32: It
> still seems impossible to name the standard streams for
> Stream_IO.Open on Win32?  The package has worked with (other) named
> files, though.
>
> > But surely "Put_Line" and "Get_Line" are preferable subprogram
> > names?...
>
> The names Put_Line (for Print) and Get_Line (for Getline)
> could suggest that these are perfect replacements.
> For Print, that is basically the case for standard output,
> I think.
> Getline, however, needs a little more care than Get_Line
> when using it.  At the moment, at least.

I am curious how close when compared to GNAT.IO.Put_Line and
GNAT.IO.Get_Line. If it is close enough, I would say it is the best of
both worlds, speed and portability.

Anh Vo



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
  2009-08-31 15:33     ` Anh Vo
@ 2009-08-31 16:52       ` Georg Bauhaus
  0 siblings, 0 replies; 15+ messages in thread
From: Georg Bauhaus @ 2009-08-31 16:52 UTC (permalink / raw)


Anh Vo schrieb:

> I am curious how close when compared to GNAT.IO.Put_Line and
> GNAT.IO.Get_Line. If it is close enough, I would say it is the best of
> both worlds, speed and portability.

For reading, I don't know how to compare GNAT.IO.Get_Line.
This Get_Line seems to ignore the end of input.
AFAICS, it is implemented using C's getchar(), non-macro-versions
IIUC. Never compares the result of imported get_char <- getchar()
against C's EOF.

GNAT.IO.Put_Line seem to be slow.
It, too, ends up calling C's putchar().
In fact, it appears to be running many times more slowly than
Text_IO.Put_Line.

A few statistical results, sampled on one GNU/Linux machine.

$ gnatmake -g -O2 -gnatwa -gnatn test_line_io.adb

$ ./test_line_io < {250MB text file} > {some output file}


With Line_IO.Print and

-  Line_IO.Getline: ~3 seconds.
-  Text_IO.Get_Line: ~7.5 seconds.

With Ada.Text_IO.Put_Line and

-  Line_IO.Getline: ~21 seconds
-  Text_IO.Get_Line: ~27 seconds



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
       [not found] <4a9b045a$0$31875$9b4e6d93@newsspool3.arcor-online.net>
  2009-08-31  8:28 ` Q: Line_IO Martin
@ 2009-08-31 18:39 ` Dmitry A. Kazakov
  2009-08-31 22:51   ` Robert A Duff
  2009-08-31 23:56   ` Georg Bauhaus
  1 sibling, 2 replies; 15+ messages in thread
From: Dmitry A. Kazakov @ 2009-08-31 18:39 UTC (permalink / raw)


On Mon, 31 Aug 2009 00:59:38 +0200, Georg Bauhaus wrote:

> Text_IO seems fairly slow when just reading lines of text.
> Here are two alternative I/O subprograms for Line I/O, in plain Ada,
> based on Stream_IO.   They seem to run significantly faster.

When you print you do:

   Stream_IO.Write (Stdout, To_Bytes (Item & Separator_Sequence));

You could try not to concatenate:

   Stream_IO.Write (Stdout, To_Bytes (Item));
   Stream_IO.Write (Stdout, To_Bytes (Separator_Sequence));

, which should be faster when Item is large.

Then there is a crazy way to convert congruent types without
Unchecked_Conversion. I cannot tell whether it is actually faster:

   procedure Print (Item : String) is
      subtype Index is Stream_Element_Offset range 1..Item'Length;
      subtype XBytes is Stream_Element_Array (Index);
      Alias : XBytes;
      for Alias'Address use Item'Address;
   begin
      Stream_IO.Write (Stdout, Alias);
      ...   

P.S. The superimposed object shall not have initializers.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
  2009-08-31 18:39 ` Dmitry A. Kazakov
@ 2009-08-31 22:51   ` Robert A Duff
  2009-09-01  0:35     ` Georg Bauhaus
  2009-08-31 23:56   ` Georg Bauhaus
  1 sibling, 1 reply; 15+ messages in thread
From: Robert A Duff @ 2009-08-31 22:51 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

> Then there is a crazy way to convert congruent types without
> Unchecked_Conversion. I cannot tell whether it is actually faster:
>
>    procedure Print (Item : String) is
>       subtype Index is Stream_Element_Offset range 1..Item'Length;
>       subtype XBytes is Stream_Element_Array (Index);
>       Alias : XBytes;
>       for Alias'Address use Item'Address;
>    begin
>       Stream_IO.Write (Stdout, Alias);
>       ...   
>
> P.S. The superimposed object shall not have initializers.

If it has default initialization, you can suppress it using:

    pragma Import (Ada, Alias);

See 13.3(12.c) and B.1(38,38.a).

- Bob



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
  2009-08-31 18:39 ` Dmitry A. Kazakov
  2009-08-31 22:51   ` Robert A Duff
@ 2009-08-31 23:56   ` Georg Bauhaus
  2009-09-01  0:19     ` Georg Bauhaus
  2009-09-01  7:02     ` Ludovic Brenta
  1 sibling, 2 replies; 15+ messages in thread
From: Georg Bauhaus @ 2009-08-31 23:56 UTC (permalink / raw)


Dmitry A. Kazakov wrote:

> You could try not to concatenate:
> 
>    Stream_IO.Write (Stdout, To_Bytes (Item));
>    Stream_IO.Write (Stdout, To_Bytes (Separator_Sequence));
> 
> , which should be faster when Item is large.

Yes, though according to some measurements that have been made
in the recent past, "&" is faster for "normal" sized lines.
Other sizes did not produce stable results (on my machine at least).
A test case is in
<4a7bebaa$0$30224$9b4e6d93@newsspool1.arcor-online.net>

However, the two calls are more general, so perhaps they
should replace the concatenation.  And they seem to make
using 'Address be simpler, below---

> Then there is a crazy way to convert congruent types without
> Unchecked_Conversion. I cannot tell whether it is actually faster:

Since 'Address is used for reading anyway, and since, yes,
it is faster, it could replace the unchecked conversion.
Is there a risk with function parameters, not objects of
"better known" storage places?

New version below. If you want to see the difference between
Unchecked_Conversion and 'Address, rename either Print_1 (old)
to Put_Line or Print_2 (new, 'Address) to the same.

> P.S. The superimposed object shall not have initializers.

Does this apply to String parameters?

generic
   Separator_Sequence : in String;  --  ends a line
package Line_IO is

   pragma Elaborate_Body;

   --
   --  High(er) speed reading and writing of lines via Stream I/O.
   --  Made with Unix pipes in mind.
   --
   --  Assumptions:
   --  - Lines are separated by a sequence of characters.
   --  - Characters and stream elements can be used interchangeably.
   --  - Lines are not longer than internal buffer size.
   --
   --  I/O exceptions are propagated

   procedure Put_Line(Item : String);

   function Get_Line return String;

end Line_IO;


with Ada.Streams.Stream_IO;
with Ada.Unchecked_Conversion;

package body Line_IO is

   use Ada.Streams;

   Stdout : Stream_IO.File_Type;
   Stdin : Stream_IO.File_Type;

   -- writing

   procedure Print_1 (Item : String) is

      subtype Index is Stream_Element_Offset range
        Stream_Element_Offset(Item'First)
        .. Stream_Element_Offset(Item'Last + Separator_Sequence'Length);
      subtype XString is String (Item'First
        .. Item'Last + Separator_Sequence'Length);
      subtype XBytes is Stream_Element_Array (Index);
      function To_Bytes is new Ada.Unchecked_Conversion
        (Source => XString,
         Target => XBytes);
   begin
      Stream_IO.Write (Stdout, To_Bytes (Item & Separator_Sequence));
   end Print_1;

   -- Alternative:
   -- - call Stream_IO.Write twice, once for the string, then for the
   --   line separator (terminator)
   -- - specify 'Address, not unchecked_conversion is needed then

   -- We need the separator as a Stream_Element_Array. (Can we
   -- use 'Address on a generic formal object?  If so, then
   -- again, no Unchecked_Conversion is needed (advantage?))

   subtype Sep_String is String(Separator_Sequence'Range);
   subtype Sep_Bytes is Stream_Element_Array
     (Stream_Element_Offset(Separator_Sequence'First)
     .. Stream_Element_Offset(Separator_Sequence'Last));

   function To_Bytes is new Ada.Unchecked_Conversion
     (Source => Sep_String,
      Target => Sep_Bytes);

   Separator_Bytes : constant Stream_Element_Array :=
     To_Bytes(Separator_Sequence);

   procedure Print_2 (Item : String) is
      subtype Index is Stream_Element_Offset range
        Stream_Element_Offset(Item'First)
        .. Stream_Element_Offset(Item'Last);
      subtype XBytes is Stream_Element_Array (Index);
      Item_Bytes: XBytes;
      for Item_Bytes'Address use Item'Address;
   begin
      Stream_IO.Write (Stdout, Item_Bytes);
      Stream_IO.Write (Stdout, Separator_Bytes);
   end Print_2;

   procedure Put_Line (Item : String) renames Print_2;

   -- ----------------
   -- reading
   -- ----------------
   -- Types etc., status variables, and the buffer.  `Buffer` is at the
   -- same time an array of Character and and array of Stream_Element
   -- called `Bytes`.  They share the same address.  This setup makes the
   -- storage at the address either a String (when selecting result
   -- characters) or a Stream_Element_Array (when reading input bytes).

   BUFSIZ: constant := 8_192;
   pragma Assert(Character'Size = Stream_Element'Size);

   SL : constant Natural := Separator_Sequence'Length;

   subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL;
   subtype Buffer_Index is Extended_Buffer_Index
     range Extended_Buffer_Index'First .. Extended_Buffer_Index'Last - SL;
   subtype Extended_Bytes_Index is Stream_Element_Offset
     range 1 .. Stream_Element_Offset(Extended_Buffer_Index'Last);
   subtype Bytes_Index is Extended_Bytes_Index
     range Extended_Bytes_Index'First
     .. (Extended_Bytes_Index'Last - Stream_Element_Offset(SL));

   subtype Buffer_Data is String(Extended_Buffer_Index);
   subtype Buffer_Bytes is Stream_Element_Array(Extended_Bytes_Index);

   Buffer : Buffer_Data;
   Bytes  : Buffer_Bytes;
   for Bytes'Address use Buffer'Address;

   Position : Natural; -- start of next substring
   Last     : Natural; -- last valid character in buffer


   function Get_Line return String is

      procedure Reload;
      --  move remaining characters to the start of `Buffer` and
      --  fill the following bytes if possible
      --  post: Position in 0 .. 1, and 0 should mean end of file
      --        Last is 0 or else the index of the last valid element in
Buffer

      procedure Reload is
         Remaining : constant Natural := Buffer_Index'Last - Position + 1;
         Last_Index : Stream_Element_Offset;
      begin
         Buffer(1 .. Remaining) := Buffer(Position .. Buffer_Index'Last);

         Stream_IO.Read(Stdin,
           Item => Bytes(Stream_Element_Offset(Remaining) + 1 ..
Bytes_Index'Last),
           Last => Last_Index);
         Last := Natural(Last_Index);
         Buffer(Last + 1 .. Last + SL) := Separator_Sequence;

         Position := Boolean'Pos(Last_Index > 0
           and then Buffer(1) /= ASCII.EOT   -- ^D
           and then Buffer(1) /= ASCII.SUB); -- ^Z

      end Reload;

      function Sep_Index return Natural;
      --  position of next Separator_Sequence
      pragma Inline(Sep_Index);

      function Sep_Index return Natural is
         K : Natural := Position;
      begin
         pragma Assert(K >= Buffer'First);
         pragma Assert(Buffer(Buffer_Index'Last + 1 .. Buffer'Last)
           = Separator_Sequence);

         while Buffer(K) /= Separator_Sequence(1) loop
            K := K + 1;
         end loop;

         return K;
      end Sep_Index;

      Next_Separator : Natural;
   begin  -- Get_Line
      pragma Assert(Position = 0 or else Position in Extended_Buffer_Index);
      pragma Assert(Last = 0 or else Last in Buffer_Index);

      if Position = 0 then
         raise Stream_IO.End_Error;
      end if;

      Next_Separator := Sep_Index;

      if Next_Separator > Buffer_Index'Last then
         -- must be sentinel
         Reload;
         return Get_Line;
      end if;

      if Next_Separator <= Last then
         declare
            Limit : constant Natural := Natural'Max(0, Next_Separator - SL);
            -- there was trouble (Print) when Integer Limit could be
negative
            -- (for 2-char SL and Next_Separator = 1)
            Result : constant String := Buffer(Position .. Limit);
         begin
            Position := Limit + SL + 1;
            return Result;
         end;
      else
         -- the separator is among the characters beyond `Last`
         declare
            Limit : constant Positive := Last;
            Result : constant String := Buffer(Position .. Limit);
         begin
            --  -- makes the spurious line go away
            --  -- But make sure that it isn't cause by Put_Line!
            if Position > Last then
               raise Stream_IO.End_Error;
            end if;
            Position := 0;  -- next call will raise End_Error
            return Result;
         end;
      end if;

      raise Program_Error;
   end Get_Line;


begin
   -- (see <ILmdnWHx29q5VMrZnZ2dnUVZ_sednZ2d@megapath.net> for names
   -- of standard I/O streams when using Janus Ada on Windows.)

   Stream_IO.Open (Stdout,
     Mode => Stream_IO.Out_File,
     Name => "/dev/stdout");
   Stream_IO.Open (Stdin,
     Mode => Stream_IO.In_File,
     Name => "/dev/stdin");

   -- make sure there is no line separator in `Buffer` other than the
sentinel
   Buffer := Buffer_Data'(others => ASCII.NUL);
   Buffer(Buffer_Index'Last + 1 .. Buffer'Last) := Separator_Sequence;
   Position := Buffer_Index'Last + 1;  -- See also
`Getline.Reload.Remaining`
   Last := 0;
end Line_IO;



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
  2009-08-31 23:56   ` Georg Bauhaus
@ 2009-09-01  0:19     ` Georg Bauhaus
  2009-09-01  1:08       ` Robert A Duff
  2009-09-01  7:02     ` Ludovic Brenta
  1 sibling, 1 reply; 15+ messages in thread
From: Georg Bauhaus @ 2009-09-01  0:19 UTC (permalink / raw)


Georg Bauhaus wrote:
>    procedure Print_2 (Item : String) is
>       subtype Index is Stream_Element_Offset range
>         Stream_Element_Offset(Item'First)
>         .. Stream_Element_Offset(Item'Last);
>       subtype XBytes is Stream_Element_Array (Index);
>       Item_Bytes: XBytes;
>       for Item_Bytes'Address use Item'Address;
>    begin
>       Stream_IO.Write (Stdout, Item_Bytes);
>       Stream_IO.Write (Stdout, Separator_Bytes);
>    end Print_2;

*** line_io.ada	old
--- line_io.ada	new
***************
*** 78,79 ****
--- 78,80 ----
        Item_Bytes: XBytes;
+       pragma Import (Ada, Item_Bytes);
        for Item_Bytes'Address use Item'Address;



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
  2009-08-31 22:51   ` Robert A Duff
@ 2009-09-01  0:35     ` Georg Bauhaus
  0 siblings, 0 replies; 15+ messages in thread
From: Georg Bauhaus @ 2009-09-01  0:35 UTC (permalink / raw)


Robert A Duff wrote:
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:

>>    procedure Print (Item : String) is
>>       subtype Index is Stream_Element_Offset range 1..Item'Length;
>>       subtype XBytes is Stream_Element_Array (Index);
>>       Alias : XBytes;
>>       for Alias'Address use Item'Address;
>>    begin
>>       Stream_IO.Write (Stdout, Alias);
>>       ...   
>>
>> P.S. The superimposed object shall not have initializers.
> 
> If it has default initialization, you can suppress it using:
> 
>     pragma Import (Ada, Alias);
> 
> See 13.3(12.c) and B.1(38,38.a).

I guess the "superimposed" object is Item, superimposed onto Alias?



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
  2009-09-01  0:19     ` Georg Bauhaus
@ 2009-09-01  1:08       ` Robert A Duff
  0 siblings, 0 replies; 15+ messages in thread
From: Robert A Duff @ 2009-09-01  1:08 UTC (permalink / raw)


Georg Bauhaus <see.reply.to@maps.futureapps.de> writes:

> Georg Bauhaus wrote:
>>    procedure Print_2 (Item : String) is
>>       subtype Index is Stream_Element_Offset range
>>         Stream_Element_Offset(Item'First)
>>         .. Stream_Element_Offset(Item'Last);
>>       subtype XBytes is Stream_Element_Array (Index);
>>       Item_Bytes: XBytes;
>>       for Item_Bytes'Address use Item'Address;
>>    begin
>>       Stream_IO.Write (Stdout, Item_Bytes);
>>       Stream_IO.Write (Stdout, Separator_Bytes);
>>    end Print_2;
>
> *** line_io.ada	old
> --- line_io.ada	new
> ***************
> *** 78,79 ****
> --- 78,80 ----
>         Item_Bytes: XBytes;
> +       pragma Import (Ada, Item_Bytes);
>         for Item_Bytes'Address use Item'Address;

The Import is not strictly necessary, because Stream_Element_Array has
no default initialization.  But it's still good style -- it says, the
declaration of Item_Bytes is not creating a new object, it's just
overlaying an old one.

If Item_Bytes had default inits (e.g. if it were an array of access
values, which are default-initialized to null, or an array of records
with some defaulted components), then the Import would be necessary.
I think in that case, GNAT warns, because without the Import, the
default inits will overwrite Item, which is certainly not what you
want.

- Bob



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
  2009-08-31 23:56   ` Georg Bauhaus
  2009-09-01  0:19     ` Georg Bauhaus
@ 2009-09-01  7:02     ` Ludovic Brenta
  2009-09-01  9:55       ` Georg Bauhaus
                         ` (3 more replies)
  1 sibling, 4 replies; 15+ messages in thread
From: Ludovic Brenta @ 2009-09-01  7:02 UTC (permalink / raw)


Georg Bauhaus wrote on comp.lang.ada:
>    BUFSIZ: constant := 8_192;
[...]
>    SL : constant Natural := Separator_Sequence'Length;
>    subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL;

Since BUFSIZ is obviously chosen as an integral number of hardware
memory pages, the extended_buffer uses two pages plus two bytes. How
about allocating a buffer of BUFSIZ bytes and using only BUFSIZ-SL for
the string and the remaining SL bytes for the terminator?

I realize that at this point we're down to nitpicking because the
program seems really good and fast now.

--
Ludovic Brenta.




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
  2009-09-01  7:02     ` Ludovic Brenta
@ 2009-09-01  9:55       ` Georg Bauhaus
  2009-09-01 12:03       ` jonathan
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 15+ messages in thread
From: Georg Bauhaus @ 2009-09-01  9:55 UTC (permalink / raw)


Ludovic Brenta schrieb:
> Georg Bauhaus wrote on comp.lang.ada:
>>    BUFSIZ: constant := 8_192;
> [...]
>>    SL : constant Natural := Separator_Sequence'Length;
>>    subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL;
> 
> Since BUFSIZ is obviously chosen as an integral number of hardware
> memory pages, the extended_buffer uses two pages plus two bytes. How
> about allocating a buffer of BUFSIZ bytes and using only BUFSIZ-SL for
> the string and the remaining SL bytes for the terminator?

I had made the buffer have BUFSIZ + Separator_Sequence'Length
elements because Stream_IO.Read would then have BUFSIZ bytes
into which to store its data.  I was only guessing that this
would matter; indeed, it is somewhat faster than using BUFSIZ = 128.
But growing beyond 8192 did not have an effect.  I'll try others.

Another thing:
The names look bit bulky, at least from a "molecular source
pattern" matching point of view.  But I can't produce better
short and meaningful names that are still Ada.  Is it enough
to add a few empty lines?  Ideas welcome.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
  2009-09-01  7:02     ` Ludovic Brenta
  2009-09-01  9:55       ` Georg Bauhaus
@ 2009-09-01 12:03       ` jonathan
       [not found]       ` <4a9e2c86$0$30235$9b4e6d93@newsspool1.arcor-online.net>
  2009-09-05 20:30       ` Georg Bauhaus
  3 siblings, 0 replies; 15+ messages in thread
From: jonathan @ 2009-09-01 12:03 UTC (permalink / raw)


On Sep 1, 8:02 am, Ludovic Brenta <ludo...@ludovic-brenta.org> wrote:
> Georg Bauhaus wrote on comp.lang.ada:
>
> >    BUFSIZ: constant := 8_192;
> [...]
> >    SL : constant Natural := Separator_Sequence'Length;
> >    subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL;
>
> Since BUFSIZ is obviously chosen as an integral number of hardware
> memory pages, the extended_buffer uses two pages plus two bytes. How
> about allocating a buffer of BUFSIZ bytes and using only BUFSIZ-SL for
> the string and the remaining SL bytes for the terminator?
>
> I realize that at this point we're down to nitpicking because the
> program seems really good and fast now.
>
> --
> Ludovic Brenta.

A few benchmark timings:

I updated a version of knucleotide.adb with the new get_line.
IO overhead fell from 3.6 sec on my machine, to 1.2 sec.
It now reads and stores (half of) the 250 MB text file in about
the same time as my vim editor. Very nice result, especially
for the multitasking program, which can parallelize everything
except IO.

Jonathan




^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
       [not found]       ` <4a9e2c86$0$30235$9b4e6d93@newsspool1.arcor-online.net>
@ 2009-09-02  8:47         ` Georg Bauhaus
  0 siblings, 0 replies; 15+ messages in thread
From: Georg Bauhaus @ 2009-09-02  8:47 UTC (permalink / raw)


(I'll switch News readers back to Emacs, unless
I can find out how to add a QP piece of text
to a message in Thunderbird; sorry if the attachment
is inconvenient.)



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Q: Line_IO
  2009-09-01  7:02     ` Ludovic Brenta
                         ` (2 preceding siblings ...)
       [not found]       ` <4a9e2c86$0$30235$9b4e6d93@newsspool1.arcor-online.net>
@ 2009-09-05 20:30       ` Georg Bauhaus
  3 siblings, 0 replies; 15+ messages in thread
From: Georg Bauhaus @ 2009-09-05 20:30 UTC (permalink / raw)


Ludovic Brenta wrote:

> I realize that at this point we're down to nitpicking because the
> program seems really good and fast now.

It is now somewhat more correct, too:
http://home.arcor.de/bauhaus/Ada/line_io.ada

It is probably worth noting that specifying Buffer'Alignment
was not a good idea, slowed down Get_Line.
(And ObjectAda is happy with an alignment number like 8,
not BUFSIZ, anyway.)



^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-09-05 20:30 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4a9b045a$0$31875$9b4e6d93@newsspool3.arcor-online.net>
2009-08-31  8:28 ` Q: Line_IO Martin
2009-08-31 10:05   ` Georg Bauhaus
2009-08-31 15:33     ` Anh Vo
2009-08-31 16:52       ` Georg Bauhaus
2009-08-31 18:39 ` Dmitry A. Kazakov
2009-08-31 22:51   ` Robert A Duff
2009-09-01  0:35     ` Georg Bauhaus
2009-08-31 23:56   ` Georg Bauhaus
2009-09-01  0:19     ` Georg Bauhaus
2009-09-01  1:08       ` Robert A Duff
2009-09-01  7:02     ` Ludovic Brenta
2009-09-01  9:55       ` Georg Bauhaus
2009-09-01 12:03       ` jonathan
     [not found]       ` <4a9e2c86$0$30235$9b4e6d93@newsspool1.arcor-online.net>
2009-09-02  8:47         ` Georg Bauhaus
2009-09-05 20:30       ` Georg Bauhaus

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox