comp.lang.ada
 help / color / mirror / Atom feed
From: Martin <martin.dowie@btopenworld.com>
Subject: Re: Q: Line_IO
Date: Mon, 31 Aug 2009 01:28:36 -0700 (PDT)
Date: 2009-08-31T01:28:36-07:00	[thread overview]
Message-ID: <7225bda9-8757-4c5c-bb44-b3be21a1e1f9@p36g2000vbn.googlegroups.com> (raw)
In-Reply-To: 4a9b045a$0$31875$9b4e6d93@newsspool3.arcor-online.net

On Aug 30, 11:59 pm, Georg Bauhaus <see.reply...@maps.futureapps.de>
wrote:
> Text_IO seems fairly slow when just reading lines of text.
> Here are two alternative I/O subprograms for Line I/O, in plain Ada,
> based on Stream_IO.   They seem to run significantly faster.
>
> However, there is one glitch and I can't find the cause:
> output always has one more line at the end, an empty one.
> Why?  If you have got a minute to look at this, you will
> also help us with getting faster programs at the Shootout.
> These read lines by the megabyte.
>
> generic
>    Separator_Sequence : in String;  --  ends a line
> package Line_IO is
>
>    pragma Elaborate_Body;
>
>    --
>    --  High(er) speed reading and writing of lines via Stream I/O.
>    --  Made with Unix pipes in mind.
>    --
>    --  Assumptions:
>    --  - Lines are separated by a sequence of characters.
>    --  - Characters and stream elements can be used interchangeably.
>    --  - Lines are not longer than internal buffer size.
>    --
>    --  I/O exceptions are propagated
>
>    procedure Print(Item : String);
>
>    function Getline return String;
>
> end Line_IO;
>
> with Ada.Streams.Stream_IO;
> with Ada.Unchecked_Conversion;
>
> package body Line_IO is
>
>    use Ada.Streams;
>
>    Stdout : Stream_IO.File_Type;
>    Stdin : Stream_IO.File_Type;
>
>    -- writing
>
>    procedure Print (Item : String) is
>
>       subtype Index is Stream_Element_Offset range
>         Stream_Element_Offset(Item'First)
>         .. Stream_Element_Offset(Item'Last + Separator_Sequence'Length);
>       subtype XString is String (Item'First
>         .. Item'Last + Separator_Sequence'Length);
>       subtype XBytes is Stream_Element_Array (Index);
>       function To_Bytes is new Ada.Unchecked_Conversion
>         (Source => XString,
>          Target => XBytes);
>    begin
>       Stream_IO.Write (Stdout, To_Bytes (Item & Separator_Sequence));
>    end Print;
>
>    -- ----------------
>    -- reading
>    -- ----------------
>    -- Types etc., status variables, and the buffer.  `Buffer` is at the
>    -- same time an array of Character and and array of Stream_Element
>    -- called `Bytes`.  They share the same address.  This setup makes the
>    -- storage at the address either a String (when selecting result
>    -- characters) or a Stream_Element_Array (when reading input bytes).
>
>    BUFSIZ: constant := 8_192;
>    pragma Assert(Character'Size = Stream_Element'Size);
>
>    SL : constant Natural := Separator_Sequence'Length;
>
>    subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL;
>    subtype Buffer_Index is Extended_Buffer_Index
>      range Extended_Buffer_Index'First .. Extended_Buffer_Index'Last - SL;
>    subtype Extended_Bytes_Index is Stream_Element_Offset
>      range 1 .. Stream_Element_Offset(Extended_Buffer_Index'Last);
>    subtype Bytes_Index is Extended_Bytes_Index
>      range Extended_Bytes_Index'First
>      .. (Extended_Bytes_Index'Last - Stream_Element_Offset(SL));
>
>    subtype Buffer_Data is String(Extended_Buffer_Index);
>    subtype Buffer_Bytes is Stream_Element_Array(Extended_Bytes_Index);
>
>    Buffer : Buffer_Data;
>    Bytes  : Buffer_Bytes;
>    for Bytes'Address use Buffer'Address;
>
>    Position : Natural; -- start of next substring
>    Last     : Natural; -- last valid character in buffer
>
>    function Getline return String is
>
>       procedure Reload;
>       --  move remaining characters to the start of `Buffer` and
>       --  fill the following bytes if possible
>       --  post: Position in 0 .. 1, and 0 should mean end of file
>       --        Last is 0 or else the index of the last valid element in
> Buffer
>
>       procedure Reload is
>          Remaining : constant Natural := Buffer_Index'Last - Position + 1;
>          Last_Index : Stream_Element_Offset;
>       begin
>          Buffer(1 .. Remaining) := Buffer(Position .. Buffer_Index'Last);
>
>          Stream_IO.Read(Stdin,
>            Item => Bytes(Stream_Element_Offset(Remaining) + 1 ..
> Bytes_Index'Last),
>                         Last => Last_Index);
>          Last := Natural(Last_Index);
>          Buffer(Last + 1 .. Last + SL) := Separator_Sequence;
>
>          Position := Boolean'Pos(Last_Index > 0
>            and then Buffer(1) /= ASCII.EOT   -- ^D
>            and then Buffer(1) /= ASCII.SUB); -- ^Z
>
>       end Reload;
>
>       function Sep_Index return Natural;
>       --  position of next Separator_Sequence
>       pragma Inline(Sep_Index);
>
>       function Sep_Index return Natural is
>          K : Natural := Position;
>       begin
>          pragma Assert(K >= Buffer'First);
>          pragma Assert(Buffer(Buffer_Index'Last + 1 .. Buffer'Last)
>            = Separator_Sequence);
>
>          while Buffer(K) /= Separator_Sequence(1) loop
>             K := K + 1;
>          end loop;
>
>          return K;
>       end Sep_Index;
>
>       Next_Separator : Natural;
>    begin  -- Getline
>       pragma Assert(Position = 0 or else Position in Extended_Buffer_Index);
>       pragma Assert(Last = 0 or else Last in Buffer_Index);
>
>       if Position = 0 then
>          raise Stream_IO.End_Error;
>       end if;
>
>       Next_Separator := Sep_Index;
>
>       if Next_Separator > Buffer_Index'Last then
>          -- must be sentinel
>          Reload;
>          return Getline;
>       end if;
>
>       if Next_Separator <= Last then
>          declare
>             Limit : constant Natural := Natural'Max(0, Next_Separator - SL);
>             -- there was trouble (Print) when Integer Limit could be
> negative
>             -- (for 2-char SL and Next_Separator = 1)
>             Result : constant String := Buffer(Position .. Limit);
>          begin
>             Position := Limit + SL + 1;
>             return Result;
>          end;
>       else
>          -- the separator is among the characters beyond `Last`
>          declare
>             Limit : constant Positive := Last;
>             Result : constant String := Buffer(Position .. Limit);
>          begin
>             Position := 0;  -- next call will raise End_Error
>             return Result;
>          end;
>       end if;
>
>       raise Program_Error;
>    end Getline;
>
> begin
>    -- (see <ILmdnWHx29q5VMrZnZ2dnUVZ_sedn...@megapath.net> for names
>    -- of standard I/O streams when using Janus Ada on Windows.)
>
>    Stream_IO.Open (Stdout,
>      Mode => Stream_IO.Out_File,
>      Name => "/dev/stdout");
>    Stream_IO.Open (Stdin,
>      Mode => Stream_IO.In_File,
>      Name => "/dev/stdin");
>
>    -- make sure there is no line separator in `Buffer` other than the
> sentinel
>    Buffer := Buffer_Data'(others => ASCII.NUL);
>    Buffer(Buffer_Index'Last + 1 .. Buffer'Last) := Separator_Sequence;
>    Position := Buffer_Index'Last + 1;  -- See also
> `Getline.Reload.Remaining`
>    Last := 0;
> end Line_IO;
>
> --
> -- A small test program.
> --
> with Line_IO;
> with Ada.Text_IO;
>
> procedure Test_Line_IO is
>    Want_Text_IO : constant Boolean := False;
>
>    -- pick the correct one for your input files
>    UnixLF  : constant String := String'(1 => ASCII.LF);
>    MacCR   : constant String := String'(1 => ASCII.CR);
>    OS2CRLF : constant String := String'(1 => ASCII.CR, 2 => ASCII.LF);
>
>    package LIO is new Line_IO(Separator_Sequence => UnixLF);
>
> begin
>    if Want_Text_IO then
>       loop
>          declare
>             A_Line : constant String := Ada.Text_IO.Get_Line;
>          begin
>             LIO.Print(A_Line);
>             null;
>             pragma Inspection_Point(A_Line);
>          end;
>       end loop;
>    else
>       loop
>          declare
>             A_Line : constant String := LIO.Getline;
>          begin
>             LIO.Print(A_Line);
>             null;
>             pragma Inspection_Point(A_Line);
>          end;
>       end loop;
>    end if;
>
> end Test_Line_IO;

Nice one...I'll try these out on Win23 and see what happens :-)

But surely "Put_Line" and "Get_Line" are preferable subprogram
names?...

Cheers
-- Martin



       reply	other threads:[~2009-08-31  8:28 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <4a9b045a$0$31875$9b4e6d93@newsspool3.arcor-online.net>
2009-08-31  8:28 ` Martin [this message]
2009-08-31 10:05   ` Q: Line_IO Georg Bauhaus
2009-08-31 15:33     ` Anh Vo
2009-08-31 16:52       ` Georg Bauhaus
2009-08-31 18:39 ` Dmitry A. Kazakov
2009-08-31 22:51   ` Robert A Duff
2009-09-01  0:35     ` Georg Bauhaus
2009-08-31 23:56   ` Georg Bauhaus
2009-09-01  0:19     ` Georg Bauhaus
2009-09-01  1:08       ` Robert A Duff
2009-09-01  7:02     ` Ludovic Brenta
2009-09-01  9:55       ` Georg Bauhaus
2009-09-01 12:03       ` jonathan
     [not found]       ` <4a9e2c86$0$30235$9b4e6d93@newsspool1.arcor-online.net>
2009-09-02  8:47         ` Georg Bauhaus
2009-09-05 20:30       ` Georg Bauhaus
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox