From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,c19e8df8a75221d0 X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,ASCII Path: g2news2.google.com!postnews.google.com!p36g2000vbn.googlegroups.com!not-for-mail From: Martin Newsgroups: comp.lang.ada Subject: Re: Q: Line_IO Date: Mon, 31 Aug 2009 01:28:36 -0700 (PDT) Organization: http://groups.google.com Message-ID: <7225bda9-8757-4c5c-bb44-b3be21a1e1f9@p36g2000vbn.googlegroups.com> References: <4a9b045a$0$31875$9b4e6d93@newsspool3.arcor-online.net> NNTP-Posting-Host: 20.133.0.8 Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1251707317 20579 127.0.0.1 (31 Aug 2009 08:28:37 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Mon, 31 Aug 2009 08:28:37 +0000 (UTC) Complaints-To: groups-abuse@google.com Injection-Info: p36g2000vbn.googlegroups.com; posting-host=20.133.0.8; posting-account=g4n69woAAACHKbpceNrvOhHWViIbdQ9G User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2,gzip(gfe),gzip(gfe) Xref: g2news2.google.com comp.lang.ada:8070 Date: 2009-08-31T01:28:36-07:00 List-Id: On Aug 30, 11:59=A0pm, Georg Bauhaus wrote: > Text_IO seems fairly slow when just reading lines of text. > Here are two alternative I/O subprograms for Line I/O, in plain Ada, > based on Stream_IO. =A0 They seem to run significantly faster. > > However, there is one glitch and I can't find the cause: > output always has one more line at the end, an empty one. > Why? =A0If you have got a minute to look at this, you will > also help us with getting faster programs at the Shootout. > These read lines by the megabyte. > > generic > =A0 =A0Separator_Sequence : in String; =A0-- =A0ends a line > package Line_IO is > > =A0 =A0pragma Elaborate_Body; > > =A0 =A0-- > =A0 =A0-- =A0High(er) speed reading and writing of lines via Stream I/O. > =A0 =A0-- =A0Made with Unix pipes in mind. > =A0 =A0-- > =A0 =A0-- =A0Assumptions: > =A0 =A0-- =A0- Lines are separated by a sequence of characters. > =A0 =A0-- =A0- Characters and stream elements can be used interchangeably= . > =A0 =A0-- =A0- Lines are not longer than internal buffer size. > =A0 =A0-- > =A0 =A0-- =A0I/O exceptions are propagated > > =A0 =A0procedure Print(Item : String); > > =A0 =A0function Getline return String; > > end Line_IO; > > with Ada.Streams.Stream_IO; > with Ada.Unchecked_Conversion; > > package body Line_IO is > > =A0 =A0use Ada.Streams; > > =A0 =A0Stdout : Stream_IO.File_Type; > =A0 =A0Stdin : Stream_IO.File_Type; > > =A0 =A0-- writing > > =A0 =A0procedure Print (Item : String) is > > =A0 =A0 =A0 subtype Index is Stream_Element_Offset range > =A0 =A0 =A0 =A0 Stream_Element_Offset(Item'First) > =A0 =A0 =A0 =A0 .. Stream_Element_Offset(Item'Last + Separator_Sequence'L= ength); > =A0 =A0 =A0 subtype XString is String (Item'First > =A0 =A0 =A0 =A0 .. Item'Last + Separator_Sequence'Length); > =A0 =A0 =A0 subtype XBytes is Stream_Element_Array (Index); > =A0 =A0 =A0 function To_Bytes is new Ada.Unchecked_Conversion > =A0 =A0 =A0 =A0 (Source =3D> XString, > =A0 =A0 =A0 =A0 =A0Target =3D> XBytes); > =A0 =A0begin > =A0 =A0 =A0 Stream_IO.Write (Stdout, To_Bytes (Item & Separator_Sequence)= ); > =A0 =A0end Print; > > =A0 =A0-- ---------------- > =A0 =A0-- reading > =A0 =A0-- ---------------- > =A0 =A0-- Types etc., status variables, and the buffer. =A0`Buffer` is at= the > =A0 =A0-- same time an array of Character and and array of Stream_Element > =A0 =A0-- called `Bytes`. =A0They share the same address. =A0This setup m= akes the > =A0 =A0-- storage at the address either a String (when selecting result > =A0 =A0-- characters) or a Stream_Element_Array (when reading input bytes= ). > > =A0 =A0BUFSIZ: constant :=3D 8_192; > =A0 =A0pragma Assert(Character'Size =3D Stream_Element'Size); > > =A0 =A0SL : constant Natural :=3D Separator_Sequence'Length; > > =A0 =A0subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL; > =A0 =A0subtype Buffer_Index is Extended_Buffer_Index > =A0 =A0 =A0range Extended_Buffer_Index'First .. Extended_Buffer_Index'Las= t - SL; > =A0 =A0subtype Extended_Bytes_Index is Stream_Element_Offset > =A0 =A0 =A0range 1 .. Stream_Element_Offset(Extended_Buffer_Index'Last); > =A0 =A0subtype Bytes_Index is Extended_Bytes_Index > =A0 =A0 =A0range Extended_Bytes_Index'First > =A0 =A0 =A0.. (Extended_Bytes_Index'Last - Stream_Element_Offset(SL)); > > =A0 =A0subtype Buffer_Data is String(Extended_Buffer_Index); > =A0 =A0subtype Buffer_Bytes is Stream_Element_Array(Extended_Bytes_Index)= ; > > =A0 =A0Buffer : Buffer_Data; > =A0 =A0Bytes =A0: Buffer_Bytes; > =A0 =A0for Bytes'Address use Buffer'Address; > > =A0 =A0Position : Natural; -- start of next substring > =A0 =A0Last =A0 =A0 : Natural; -- last valid character in buffer > > =A0 =A0function Getline return String is > > =A0 =A0 =A0 procedure Reload; > =A0 =A0 =A0 -- =A0move remaining characters to the start of `Buffer` and > =A0 =A0 =A0 -- =A0fill the following bytes if possible > =A0 =A0 =A0 -- =A0post: Position in 0 .. 1, and 0 should mean end of file > =A0 =A0 =A0 -- =A0 =A0 =A0 =A0Last is 0 or else the index of the last val= id element in > Buffer > > =A0 =A0 =A0 procedure Reload is > =A0 =A0 =A0 =A0 =A0Remaining : constant Natural :=3D Buffer_Index'Last - = Position + 1; > =A0 =A0 =A0 =A0 =A0Last_Index : Stream_Element_Offset; > =A0 =A0 =A0 begin > =A0 =A0 =A0 =A0 =A0Buffer(1 .. Remaining) :=3D Buffer(Position .. Buffer_= Index'Last); > > =A0 =A0 =A0 =A0 =A0Stream_IO.Read(Stdin, > =A0 =A0 =A0 =A0 =A0 =A0Item =3D> Bytes(Stream_Element_Offset(Remaining) += 1 .. > Bytes_Index'Last), > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Last =3D> Last_Index); > =A0 =A0 =A0 =A0 =A0Last :=3D Natural(Last_Index); > =A0 =A0 =A0 =A0 =A0Buffer(Last + 1 .. Last + SL) :=3D Separator_Sequence; > > =A0 =A0 =A0 =A0 =A0Position :=3D Boolean'Pos(Last_Index > 0 > =A0 =A0 =A0 =A0 =A0 =A0and then Buffer(1) /=3D ASCII.EOT =A0 -- ^D > =A0 =A0 =A0 =A0 =A0 =A0and then Buffer(1) /=3D ASCII.SUB); -- ^Z > > =A0 =A0 =A0 end Reload; > > =A0 =A0 =A0 function Sep_Index return Natural; > =A0 =A0 =A0 -- =A0position of next Separator_Sequence > =A0 =A0 =A0 pragma Inline(Sep_Index); > > =A0 =A0 =A0 function Sep_Index return Natural is > =A0 =A0 =A0 =A0 =A0K : Natural :=3D Position; > =A0 =A0 =A0 begin > =A0 =A0 =A0 =A0 =A0pragma Assert(K >=3D Buffer'First); > =A0 =A0 =A0 =A0 =A0pragma Assert(Buffer(Buffer_Index'Last + 1 .. Buffer'L= ast) > =A0 =A0 =A0 =A0 =A0 =A0=3D Separator_Sequence); > > =A0 =A0 =A0 =A0 =A0while Buffer(K) /=3D Separator_Sequence(1) loop > =A0 =A0 =A0 =A0 =A0 =A0 K :=3D K + 1; > =A0 =A0 =A0 =A0 =A0end loop; > > =A0 =A0 =A0 =A0 =A0return K; > =A0 =A0 =A0 end Sep_Index; > > =A0 =A0 =A0 Next_Separator : Natural; > =A0 =A0begin =A0-- Getline > =A0 =A0 =A0 pragma Assert(Position =3D 0 or else Position in Extended_Buf= fer_Index); > =A0 =A0 =A0 pragma Assert(Last =3D 0 or else Last in Buffer_Index); > > =A0 =A0 =A0 if Position =3D 0 then > =A0 =A0 =A0 =A0 =A0raise Stream_IO.End_Error; > =A0 =A0 =A0 end if; > > =A0 =A0 =A0 Next_Separator :=3D Sep_Index; > > =A0 =A0 =A0 if Next_Separator > Buffer_Index'Last then > =A0 =A0 =A0 =A0 =A0-- must be sentinel > =A0 =A0 =A0 =A0 =A0Reload; > =A0 =A0 =A0 =A0 =A0return Getline; > =A0 =A0 =A0 end if; > > =A0 =A0 =A0 if Next_Separator <=3D Last then > =A0 =A0 =A0 =A0 =A0declare > =A0 =A0 =A0 =A0 =A0 =A0 Limit : constant Natural :=3D Natural'Max(0, Next= _Separator - SL); > =A0 =A0 =A0 =A0 =A0 =A0 -- there was trouble (Print) when Integer Limit c= ould be > negative > =A0 =A0 =A0 =A0 =A0 =A0 -- (for 2-char SL and Next_Separator =3D 1) > =A0 =A0 =A0 =A0 =A0 =A0 Result : constant String :=3D Buffer(Position .. = Limit); > =A0 =A0 =A0 =A0 =A0begin > =A0 =A0 =A0 =A0 =A0 =A0 Position :=3D Limit + SL + 1; > =A0 =A0 =A0 =A0 =A0 =A0 return Result; > =A0 =A0 =A0 =A0 =A0end; > =A0 =A0 =A0 else > =A0 =A0 =A0 =A0 =A0-- the separator is among the characters beyond `Last` > =A0 =A0 =A0 =A0 =A0declare > =A0 =A0 =A0 =A0 =A0 =A0 Limit : constant Positive :=3D Last; > =A0 =A0 =A0 =A0 =A0 =A0 Result : constant String :=3D Buffer(Position .. = Limit); > =A0 =A0 =A0 =A0 =A0begin > =A0 =A0 =A0 =A0 =A0 =A0 Position :=3D 0; =A0-- next call will raise End_E= rror > =A0 =A0 =A0 =A0 =A0 =A0 return Result; > =A0 =A0 =A0 =A0 =A0end; > =A0 =A0 =A0 end if; > > =A0 =A0 =A0 raise Program_Error; > =A0 =A0end Getline; > > begin > =A0 =A0-- (see for names > =A0 =A0-- of standard I/O streams when using Janus Ada on Windows.) > > =A0 =A0Stream_IO.Open (Stdout, > =A0 =A0 =A0Mode =3D> Stream_IO.Out_File, > =A0 =A0 =A0Name =3D> "/dev/stdout"); > =A0 =A0Stream_IO.Open (Stdin, > =A0 =A0 =A0Mode =3D> Stream_IO.In_File, > =A0 =A0 =A0Name =3D> "/dev/stdin"); > > =A0 =A0-- make sure there is no line separator in `Buffer` other than the > sentinel > =A0 =A0Buffer :=3D Buffer_Data'(others =3D> ASCII.NUL); > =A0 =A0Buffer(Buffer_Index'Last + 1 .. Buffer'Last) :=3D Separator_Sequen= ce; > =A0 =A0Position :=3D Buffer_Index'Last + 1; =A0-- See also > `Getline.Reload.Remaining` > =A0 =A0Last :=3D 0; > end Line_IO; > > -- > -- A small test program. > -- > with Line_IO; > with Ada.Text_IO; > > procedure Test_Line_IO is > =A0 =A0Want_Text_IO : constant Boolean :=3D False; > > =A0 =A0-- pick the correct one for your input files > =A0 =A0UnixLF =A0: constant String :=3D String'(1 =3D> ASCII.LF); > =A0 =A0MacCR =A0 : constant String :=3D String'(1 =3D> ASCII.CR); > =A0 =A0OS2CRLF : constant String :=3D String'(1 =3D> ASCII.CR, 2 =3D> ASC= II.LF); > > =A0 =A0package LIO is new Line_IO(Separator_Sequence =3D> UnixLF); > > begin > =A0 =A0if Want_Text_IO then > =A0 =A0 =A0 loop > =A0 =A0 =A0 =A0 =A0declare > =A0 =A0 =A0 =A0 =A0 =A0 A_Line : constant String :=3D Ada.Text_IO.Get_Lin= e; > =A0 =A0 =A0 =A0 =A0begin > =A0 =A0 =A0 =A0 =A0 =A0 LIO.Print(A_Line); > =A0 =A0 =A0 =A0 =A0 =A0 null; > =A0 =A0 =A0 =A0 =A0 =A0 pragma Inspection_Point(A_Line); > =A0 =A0 =A0 =A0 =A0end; > =A0 =A0 =A0 end loop; > =A0 =A0else > =A0 =A0 =A0 loop > =A0 =A0 =A0 =A0 =A0declare > =A0 =A0 =A0 =A0 =A0 =A0 A_Line : constant String :=3D LIO.Getline; > =A0 =A0 =A0 =A0 =A0begin > =A0 =A0 =A0 =A0 =A0 =A0 LIO.Print(A_Line); > =A0 =A0 =A0 =A0 =A0 =A0 null; > =A0 =A0 =A0 =A0 =A0 =A0 pragma Inspection_Point(A_Line); > =A0 =A0 =A0 =A0 =A0end; > =A0 =A0 =A0 end loop; > =A0 =A0end if; > > end Test_Line_IO; Nice one...I'll try these out on Win23 and see what happens :-) But surely "Put_Line" and "Get_Line" are preferable subprogram names?... Cheers -- Martin