* Re: Q: Line_IO [not found] <4a9b045a$0$31875$9b4e6d93@newsspool3.arcor-online.net> @ 2009-08-31 8:28 ` Martin 2009-08-31 10:05 ` Georg Bauhaus 2009-08-31 18:39 ` Dmitry A. Kazakov 1 sibling, 1 reply; 15+ messages in thread From: Martin @ 2009-08-31 8:28 UTC (permalink / raw) On Aug 30, 11:59 pm, Georg Bauhaus <see.reply...@maps.futureapps.de> wrote: > Text_IO seems fairly slow when just reading lines of text. > Here are two alternative I/O subprograms for Line I/O, in plain Ada, > based on Stream_IO. They seem to run significantly faster. > > However, there is one glitch and I can't find the cause: > output always has one more line at the end, an empty one. > Why? If you have got a minute to look at this, you will > also help us with getting faster programs at the Shootout. > These read lines by the megabyte. > > generic > Separator_Sequence : in String; -- ends a line > package Line_IO is > > pragma Elaborate_Body; > > -- > -- High(er) speed reading and writing of lines via Stream I/O. > -- Made with Unix pipes in mind. > -- > -- Assumptions: > -- - Lines are separated by a sequence of characters. > -- - Characters and stream elements can be used interchangeably. > -- - Lines are not longer than internal buffer size. > -- > -- I/O exceptions are propagated > > procedure Print(Item : String); > > function Getline return String; > > end Line_IO; > > with Ada.Streams.Stream_IO; > with Ada.Unchecked_Conversion; > > package body Line_IO is > > use Ada.Streams; > > Stdout : Stream_IO.File_Type; > Stdin : Stream_IO.File_Type; > > -- writing > > procedure Print (Item : String) is > > subtype Index is Stream_Element_Offset range > Stream_Element_Offset(Item'First) > .. Stream_Element_Offset(Item'Last + Separator_Sequence'Length); > subtype XString is String (Item'First > .. Item'Last + Separator_Sequence'Length); > subtype XBytes is Stream_Element_Array (Index); > function To_Bytes is new Ada.Unchecked_Conversion > (Source => XString, > Target => XBytes); > begin > Stream_IO.Write (Stdout, To_Bytes (Item & Separator_Sequence)); > end Print; > > -- ---------------- > -- reading > -- ---------------- > -- Types etc., status variables, and the buffer. `Buffer` is at the > -- same time an array of Character and and array of Stream_Element > -- called `Bytes`. They share the same address. This setup makes the > -- storage at the address either a String (when selecting result > -- characters) or a Stream_Element_Array (when reading input bytes). > > BUFSIZ: constant := 8_192; > pragma Assert(Character'Size = Stream_Element'Size); > > SL : constant Natural := Separator_Sequence'Length; > > subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL; > subtype Buffer_Index is Extended_Buffer_Index > range Extended_Buffer_Index'First .. Extended_Buffer_Index'Last - SL; > subtype Extended_Bytes_Index is Stream_Element_Offset > range 1 .. Stream_Element_Offset(Extended_Buffer_Index'Last); > subtype Bytes_Index is Extended_Bytes_Index > range Extended_Bytes_Index'First > .. (Extended_Bytes_Index'Last - Stream_Element_Offset(SL)); > > subtype Buffer_Data is String(Extended_Buffer_Index); > subtype Buffer_Bytes is Stream_Element_Array(Extended_Bytes_Index); > > Buffer : Buffer_Data; > Bytes : Buffer_Bytes; > for Bytes'Address use Buffer'Address; > > Position : Natural; -- start of next substring > Last : Natural; -- last valid character in buffer > > function Getline return String is > > procedure Reload; > -- move remaining characters to the start of `Buffer` and > -- fill the following bytes if possible > -- post: Position in 0 .. 1, and 0 should mean end of file > -- Last is 0 or else the index of the last valid element in > Buffer > > procedure Reload is > Remaining : constant Natural := Buffer_Index'Last - Position + 1; > Last_Index : Stream_Element_Offset; > begin > Buffer(1 .. Remaining) := Buffer(Position .. Buffer_Index'Last); > > Stream_IO.Read(Stdin, > Item => Bytes(Stream_Element_Offset(Remaining) + 1 .. > Bytes_Index'Last), > Last => Last_Index); > Last := Natural(Last_Index); > Buffer(Last + 1 .. Last + SL) := Separator_Sequence; > > Position := Boolean'Pos(Last_Index > 0 > and then Buffer(1) /= ASCII.EOT -- ^D > and then Buffer(1) /= ASCII.SUB); -- ^Z > > end Reload; > > function Sep_Index return Natural; > -- position of next Separator_Sequence > pragma Inline(Sep_Index); > > function Sep_Index return Natural is > K : Natural := Position; > begin > pragma Assert(K >= Buffer'First); > pragma Assert(Buffer(Buffer_Index'Last + 1 .. Buffer'Last) > = Separator_Sequence); > > while Buffer(K) /= Separator_Sequence(1) loop > K := K + 1; > end loop; > > return K; > end Sep_Index; > > Next_Separator : Natural; > begin -- Getline > pragma Assert(Position = 0 or else Position in Extended_Buffer_Index); > pragma Assert(Last = 0 or else Last in Buffer_Index); > > if Position = 0 then > raise Stream_IO.End_Error; > end if; > > Next_Separator := Sep_Index; > > if Next_Separator > Buffer_Index'Last then > -- must be sentinel > Reload; > return Getline; > end if; > > if Next_Separator <= Last then > declare > Limit : constant Natural := Natural'Max(0, Next_Separator - SL); > -- there was trouble (Print) when Integer Limit could be > negative > -- (for 2-char SL and Next_Separator = 1) > Result : constant String := Buffer(Position .. Limit); > begin > Position := Limit + SL + 1; > return Result; > end; > else > -- the separator is among the characters beyond `Last` > declare > Limit : constant Positive := Last; > Result : constant String := Buffer(Position .. Limit); > begin > Position := 0; -- next call will raise End_Error > return Result; > end; > end if; > > raise Program_Error; > end Getline; > > begin > -- (see <ILmdnWHx29q5VMrZnZ2dnUVZ_sedn...@megapath.net> for names > -- of standard I/O streams when using Janus Ada on Windows.) > > Stream_IO.Open (Stdout, > Mode => Stream_IO.Out_File, > Name => "/dev/stdout"); > Stream_IO.Open (Stdin, > Mode => Stream_IO.In_File, > Name => "/dev/stdin"); > > -- make sure there is no line separator in `Buffer` other than the > sentinel > Buffer := Buffer_Data'(others => ASCII.NUL); > Buffer(Buffer_Index'Last + 1 .. Buffer'Last) := Separator_Sequence; > Position := Buffer_Index'Last + 1; -- See also > `Getline.Reload.Remaining` > Last := 0; > end Line_IO; > > -- > -- A small test program. > -- > with Line_IO; > with Ada.Text_IO; > > procedure Test_Line_IO is > Want_Text_IO : constant Boolean := False; > > -- pick the correct one for your input files > UnixLF : constant String := String'(1 => ASCII.LF); > MacCR : constant String := String'(1 => ASCII.CR); > OS2CRLF : constant String := String'(1 => ASCII.CR, 2 => ASCII.LF); > > package LIO is new Line_IO(Separator_Sequence => UnixLF); > > begin > if Want_Text_IO then > loop > declare > A_Line : constant String := Ada.Text_IO.Get_Line; > begin > LIO.Print(A_Line); > null; > pragma Inspection_Point(A_Line); > end; > end loop; > else > loop > declare > A_Line : constant String := LIO.Getline; > begin > LIO.Print(A_Line); > null; > pragma Inspection_Point(A_Line); > end; > end loop; > end if; > > end Test_Line_IO; Nice one...I'll try these out on Win23 and see what happens :-) But surely "Put_Line" and "Get_Line" are preferable subprogram names?... Cheers -- Martin ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO 2009-08-31 8:28 ` Q: Line_IO Martin @ 2009-08-31 10:05 ` Georg Bauhaus 2009-08-31 15:33 ` Anh Vo 0 siblings, 1 reply; 15+ messages in thread From: Georg Bauhaus @ 2009-08-31 10:05 UTC (permalink / raw) Martin schrieb: > Nice one...I'll try these out on Win23 and see what happens :-) Thanks. As is, the program will raise NAME_ERROR on Win32: It still seems impossible to name the standard streams for Stream_IO.Open on Win32? The package has worked with (other) named files, though. > But surely "Put_Line" and "Get_Line" are preferable subprogram > names?... The names Put_Line (for Print) and Get_Line (for Getline) could suggest that these are perfect replacements. For Print, that is basically the case for standard output, I think. Getline, however, needs a little more care than Get_Line when using it. At the moment, at least. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO 2009-08-31 10:05 ` Georg Bauhaus @ 2009-08-31 15:33 ` Anh Vo 2009-08-31 16:52 ` Georg Bauhaus 0 siblings, 1 reply; 15+ messages in thread From: Anh Vo @ 2009-08-31 15:33 UTC (permalink / raw) On Aug 31, 3:05 am, Georg Bauhaus <rm.dash-bauh...@futureapps.de> wrote: > Martin schrieb: > > > Nice one...I'll try these out on Win23 and see what happens :-) > > Thanks. As is, the program will raise NAME_ERROR on Win32: It > still seems impossible to name the standard streams for > Stream_IO.Open on Win32? The package has worked with (other) named > files, though. > > > But surely "Put_Line" and "Get_Line" are preferable subprogram > > names?... > > The names Put_Line (for Print) and Get_Line (for Getline) > could suggest that these are perfect replacements. > For Print, that is basically the case for standard output, > I think. > Getline, however, needs a little more care than Get_Line > when using it. At the moment, at least. I am curious how close when compared to GNAT.IO.Put_Line and GNAT.IO.Get_Line. If it is close enough, I would say it is the best of both worlds, speed and portability. Anh Vo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO 2009-08-31 15:33 ` Anh Vo @ 2009-08-31 16:52 ` Georg Bauhaus 0 siblings, 0 replies; 15+ messages in thread From: Georg Bauhaus @ 2009-08-31 16:52 UTC (permalink / raw) Anh Vo schrieb: > I am curious how close when compared to GNAT.IO.Put_Line and > GNAT.IO.Get_Line. If it is close enough, I would say it is the best of > both worlds, speed and portability. For reading, I don't know how to compare GNAT.IO.Get_Line. This Get_Line seems to ignore the end of input. AFAICS, it is implemented using C's getchar(), non-macro-versions IIUC. Never compares the result of imported get_char <- getchar() against C's EOF. GNAT.IO.Put_Line seem to be slow. It, too, ends up calling C's putchar(). In fact, it appears to be running many times more slowly than Text_IO.Put_Line. A few statistical results, sampled on one GNU/Linux machine. $ gnatmake -g -O2 -gnatwa -gnatn test_line_io.adb $ ./test_line_io < {250MB text file} > {some output file} With Line_IO.Print and - Line_IO.Getline: ~3 seconds. - Text_IO.Get_Line: ~7.5 seconds. With Ada.Text_IO.Put_Line and - Line_IO.Getline: ~21 seconds - Text_IO.Get_Line: ~27 seconds ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO [not found] <4a9b045a$0$31875$9b4e6d93@newsspool3.arcor-online.net> 2009-08-31 8:28 ` Q: Line_IO Martin @ 2009-08-31 18:39 ` Dmitry A. Kazakov 2009-08-31 22:51 ` Robert A Duff 2009-08-31 23:56 ` Georg Bauhaus 1 sibling, 2 replies; 15+ messages in thread From: Dmitry A. Kazakov @ 2009-08-31 18:39 UTC (permalink / raw) On Mon, 31 Aug 2009 00:59:38 +0200, Georg Bauhaus wrote: > Text_IO seems fairly slow when just reading lines of text. > Here are two alternative I/O subprograms for Line I/O, in plain Ada, > based on Stream_IO. They seem to run significantly faster. When you print you do: Stream_IO.Write (Stdout, To_Bytes (Item & Separator_Sequence)); You could try not to concatenate: Stream_IO.Write (Stdout, To_Bytes (Item)); Stream_IO.Write (Stdout, To_Bytes (Separator_Sequence)); , which should be faster when Item is large. Then there is a crazy way to convert congruent types without Unchecked_Conversion. I cannot tell whether it is actually faster: procedure Print (Item : String) is subtype Index is Stream_Element_Offset range 1..Item'Length; subtype XBytes is Stream_Element_Array (Index); Alias : XBytes; for Alias'Address use Item'Address; begin Stream_IO.Write (Stdout, Alias); ... P.S. The superimposed object shall not have initializers. -- Regards, Dmitry A. Kazakov http://www.dmitry-kazakov.de ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO 2009-08-31 18:39 ` Dmitry A. Kazakov @ 2009-08-31 22:51 ` Robert A Duff 2009-09-01 0:35 ` Georg Bauhaus 2009-08-31 23:56 ` Georg Bauhaus 1 sibling, 1 reply; 15+ messages in thread From: Robert A Duff @ 2009-08-31 22:51 UTC (permalink / raw) "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes: > Then there is a crazy way to convert congruent types without > Unchecked_Conversion. I cannot tell whether it is actually faster: > > procedure Print (Item : String) is > subtype Index is Stream_Element_Offset range 1..Item'Length; > subtype XBytes is Stream_Element_Array (Index); > Alias : XBytes; > for Alias'Address use Item'Address; > begin > Stream_IO.Write (Stdout, Alias); > ... > > P.S. The superimposed object shall not have initializers. If it has default initialization, you can suppress it using: pragma Import (Ada, Alias); See 13.3(12.c) and B.1(38,38.a). - Bob ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO 2009-08-31 22:51 ` Robert A Duff @ 2009-09-01 0:35 ` Georg Bauhaus 0 siblings, 0 replies; 15+ messages in thread From: Georg Bauhaus @ 2009-09-01 0:35 UTC (permalink / raw) Robert A Duff wrote: > "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes: >> procedure Print (Item : String) is >> subtype Index is Stream_Element_Offset range 1..Item'Length; >> subtype XBytes is Stream_Element_Array (Index); >> Alias : XBytes; >> for Alias'Address use Item'Address; >> begin >> Stream_IO.Write (Stdout, Alias); >> ... >> >> P.S. The superimposed object shall not have initializers. > > If it has default initialization, you can suppress it using: > > pragma Import (Ada, Alias); > > See 13.3(12.c) and B.1(38,38.a). I guess the "superimposed" object is Item, superimposed onto Alias? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO 2009-08-31 18:39 ` Dmitry A. Kazakov 2009-08-31 22:51 ` Robert A Duff @ 2009-08-31 23:56 ` Georg Bauhaus 2009-09-01 0:19 ` Georg Bauhaus 2009-09-01 7:02 ` Ludovic Brenta 1 sibling, 2 replies; 15+ messages in thread From: Georg Bauhaus @ 2009-08-31 23:56 UTC (permalink / raw) Dmitry A. Kazakov wrote: > You could try not to concatenate: > > Stream_IO.Write (Stdout, To_Bytes (Item)); > Stream_IO.Write (Stdout, To_Bytes (Separator_Sequence)); > > , which should be faster when Item is large. Yes, though according to some measurements that have been made in the recent past, "&" is faster for "normal" sized lines. Other sizes did not produce stable results (on my machine at least). A test case is in <4a7bebaa$0$30224$9b4e6d93@newsspool1.arcor-online.net> However, the two calls are more general, so perhaps they should replace the concatenation. And they seem to make using 'Address be simpler, below--- > Then there is a crazy way to convert congruent types without > Unchecked_Conversion. I cannot tell whether it is actually faster: Since 'Address is used for reading anyway, and since, yes, it is faster, it could replace the unchecked conversion. Is there a risk with function parameters, not objects of "better known" storage places? New version below. If you want to see the difference between Unchecked_Conversion and 'Address, rename either Print_1 (old) to Put_Line or Print_2 (new, 'Address) to the same. > P.S. The superimposed object shall not have initializers. Does this apply to String parameters? generic Separator_Sequence : in String; -- ends a line package Line_IO is pragma Elaborate_Body; -- -- High(er) speed reading and writing of lines via Stream I/O. -- Made with Unix pipes in mind. -- -- Assumptions: -- - Lines are separated by a sequence of characters. -- - Characters and stream elements can be used interchangeably. -- - Lines are not longer than internal buffer size. -- -- I/O exceptions are propagated procedure Put_Line(Item : String); function Get_Line return String; end Line_IO; with Ada.Streams.Stream_IO; with Ada.Unchecked_Conversion; package body Line_IO is use Ada.Streams; Stdout : Stream_IO.File_Type; Stdin : Stream_IO.File_Type; -- writing procedure Print_1 (Item : String) is subtype Index is Stream_Element_Offset range Stream_Element_Offset(Item'First) .. Stream_Element_Offset(Item'Last + Separator_Sequence'Length); subtype XString is String (Item'First .. Item'Last + Separator_Sequence'Length); subtype XBytes is Stream_Element_Array (Index); function To_Bytes is new Ada.Unchecked_Conversion (Source => XString, Target => XBytes); begin Stream_IO.Write (Stdout, To_Bytes (Item & Separator_Sequence)); end Print_1; -- Alternative: -- - call Stream_IO.Write twice, once for the string, then for the -- line separator (terminator) -- - specify 'Address, not unchecked_conversion is needed then -- We need the separator as a Stream_Element_Array. (Can we -- use 'Address on a generic formal object? If so, then -- again, no Unchecked_Conversion is needed (advantage?)) subtype Sep_String is String(Separator_Sequence'Range); subtype Sep_Bytes is Stream_Element_Array (Stream_Element_Offset(Separator_Sequence'First) .. Stream_Element_Offset(Separator_Sequence'Last)); function To_Bytes is new Ada.Unchecked_Conversion (Source => Sep_String, Target => Sep_Bytes); Separator_Bytes : constant Stream_Element_Array := To_Bytes(Separator_Sequence); procedure Print_2 (Item : String) is subtype Index is Stream_Element_Offset range Stream_Element_Offset(Item'First) .. Stream_Element_Offset(Item'Last); subtype XBytes is Stream_Element_Array (Index); Item_Bytes: XBytes; for Item_Bytes'Address use Item'Address; begin Stream_IO.Write (Stdout, Item_Bytes); Stream_IO.Write (Stdout, Separator_Bytes); end Print_2; procedure Put_Line (Item : String) renames Print_2; -- ---------------- -- reading -- ---------------- -- Types etc., status variables, and the buffer. `Buffer` is at the -- same time an array of Character and and array of Stream_Element -- called `Bytes`. They share the same address. This setup makes the -- storage at the address either a String (when selecting result -- characters) or a Stream_Element_Array (when reading input bytes). BUFSIZ: constant := 8_192; pragma Assert(Character'Size = Stream_Element'Size); SL : constant Natural := Separator_Sequence'Length; subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL; subtype Buffer_Index is Extended_Buffer_Index range Extended_Buffer_Index'First .. Extended_Buffer_Index'Last - SL; subtype Extended_Bytes_Index is Stream_Element_Offset range 1 .. Stream_Element_Offset(Extended_Buffer_Index'Last); subtype Bytes_Index is Extended_Bytes_Index range Extended_Bytes_Index'First .. (Extended_Bytes_Index'Last - Stream_Element_Offset(SL)); subtype Buffer_Data is String(Extended_Buffer_Index); subtype Buffer_Bytes is Stream_Element_Array(Extended_Bytes_Index); Buffer : Buffer_Data; Bytes : Buffer_Bytes; for Bytes'Address use Buffer'Address; Position : Natural; -- start of next substring Last : Natural; -- last valid character in buffer function Get_Line return String is procedure Reload; -- move remaining characters to the start of `Buffer` and -- fill the following bytes if possible -- post: Position in 0 .. 1, and 0 should mean end of file -- Last is 0 or else the index of the last valid element in Buffer procedure Reload is Remaining : constant Natural := Buffer_Index'Last - Position + 1; Last_Index : Stream_Element_Offset; begin Buffer(1 .. Remaining) := Buffer(Position .. Buffer_Index'Last); Stream_IO.Read(Stdin, Item => Bytes(Stream_Element_Offset(Remaining) + 1 .. Bytes_Index'Last), Last => Last_Index); Last := Natural(Last_Index); Buffer(Last + 1 .. Last + SL) := Separator_Sequence; Position := Boolean'Pos(Last_Index > 0 and then Buffer(1) /= ASCII.EOT -- ^D and then Buffer(1) /= ASCII.SUB); -- ^Z end Reload; function Sep_Index return Natural; -- position of next Separator_Sequence pragma Inline(Sep_Index); function Sep_Index return Natural is K : Natural := Position; begin pragma Assert(K >= Buffer'First); pragma Assert(Buffer(Buffer_Index'Last + 1 .. Buffer'Last) = Separator_Sequence); while Buffer(K) /= Separator_Sequence(1) loop K := K + 1; end loop; return K; end Sep_Index; Next_Separator : Natural; begin -- Get_Line pragma Assert(Position = 0 or else Position in Extended_Buffer_Index); pragma Assert(Last = 0 or else Last in Buffer_Index); if Position = 0 then raise Stream_IO.End_Error; end if; Next_Separator := Sep_Index; if Next_Separator > Buffer_Index'Last then -- must be sentinel Reload; return Get_Line; end if; if Next_Separator <= Last then declare Limit : constant Natural := Natural'Max(0, Next_Separator - SL); -- there was trouble (Print) when Integer Limit could be negative -- (for 2-char SL and Next_Separator = 1) Result : constant String := Buffer(Position .. Limit); begin Position := Limit + SL + 1; return Result; end; else -- the separator is among the characters beyond `Last` declare Limit : constant Positive := Last; Result : constant String := Buffer(Position .. Limit); begin -- -- makes the spurious line go away -- -- But make sure that it isn't cause by Put_Line! if Position > Last then raise Stream_IO.End_Error; end if; Position := 0; -- next call will raise End_Error return Result; end; end if; raise Program_Error; end Get_Line; begin -- (see <ILmdnWHx29q5VMrZnZ2dnUVZ_sednZ2d@megapath.net> for names -- of standard I/O streams when using Janus Ada on Windows.) Stream_IO.Open (Stdout, Mode => Stream_IO.Out_File, Name => "/dev/stdout"); Stream_IO.Open (Stdin, Mode => Stream_IO.In_File, Name => "/dev/stdin"); -- make sure there is no line separator in `Buffer` other than the sentinel Buffer := Buffer_Data'(others => ASCII.NUL); Buffer(Buffer_Index'Last + 1 .. Buffer'Last) := Separator_Sequence; Position := Buffer_Index'Last + 1; -- See also `Getline.Reload.Remaining` Last := 0; end Line_IO; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO 2009-08-31 23:56 ` Georg Bauhaus @ 2009-09-01 0:19 ` Georg Bauhaus 2009-09-01 1:08 ` Robert A Duff 2009-09-01 7:02 ` Ludovic Brenta 1 sibling, 1 reply; 15+ messages in thread From: Georg Bauhaus @ 2009-09-01 0:19 UTC (permalink / raw) Georg Bauhaus wrote: > procedure Print_2 (Item : String) is > subtype Index is Stream_Element_Offset range > Stream_Element_Offset(Item'First) > .. Stream_Element_Offset(Item'Last); > subtype XBytes is Stream_Element_Array (Index); > Item_Bytes: XBytes; > for Item_Bytes'Address use Item'Address; > begin > Stream_IO.Write (Stdout, Item_Bytes); > Stream_IO.Write (Stdout, Separator_Bytes); > end Print_2; *** line_io.ada old --- line_io.ada new *************** *** 78,79 **** --- 78,80 ---- Item_Bytes: XBytes; + pragma Import (Ada, Item_Bytes); for Item_Bytes'Address use Item'Address; ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO 2009-09-01 0:19 ` Georg Bauhaus @ 2009-09-01 1:08 ` Robert A Duff 0 siblings, 0 replies; 15+ messages in thread From: Robert A Duff @ 2009-09-01 1:08 UTC (permalink / raw) Georg Bauhaus <see.reply.to@maps.futureapps.de> writes: > Georg Bauhaus wrote: >> procedure Print_2 (Item : String) is >> subtype Index is Stream_Element_Offset range >> Stream_Element_Offset(Item'First) >> .. Stream_Element_Offset(Item'Last); >> subtype XBytes is Stream_Element_Array (Index); >> Item_Bytes: XBytes; >> for Item_Bytes'Address use Item'Address; >> begin >> Stream_IO.Write (Stdout, Item_Bytes); >> Stream_IO.Write (Stdout, Separator_Bytes); >> end Print_2; > > *** line_io.ada old > --- line_io.ada new > *************** > *** 78,79 **** > --- 78,80 ---- > Item_Bytes: XBytes; > + pragma Import (Ada, Item_Bytes); > for Item_Bytes'Address use Item'Address; The Import is not strictly necessary, because Stream_Element_Array has no default initialization. But it's still good style -- it says, the declaration of Item_Bytes is not creating a new object, it's just overlaying an old one. If Item_Bytes had default inits (e.g. if it were an array of access values, which are default-initialized to null, or an array of records with some defaulted components), then the Import would be necessary. I think in that case, GNAT warns, because without the Import, the default inits will overwrite Item, which is certainly not what you want. - Bob ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO 2009-08-31 23:56 ` Georg Bauhaus 2009-09-01 0:19 ` Georg Bauhaus @ 2009-09-01 7:02 ` Ludovic Brenta 2009-09-01 9:55 ` Georg Bauhaus ` (3 more replies) 1 sibling, 4 replies; 15+ messages in thread From: Ludovic Brenta @ 2009-09-01 7:02 UTC (permalink / raw) Georg Bauhaus wrote on comp.lang.ada: > BUFSIZ: constant := 8_192; [...] > SL : constant Natural := Separator_Sequence'Length; > subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL; Since BUFSIZ is obviously chosen as an integral number of hardware memory pages, the extended_buffer uses two pages plus two bytes. How about allocating a buffer of BUFSIZ bytes and using only BUFSIZ-SL for the string and the remaining SL bytes for the terminator? I realize that at this point we're down to nitpicking because the program seems really good and fast now. -- Ludovic Brenta. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO 2009-09-01 7:02 ` Ludovic Brenta @ 2009-09-01 9:55 ` Georg Bauhaus 2009-09-01 12:03 ` jonathan ` (2 subsequent siblings) 3 siblings, 0 replies; 15+ messages in thread From: Georg Bauhaus @ 2009-09-01 9:55 UTC (permalink / raw) Ludovic Brenta schrieb: > Georg Bauhaus wrote on comp.lang.ada: >> BUFSIZ: constant := 8_192; > [...] >> SL : constant Natural := Separator_Sequence'Length; >> subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL; > > Since BUFSIZ is obviously chosen as an integral number of hardware > memory pages, the extended_buffer uses two pages plus two bytes. How > about allocating a buffer of BUFSIZ bytes and using only BUFSIZ-SL for > the string and the remaining SL bytes for the terminator? I had made the buffer have BUFSIZ + Separator_Sequence'Length elements because Stream_IO.Read would then have BUFSIZ bytes into which to store its data. I was only guessing that this would matter; indeed, it is somewhat faster than using BUFSIZ = 128. But growing beyond 8192 did not have an effect. I'll try others. Another thing: The names look bit bulky, at least from a "molecular source pattern" matching point of view. But I can't produce better short and meaningful names that are still Ada. Is it enough to add a few empty lines? Ideas welcome. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO 2009-09-01 7:02 ` Ludovic Brenta 2009-09-01 9:55 ` Georg Bauhaus @ 2009-09-01 12:03 ` jonathan [not found] ` <4a9e2c86$0$30235$9b4e6d93@newsspool1.arcor-online.net> 2009-09-05 20:30 ` Georg Bauhaus 3 siblings, 0 replies; 15+ messages in thread From: jonathan @ 2009-09-01 12:03 UTC (permalink / raw) On Sep 1, 8:02 am, Ludovic Brenta <ludo...@ludovic-brenta.org> wrote: > Georg Bauhaus wrote on comp.lang.ada: > > > BUFSIZ: constant := 8_192; > [...] > > SL : constant Natural := Separator_Sequence'Length; > > subtype Extended_Buffer_Index is Positive range 1 .. BUFSIZ + SL; > > Since BUFSIZ is obviously chosen as an integral number of hardware > memory pages, the extended_buffer uses two pages plus two bytes. How > about allocating a buffer of BUFSIZ bytes and using only BUFSIZ-SL for > the string and the remaining SL bytes for the terminator? > > I realize that at this point we're down to nitpicking because the > program seems really good and fast now. > > -- > Ludovic Brenta. A few benchmark timings: I updated a version of knucleotide.adb with the new get_line. IO overhead fell from 3.6 sec on my machine, to 1.2 sec. It now reads and stores (half of) the 250 MB text file in about the same time as my vim editor. Very nice result, especially for the multitasking program, which can parallelize everything except IO. Jonathan ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <4a9e2c86$0$30235$9b4e6d93@newsspool1.arcor-online.net>]
* Re: Q: Line_IO [not found] ` <4a9e2c86$0$30235$9b4e6d93@newsspool1.arcor-online.net> @ 2009-09-02 8:47 ` Georg Bauhaus 0 siblings, 0 replies; 15+ messages in thread From: Georg Bauhaus @ 2009-09-02 8:47 UTC (permalink / raw) (I'll switch News readers back to Emacs, unless I can find out how to add a QP piece of text to a message in Thunderbird; sorry if the attachment is inconvenient.) ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Q: Line_IO 2009-09-01 7:02 ` Ludovic Brenta ` (2 preceding siblings ...) [not found] ` <4a9e2c86$0$30235$9b4e6d93@newsspool1.arcor-online.net> @ 2009-09-05 20:30 ` Georg Bauhaus 3 siblings, 0 replies; 15+ messages in thread From: Georg Bauhaus @ 2009-09-05 20:30 UTC (permalink / raw) Ludovic Brenta wrote: > I realize that at this point we're down to nitpicking because the > program seems really good and fast now. It is now somewhat more correct, too: http://home.arcor.de/bauhaus/Ada/line_io.ada It is probably worth noting that specifying Buffer'Alignment was not a good idea, slowed down Get_Line. (And ObjectAda is happy with an alignment number like 8, not BUFSIZ, anyway.) ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2009-09-05 20:30 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <4a9b045a$0$31875$9b4e6d93@newsspool3.arcor-online.net> 2009-08-31 8:28 ` Q: Line_IO Martin 2009-08-31 10:05 ` Georg Bauhaus 2009-08-31 15:33 ` Anh Vo 2009-08-31 16:52 ` Georg Bauhaus 2009-08-31 18:39 ` Dmitry A. Kazakov 2009-08-31 22:51 ` Robert A Duff 2009-09-01 0:35 ` Georg Bauhaus 2009-08-31 23:56 ` Georg Bauhaus 2009-09-01 0:19 ` Georg Bauhaus 2009-09-01 1:08 ` Robert A Duff 2009-09-01 7:02 ` Ludovic Brenta 2009-09-01 9:55 ` Georg Bauhaus 2009-09-01 12:03 ` jonathan [not found] ` <4a9e2c86$0$30235$9b4e6d93@newsspool1.arcor-online.net> 2009-09-02 8:47 ` Georg Bauhaus 2009-09-05 20:30 ` Georg Bauhaus
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox