* How to speed up stream & record handling?
@ 2002-02-21 12:37 Karl Ran
2002-02-21 14:17 ` Martin Dowie
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Karl Ran @ 2002-02-21 12:37 UTC (permalink / raw)
Hello,
I've a problem getting a reasonable IO preformance from an Ada
program (source is attached)
The environment looks like this:
OS: linux-2.4.16
CPU: Intel P3/700 MHz / BX chipset
compiler: gnat-3.14p
The program reads a packet of data (200 bytes) and converts the
header(2 bytes) to match the architecture. It will do it 100000 times.
>time ./slow_ada
real 0m8.545s
user 0m8.350s <- looks like it spend all the time in the ada-code!
sys 0m0.070s
Thats 20 MByte / 8.5 seconds = 2.4 MByte/sec
It's too slow for the target application.
Has anyone an idea on how to speed up this progam, other than
throwing more CPU-power at the problem?
OK,
I know the the PC achitecture is known for their bad IO performance,
but this 2.4 MBytes/s seems not be related to the PC-IO-bottleneck...
Thanks,
Karl
with Ada.Text_IO;
with Ada.Streams.Stream_IO;
with System;
procedure Slow_Ada is
type Unsigned_4 is mod 2 **4;
for Unsigned_4'Size use 4;
type Unsigned_8 is mod 2 ** 8;
for Unsigned_8'Size use 8;
type Unsigned_12 is mod 2 ** 12;
for Unsigned_12'Size use 12;
type Data_array is array( 1 .. 198 ) of Unsigned_8;
-- Because a single byte is occupied by Both part of B and all of C,
-- We combine them into a record so we can define stream-oriented
-- attributes such that they can be read and written from/to a stream
-- properly, regardless of machine endianness.
type B_And_C is
record
B : Unsigned_12;
C : Unsigned_4;
end record;
-- Use this representation clause on a little-endian machine.
for B_And_C use
record at mod 1;
B at 0 range 4 .. 15;
C at 0 range 0 .. 3;
end record;
-- Use this representation clause on a big-endian machine.
-- for B_and_C use
-- record at mod 1;
-- B at 0 range 0 .. 11;
-- C at 0 range 12 .. 15;
-- end record;
for B_And_C'Size use 16;
-- We now procedd to declare the stream-orientd attributes
procedure B_And_C_Read
(Stream : access Ada.Streams.Root_Stream_Type'Class;
Item : out B_And_C);
procedure B_And_C_Write
(Stream : access Ada.Streams.Root_Stream_Type'Class;
Item : in B_And_C);
for B_And_C'Read use B_And_C_Read;
for B_And_C'Write use B_And_C_Write;
type My_Record is
record
A : Data_array; -- 200 bytes
BC : B_And_C; -- 2 bytes
end record;
type Byte_Array is array (Positive range <>) of Unsigned_8;
-- This procedure reverses the oder of the bytes in its argument.
procedure Swap (The_Bytes : in out Byte_Array) is
Temp : Unsigned_8;
begin
for B in 1 .. The_Bytes'Last / 2 loop
Temp := The_Bytes (B);
The_Bytes (B) := The_Bytes (The_Bytes'Last - B + 1);
The_Bytes (The_Bytes'Last - B + 1) := Temp;
end loop;
end Swap;
-- These porocedures implement the stream-oriented attributes.
procedure B_And_C_Read
(Stream : access Ada.Streams.Root_Stream_Type'Class;
Item : out B_And_C) is
The_Bytes : Byte_Array (1 .. Item'Size / Unsigned_8'Size);
for The_Bytes'Address use Item'Address;
use type System.Bit_Order;
begin
Byte_Array'Read (Stream, The_Bytes);
if System.Default_Bit_Order = System.Low_Order_First then
Swap (The_Bytes);
end if;
end B_And_C_Read;
procedure B_And_C_Write
(Stream : access Ada.Streams.Root_Stream_Type'Class;
Item : in B_And_C) is
The_Bytes : Byte_Array (1 .. Item'Size / Unsigned_8'Size);
for The_Bytes'Address use Item'Address;
use type System.Bit_Order;
begin
if System.Default_Bit_Order = System.Low_Order_First then
Swap (The_Bytes);
end if;
Byte_Array'Write (Stream, The_Bytes);
end B_And_C_Write;
Item : My_Record;
File : Ada.Streams.Stream_IO.File_Type;
Stream : Ada.Streams.Stream_IO.Stream_Access;
begin
Ada.Streams.Stream_IO.Open
(Name => "/dev/zero", --use any big file you like (20 MByte)
File => File,
Mode => Ada.Streams.Stream_IO.In_File);
Stream := Ada.Streams.Stream_IO.Stream (File);
for I in 1 .. 100000 loop
My_Record'Read (Stream, Item);
end loop;
Ada.Streams.Stream_IO.Close (File);
end Slow_Ada;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to speed up stream & record handling?
2002-02-21 12:37 How to speed up stream & record handling? Karl Ran
@ 2002-02-21 14:17 ` Martin Dowie
2002-02-21 17:34 ` Jeffrey Carter
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Martin Dowie @ 2002-02-21 14:17 UTC (permalink / raw)
"Karl Ran" <karlran1234@yahoo.com> wrote in message
news:e7ebd224.0202210437.1c7d0fbf@posting.google.com...
> I've a problem getting a reasonable IO preformance from an Ada
> program (source is attached)
>
> The environment looks like this:
> OS: linux-2.4.16
> CPU: Intel P3/700 MHz / BX chipset
> compiler: gnat-3.14p
Have you tried varying compiler optimisations? i.e. -O3 instead of -O0?
The swap routine looks a little extravagant. You already know that
B_And_C is 2 bytes. Try:
type Byte_Array is array (Positive range 1 .. 2) of Unsigned_8;
-- This procedure reverses the oder of the bytes in its argument.
procedure Swap (
The_Bytes : in out Byte_Array ) is
Temp : Unsigned_8 := The_Bytes (1);
begin
The_Bytes (1) := The_Bytes (2);
The_Bytes (2) := Temp;
end Swap;
and see if that makes any difference.
You could also try inlining your swap routine.
Hope this helps!
Slainte!
Martin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to speed up stream & record handling?
2002-02-21 12:37 How to speed up stream & record handling? Karl Ran
2002-02-21 14:17 ` Martin Dowie
@ 2002-02-21 17:34 ` Jeffrey Carter
2002-02-21 20:25 ` Florian Weimer
2002-02-24 3:23 ` Nick Roberts
3 siblings, 0 replies; 8+ messages in thread
From: Jeffrey Carter @ 2002-02-21 17:34 UTC (permalink / raw)
It would be useful to know how fast you can read 20 MB in 200 byte
groups without any user-defined reads. It would also help to know what
optimization level you compiled with. By default (-O0), GNAT produces
completely unoptimized code. You need to use -O1 just to get the
equivalent of "no optimization" with most other compilers. Since you
need fast code, -O3 is probably a good idea.
Other than that, the only thing I can see is that Swap is very general.
Making it specific to a 2-byte array (a simple exchange) and inlining it
might speed things up a bit.
If it's still not fast enough, you might try suppressing all run-time
checks.
--
Jeffrey Carter
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to speed up stream & record handling?
2002-02-21 12:37 How to speed up stream & record handling? Karl Ran
2002-02-21 14:17 ` Martin Dowie
2002-02-21 17:34 ` Jeffrey Carter
@ 2002-02-21 20:25 ` Florian Weimer
2002-02-21 23:59 ` tmoran
2002-02-24 3:23 ` Nick Roberts
3 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2002-02-21 20:25 UTC (permalink / raw)
karlran1234@yahoo.com (Karl Ran) writes:
> type Data_array is array( 1 .. 198 ) of Unsigned_8;
> type My_Record is
> record
> A : Data_array; -- 200 bytes
> BC : B_And_C; -- 2 bytes
> end record;
> My_Record'Read (Stream, Item);
Data_array'Read (and thus My_Record'Read) calls Unsigned_8'Read for
each array element individually. Perhaps this burns a lot of CPU
cycles.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to speed up stream & record handling?
2002-02-21 20:25 ` Florian Weimer
@ 2002-02-21 23:59 ` tmoran
2002-02-22 13:31 ` Karl Ran
0 siblings, 1 reply; 8+ messages in thread
From: tmoran @ 2002-02-21 23:59 UTC (permalink / raw)
> Data_array'Read (and thus My_Record'Read) calls Unsigned_8'Read for
> each array element individually. Perhaps this burns a lot of CPU
I suspect you're right. I ran the program as given on a 750MHz Windows
system and it took 7.7 seconds. It runs in .13 seconds with some
simple changes:
type My_Record is
record
A : Data_array; -- 200 bytes
-- BC : B_And_C; -- 2 bytes
BC : Byte_Array(1 .. 2);
end record;
buffer : my_record;
package bio is new ada.sequential_io(my_record);
...
for I in 1 .. 100000 loop
bio.read(f,buffer);
swap(buffer.bc);
end loop;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to speed up stream & record handling?
2002-02-21 23:59 ` tmoran
@ 2002-02-22 13:31 ` Karl Ran
2002-02-22 20:25 ` tmoran
0 siblings, 1 reply; 8+ messages in thread
From: Karl Ran @ 2002-02-22 13:31 UTC (permalink / raw)
tmoran@acm.org wrote in message
news:<8Rfd8.41459$Cu.776895529@newssvr14.news.prodigy.com>...
>> Data_array'Read (and thus My_Record'Read) calls Unsigned_8'Read for
>> each array element individually. Perhaps this burns a lot of CPU
> I suspect you're right. I ran the program as given on a 750MHz Windows
> system and it took 7.7 seconds. It runs in .13 seconds with some
> simple changes:
Same here!
So, whats the problem with the streams?
Was the streams-package not designed to handle high-bandwidth application?
Does the used compiler matters?
How do I avoid calling 'Unsigned_8'Read' 200 times per packet while still
using streams?
Thanks,
Karl
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to speed up stream & record handling?
2002-02-22 13:31 ` Karl Ran
@ 2002-02-22 20:25 ` tmoran
0 siblings, 0 replies; 8+ messages in thread
From: tmoran @ 2002-02-22 20:25 UTC (permalink / raw)
>How do I avoid calling 'Unsigned_8'Read' 200 times per packet while still
>using streams?
Go up a level. Define your own My_Record'Read. I made an "is new
Ada.Streams.Root_Stream_Type" (see below) and changed the code to:
procedure My_Record_Read
(Stream : access Ada.Streams.Root_Stream_Type'Class;
Item : out My_Record) is
use type Ada.Streams.Stream_Element_Offset;
The_Data_Bytes : Ada.Streams.Stream_Element_Array (1 .. Data_Array'size/Unsigned_8'Size);
for The_Data_Bytes'Address use Item.A'Address;
Last : Ada.Streams.Stream_Element_Offset;
begin
Not_Slow.Read(Not_Slow.My_File_Type(Stream.all), The_Data_Bytes, Last);
if Last /= The_Data_Bytes'last then
null; -- what to do?
end if;
B_And_C'Read(Stream, Item.BC);
end My_Record_Read;
My_File : aliased Not_Slow.My_File_Type;
...
Not_Slow.Open(My_File,
Mode => Ada.Streams.Stream_IO.In_File,
Name => "r:big");
for I in 1 .. 100000 loop
My_Record'Read (My_File'access, Item);
end loop;
Not_Slow.Close(My_File);
and it ran in 0.22 seconds. (Gnat 3.14p, -O2)
Of course the fixed record size makes Sequential_IO still a faster option.
with Ada.Streams,
Ada.Streams.Stream_IO;
package Not_Slow is
type My_File_Type is new Ada.Streams.Root_Stream_Type with private;
procedure Open(Stream : in out My_File_Type;
Mode : in Ada.Streams.Stream_IO.File_Mode;
Name : in String;
Form : in String := "");
procedure Close(Stream : in out My_File_Type);
procedure Read(
Stream : in out My_File_Type;
Item : out Ada.Streams.Stream_Element_Array;
Last : out Ada.Streams.Stream_Element_Offset);
procedure Write(
Stream : in out My_File_Type;
Item : in Ada.Streams.Stream_Element_Array);
private
type My_File_Type is new Ada.Streams.Root_Stream_Type with record
The_File : Ada.Streams.Stream_IO.File_Type;
end record;
end Not_Slow;
package body Not_Slow is
procedure Open(Stream : in out My_File_Type;
Mode : in Ada.Streams.Stream_IO.File_Mode;
Name : in String;
Form : in String := "") is
begin
Ada.Streams.Stream_IO.Open(Stream.The_File, Mode, Name, Form);
end Open;
procedure Close(Stream : in out My_File_Type) is
begin
Ada.Streams.Stream_IO.Close(Stream.The_File);
end Close;
procedure Read(
Stream : in out My_File_Type;
Item : out Ada.Streams.Stream_Element_Array;
Last : out Ada.Streams.Stream_Element_Offset) is
begin
Ada.Streams.Stream_IO.Read(Stream.The_File, Item, Last);
end Read;
procedure Write(
Stream : in out My_File_Type;
Item : in Ada.Streams.Stream_Element_Array) is
begin
Ada.Streams.Stream_IO.Write(Stream.The_File, Item);
end Write;
end Not_Slow;
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: How to speed up stream & record handling?
2002-02-21 12:37 How to speed up stream & record handling? Karl Ran
` (2 preceding siblings ...)
2002-02-21 20:25 ` Florian Weimer
@ 2002-02-24 3:23 ` Nick Roberts
3 siblings, 0 replies; 8+ messages in thread
From: Nick Roberts @ 2002-02-24 3:23 UTC (permalink / raw)
On 21 Feb 2002 04:37:57 -0800, karlran1234@yahoo.com (Karl Ran) strongly
typed:
>Hello,
>I've a problem getting a reasonable IO preformance from an Ada
>program (source is attached)
>
>The environment looks like this:
>OS: linux-2.4.16
>CPU: Intel P3/700 MHz / BX chipset
>compiler: gnat-3.14p
>
>The program reads a packet of data (200 bytes) and converts the
>header(2 bytes) to match the architecture. It will do it 100000 times.
>...
My suggestion (you may not like it!) is: (a) if possible, ensure the
NIC/DMA dumps the data into a useful piece of memory (i.e. shared); (b)
preferably, use assembly.
>It's too slow for the target application.
Which is, please?
>I know the the PC achitecture is known for their bad IO performance,
>but this 2.4 MBytes/s seems not be related to the PC-IO-bottleneck...
I'm sorry but this is a jaw-dropper. Speak not of that which thou wot not
of, child.
My suggestion for improvements in your Ada code follows.
with Ada.Text_IO;
with Ada.Streams.Stream_IO;
procedure Slow_Ada is
type Machine_Architecture is (Little_Endian, Big_Endian);
Architecture: constant Machine_Architecture := Little_Endian; -- IA
subtype Must_Be_True is Boolean range True..True;
Check_Element_Size: constant Must_Be_True :=
Ada.Streams.Stream_Element'Size = 8;
subtype Data_Array is Ada.Streams.Stream_Element_Array(1..200);
procedure Adjust_Data (Data: in out Data_Array) is
Temp: Ada.Streams.Stream_Element; -- assuming this is a byte
begin
case Architecture is
when Little_Endian =>
Temp := Data(199);
Data(199) := Data(200);
Data(200) := Temp;
when Big_Endian =>
null;
end case;
end Adjust_Data;
pragma Inline(Adjust_Data);
Item: Data_Array;
Last: Ada.Streams.Stream_Element_Offset;
File: Ada.Streams.Stream_IO.File_Type;
begin
Ada.Streams.Stream_IO.Open
(Name => "/dev/zero", --use any big file you like (20 MByte)
File => File,
Mode => Ada.Streams.Stream_IO.In_File);
Put("Starting...");
for I in 1 .. 100000 loop
--- test for end of file?
Read(File,Item,Last);
-- test Last for /= 200?
Adjust_Data(Item);
-- do something with the data?
end loop;
Put_Line("Done.");
Ada.Streams.Stream_IO.Close (File);
end Slow_Ada;
If a stream element isn't a byte, you'll need to adjust Adjust_Data (and
remove or change Check_Element_Size). At this (low) level, this sort of
twiddling will always be necessary.
(Also, note how my code does the job with the minimum of 'fuss'.)
Hope this helps.
--
Nick Roberts
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2002-02-24 3:23 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-02-21 12:37 How to speed up stream & record handling? Karl Ran
2002-02-21 14:17 ` Martin Dowie
2002-02-21 17:34 ` Jeffrey Carter
2002-02-21 20:25 ` Florian Weimer
2002-02-21 23:59 ` tmoran
2002-02-22 13:31 ` Karl Ran
2002-02-22 20:25 ` tmoran
2002-02-24 3:23 ` Nick Roberts
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox