From: Quarc <quarc2000@hotmail.com>
Subject: Re: Reading and writing a big file in Ada (GNAT) on Windows XP
Date: Sun, 06 May 2007 23:55:33 +0200
Date: 2007-05-06T21:52:33+00:00 [thread overview]
Message-ID: <f1lin1$1rj$1@yggdrasil.glocalnet.net> (raw)
In-Reply-To: <ie4a339a0qrr3g5cio7pugish192o8h61f@4ax.com>
Fionn Mac Cumhaill wrote:
> On Sat, 28 Apr 2007 19:12:33 GMT, "Jeffrey R. Carter"
> <jrcarter@acm.org> wrote:
>
>> Fionn Mac Cumhaill wrote:
>>> All it does is read lines in a loop from a text file with
>>> Ada.Text_IO.Get_Line, does minor modifications on about 80% of the
>>> lines that it reads, and writes the lines to an output file with
>>> Put_Line.
>>>
>>> The modifications consist of replacing a slice of text at the end of a
>>> line with another bit of text. The biggest slice is 10 characters, and
>>> the replacement slice is always smaller than the original slice. An
>>> occasional line of text is about 6000 characters long, but most are
>>> about 700 haracters. Get_Line reads them into a String variable that
>>> is 10,000 characters long.
>>>
>>> The problem is that the input file has more than 10 million lines of
>>> text in it. The program works perfectly, but takes about 5 hours to
>>> run. The Cygwin version of wc can count the lines in the input file in
>>> less than one minute.
>>>
>>> Why is this so slow?
>>> Do I have an Ada problem, a GNAT problem, or a MinGW problem?
Sorry, but I don't have your full question here so trying to figure out
your problem for the quote above. I woul assume the big difference
between what you are doing and what WC does is that wc only reads the
file, and can therefore do it in one go.
I am speculation now, but I think one explanation for this would be that
you are reading, then writing each line by itself. Depending on the OS
you are running on, and also the binding to the OS this could
potentially create a loot of seek times on the disk. This could explain
the long times I think. One of two things happen I think (or potentially
both). 1) You read 700 bytes from disk, you write 700 bytes to disk.
Assume average seek times of 5 ms and you get to 2*50 000 s = 100 000 s
= 28 hours !!!! 2) Instead you creating the seektimes on the disk
between reads, ther are other processes waiting for disk access in the
OS as well. SO qite often some other process get to the top of the
waiting queue, and can access disk, causing the seektimes.
If one of the above is causing your problem, the only solution is to
read in larger chunks at once, and then write larger chunks of data as
well. I don't remember how to set that up, and it is of course depending
on compiler and OS, but hopefully you should be able to make sure you
don't read line by line from disk.
regards
Peter Atterfj�ll
(haven't been working professionally with Ada for 14 years now :-)
next prev parent reply other threads:[~2007-05-06 21:55 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-04-28 5:03 Reading and writing a big file in Ada (GNAT) on Windows XP Fionn Mac Cumhaill
2007-04-28 5:20 ` Gautier
2007-04-29 21:17 ` Fionn Mac Cumhaill
2007-04-28 5:25 ` tmoran
2007-04-28 6:56 ` Martin Krischik
2007-04-28 17:12 ` tmoran
2007-04-28 12:41 ` Jeffrey Creem
2007-04-29 21:35 ` Fionn Mac Cumhaill
2007-04-28 13:22 ` (see below)
2007-04-28 17:56 ` Simon Wright
2007-04-28 18:28 ` Jeffrey Creem
2007-04-29 7:20 ` Simon Wright
2007-04-29 21:44 ` Fionn Mac Cumhaill
2007-04-29 21:42 ` Fionn Mac Cumhaill
2007-04-30 0:48 ` Jeffrey R. Carter
2007-04-30 2:30 ` Fionn Mac Cumhaill
2007-04-30 4:21 ` tmoran
2007-04-28 19:12 ` Jeffrey R. Carter
2007-04-29 21:46 ` Fionn Mac Cumhaill
2007-05-01 14:10 ` Fionn Mac Cumhaill
2007-05-06 21:55 ` Quarc [this message]
2007-05-02 7:46 ` george
2007-05-03 6:31 ` Fionn Mac Cumhaill
2007-05-03 20:00 ` Simon Wright
2007-05-04 4:35 ` Jeffrey R. Carter
2007-05-04 4:45 ` Fionn Mac Cumhaill
2007-05-04 6:53 ` Alternative Index implementation? (Was: Reading and writing a big file in Ada (GNAT) on Windows XP) Jacob Sparre Andersen
2007-05-04 7:41 ` Dmitry A. Kazakov
2007-05-04 9:16 ` Copying string slices before calling subroutines? (Was: Alternative Index implementation?) Jacob Sparre Andersen
2007-05-04 9:44 ` Copying string slices before calling subroutines? Jacob Sparre Andersen
2007-05-04 10:14 ` Dmitry A. Kazakov
2007-05-04 12:07 ` Jeffrey Creem
2007-05-04 12:46 ` Dmitry A. Kazakov
2007-05-04 22:27 ` Simon Wright
2007-05-05 7:33 ` Jacob Sparre Andersen
2007-05-05 7:47 ` Dmitry A. Kazakov
2007-05-05 7:41 ` Dmitry A. Kazakov
2007-05-03 20:27 ` Reading and writing a big file in Ada (GNAT) on Windows XP Adam Beneschan
2007-05-03 23:01 ` Randy Brukardt
2007-05-04 0:28 ` Markus E Leypold
2007-05-05 16:26 ` Adam Beneschan
2007-05-05 17:27 ` Markus E Leypold
2007-05-15 23:03 ` Randy Brukardt
2007-05-04 20:04 ` Adam Beneschan
2007-05-05 16:36 ` tmoran
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox