comp.lang.ada
 help / color / mirror / Atom feed
From: Quarc <quarc2000@hotmail.com>
Subject: Re: Reading and writing a big file in Ada (GNAT) on Windows XP
Date: Sun, 06 May 2007 23:55:33 +0200
Date: 2007-05-06T21:52:33+00:00	[thread overview]
Message-ID: <f1lin1$1rj$1@yggdrasil.glocalnet.net> (raw)
In-Reply-To: <ie4a339a0qrr3g5cio7pugish192o8h61f@4ax.com>

Fionn Mac Cumhaill wrote:
> On Sat, 28 Apr 2007 19:12:33 GMT, "Jeffrey R. Carter"
> <jrcarter@acm.org> wrote:
> 
>> Fionn Mac Cumhaill wrote:
>>> All it does is read lines in a loop from a text file with
>>> Ada.Text_IO.Get_Line, does minor modifications on about 80% of the
>>> lines that it reads, and writes the lines to an output file with
>>> Put_Line. 
>>>
>>> The modifications consist of replacing a slice of text at the end of a
>>> line with another bit of text. The biggest slice is 10 characters, and
>>> the replacement slice is always smaller than the original slice. An
>>> occasional line of text is about 6000 characters long, but most are
>>> about 700 haracters.  Get_Line reads them into a String variable that
>>> is 10,000 characters long.
>>>
>>> The problem is that the input file has more than 10 million lines of
>>> text in it. The program works perfectly, but takes about 5 hours to
>>> run. The Cygwin version of wc can count the lines in the input file in
>>> less than one minute.
>>>
>>> Why is this so slow? 
>>> Do I have an Ada problem, a GNAT problem, or a MinGW problem?

Sorry, but I don't have your full question here so trying to figure out 
your problem for the quote above. I woul assume the big difference 
between what you are doing and what WC does is that wc only reads the 
file, and can therefore do it in one go.

I am speculation now, but I think one explanation for this would be that 
you are reading, then writing each line by itself. Depending on the OS 
you are running on, and also the binding to the OS this could 
potentially create a loot of seek times on the disk. This could explain 
the long times I think. One of two things happen I think (or potentially 
both). 1) You read 700 bytes from disk, you write 700 bytes to disk. 
Assume average seek times of 5 ms and you get to 2*50 000 s = 100 000 s 
= 28 hours !!!! 2) Instead you creating the seektimes on the disk 
between reads, ther are other processes waiting for disk access in the 
OS as well. SO qite often some other process get to the top of the 
waiting queue, and can access disk, causing the seektimes.

If one of the above is causing your problem, the only solution is to 
read in larger chunks at once, and then write larger chunks of data as 
well. I don't remember how to set that up, and it is of course depending 
on compiler and OS, but hopefully you should be able to make sure you 
don't read line by line from disk.

regards

Peter Atterfj�ll
(haven't been working professionally with Ada for 14 years now :-)



  parent reply	other threads:[~2007-05-06 21:55 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-04-28  5:03 Reading and writing a big file in Ada (GNAT) on Windows XP Fionn Mac Cumhaill
2007-04-28  5:20 ` Gautier
2007-04-29 21:17   ` Fionn Mac Cumhaill
2007-04-28  5:25 ` tmoran
2007-04-28  6:56 ` Martin Krischik
2007-04-28 17:12   ` tmoran
2007-04-28 12:41 ` Jeffrey Creem
2007-04-29 21:35   ` Fionn Mac Cumhaill
2007-04-28 13:22 ` (see below)
2007-04-28 17:56 ` Simon Wright
2007-04-28 18:28   ` Jeffrey Creem
2007-04-29  7:20     ` Simon Wright
2007-04-29 21:44     ` Fionn Mac Cumhaill
2007-04-29 21:42   ` Fionn Mac Cumhaill
2007-04-30  0:48     ` Jeffrey R. Carter
2007-04-30  2:30       ` Fionn Mac Cumhaill
2007-04-30  4:21         ` tmoran
2007-04-28 19:12 ` Jeffrey R. Carter
2007-04-29 21:46   ` Fionn Mac Cumhaill
2007-05-01 14:10     ` Fionn Mac Cumhaill
2007-05-06 21:55     ` Quarc [this message]
2007-05-02  7:46 ` george
2007-05-03  6:31   ` Fionn Mac Cumhaill
2007-05-03 20:00     ` Simon Wright
2007-05-04  4:35       ` Jeffrey R. Carter
2007-05-04  4:45       ` Fionn Mac Cumhaill
2007-05-04  6:53       ` Alternative Index implementation? (Was: Reading and writing a big file in Ada (GNAT) on Windows XP) Jacob Sparre Andersen
2007-05-04  7:41         ` Dmitry A. Kazakov
2007-05-04  9:16           ` Copying string slices before calling subroutines? (Was: Alternative Index implementation?) Jacob Sparre Andersen
2007-05-04  9:44             ` Copying string slices before calling subroutines? Jacob Sparre Andersen
2007-05-04 10:14               ` Dmitry A. Kazakov
2007-05-04 12:07                 ` Jeffrey Creem
2007-05-04 12:46                   ` Dmitry A. Kazakov
2007-05-04 22:27                   ` Simon Wright
2007-05-05  7:33                     ` Jacob Sparre Andersen
2007-05-05  7:47                       ` Dmitry A. Kazakov
2007-05-05  7:41                     ` Dmitry A. Kazakov
2007-05-03 20:27     ` Reading and writing a big file in Ada (GNAT) on Windows XP Adam Beneschan
2007-05-03 23:01       ` Randy Brukardt
2007-05-04  0:28         ` Markus E Leypold
2007-05-05 16:26           ` Adam Beneschan
2007-05-05 17:27             ` Markus E Leypold
2007-05-15 23:03               ` Randy Brukardt
2007-05-04 20:04         ` Adam Beneschan
2007-05-05 16:36           ` tmoran
replies disabled

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox