From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,4fbd260da735f6f4 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news2.google.com!border1.nntp.dca.giganews.com!border2.nntp.dca.giganews.com!nntp.giganews.com!cyclone1.gnilink.net!spamkiller.gnilink.net!gnilink.net!trnddc05.POSTED!72fcb693!not-for-mail From: Fionn Mac Cumhaill Newsgroups: comp.lang.ada Subject: Re: Reading and writing a big file in Ada (GNAT) on Windows XP Message-ID: References: <0hj5339mjmond132qhbn2o01unurs61lbj@4ax.com> X-Newsreader: Forte Agent 4.2/32.1117 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Tue, 01 May 2007 14:10:43 GMT NNTP-Posting-Host: 71.170.31.60 X-Complaints-To: abuse@verizon.net X-Trace: trnddc05 1178028643 71.170.31.60 (Tue, 01 May 2007 10:10:43 EDT) NNTP-Posting-Date: Tue, 01 May 2007 10:10:43 EDT Xref: g2news1.google.com comp.lang.ada:15438 Date: 2007-05-01T14:10:43+00:00 List-Id: On Sun, 29 Apr 2007 21:46:29 GMT, Fionn Mac Cumhaill wrote: >On Sat, 28 Apr 2007 19:12:33 GMT, "Jeffrey R. Carter" > wrote: > >>Fionn Mac Cumhaill wrote: >>> >>> All it does is read lines in a loop from a text file with >>> Ada.Text_IO.Get_Line, does minor modifications on about 80% of the >>> lines that it reads, and writes the lines to an output file with >>> Put_Line. >>> >>> The modifications consist of replacing a slice of text at the end of a >>> line with another bit of text. The biggest slice is 10 characters, and >>> the replacement slice is always smaller than the original slice. An >>> occasional line of text is about 6000 characters long, but most are >>> about 700 haracters. Get_Line reads them into a String variable that >>> is 10,000 characters long. >>> >>> The problem is that the input file has more than 10 million lines of >>> text in it. The program works perfectly, but takes about 5 hours to >>> run. The Cygwin version of wc can count the lines in the input file in >>> less than one minute. >>> >>> Why is this so slow? >>> Do I have an Ada problem, a GNAT problem, or a MinGW problem? >> >>It's hard to tell. Text_IO is quite heavy-weight compared to C text I/O, >>but I'd be surprised if it made that much difference. >> >>This sounds as if it would be a pretty simple program; if so, you could >>post it (or a reasonable facsimile, if there's a reason you can't post >>the actual code) here, and we would be better able to help. > >I'm posting to cla from home. The program is at work. I'll post the >code when I get back to work. Well, I was too pressed for time to do any one-change-at-a-time experimentation; I removed the buffer initialization, and changed the Index searches to use Backward. I removed some other uses of Index, as the slice for which I was looking was at a fixed location. Run time fell to 8 minutes.