From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,4fbd260da735f6f4 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news2.google.com!border1.nntp.dca.giganews.com!nntp.giganews.com!local01.nntp.dca.giganews.com!nntp.comcast.com!news.comcast.com.POSTED!not-for-mail NNTP-Posting-Date: Sat, 28 Apr 2007 08:15:03 -0500 Date: Sat, 28 Apr 2007 08:41:26 -0400 From: Jeffrey Creem User-Agent: Thunderbird 2.0.0.0 (Windows/20070326) MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Re: Reading and writing a big file in Ada (GNAT) on Windows XP References: <0hj5339mjmond132qhbn2o01unurs61lbj@4ax.com> In-Reply-To: <0hj5339mjmond132qhbn2o01unurs61lbj@4ax.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Message-ID: NNTP-Posting-Host: 24.147.74.171 X-Trace: sv3-8j0BsE7e7f+xhLuLQTdyi8ZpIg9UQcq8gGZY5czqgO9+hF1uXxlvY+ulmwEdaJ0Cl2AGpAiIml0RBwh!/JMzgQxl1g0hjxd+z2DIk/lhxXSJf1H+EbQsq0vWVxLjE5ckreVzVkzwEH+W04B0xUB+0Gv5k3eK!6qSdLjleLmXw88spfQ8y6g== X-Complaints-To: abuse@comcast.net X-DMCA-Complaints-To: dmca@comcast.net X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.34 Xref: g2news1.google.com comp.lang.ada:15377 Date: 2007-04-28T08:41:26-04:00 List-Id: Fionn Mac Cumhaill wrote: > I have written a very simple program that runs on a Windows XP > machine. (Pentium D , I don't remember the clock speed, it is in > middle of Pentium D clock speeds, 2 Gb memory) It is compiled with the > MinGW GNAT 3.4.5 Ada compiler. > > All it does is read lines in a loop from a text file with > Ada.Text_IO.Get_Line, does minor modifications on about 80% of the > lines that it reads, and writes the lines to an output file with > Put_Line. > > The modifications consist of replacing a slice of text at the end of a > line with another bit of text. The biggest slice is 10 characters, and > the replacement slice is always smaller than the original slice. An > occasional line of text is about 6000 characters long, but most are > about 700 haracters. Get_Line reads them into a String variable that > is 10,000 characters long. > > > The problem is that the input file has more than 10 million lines of > text in it. The program works perfectly, but takes about 5 hours to > run. The Cygwin version of wc can count the lines in the input file in > less than one minute. > > The program that produces the file pulls data from a SQL Server 7 > running on an 800 MhZ Pentium III machine with 512 Mb and writes it to > a file in 1-1/2 hours. Almost all (99.9%) of the lines are rows from a > database table. > > Why is this so slow? > Do I have an Ada problem, a GNAT problem, or a MinGW problem? Is there any chance that the code is doing things like initializing the string to blanks/nulls before each getline and or scanning beyond the value returned for last when doing the replace?