From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,4fbd260da735f6f4 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!postnews.google.com!l77g2000hsb.googlegroups.com!not-for-mail From: Adam Beneschan Newsgroups: comp.lang.ada Subject: Re: Reading and writing a big file in Ada (GNAT) on Windows XP Date: 3 May 2007 13:27:28 -0700 Organization: http://groups.google.com Message-ID: <1178224048.034635.39010@l77g2000hsb.googlegroups.com> References: <0hj5339mjmond132qhbn2o01unurs61lbj@4ax.com> <1178091967.392381.282510@o5g2000hsb.googlegroups.com> NNTP-Posting-Host: 66.126.103.122 Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-Trace: posting.google.com 1178224048 30793 127.0.0.1 (3 May 2007 20:27:28 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Thu, 3 May 2007 20:27:28 +0000 (UTC) In-Reply-To: User-Agent: G2/1.0 X-HTTP-UserAgent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.7.12-1.3.1,gzip(gfe),gzip(gfe) Complaints-To: groups-abuse@google.com Injection-Info: l77g2000hsb.googlegroups.com; posting-host=66.126.103.122; posting-account=cw1zeQwAAABOY2vF_g6V_9cdsyY_wV9w Xref: g2news1.google.com comp.lang.ada:15483 Date: 2007-05-03T13:27:28-07:00 List-Id: On May 2, 11:31 pm, Fionn Mac Cumhaill wrote: > I did do further investigation; I made a copy of the now-working > program and threw most of the program away, leaving only a very simple > program which read the large input file, but made no changes and did > no output. I added code to track the run time and put the buffer clear > back in. It read the 10 million lines in just a little over five > minutes. I then put Index back and used it to search the buffer for a > short string that would never be found, seatching forwards from the > beginning of the input buffer. Bingo. Run time increased to a bit more > than 1-1/2 hours. It strikes me that Index is the kind of function that really ought to be written in assembly language, at least partially. I notice that the version of Linux that I'm using has a built-in function to search memory for a substring; this is very descriptively called memmem() and has the amusing profile void *memmem (const void *needle, size_t needlelen, const void *haystack, size_t haystacklen); according to the man page. But I assume this is written to use registers optimally and take advantage of the REP instructions (on an x86 or Pentium). I don't know how GNAT implements Index---I haven't looked into it. -- Adam