From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: 103376,4fbd260da735f6f4 X-Google-Attributes: gid103376,public X-Google-Language: ENGLISH,ASCII-7-bit Path: g2news1.google.com!news4.google.com!proxad.net!news.in2p3.fr!in2p3.fr!news.ecp.fr!news.jacob-sparre.dk!pnx.dk!not-for-mail From: "Randy Brukardt" Newsgroups: comp.lang.ada Subject: Re: Reading and writing a big file in Ada (GNAT) on Windows XP Date: Thu, 3 May 2007 18:01:10 -0500 Organization: Jacob's private Usenet server Message-ID: References: <0hj5339mjmond132qhbn2o01unurs61lbj@4ax.com> <1178091967.392381.282510@o5g2000hsb.googlegroups.com> <1178224048.034635.39010@l77g2000hsb.googlegroups.com> NNTP-Posting-Host: static-69-95-181-76.mad.choiceone.net X-Trace: jacob-sparre.dk 1178233152 12816 69.95.181.76 (3 May 2007 22:59:12 GMT) X-Complaints-To: news@jacob-sparre.dk NNTP-Posting-Date: Thu, 3 May 2007 22:59:12 +0000 (UTC) X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2800.1807 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1807 Xref: g2news1.google.com comp.lang.ada:15485 Date: 2007-05-03T18:01:10-05:00 List-Id: "Adam Beneschan" wrote in message news:1178224048.034635.39010@l77g2000hsb.googlegroups.com... ... > It strikes me that Index is the kind of function that really ought to > be written in assembly language, at least partially. I notice that > the version of Linux that I'm using has a built-in function to search > memory for a substring; this is very descriptively called memmem() and > has the amusing profile > > void *memmem (const void *needle, size_t needlelen, > const void *haystack, size_t haystacklen); > > according to the man page. But I assume this is written to use > registers optimally and take advantage of the REP instructions (on an > x86 or Pentium). I don't know how GNAT implements Index---I haven't > looked into it. The big expense in Index is the mapping set or function, not the actual compare. For Janus/Ada, I had seen a similar problem (a big deal as Index was used to look for spam patterns), and finally special-cased a number of common cases (no mapping, single character patterns, and so on). I also spent a bit of time on the code generator, figuring that this sort of string manipulation code is common enough that it might as well be generated well. The updates helped a lot, although they don't quite generate a single instruction such as is possible. (OTOH, Intel used to recommend avoiding the block move and compare instructions because they fouled up the pipeline and thus slowed the overall execution. I don't know if that is still true, but ifi it is, there might be less benefit to hand-coded assembler than you are thinking...) Randy.