From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,8c8550b9f2cf7d40 X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2003-06-10 16:23:07 PST Path: archiver1.google.com!news1.google.com!newsfeed.stanford.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!news.airnews.net!cabal12.airnews.net!usenet From: "John R. Strohm" Newsgroups: comp.lang.ada Subject: Re: Is ther any sense in *= and matrices? Date: Tue, 10 Jun 2003 18:16:47 -0500 Organization: Airnews.net! at Internet America Message-ID: References: <7visre2rwv.fsf@vlinux.voxelvision.no> Abuse-Reports-To: abuse at airmail.net to report improper postings NNTP-Proxy-Relay: library1-aux.airnews.net NNTP-Posting-Time: Tue, 10 Jun 2003 18:19:27 -0500 (CDT) NNTP-Posting-Host: !Zf3)1k-W=h7:je (Encoded at Airnews!) X-Priority: 3 X-MSMail-Priority: Normal X-Newsreader: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 Xref: archiver1.google.com comp.lang.ada:38939 Date: 2003-06-10T18:16:47-05:00 List-Id: "Russ" <18k11tm001@sneakemail.com> wrote in message news:bebbba07.0306101033.66054a83@posting.google.com... > Ole-Hjalmar Kristensen wrote in message news:<7visre2rwv.fsf@vlinux.voxelvision.no>... > > "John R. Strohm" writes: > > > > > "Russ" <18k11tm001@sneakemail.com> wrote in message > > > news:bebbba07.0306082024.7cebb5df@posting.google.com... > > > > tmoran@acm.org wrote in message news:... > > > > > >Oh, really? I just did a test in C++ with 3x3 matrices. I added them > > > > > >together 10,000,000 times using "+", then "+=". The "+=" version took > > > > > >about 19 seconds, and the "+" version took about 55 seconds. That's > > > > > Would you be so kind as to post your code, and what C++ compiler > > > > > and what hardware you used? Your results seem quite different from > > > > > other people's. My old 900MHz Windows 2K machine using MSVC++ 5.0 > > > > > took 4.38 and 3.28 seconds, a factor of 1.33, and using Gnat 3.15p > > > > > on the same machine took 1.38 and 0.85 seconds, a ratio of 1.6 > > > > > Clearly, there's something substantially different between our > > > > > compilers/hardware/code. > > > > > > > > I'm using gcc 2.95.2 on a Sunblade 2000. I can't post the code, but it > > > > is a pretty standard vector/matrix implementation in C++. Actually, it > > > > is designed for very efficient indexing, perhaps at the expense of > > > > slightly less efficient construction (it has a pointer for each row of > > > > the matrix). That might explain part of the difference you are seeing, > > > > but certainly not all. Perhaps your choice of C++ compiler is a factor > > > > too. > > > > > > I think the point that Tom is trying to make is that your results are so far > > > out of line with expected reality that he suspects that your optimizations > > > for indexing may in fact be pessimizations. > > > > > > > Yes, on modern hardware, having a pointer for each row of the matrix > > usually leads to slower code. The reason is that the indexing > > operation by multilply/add is actually faster than fetching a pointer > > from an array and then adding, thereby incurring an extra memory fetch > > and cache pollution. The above holds for large matrices, I haven't > > investigated the effects on small matrices. > > Thanks for clarifying that. I wrote that code about 10 years ago when > I was just starting to learn C++. It has other inefficiencies too. For > example, I used offset pointers to achieve indexing that starts with 1 > rather than 0. This saved the "minus one" operation on each indexing, > but it put the "minus one" operations into the constructor. It makes > sense to do the "minus one" only once in the constructor, but it does > slow down the constructor (which increases the cost of a temporary). Sounds like you really ought to rewrite your code and see what happens to your benchmark results. > If I had it to do over again, I'd probably just settle for indexing > that starts with zero. Ada allows offset indexing much more naturally > than C or C++, of course, but that's another topic. > > I am certainly no efficiency expert, but I do believe that an basic > step in achieving reasonable efficiency is to eliminate excessive > generation of temporary objects. It really depends on the temporary object in question. For a 3x3 matrix multiply (or a 3x3 matrix add), on a machine with LOTS of general-purpose registers, the optimum temporary object is nine CPU registers. ZERO allocation cost, ZERO deallocation cost, ZERO impact.