From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM autolearn=ham autolearn_force=no version=3.4.4 X-Google-Thread: a07f3367d7,c689b55786a9f2bd X-Google-Attributes: gida07f3367d7,public,usenet X-Google-NewGroupId: yes X-Google-Language: ENGLISH,UTF8 Path: g2news2.google.com!news3.google.com!feeder.news-service.com!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!gegeweb.org!aioe.org!not-for-mail From: =?utf-8?Q?Yannick_Duch=C3=AAne_=28Hibou57?= =?utf-8?Q?=29?= Newsgroups: comp.lang.ada Subject: Re: for S'Image use Func?? Date: Tue, 11 May 2010 20:48:50 +0200 Organization: Ada At Home Message-ID: References: <4be417b4$0$6992$9b4e6d93@newsspool4.arcor-online.net> <1qcb6z4i20dyb.1dz2hd4c0vx69.dlg@40tude.net> <87632vwikr.fsf@ludovic-brenta.org> <112n6z3554srr$.tjzjtg467xfo.dlg@40tude.net> <1utg4wrqqbcjl.nh1w8haywq4p$.dlg@40tude.net> NNTP-Posting-Host: EFpv4lnpRyjbMhM0po550g.user.speranza.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes Content-Transfer-Encoding: Quoted-Printable X-Complaints-To: abuse@aioe.org X-Notice: Filtered by postfilter v. 0.8.2 User-Agent: Opera Mail/10.53 (Win32) Xref: g2news2.google.com comp.lang.ada:11522 Date: 2010-05-11T20:48:50+02:00 List-Id: Le Tue, 11 May 2010 19:15:07 +0200, Dmitry A. Kazakov = a =C3=A9crit: > Buffering =3D making copies. A copy is always an overhead. It pays off= if = > you > have asynchronous components (use them in parallel), or components wit= h > high switching overhead, or faster memory (caching, indexing etc). If = you > don't have that it is just a loss. As yoi said, there is caching, and even the good old i486 did have a = cache. If you are to iterate on the X coming items of data and X is not = so = much a big number (otherwise, you may iterate on multiple buffers), then= = you benefit from the cache, which is otherwise loose as soon as you call= = the method which will read one more items and which will load its own = state into the cache. This is not so much special, near to all machines have a cache. Also, on a stream, you will need a cache if you are to walk forward in t= he = data stream (a stream is one way only, always forward, not backward). As you were talking about checking that a particular optimization is = really efficient or not, I can assert I've checked buffering improve = performance on DOS running on i486 25 MHz. I remember of assembly or = Pascal programs getting noticeably better performance as soon as they wa= s = relying on a buffer of at least 512 bits or more. There was a limit abov= e = which one increasing the buffer size was not increasing performance any = = more (if my mind is right, this was something like 2 Ki bytes) On another application I'm working on actually, on Windows XP running on= a = faster machine (1 GHz CPU), better performance can be gain with buffer = size above the latter : I've check the application is consuming about 30= % = less times for execution with 150 Ki buffer than with a rather small one= . = Giving it some even bigger buffer does not make so much difference. It seems to me the faster the machine is, the higher you can increase = buffers size. Then, I feel you see copies where there are not more copies than with th= e = way you suggest. Given this Read one byte in a one byte variable Read one byte in a one byte variable Read one byte in a one byte variable Read one byte in a one byte variable Read one byte in a one byte variable And then that Read five bytes in a five bytes buffer Which one do make more copies than the other ? The answer is None, they = = both copy exactly five bytes. The second one is not making more copy of anything, it is just using a = bigger variable to store multiple items at once. So the matter then is = =E2=80=9Cwhich is the best capacity for the buffer=E2=80=9D. First answe= r is =E2=80=9Cdepends on = memory=E2=80=9D (I'm not to say all memory may be used for that, just th= at there = is a proportional relation) and the second answer is =E2=80=9Ccheck for = it playing = with buffer size and Ada.Calendar=E2=80=9D. There is also another point : it is mostly better, *when possible*, to d= o Batch OP1.1,OP1.2,OP1.3,OP1.4 Batch OP2.1,OP2.2,OP2.3,OP2.4 instead of OP1.1 OP2.1 OP1.2 OP2.2 etc ... The reason here again, is the CPU cache This is something which is a bit related to the ability of an applicatio= n = or an algorithm to be re-design on top of parallelism. Here, instead of = = getting benefit from simultaneous execution, we get benefit from CPU cac= he = (and nearly to all CPU have a cache). Ah, an occasion to say the CPU cache also have another interesting effec= t = peoples should know about : loop unroll is most of time anti-productive.= = If you enable the loop-unrolling =E2=80=9Coptimization=E2=80=9D option o= f your compiler, = please, check this is really relevant, don't just believe it (by the way= , = code-cache and data-cache do not exactly apply the same strategy, so don= 't = infer code-cache performance beliefs from data-cache performance = observations). -- = pragma Asset ? Is that true ? Waaww... great