* Ada and vectorization
@ 2002-06-16 9:56 Guillaume Foliard
2002-06-16 12:50 ` Dale Stanbrough
2002-06-17 23:47 ` Robert I. Eachus
0 siblings, 2 replies; 15+ messages in thread
From: Guillaume Foliard @ 2002-06-16 9:56 UTC (permalink / raw)
Hello,
I start to learn how to use the Intel's SSE instruction set in Ada programs
with inline assembly. And while reading Intel documentation (1) I was
asking myself if Ada could provide a clean way of vectorization through its
strong-typed approach. Could it be sensible, for the next Ada revision, to
create some new attributes for array types to explicitly hint the compiler
that we want to use SIMD instructions ?
Language lawyers comments are definitly welcome. As SIMD in modern general
purpose processors is largely available nowadays (SSE, SSE2, Altivec,
etc...), IMHO, it would be a mistake for Ada to ignore the performance
benefit this could bring.
(1)
http://www.intel.com/software/products/college/ia32/strmsimd/814down.htm
Guillaume Foliard
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-16 9:56 Ada and vectorization Guillaume Foliard
@ 2002-06-16 12:50 ` Dale Stanbrough
2002-06-16 20:07 ` Matthias Kretschmer
2002-06-16 22:45 ` Ted Dennison
2002-06-17 23:47 ` Robert I. Eachus
1 sibling, 2 replies; 15+ messages in thread
From: Dale Stanbrough @ 2002-06-16 12:50 UTC (permalink / raw)
Guillaume Foliard wrote:
> Hello,
>
> I start to learn how to use the Intel's SSE instruction set in Ada programs
> with inline assembly. And while reading Intel documentation (1) I was
> asking myself if Ada could provide a clean way of vectorization through its
> strong-typed approach. Could it be sensible, for the next Ada revision, to
> create some new attributes for array types to explicitly hint the compiler
> that we want to use SIMD instructions ?
> Language lawyers comments are definitly welcome. As SIMD in modern general
> purpose processors is largely available nowadays (SSE, SSE2, Altivec,
> etc...), IMHO, it would be a mistake for Ada to ignore the performance
> benefit this could bring.
I think the best way to do this is via pragmas. There is one pragma -
Annotate - which would be perfect for the job.
I think Annotate is a Gnat only thing - the real work would have
to be done with an ASIS like tool.
Very much like the fortran world, where the structured comments can
be ignored by ignorant compilers, and the program still behaves
correctly (if not as fast).
Dale
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-16 12:50 ` Dale Stanbrough
@ 2002-06-16 20:07 ` Matthias Kretschmer
2002-06-16 22:38 ` Robert A Duff
2002-06-16 22:45 ` Ted Dennison
1 sibling, 1 reply; 15+ messages in thread
From: Matthias Kretschmer @ 2002-06-16 20:07 UTC (permalink / raw)
I think, that this job should be done by the compiler itself, without
changing the language:
a) I do not want to specify everywhere which feature to use
b) vectorisation could be useful in many places, don't we want to use it
anywhere, we get speed gain from the vector unit?
c) there are good examples that there is no need - think of x86 architecture
and intel's c compiler, it uses mmx/sse/sse2 if one let's him to use,
everywhere it makes sense and the performance gain is for clean written
programs high
d) why binding this to variables - in one place it would be usefull to use
vectorisation for this part of code, in the other place is it not, with the
same variables - so explicitly using it just for one bunch of array
variables could be very inefficient comparing to another approach - why not
grouping together variables used in records (e.g. someone using for
3d-representation a record with x,y and z kartesian coordinates)? Ok one
could enhance this feature, to record constructs, but why?
e) on many architectures the vector unit is just a coprocessor, so fpu could
calculate one part and vu the other one, I think we want to let the
optimizer decide how to use both units to get the best performance - so why
we don't let the optimizer decide when to use vu, too?
GNAT uses as backend the GNU Compiler Suite - in 3.1 it is implemented - so
using the same code generator, which is capable of using mmx/sse/3dnow or
something now in some way (do not ask me how much - I just know, that
povray is far faster after compiling it with the intel c compiler ...).
Dale Stanbrough wrote:
> Guillaume Foliard wrote:
>
>> Hello,
>>
>> I start to learn how to use the Intel's SSE instruction set in Ada
>> programs with inline assembly. And while reading Intel documentation (1)
>> I was asking myself if Ada could provide a clean way of vectorization
>> through its strong-typed approach. Could it be sensible, for the next Ada
>> revision, to create some new attributes for array types to explicitly
>> hint the compiler that we want to use SIMD instructions ?
>> Language lawyers comments are definitly welcome. As SIMD in modern
>> general purpose processors is largely available nowadays (SSE, SSE2,
>> Altivec, etc...), IMHO, it would be a mistake for Ada to ignore the
>> performance benefit this could bring.
>
>
> I think the best way to do this is via pragmas. There is one pragma -
> Annotate - which would be perfect for the job.
> I think Annotate is a Gnat only thing - the real work would have
> to be done with an ASIS like tool.
>
> Very much like the fortran world, where the structured comments can
> be ignored by ignorant compilers, and the program still behaves
> correctly (if not as fast).
>
> Dale
--
Greetings
Matthias Kretschmer
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-16 20:07 ` Matthias Kretschmer
@ 2002-06-16 22:38 ` Robert A Duff
2002-06-18 8:24 ` Matthias Kretschmer
0 siblings, 1 reply; 15+ messages in thread
From: Robert A Duff @ 2002-06-16 22:38 UTC (permalink / raw)
Various early versions of the Ada 9X proposals had some explicit support
for vectorizing and the like. I don't remember the details. You could
look up the early versions if you're interested.
These were removed, not because there was anything wrong with them
technically in and of themselves, but because there was a general
feeling amongst reviewers (especially compiler writers) that there were
too many new features.
- Bob
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-16 12:50 ` Dale Stanbrough
2002-06-16 20:07 ` Matthias Kretschmer
@ 2002-06-16 22:45 ` Ted Dennison
1 sibling, 0 replies; 15+ messages in thread
From: Ted Dennison @ 2002-06-16 22:45 UTC (permalink / raw)
Dale Stanbrough wrote:
>>I start to learn how to use the Intel's SSE instruction set in Ada programs
>>with inline assembly. And while reading Intel documentation (1) I was
>>asking myself if Ada could provide a clean way of vectorization through its
>>strong-typed approach. Could it be sensible, for the next Ada revision, to
> I think the best way to do this is via pragmas. There is one pragma -
When I was reading about HPF, I remember thinking that the parallel
loops could be done just as easily in Ada with custom pragmas ("pragma
parallel (Loopname);"). I also remember thinking that a lot of the
optimization problems that we obsessed over in class (it was a compiler
optimization class) would be much simpler in Ada.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-16 9:56 Ada and vectorization Guillaume Foliard
2002-06-16 12:50 ` Dale Stanbrough
@ 2002-06-17 23:47 ` Robert I. Eachus
1 sibling, 0 replies; 15+ messages in thread
From: Robert I. Eachus @ 2002-06-17 23:47 UTC (permalink / raw)
Guillaume Foliard wrote:
> I start to learn how to use the Intel's SSE instruction set in Ada
> programs with inline assembly. And while reading Intel
> documentation (1) I was asking myself if Ada could provide a clean
> way of vectorization through its strong-typed approach. Could it
> be sensible, for the next Ada revision, to create some new
> attributes for array types to explicitly hint the compiler that we
> want to use SIMD instructions ? Language lawyers comments are
> definitly welcome. As SIMD in modern general purpose processors is
> largely available nowadays (SSE, SSE2, Altivec, etc...), IMHO, it
> would be a mistake for Ada to ignore the performance benefit this
> could bring.
Let me answer this with two different hats on.
First language lawyer: You have to ask what restrictions imposed by the
language prevent the use of these features, then look at how to either
relax the restrictions or create language features which explicitly
bypass the restrictions. This has been done in Ada. For example:
Ada allows non-standard numeric types to allow for things like a
floating-point type with inaccurate divides, integer types that cannot
be used as array indicies, etc.
If you don't need accuracy, you can compile and execute programs with
the strict mode of the Numeric Annex turned off. I am not quite that
crazy, but it could make sense for 3d display code. ;-)
See 11.6 Exceptions and Optimization (and that section can lead to a
real long thread...)
So if it takes something special to use an SIMD instruction set, the
language allows it. In practice all of the existing interesting SIMD
extensions can be mapped to standard integer, boolean, float, etc. types.
Now from a practical point of view: There are two problems with
designing language extensions to map to specific hardware. The first is
that software and language lifetimes are much greater than hardware
lifetimes. For example, you mention AMD's 3dNow! As it happens, there
are three versions of 3dNow! The original version in the K6/2, the
extended version in the original Athlons, and the version in the Athlon
XP (and Morgan Duron) chips that is a superset of Intel's SSE.
The Intel situation is a little clearer, but even there if you are doing
a decent (portable) programming job you have to deal with MMX only
chips, those with SSE, and those with SSE2. It is much nicer to use the
right architecture switch and have the compiler produce efficient code
for your target architecture. (If you are really doing a good job, you
will isolate all the SIMD dependent code into a few dlls, and have the
installer choose the correct version of each for the current hardware.
The second practical issue is much nastier. Two implementations of the
same identical ISA can have very different performance behavior. Worse,
two otherwise exclusive features can have nasty interactions in an
implementation. Let me take a simple example, MMX and 3dNow! Athlons
allow integers and floating-point values to share architectural
registers. Due to the large floating-point register renaming files this
is actually a nice feature. But if you reset mode bits, the programmer
usually cares which mode bits are used for which operations. The
solution is to generate an SFENCE instruction which insures that the
view of hardware registers and memory is globally consistant, even for
things which are otherwise weakly ordered. This instruction can have
almost no latency--or require thousands of clock cycles in the worst
cases. (For example a write may cause a TLB miss, and the part of the
memory table that needs to be read may not be in L1 or L2 cache.)
So what code should a compiler generate? The usual solution is to
consider both the average execution time and the variance when choosing
between two solutions. Would you rather that the compiler used sequence
A, with a minimum of 107 clocks and a maximum of 192, or sequence B with
a minumum of 100 clocks and a worst case of 1000? This often results in
not using MMX registers or SSE code where the potential savings is only
a few percent. If the user forces the compiler to use SSE in all cases,
the horrible sequences will be in there along with the good ones.
One last horrible problem with the innocent sounding name of store to
load forwarding. On modern processors, actual stores from registers to
either cache or main memory can take place hundreds of clock cycles
later than the beginning of the move instruction. Out of order
processors get around this by keeping track of pending writes of
renaming registers and if a load instruction for that data is
encountered, the load is turned into a no-op, and the register is
renamed as the target of the load.
But what if only part of the load data is coming from the store and the
rest is being read from cache or main memory? Most chips throw up their
hands and make the load instruction dependent on the store instruction
being retired. This is a nasty cost you don't want to run into. (There
are also other ways to run into store to load forwarding problems, but
that is another topic.) What if you have a 32-bit integer in an integer
register and want to combine it into a 64-bit or 128-bit SSE operand.
Uh-oh! Much better to avoid the store to load restrictions and the
SSE operations. Again this is something you where you expect (hope?)
the compiler will get it right, and forcing the use of SSE can result in
very suboptimal code.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-16 22:38 ` Robert A Duff
@ 2002-06-18 8:24 ` Matthias Kretschmer
2002-06-18 10:02 ` Dale Stanbrough
0 siblings, 1 reply; 15+ messages in thread
From: Matthias Kretschmer @ 2002-06-18 8:24 UTC (permalink / raw)
Robert A Duff wrote:
> Various early versions of the Ada 9X proposals had some explicit support
> for vectorizing and the like. I don't remember the details. You could
> look up the early versions if you're interested.
>
> These were removed, not because there was anything wrong with them
> technically in and of themselves, but because there was a general
> feeling amongst reviewers (especially compiler writers) that there were
> too many new features.
>
> - Bob
Oh didn't wanted to say, this is wrong, but I think there is a better
solution - I do not want to care about, where to use or not use this or
that feature of an architecture or of architectures, maybe they are
changing, what to do next, write some new pragmas, to let the compiler
optimize for the new architectures better? I think that today compilers -
optimizers of compilers - are capable of finding places where to use vector
units or what how ever so cool feature of you cpu, so I want them to decide
and let me alone with the real important stuff. I do not want to know how
many clock cycles A takes in unit B of cpu C. Think of what you have to
know just to get something done, which other compilers do for their own
without bothering the programmer.
The logic to decide when to use or not to use vectorization of instructions
is out there, just someone should implement this in an Ada compiler. No
need to change the language itself. So I do not know how good gcc3.1 is in
vectorization of instructions, but there are some good examples how to do
this right (so I don't know of Ada compiler :( ): Sun's C Compiler, Intel's
C Compiler, both if someone wants them to use, do a lot of vectorization
with clean written code.
--
Greetings
Matthias Kretschmer
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-18 8:24 ` Matthias Kretschmer
@ 2002-06-18 10:02 ` Dale Stanbrough
2002-06-18 16:21 ` Matthias Kretschmer
2002-06-18 17:46 ` Ted Dennison
0 siblings, 2 replies; 15+ messages in thread
From: Dale Stanbrough @ 2002-06-18 10:02 UTC (permalink / raw)
In article <aemqnr$grq$07$1@news.t-online.com>,
Matthias Kretschmer <schreib_mir_du_spacken@gmx.de> wrote:
> I think that today compilers -
> optimizers of compilers - are capable of finding places where to use vector
> units or what how ever so cool feature of you cpu, so I want them to decide
> and let me alone with the real important stuff. I do not want to know how
> many clock cycles A takes in unit B of cpu C. Think of what you have to
> know just to get something done, which other compilers do for their own
> without bothering the programmer.
It would be nice if we let the compiler discover all of the possible
vectorisations possible. I've got no idea what the current state
of the art is in this respect, however I would imagine that it would
still be -cheaper- to build a simple compiler that took hints or
directions from the programmer about possible vectorisation.
Does anyone have real info instead of my speculation?
Dale
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-18 10:02 ` Dale Stanbrough
@ 2002-06-18 16:21 ` Matthias Kretschmer
2002-06-18 19:13 ` Robert A Duff
2002-06-18 20:13 ` Guillaume Foliard
2002-06-18 17:46 ` Ted Dennison
1 sibling, 2 replies; 15+ messages in thread
From: Matthias Kretschmer @ 2002-06-18 16:21 UTC (permalink / raw)
Dale Stanbrough wrote:
> In article <aemqnr$grq$07$1@news.t-online.com>,
> Matthias Kretschmer <schreib_mir_du_spacken@gmx.de> wrote:
>
>> I think that today compilers -
>> optimizers of compilers - are capable of finding places where to use
>> vector units or what how ever so cool feature of you cpu, so I want them
>> to decide and let me alone with the real important stuff. I do not want
>> to know how many clock cycles A takes in unit B of cpu C. Think of what
>> you have to know just to get something done, which other compilers do for
>> their own without bothering the programmer.
>
> It would be nice if we let the compiler discover all of the possible
> vectorisations possible. I've got no idea what the current state
> of the art is in this respect, however I would imagine that it would
> still be -cheaper- to build a simple compiler that took hints or
> directions from the programmer about possible vectorisation.
maybe cheaper, but let me cite Dijkstra: "Are you quite sure that all those
bells and whistles, all those wonderful facilities of your so-called
powerful programming languages belong to the solution set rather than to
the problem set?"
And this is the question we have to ask here I think, and I am quite sure,
that vectorization hints belong to the problem set ...
and looking at the compiler design people, they are doing a great job, what
today a compiler does is not compareable to the stuff possible (or
available) twenty years ago - there are definatively nice compiler
implementations available today using all those nice feature of your cpu -
so not all of course, but I don't think it is a solution to move all the
logic in the language, so the programmer has to care about, it just let's
the programmer to do all over the stuff again (somehow inventing the wheel
everytime he writes down code again).
The other advantage is, that old code can gain more performance without
changing one line of code, just by using a newer version of a compiler or
another compiler.
Having all features architectures provide accessable through a language I
think should be called assembler and has nothing to do with abstraction of
the programming of the underlying hardware. Do we really want to implement
every single feature cpu-designers provide in the language itself? Then we
will have some very bloated, complex language which will raise the
difficult level of programming in this language. And this is not the aim of
"higher programming languages". They should make it easy, or we all could
just use assembler. The reason why I personally use Ada is, that it is
abstract, not that I have to care about the hardware and I think this is
the way it should be.
>
> Does anyone have real info instead of my speculation?
>
> Dale
--
Greetings
Matthias Kretschmer
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-18 10:02 ` Dale Stanbrough
2002-06-18 16:21 ` Matthias Kretschmer
@ 2002-06-18 17:46 ` Ted Dennison
1 sibling, 0 replies; 15+ messages in thread
From: Ted Dennison @ 2002-06-18 17:46 UTC (permalink / raw)
Dale Stanbrough <dstanbro@bigpond.net.au> wrote in message news:<dstanbro-16FC0C.20004918062002@news-server.bigpond.net.au>...
> It would be nice if we let the compiler discover all of the possible
> vectorisations possible. I've got no idea what the current state
> of the art is in this respect, however I would imagine that it would
> still be -cheaper- to build a simple compiler that took hints or
> directions from the programmer about possible vectorisation.
>
> Does anyone have real info instead of my speculation?
>
I took a graduate-level compiler optimzation course a couple of years
ago that dealt almost entirely with this. Apparently most reasearch
done into compiler optimizations of this sort is done using Fortran,
as most of the folks who need that kind of number-cruching power have
Fortran code they want it done with.
Fortran's solution to this issue was to use the "hint" approach by
introducing new loop constructs (and a new dialect - HPF) for this. So
clearly they think it isn't feasable for normal Fortran. Actually, I
think the real issue may be that some operations will give different
results when done in parallel, and there has to be a way of saying
that's OK (or not OK) for your particular app.
So I really think the best way to do this in Ada would be via a pragma
on the loop name. The main drawback here is that if you put such code
through a compiler that doesn't support the pragma, then you may end
up with an incorrect calculation (in addition to a slower one).
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-18 16:21 ` Matthias Kretschmer
@ 2002-06-18 19:13 ` Robert A Duff
2002-06-18 20:12 ` Matthias Kretschmer
2002-06-18 20:13 ` Guillaume Foliard
1 sibling, 1 reply; 15+ messages in thread
From: Robert A Duff @ 2002-06-18 19:13 UTC (permalink / raw)
Matthias Kretschmer <schreib_mir_du_spacken@gmx.de> writes:
> maybe cheaper, but let me cite Dijkstra: "Are you quite sure that all those
> bells and whistles, all those wonderful facilities of your so-called
> powerful programming languages belong to the solution set rather than to
> the problem set?"
Buggy optimizers are part of my problem set, too.
You're probably right in this case, but surely in *some* cases, it is
appropriate to let the programmer give the compiler hints about how to
optimize. The compiler is still doing the error-prone part (deciding
whether the optimization is correct, and actually performing the
transformation). The programmer is merely suggesting that the
optimization is worthwhile.
- Bob
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-18 19:13 ` Robert A Duff
@ 2002-06-18 20:12 ` Matthias Kretschmer
2002-06-18 20:51 ` Guillaume Foliard
0 siblings, 1 reply; 15+ messages in thread
From: Matthias Kretschmer @ 2002-06-18 20:12 UTC (permalink / raw)
Robert A Duff wrote:
> Matthias Kretschmer <schreib_mir_du_spacken@gmx.de> writes:
>
>> maybe cheaper, but let me cite Dijkstra: "Are you quite sure that all
>> those bells and whistles, all those wonderful facilities of your
>> so-called powerful programming languages belong to the solution set
>> rather than to the problem set?"
>
> Buggy optimizers are part of my problem set, too.
sure :) - but hopefully this won't happen - but this could happen with any
piece of code of any language, if the optimizer/compiler is written poorly
>
> You're probably right in this case, but surely in *some* cases, it is
> appropriate to let the programmer give the compiler hints about how to
> optimize. The compiler is still doing the error-prone part (deciding
> whether the optimization is correct, and actually performing the
> transformation). The programmer is merely suggesting that the
> optimization is worthwhile.
>
> - Bob
Yeah, but the compiler should use it even the programmer isn't suggesting it
- if you look at pragma inline (so I do not know how gnat which I currently
only use handles this, but there are good examples that not only functions
or procedures which are told to be inlined are inlined to gain speed with,
e.g. Intel's C++ Compiler) - of course only if it is useful, which of
course means more performance (or regulated the more speed <-> more size
measurement about optimization flags or something).
As suggest in this thread using pragma for loops only isn't enough I think
(so making it complicated I think - bloating the language up), because if
you just think about something like:
a := a1*a2;
b := b1*b2;
c := c1*c2;
d := d1*d2
wouldn't be cool if it is vectorized? you may say, throw anything in an
array and then put it in a loop, but can't it happen, that these a,b,c and
d aren't related, so putting it together into one array wouldn't be very
wise.
And these situations could be achived without having these statements put
together somewhere in one procedure or something, think of these
inter-procedure optimization features of compilers like sun's c compiler
(sorry but I do not know much about available Ada compilers, so my examples
are from other languages adapter, but they aren't depended on c itself)
which are able to optimize code and vectorize it if useful even if some
frictions of code are written in other procedures/functions - this of
course includes some inlinening and the code is very useless if one wants
to debug - but who really cares how the code gets faster if one needs
speed? :)
Btw. are there Ada compilers available (beside gcc 3.1 - yes the backend is
capable of using the vector units of at least x86-based cpus as stated on
gcc.gnu.org) which currently use vectorization and/or inter-procedure
optimization?
--
Greetings
Matthias Kretschmer
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-18 16:21 ` Matthias Kretschmer
2002-06-18 19:13 ` Robert A Duff
@ 2002-06-18 20:13 ` Guillaume Foliard
1 sibling, 0 replies; 15+ messages in thread
From: Guillaume Foliard @ 2002-06-18 20:13 UTC (permalink / raw)
Matthias Kretschmer wrote:
> Having all features architectures provide accessable through a language I
> think should be called assembler and has nothing to do with abstraction of
> the programming of the underlying hardware. Do we really want to implement
> every single feature cpu-designers provide in the language itself?
SIMD concepts does not seem a cpu-designer's feature to me, but rather a
different approach to problems solving...
> Then we
> will have some very bloated, complex language which will raise the
> difficult level of programming in this language. And this is not the aim
> of "higher programming languages". They should make it easy, or we all
> could just use assembler. The reason why I personally use Ada is, that it
> is abstract, not that I have to care about the hardware and I think this
> is the way it should be.
I agree with you on this. That's why, I was initially wondering if we could
find a way to abstract "vectorization processing". After having read the
Intel document on Pentium4 optimizations I don't think compilers can
automagically perform a really efficient vectorization as a true
vectorization implies algorithms different from the ones used in a
sequential processing, and most importantly a different data layout. Can
moderm compiler technology deal with data layout, and automatically choose
the best one (switch between Arrays of Structs and Structs of Array for
instance, an operation called "Data Swizzling" (1)) ? Okay, you may have
some optimized loops but a sequential algorithm by its own nature may not
be easely transformed to an efficient parallel one by a compiler.
If vectorization processing abstraction reveals to be infeasible, we should
not be disgusted by introducing some low-level features anyway. After all,
there is some quite low-level stuff in the B.2 annex of the Ada95 RM (2).
Bitwise operations can really be helpful sometimes. So can be SIMD
instructions.
(1)
http://www.google.fr/search?hl=fr&q=Data+Swizzling&btnG=Recherche+Google&meta=
(2)
http://www.adahome.com/rm95/rm9x-B-02.html#6
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-18 20:12 ` Matthias Kretschmer
@ 2002-06-18 20:51 ` Guillaume Foliard
2002-06-19 4:28 ` Matthias Kretschmer
0 siblings, 1 reply; 15+ messages in thread
From: Guillaume Foliard @ 2002-06-18 20:51 UTC (permalink / raw)
Matthias Kretschmer wrote:
> As suggest in this thread using pragma for loops only isn't enough I think
> (so making it complicated I think - bloating the language up), because if
> you just think about something like:
> a := a1*a2;
> b := b1*b2;
> c := c1*c2;
> d := d1*d2
> wouldn't be cool if it is vectorized? you may say, throw anything in an
> array and then put it in a loop, but can't it happen, that these a,b,c and
> d aren't related, so putting it together into one array wouldn't be very
> wise.
Even if there not related from a semantic point of view, they are from a
computational point of view. For the sake of performance, if performance
matters of course, why should not we layout data in a efficient manner ?
This does not break data abstraction, just the layout.
> Btw. are there Ada compilers available (beside gcc 3.1 - yes the backend
> is capable of using the vector units of at least x86-based cpus as stated
> on gcc.gnu.org) which currently use vectorization and/or inter-procedure
> optimization?
Just a precision here, GCC 3.1 does not vectorize, it just uses the vector
unit in a scalar manner as a faster x87 FPU.
Have you got any links talking about "inter-procedure optimization" ?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Ada and vectorization
2002-06-18 20:51 ` Guillaume Foliard
@ 2002-06-19 4:28 ` Matthias Kretschmer
0 siblings, 0 replies; 15+ messages in thread
From: Matthias Kretschmer @ 2002-06-19 4:28 UTC (permalink / raw)
Guillaume Foliard wrote:
> Matthias Kretschmer wrote:
>
>> As suggest in this thread using pragma for loops only isn't enough I
>> think (so making it complicated I think - bloating the language up),
>> because if you just think about something like:
>> a := a1*a2;
>> b := b1*b2;
>> c := c1*c2;
>> d := d1*d2
>> wouldn't be cool if it is vectorized? you may say, throw anything in an
>> array and then put it in a loop, but can't it happen, that these a,b,c
>> and d aren't related, so putting it together into one array wouldn't be
>> very wise.
>
> Even if there not related from a semantic point of view, they are from a
> computational point of view. For the sake of performance, if performance
> matters of course, why should not we layout data in a efficient manner ?
> This does not break data abstraction, just the layout.
Well I consider this as very ugly, I think - just looking at some compilers
I don't feel unwise or stupid - that the compiler has to care about how to
rearrange stuff, so it runs fast. Do we always want to read these nice
optimization manuals for every new CPU that comes up? I do not want it -
for C I can just wait till a new version of icc is out and magic the same
code runs much faster on the new cpu (as it was with P4 and before with P3
ans so on ...).
>
>> Btw. are there Ada compilers available (beside gcc 3.1 - yes the backend
>> is capable of using the vector units of at least x86-based cpus as stated
>> on gcc.gnu.org) which currently use vectorization and/or inter-procedure
>> optimization?
>
> Just a precision here, GCC 3.1 does not vectorize, it just uses the vector
> unit in a scalar manner as a faster x87 FPU.
> Have you got any links talking about "inter-procedure optimization" ?
ah ok - then I got something wrong - but this is of course available in
other compilers ...
for the last point just look at the icc documents - afaik it uses the same
technique, didn't found useful abstract information about this stuff :( The
Intel C Manual itself holds a short abstract what is done with these
optimizations enable (btw. inter-procedure optimization is available across
module borders - it would be nice to have it, too, for ada across package
borders).
Btw. refering to you other post - I know that it isn't really trivial to
transform a sequential program to a parallel one, but why should be going
one abstraction level back be the right step, maybe if it is getting a
problem for compiler design, we should even try to find some other
solution, even if we loose Ada on the way...
On the other hand, the parallelization of code is done in cpus today - they
are rearranging the code, so it can be executed in parallel, nothing else
has to be done now in the compiler.
--
Greetings
Matthias Kretschmer
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2002-06-19 4:28 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-06-16 9:56 Ada and vectorization Guillaume Foliard
2002-06-16 12:50 ` Dale Stanbrough
2002-06-16 20:07 ` Matthias Kretschmer
2002-06-16 22:38 ` Robert A Duff
2002-06-18 8:24 ` Matthias Kretschmer
2002-06-18 10:02 ` Dale Stanbrough
2002-06-18 16:21 ` Matthias Kretschmer
2002-06-18 19:13 ` Robert A Duff
2002-06-18 20:12 ` Matthias Kretschmer
2002-06-18 20:51 ` Guillaume Foliard
2002-06-19 4:28 ` Matthias Kretschmer
2002-06-18 20:13 ` Guillaume Foliard
2002-06-18 17:46 ` Ted Dennison
2002-06-16 22:45 ` Ted Dennison
2002-06-17 23:47 ` Robert I. Eachus
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox