* Tasking performance between Ada83 and Ada95 @ 1997-06-06 0:00 Mike Rose 1997-06-07 0:00 ` Robert Dewar ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Mike Rose @ 1997-06-06 0:00 UTC (permalink / raw) I am checking the performance between Ada83 and Ada95 using the Tasking Benchmarks written by Thomas Burger from the PAL. The Compilers I'm using are for Ada95 - GNAT v3.07 and for Ada83 - Alsys Adaworld v5.5.4. The operating system is HP UX v10.10. Each test was run with the creation of 500 tasks. In comparing the results between the two compilers, I found that the tasking performance is much slower with GNAT than with Alsys, every test was at least 10 times slower and some were much more. Our software depends heavily on tasking. Is there any way to improve the tasking performance with GNAT ? -- ------------------------------------------------------------------------------- Mike Rose NSWC/DD Phone: 540-653-4753 Email: mrose@nswc.navy.mil Disclaimer: The preceeding message was brought to you via myself and in no way reflect the ideas or wishes of the U.S. Navy or the DOD in any way. ------------------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-06 0:00 Tasking performance between Ada83 and Ada95 Mike Rose @ 1997-06-07 0:00 ` Robert Dewar 1997-06-08 0:00 ` Edmond Walsh 1997-06-07 0:00 ` Robert A Duff 1997-06-07 0:00 ` jim hopper 2 siblings, 1 reply; 20+ messages in thread From: Robert Dewar @ 1997-06-07 0:00 UTC (permalink / raw) Mike Rose says <<In comparing the results between the two compilers, I found that the tasking performance is much slower with GNAT than with Alsys, every test was at least 10 times slower and some were much more. Our software depends heavily on tasking. Is there any way to improve the tasking performance with GNAT ? >> You have to be careful to know exactly what you are comparing. In partciular, there is no question that Ada 95 does impose some additional semantic constraints and features (e.g. requeue and ATC) that result in distributed implementation costs. Often for example, the proper comparison is between a task in Ada 83 and a protected type in Ada 95. You also have to decide what features you are testing carefully. In looking at, for example, the timings on SGI between VADS and GNAT on the PIWG tasking benchmarks, we certainly do not see a factor of 10 difference in performance, and in some comparative benchmarks on timing performance for tasking, we see GNAT running faster than an Ada 83 compiler doing similar things. The other point is that you have to be very careful that you are in fact looking at comparable situations. For example, comparing a GNAT compiler where tasks are mapped to operating systems threads, with an Ada 83 o compiler where the tasking maps to a single processing and is handled in user mode is of course a completely meaningless comparison. One answer to your question if you are making this kind of apples/oranges comparison is to use a similar kernel (e.g. FSU threads on GNAT). We find on many targets that the use of FSU threads is MUCH more efficient than the use of operating systems tasks. A more focussed reply is possible if you tell us exactly what is being compared (what machines, what compilers, what thread packages). Robert Dewar Ada Core Technologies ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-07 0:00 ` Robert Dewar @ 1997-06-08 0:00 ` Edmond Walsh 1997-06-09 0:00 ` Robert Dewar 0 siblings, 1 reply; 20+ messages in thread From: Edmond Walsh @ 1997-06-08 0:00 UTC (permalink / raw) In article <dewar.865693453@merv>, Robert Dewar <dewar@merv.cs.nyu.edu> writes >Mike Rose says > ><<In comparing the results between the two compilers, I found that the tasking >performance is much slower with GNAT than with Alsys, every test was at least >10 times slower and some were much more. > >Our software depends heavily on tasking. Is there any way to improve the >tasking performance with GNAT ? > >>> > >You have to be careful to know exactly what you are comparing. In partciular, >there is no question that Ada 95 does impose some additional semantic >constraints and features (e.g. requeue and ATC) that result in distributed >implementation costs. Often for example, the proper comparison is between >a task in Ada 83 and a protected type in Ada 95. > >You also have to decide what features you are testing carefully. In looking >at, for example, the timings on SGI between VADS and GNAT on the PIWG tasking >benchmarks, we certainly do not see a factor of 10 difference in performance, >and in some comparative benchmarks on timing performance for tasking, we see >GNAT running faster than an Ada 83 compiler doing similar things. > >The other point is that you have to be very careful that you are in fact >looking at comparable situations. For example, comparing a GNAT compiler >where tasks are mapped to operating systems threads, with an Ada 83 >o > >compiler where the tasking maps to a single processing and is handled in >user mode is of course a completely meaningless comparison. > >One answer to your question if you are making this kind of apples/oranges >comparison is to use a similar kernel (e.g. FSU threads on GNAT). We find >on many targets that the use of FSU threads is MUCH more efficient than the >use of operating systems tasks. > >A more focussed reply is possible if you tell us exactly what is being >compared (what machines, what compilers, what thread packages). > >Robert Dewar >Ada Core Technologies > We had a similar problem when moving some Ada 83 code running on a Harris NightHawk to Ada 95 (Gnat) on an SG. It took a lot of effort to get the code running reasonably on the SG. The underlying problem was the mapping of the Ada Tasks to Unix threads. The Harris (now Concurrent) system was very good, reflecting the Real Time background of Harris. (I was not involved in the porting, I was just an interested observer.) -- Edmond Walsh ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-08 0:00 ` Edmond Walsh @ 1997-06-09 0:00 ` Robert Dewar 1997-06-15 0:00 ` Edmond Walsh 0 siblings, 1 reply; 20+ messages in thread From: Robert Dewar @ 1997-06-09 0:00 UTC (permalink / raw) Edmond Walsh said <<We had a similar problem when moving some Ada 83 code running on a Harris NightHawk to Ada 95 (Gnat) on an SG. It took a lot of effort to get the code running reasonably on the SG. The underlying problem was the mapping of the Ada Tasks to Unix threads. The Harris (now Concurrent) system was very good, reflecting the Real Time background of Harris. (I was not involved in the porting, I was just an interested observer.)>> Of course, the mapping of Ada tasks to Unix threads is certainly a *good thing* if you need to take advantage of the flexibility of this mapping. For example, if you are using one of SGI's high end MP's, then you definitely want this mapping. But there certainly is an efficiency penalty to be paid. Actualy from your post it is not quite clear what exactly you are referring to in "lot of effort" and "underlying problem" here. Were there problems other than efficiency? Sometimes, especially in Ada 83 programs, where the dispatching semantics were not defined, programs make assumptions about the dispatching that are non-portable. This is avoided in Ada 95 if you are using a compiler that implements full Annex D semantics (true of the SGI compiler for example), but that does not necessarily help the porting of legacy code. Robert dewar Ada Core Technologies ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-09 0:00 ` Robert Dewar @ 1997-06-15 0:00 ` Edmond Walsh 1997-06-15 0:00 ` Robert Dewar 0 siblings, 1 reply; 20+ messages in thread From: Edmond Walsh @ 1997-06-15 0:00 UTC (permalink / raw) In article <dewar.865870961@merv>, Robert Dewar <dewar@merv.cs.nyu.edu> writes >Edmond Walsh said > ><<We had a similar problem when moving some Ada 83 code running on a >Harris NightHawk to Ada 95 (Gnat) on an SG. It took a lot of effort to >get the code running reasonably on the SG. The underlying problem was >the mapping of the Ada Tasks to Unix threads. The Harris (now >Concurrent) system was very good, reflecting the Real Time background of >Harris. (I was not involved in the porting, I was just an interested >observer.)>> > >Of course, the mapping of Ada tasks to Unix threads is certainly a *good >thing* if you need to take advantage of the flexibility of this mapping. >For example, if you are using one of SGI's high end MP's, then you >definitely want this mapping. But there certainly is an efficiency >penalty to be paid. > >Actualy from your post it is not quite clear what exactly you are referring >to in "lot of effort" and "underlying problem" here. Were there problems >other than efficiency? Sometimes, especially in Ada 83 programs, where the >dispatching semantics were not defined, programs make assumptions about the >dispatching that are non-portable. This is avoided in Ada 95 if you are using >a compiler that implements full Annex D semantics (true of the SGI compiler >for example), but that does not necessarily help the porting of legacy code. > >Robert dewar >Ada Core Technologies > Efficiency was a significant problem. The program ran correctly in Ada 95 on the SG Indy with each task being executed as a seperate unix process. However this ran rather slowly because of the large overheads in context switching unix processes. In the original Nighthawk '83 version the two main components (consisting of many tasks) each ran as a seperate unix process and the individual tasks in the components were controlled by the Ada run time executive. The blocking of one task due to inter (unix) process comunications did not block the entire component. When this scheme was ported to the SG Indy it was found that the blocking of a task due to inter (unix) process communications did block the entire component. Working around this problem to achieve reasonable run time and correct operation was what caused the trouble. -- Edmond Walsh ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-15 0:00 ` Edmond Walsh @ 1997-06-15 0:00 ` Robert Dewar 1997-06-15 0:00 ` Tom Moran ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Robert Dewar @ 1997-06-15 0:00 UTC (permalink / raw) Edmond Walsh said <<Efficiency was a significant problem. The program ran correctly in Ada 95 on the SG Indy with each task being executed as a seperate unix process. However this ran rather slowly because of the large overheads in context switching unix processes. In the original Nighthawk '83 version the two main components (consisting of many tasks) each ran as a seperate unix process and the individual tasks in the components were controlled by the Ada run time executive. The blocking of one task due to inter (unix) process comunications did not block the entire component. When this scheme was ported to the SG Indy it was found that the blocking of a task due to inter (unix) process communications did block the entire component. Working around this problem to achieve reasonable run time and correct operation was what caused the trouble. >> You mean separate unix thread, rather than separate unix process I think. At least there is certainly no need to use separate processes for each task. Nevertheless that can indeed introduce extra overhead. In the Ada 83 world on monoprocessors, the use of a special Ada exec for task switching on top of an OS often made sense, and this is exactly what we get when we port the FSU threads. The advantage of the FSU threads is high efficiency and exact semantic accuracy. However, these days, more and more work is done on multi-processors, and then of course you have no choice if you want to distribute tasks across processors other than to use the system level threads. Furthermore, the efficiency hit from operating these threads on separate processors may indeed be significant. No obvious solution here ... that's why the best approach seems to be to provide a choice of threads libraries on machines where it makes sense. In version 3.10 of GNAT, we we providing that choice for Solaris and for Linux. We may do it for additional ports as we go along. So far we did not port FSU threads to SGI (one of the motives for doing so, accuracy, does not apply, since the SGI threads implementation is exactly correct for Ada, one of the advantages of having the vendor have an interest and stake in Ada!) But the efficiency issue might still apply. Note that on the SGI implementation of GNAT, there are controls ovre how tasks are distributed among processes, and it is worth while tuning these right, especially when using a multi-processor machine ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-15 0:00 ` Robert Dewar @ 1997-06-15 0:00 ` Tom Moran 1997-06-16 0:00 ` Robert A Duff 1997-06-22 0:00 ` Geert Bosch 2 siblings, 0 replies; 20+ messages in thread From: Tom Moran @ 1997-06-15 0:00 UTC (permalink / raw) > Note that on the SGI implementation of GNAT, there are controls ovre how > tasks are distributed among processes, and it is worth while tuning these > right Some years ago at a local (SF Bay Area) SIGADA meeting the speaker discussed a system where you could map N Ada tasks to M threads. It was probably on a Sun, but I could misremember. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-15 0:00 ` Robert Dewar 1997-06-15 0:00 ` Tom Moran @ 1997-06-16 0:00 ` Robert A Duff 1997-06-17 0:00 ` Robert Dewar 1997-06-22 0:00 ` Geert Bosch 2 siblings, 1 reply; 20+ messages in thread From: Robert A Duff @ 1997-06-16 0:00 UTC (permalink / raw) In article <dewar.866376446@merv>, Robert Dewar <dewar@merv.cs.nyu.edu> wrote: >However, these days, more and more work is done on multi-processors, and >then of course you have no choice if you want to distribute tasks across >processors other than to use the system level threads. Furthermore, the >efficiency hit from operating these threads on separate processors may >indeed be significant. Are you saying it's impossible to write an Ada run-time system that does parallelism without using operating system threads (on a Unix-like OS)? That's not true -- I've done it. I wrote an Ada 83 RTS that ran on a parallel shared-memory machine where the OS had no threads support. So we created one Unix process per CPU (you need a way to know how many CPUs there are). Almost all memory in all the processes was mapped to the same place. And then the task scheduler decided which task to run in which process according to the usual priority rules. Of course, non-blocking I/O required writing Text_IO (etc) in terms of system calls that don't block, but interrupt when the I/O is done. - Bob ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-16 0:00 ` Robert A Duff @ 1997-06-17 0:00 ` Robert Dewar 0 siblings, 0 replies; 20+ messages in thread From: Robert Dewar @ 1997-06-17 0:00 UTC (permalink / raw) <<Are you saying it's impossible to write an Ada run-time system that does parallelism without using operating system threads (on a Unix-like OS)? That's not true -- I've done it. I wrote an Ada 83 RTS that ran on a parallel shared-memory machine where the OS had no threads support. So we created one Unix process per CPU (you need a way to know how many CPUs there are). Almost all memory in all the processes was mapped to the same place. And then the task scheduler decided which task to run in which process according to the usual priority rules. Of course, non-blocking I/O required writing Text_IO (etc) in terms of system calls that don't block, but interrupt when the I/O is done.>> No one said that this is impossible. It is of course possible, but it does not seem to be a particularly useful approach in practice on modern day operating systems and hardware. Or put it another way, we have not found even one customer interested in this kind of simulation. All our customers using multi-processors definitely want Ada to use the underlying operating systems threads, since this gives them all kinds of capabilities, like assigning threads to processors, dealing with thread hierarchies, special kinbds of inter-thread syncrhonization, debugging tools that know about the threads, performance analyzers that know about the threads etc. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-15 0:00 ` Robert Dewar 1997-06-15 0:00 ` Tom Moran 1997-06-16 0:00 ` Robert A Duff @ 1997-06-22 0:00 ` Geert Bosch 1997-06-23 0:00 ` Robert Dewar 1997-06-23 0:00 ` Larry Kilgallen 2 siblings, 2 replies; 20+ messages in thread From: Geert Bosch @ 1997-06-22 0:00 UTC (permalink / raw) Robert Dewar <dewar@merv.cs.nyu.edu> wrote: However, these days, more and more work is done on multi-processors, and then of course you have no choice if you want to distribute tasks across processors other than to use the system level threads. Furthermore, the efficiency hit from operating these threads on separate processors may indeed be significant. IMO the best solution would be to start X system level threads and implement a user-level threads package on top of it. Of course there will be a little extra need for locking, but on platforms suitable for multi-processing there exist CPU-instructions that make the implementation of fast locks possible. The interesting thing of course is that you can vary the number of system threads to customize the task model to the application. When N is the number of processors interesting values of X are: * 1 to limit the program to use 1 processor and no system context-switching * N, to achieve full multi-processing on a multi-processor * M, where M > N to simulate a multi-processor with M processors on a system with N processors (N = 1 for a uni-processor). This scheme could combine the advantages of user-level threads (fast context switches, fast priority changes and correct Ada semantics) with those of system-level threads (non-blocking system-calls and multi-processing). Regards, Geert ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-22 0:00 ` Geert Bosch @ 1997-06-23 0:00 ` Robert Dewar 1997-06-23 0:00 ` Larry Kilgallen 1 sibling, 0 replies; 20+ messages in thread From: Robert Dewar @ 1997-06-23 0:00 UTC (permalink / raw) iGeert Bosch said <<This scheme could combine the advantages of user-level threads (fast context switches, fast priority changes and correct Ada semantics) with those of system-level threads (non-blocking system-calls and multi-processing). >> Quite a reasonable scheme, and in fact alread implemented in some versions of GNAT. For examle, on the SGI, you have these two levels of support of threads at the system level, and you can distribute tasks among execution vehicles (the new SGI terminology for such gizmos) as you wish. Presumably, though I have not looked in detail, the threads and fibres of NT give a similar capability. It is better if the two kinds of threads are implemented at a common level and not entirely independently, since that places the abstractions at the right level, and makes sure that such issues as pririty are handled consistently. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-22 0:00 ` Geert Bosch 1997-06-23 0:00 ` Robert Dewar @ 1997-06-23 0:00 ` Larry Kilgallen 1997-06-25 0:00 ` Fergus Henderson 1 sibling, 1 reply; 20+ messages in thread From: Larry Kilgallen @ 1997-06-23 0:00 UTC (permalink / raw) In article <5oir0v$mgu$1@gonzo.sun3.iaf.nl>, Geert Bosch <geert@gonzo.sun3.iaf.nl> writes: > IMO the best solution would be to start X system level threads and > implement a user-level threads package on top of it. Of course > there will be a little extra need for locking, but on platforms > suitable for multi-processing there exist CPU-instructions that > make the implementation of fast locks possible. That is the method used by Alpha VMS. The kernel thread primitives in fact are not documented for public consumption. The documented interface is the DECthreads library (which has a couple different APIs matching varying styles and standards). DECthreads creates the user-mode lightweight threads which then get scheduled onto some number of kernel threads (typically numbering on the order of the number of CPUs). Larry Kilgallen ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-23 0:00 ` Larry Kilgallen @ 1997-06-25 0:00 ` Fergus Henderson 1997-06-25 0:00 ` Larry Kilgallen 0 siblings, 1 reply; 20+ messages in thread From: Fergus Henderson @ 1997-06-25 0:00 UTC (permalink / raw) kilgallen@eisner.decus.org (Larry Kilgallen) writes: >Geert Bosch <geert@gonzo.sun3.iaf.nl> writes: > >> IMO the best solution would be to start X system level threads and >> implement a user-level threads package on top of it. > >That is the method used by Alpha VMS. The kernel thread primitives >in fact are not documented for public consumption. The documented >interface is the DECthreads library (which has a couple different >APIs matching varying styles and standards). DECthreads creates the >user-mode lightweight threads which then get scheduled onto some >number of kernel threads (typically numbering on the order of the >number of CPUs). What happens with blocking I/O? Does DECthreads implement blocking I/O operations using asynchronous kernel I/O operations, so that it will avoid blocking a whole kernel thread when all that really needs to be blocked is a user thread? If not, then choosing the right number of kernel threads may be a bit tricky -- you want one for each CPU, to get multiprocessing, but you also need one for every user-level thread that might get blocked on I/O, so that you background compute threads don't get paused when your foreground threads are doing blocking I/O. -- Fergus Henderson <fjh@cs.mu.oz.au> | "I have always known that the pursuit WWW: <http://www.cs.mu.oz.au/~fjh> | of excellence is a lethal habit" PGP: finger fjh@128.250.37.3 | -- the last words of T. S. Garp. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-25 0:00 ` Fergus Henderson @ 1997-06-25 0:00 ` Larry Kilgallen 0 siblings, 0 replies; 20+ messages in thread From: Larry Kilgallen @ 1997-06-25 0:00 UTC (permalink / raw) n article <5oq70f$4f@mulga.cs.mu.OZ.AU>, fjh@mundook.cs.mu.OZ.AU (Fergus Henderson) writes: > kilgallen@eisner.decus.org (Larry Kilgallen) writes: > >>Geert Bosch <geert@gonzo.sun3.iaf.nl> writes: >> >>> IMO the best solution would be to start X system level threads and >>> implement a user-level threads package on top of it. >> >>That is the method used by Alpha VMS. The kernel thread primitives >>in fact are not documented for public consumption. The documented >>interface is the DECthreads library (which has a couple different >>APIs matching varying styles and standards). DECthreads creates the >>user-mode lightweight threads which then get scheduled onto some >>number of kernel threads (typically numbering on the order of the >>number of CPUs). > > What happens with blocking I/O? Does DECthreads implement blocking > I/O operations using asynchronous kernel I/O operations, so that it > will avoid blocking a whole kernel thread when all that really needs > to be blocked is a user thread? Well DECthreads does not actually implement the I/O operations itself, or else they might be tying the DECthreads I/O support to a particular language, and they might make the wrong choice :-) DECthreads has its hooks into the System Service Dispatcher of the base OS so that DECthreads will get an "upcall" invocation when IO is about to block. DECthreads then has the opportunity to switch which lightweight thread is active on a given kernel thread before the kernel thread "really" stalls. If DECthreads is smart enough to choose a lightweight thread which is not blocked :-), there is no stall. That "upcall" mechanism can even work when there is only a single kernel thread. In the future it might be enabled also on VAX, where kernel threads are not available (moral equivalent of having a single kernel thread), although that does not affect DEC Ada directly, since on VAX DEC Ada does not use DECthreads. On VAX, DEC Ada Pragma TIME_SLICE does a less timely job of preventing stalled threads from blocking the whole process by using a timer AST (timeliness under programmer control for a price (overhead)). Both the AST mechanism and the upcall mechanism only work when the wait is in user mode, since that is the only case the state of the process is well-known. Most System Service waits are in user mode these days, although those who used VMS V1 may remember programs from which one could not escape via CTRL/Y or CTRL/C, often due to System Services which waited in inner modes. > If not, then choosing the right number of kernel threads may be a bit > tricky -- you want one for each CPU, to get multiprocessing, but > you also need one for every user-level thread that might get blocked > on I/O, so that you background compute threads don't get paused > when your foreground threads are doing blocking I/O. But such an implementation would still get a marketing "checkoff" for supporting "kernel threads" :-) Larry Kilgallen ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-06 0:00 Tasking performance between Ada83 and Ada95 Mike Rose 1997-06-07 0:00 ` Robert Dewar @ 1997-06-07 0:00 ` Robert A Duff 1997-06-08 0:00 ` Robert Dewar 1997-06-07 0:00 ` jim hopper 2 siblings, 1 reply; 20+ messages in thread From: Robert A Duff @ 1997-06-07 0:00 UTC (permalink / raw) In article <1997Jun6.115223.7384@relay.nswc.navy.mil>, Mike Rose <user@machine.domain> wrote: >In comparing the results between the two compilers, I found that the tasking >performance is much slower with GNAT than with Alsys, every test was at least >10 times slower and some were much more. > >Our software depends heavily on tasking. Is there any way to improve the >tasking performance with GNAT ? Probably the problem is that GNAT uses the tasking (threads) of the underlying OS, and what you're measuring is the poor performance of those underlying systems. Often, the issue is that you're doing system calls to lock things and so forth -- and system calls take a l-o-o-o-n-g time. I don't think it's all that hard to make it fast -- change the RTS to do all its own task dispatching and whatnot, and never never do a system call in time-critical operations. But then you lose something: you have to jump through hoops to do asynchronous I/O, for example. Also, the O/S doesn't know about your tasks, so it won't know how to schedule your tasks in relation to tasks (threads) in other unrelated programs. If a page fault happens, and you're not using OS threads, your whole program will wait for disk I/O. A real-time program might not care about these things, however. - Bob ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-07 0:00 ` Robert A Duff @ 1997-06-08 0:00 ` Robert Dewar 1997-06-10 0:00 ` Jon S Anthony 1997-06-10 0:00 ` PascMartin 0 siblings, 2 replies; 20+ messages in thread From: Robert Dewar @ 1997-06-08 0:00 UTC (permalink / raw) Bob Duff said <<Probably the problem is that GNAT uses the tasking (threads) of the underlying OS, and what you're measuring is the poor performance of those underlying systems. Often, the issue is that you're doing system calls to lock things and so forth -- and system calls take a l-o-o-o-n-g time. I don't think it's all that hard to make it fast -- change the RTS to do all its own task dispatching and whatnot, and never never do a system call in time-critical operations. But then you lose something: you have to jump through hoops to do asynchronous I/O, for example. Also, the O/S doesn't know about your tasks, so it won't know how to schedule your tasks in relation to tasks (threads) in other unrelated programs. If a page fault happens, and you're not using OS threads, your whole program will wait for disk I/O. A real-time program might not care about these things, however.>> That's my guess too. I very much doubt that the Alsys World compiler on HPUX used system level threads, and I also guess that the version of GNAT on HPUX is probably using DCE threads. So the comparison is completely meaningless. What we are doing with several of the new distributions of GNAT is to provide a mechanism for selecting system level threads or our own thread package. The system level threads give full system concurrency and allow true multi-processing on an MP, but tend to be a lot slower, and also often less accurate with respect to Annex D requirements. Our own threads package (typically FSU threads), does not give full system concurrency, but is often much more efficient, and also more accurate with respect to Annex D requirements. This gives the best of both worlds, leaving the choice up to the user. Robert Dewar Ada Core Technologies ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-08 0:00 ` Robert Dewar @ 1997-06-10 0:00 ` Jon S Anthony 1997-06-10 0:00 ` PascMartin 1 sibling, 0 replies; 20+ messages in thread From: Jon S Anthony @ 1997-06-10 0:00 UTC (permalink / raw) In article <dewar.865813228@merv> dewar@merv.cs.nyu.edu (Robert Dewar) writes: > What we are doing with several of the new distributions of GNAT is to provide > a mechanism for selecting system level threads or our own thread package. > The system level threads give full system concurrency and allow true > multi-processing on an MP, but tend to be a lot slower, and also often > less accurate with respect to Annex D requirements. Our own threads package > (typically FSU threads), does not give full system concurrency, but is often > much more efficient, and also more accurate with respect to Annex D > requirements. This gives the best of both worlds, leaving the choice > up to the user. That's really excellent! Thanks, /Jon -- Jon Anthony Organon Motives, Inc. Belmont, MA 02178 617.484.3383 jsa@organon.com ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-08 0:00 ` Robert Dewar 1997-06-10 0:00 ` Jon S Anthony @ 1997-06-10 0:00 ` PascMartin 1997-06-10 0:00 ` Robert Dewar 1 sibling, 1 reply; 20+ messages in thread From: PascMartin @ 1997-06-10 0:00 UTC (permalink / raw) Robert Dewar wrote: "That's my guess too. I very much doubt that the Alsys World compiler on HPUX used system level threads, and I also guess that the version of GNAT on HPUX is probably using DCE threads. So the comparison is completely meaningless." I don't agree. Let me elaborate. First, a few comments: - to my knowledge there is _no_ thread support in the HP-UX kernel (9 months ago..). - the DCE thread library is an user-level thread simulation (same comment). - the AdaWorld product use "proprietary" user-level thread simulation, built into the Ada runtime, and (obsessively) optimized for it. So far, considering respective features of these two runtimes, I see no difference. Both are switching tasks at the same user level. Both will have the same problems regarding blocking IOs, both will never take benefit of multiprocessing, etc.. Back to the point, can you compare a bicycle and a Ford Mustang?. No, of course.. Robert you should practice bicycle more often, for me I can tell the difference when I use one or the other :-) (BTW, bicycle enhance breathing). What I want to demonstrate is: if the benchmark reflects the user's need, then the benchmark is good. It does not matter what products are compared, as all Ada compilers implement the same thing: the Ada language. Don't they ? If there is a better runtime for GNAT on HP-UX, this is a good time to disclose it. If there is none, the comparison seems valid to me. It is true Ada95 ATC is a major constraint for runtime developpers. As robustness is critical, especially in the first releases of a product, I expect most Ada95 runtimes to take a hit compare to Ada83 ones (1). Beside that, ATC is so complicated that most (sane) people will probably avoid using it. My conclusion is: some Ada95 new features could have been left on the side; they are appealing to a minority, and appaling to the others. To conclude, if some program has to be developped for the real world, and Ada83 is good enough, why not selecting AdaWorld?. When Ada95 (and GNAT) will have been well optimized, or when HP will have released the 1 GHz HPPA workstation (which ever come first), it will be time to switch. BTW, I must disclose that I have been working in Alsys for more years that I want to admit.. (1) except for some implementation details. The ObjectAda runtime includes a secondary stack design that is refreshingly efficient. Tucker Taft brain child.. Pascal. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-10 0:00 ` PascMartin @ 1997-06-10 0:00 ` Robert Dewar 0 siblings, 0 replies; 20+ messages in thread From: Robert Dewar @ 1997-06-10 0:00 UTC (permalink / raw) Pascal says <<So far, considering respective features of these two runtimes, I see no difference. Both are switching tasks at the same user level. Both will have the same problems regarding blocking IOs, both will never take benefit of multiprocessing, etc.. >> No, that's not the case, there is considerable extra overhead in the DCE case. I think the closest comparison would be with FSU threads optimized for no ATC, and no this version is not yet publicly released. But actually we find most people prefer the version of tasking that links directly to system threads, even if it is less efficient. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: Tasking performance between Ada83 and Ada95 1997-06-06 0:00 Tasking performance between Ada83 and Ada95 Mike Rose 1997-06-07 0:00 ` Robert Dewar 1997-06-07 0:00 ` Robert A Duff @ 1997-06-07 0:00 ` jim hopper 2 siblings, 0 replies; 20+ messages in thread From: jim hopper @ 1997-06-07 0:00 UTC (permalink / raw) In article <1997Jun6.115223.7384@relay.nswc.navy.mil> mrose@nswc.navy.mil (Mike Rose) writes: > I am checking the performance between Ada83 and Ada95 using the Tasking > Benchmarks written by Thomas Burger from the PAL. > > The Compilers I'm using are for Ada95 - GNAT v3.07 and for > Ada83 - Alsys Adaworld v5.5.4. The operating system is HP UX v10.10. > > Each test was run with the creation of 500 tasks. > > In comparing the results between the two compilers, I found that the tasking > performance is much slower with GNAT than with Alsys, every test was at least > 10 times slower and some were much more. > > Our software depends heavily on tasking. Is there any way to improve the > tasking performance with GNAT ? ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~1997-06-25 0:00 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 1997-06-06 0:00 Tasking performance between Ada83 and Ada95 Mike Rose 1997-06-07 0:00 ` Robert Dewar 1997-06-08 0:00 ` Edmond Walsh 1997-06-09 0:00 ` Robert Dewar 1997-06-15 0:00 ` Edmond Walsh 1997-06-15 0:00 ` Robert Dewar 1997-06-15 0:00 ` Tom Moran 1997-06-16 0:00 ` Robert A Duff 1997-06-17 0:00 ` Robert Dewar 1997-06-22 0:00 ` Geert Bosch 1997-06-23 0:00 ` Robert Dewar 1997-06-23 0:00 ` Larry Kilgallen 1997-06-25 0:00 ` Fergus Henderson 1997-06-25 0:00 ` Larry Kilgallen 1997-06-07 0:00 ` Robert A Duff 1997-06-08 0:00 ` Robert Dewar 1997-06-10 0:00 ` Jon S Anthony 1997-06-10 0:00 ` PascMartin 1997-06-10 0:00 ` Robert Dewar 1997-06-07 0:00 ` jim hopper
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox