From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on polar.synack.me X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.4 X-Google-Language: ENGLISH,ASCII-7-bit X-Google-Thread: 103376,70d59e24bec599dc,start X-Google-Attributes: gid103376,public X-Google-ArrivalTime: 2001-06-23 00:31:36 PST Path: archiver1.google.com!newsfeed.google.com!newsfeed.stanford.edu!news.uchicago.edu!newsfeed.cs.wisc.edu!loops.cs.wisc.edu!newsfeed.mathworks.com!intgwlon.nntp.telstra.net!nsw.nnrp.telstra.net!not-for-mail Message-ID: <3B344689.8A79ED4F@mullum.com.au> From: Charles Darcy X-Mailer: Mozilla 4.76 [en] (Win95; U) X-Accept-Language: en MIME-Version: 1.0 Newsgroups: comp.lang.ada Subject: Profiling Gnat re. pthread_malloc, pthread_getspecific, system finalisation & tasking. Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Date: Sat, 23 Jun 2001 17:34:33 +1000 NNTP-Posting-Host: 203.109.169.56 X-Complaints-To: abuse@telstra.net X-Trace: nsw.nnrp.telstra.net 993281493 203.109.169.56 (Sat, 23 Jun 2001 17:31:33 EST) NNTP-Posting-Date: Sat, 23 Jun 2001 17:31:33 EST Organization: Customer of Telstra Big Pond Direct Xref: archiver1.google.com comp.lang.ada:9053 Date: 2001-06-23T17:34:33+10:00 List-Id: Hello, I've profiled an Ada program, which I converted from c++, so that I might learn why the Ada program performs worse than the c++ version (approx. 10 times slower). I'm using Gnat 3.13p and gprof on a Linux (Mandrake 8.0) machine. I've used inlining, O3 optimisation, and disabled all run-time checks. The profile results (below) seem to indicate that pthread_malloc, pthread_getspecific consume a large portion of processing time, but these functions are a mystery to me. Is there any way to reduce the performance cost of these functions ? The other expensive sub-programs seem related to tasking and controlled types, neither of which I directly use. I do use the Booch components (maps, collections), which may make use of controlled types, but I am not creating and destroying these as part of my main processing, so I do not see why the system__finalization_* subprograms are such a factor. Similarly, as I do not use tasks, I don't understand why the system__tasking_* subprograms consume a significant portion of processor time. (Note: I've renamed my subprograms to *my_subprogram*) Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 8.29 79.07 79.07 4840668 0.02 0.04 *my_main_processing_subprogram* 7.61 151.64 72.57 pthread_malloc 6.43 212.98 61.34 pthread_getspecific 5.32 263.76 50.78 system__finalization_implementation__initialize__2 5.09 312.35 48.59 _free_internal 4.20 352.44 40.09 system__finalization_implementation__finalize_list 3.62 386.97 34.53 system__tasking__initialization__set_jmpbuf_address 2.60 411.73 24.76 system__tasking__initialization__task_unlock__2 2.42 434.80 23.07 1786603 0.01 0.01 *my_subprogram* 2.39 457.63 22.83 pthread_mutex_lock 2.37 480.24 22.61 1595032 0.01 0.04 *my_subprogram* 2.17 500.90 20.66 system__tasking__initialization__task_lock__2 2.13 521.24 20.34 system__soft_links__set_jmpbuf_address_soft 1.94 539.72 18.48 pthread_mutex_unlock 1.87 557.56 17.84 557870939 0.00 0.00 *my_subprogram* 1.72 573.92 16.36 61165604 0.00 0.00 *my_subprogram* 1.68 589.94 16.02 71029260 0.00 0.00 *my_subprogram* 1.66 605.78 15.84 system__tasking__initialization__get_sec_stack_addr 1.60 621.03 15.25 system__finalization_implementation__adjust__2 1.59 636.16 15.13 12025996 0.00 0.00 *my_subprogram* 1.42 649.66 13.50 system__tasking__initialization__get_jmpbuf_address 1.34 662.40 12.74 84209013 0.00 0.00 *my_subprogram* 1.23 674.13 11.73 84209013 0.00 0.00 *my_subprogram* 1.20 685.60 11.47 ada__strings__unbounded__adjust__2 1.10 696.06 10.46 84215790 0.00 0.00 *my_subprogram* 1.08 706.38 10.32 system__secondary_stack__ss_allocate 1.07 716.57 10.19 __gnat_free 1.07 726.76 10.19 __gnat_malloc 1.05 736.77 10.01 84217148 0.00 0.00 *my_subprogram* 1.02 746.53 9.76 182182420 0.00 0.00 *my_subprogram* 0.90 755.09 8.56 system__soft_links__get_jmpbuf_address_soft 0.86 763.31 8.22 288134396 0.00 0.00 *my_subprogram* 0.85 771.37 8.06 system__finalization_implementation__finalize 0.80 779.02 7.65 pthread_free 0.78 786.45 7.43 ada__strings__search__index__3 0.74 793.53 7.08 system__tasking__initialization__get_current_excep 0.71 800.31 6.78 84209013 0.00 *my_subprogram* 0.63 806.36 6.05 malloc I'm new to Ada, and afraid I may have inadvertantly used the language in an improper fashion, resulting in poor performance. My thanks to anyone who can enlighten me as to what might be done to eliminate any unnecessary performance overhead. regards, Charles.