comp.lang.ada
 help / color / mirror / Atom feed
* Large number of tasks slows down my program (using debian) - any fix?
@ 2018-03-28 18:06 reinert
  2018-03-28 18:49 ` Dennis Lee Bieber
                   ` (3 more replies)
  0 siblings, 4 replies; 39+ messages in thread
From: reinert @ 2018-03-28 18:06 UTC (permalink / raw)


Hei,

I use gnat on debian (latest version). Now I tried to initiate a large number of (access) tasks ( "t := new atask_type" ). When I try to start, say, 500 tasks, my program slows down whithout any apparent reason (before the tasks do anything).
Any hope for a quick fix?

reinert

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-28 18:06 Large number of tasks slows down my program (using debian) - any fix? reinert
@ 2018-03-28 18:49 ` Dennis Lee Bieber
  2018-03-28 19:06 ` Paul Rubin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 39+ messages in thread
From: Dennis Lee Bieber @ 2018-03-28 18:49 UTC (permalink / raw)


On Wed, 28 Mar 2018 11:06:29 -0700 (PDT), reinert <reinkor@gmail.com>
declaimed the following:

>I use gnat on debian (latest version). Now I tried to initiate a large number of (access) tasks ( "t := new atask_type" ). When I try to start, say, 500 tasks, my program slows down whithout any apparent reason (before the tasks do anything).

	Well -- you are creating, at the OS level, 500 processes each with an
associated stack (how large a stack is each receiving (I believe the common
GNAT runtime delegates task management to the underlying OS -- unless one
is running a custom runtime on bare-boards)? How much free memory did your
OS have to give to those stacks).

	What does "top" show? (#tasks, Memory Free, Swap Used, and top CPU
hogs)


-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/ 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-28 18:06 Large number of tasks slows down my program (using debian) - any fix? reinert
  2018-03-28 18:49 ` Dennis Lee Bieber
@ 2018-03-28 19:06 ` Paul Rubin
  2018-03-28 19:21 ` Dmitry A. Kazakov
  2018-03-29 22:33 ` Shark8
  3 siblings, 0 replies; 39+ messages in thread
From: Paul Rubin @ 2018-03-28 19:06 UTC (permalink / raw)


reinert <reinkor@gmail.com> writes:
> I use gnat on debian (latest version). Now I tried to initiate a large
> number of (access) tasks ( "t := new atask_type" ). When I try to
> start, say, 500 tasks, my program slows down whithout any apparent
> reason (before the tasks do anything).
> Any hope for a quick fix?

Gnat uses OS threads for tasks so they're not exactly lightweight,
though 500 isn't an enormous number on modern hardware.  If you're on a
tiny embedded platform or VPS then it might be cramped, but if you have
(say) 1GB of ram available then in the usual case where most of the
tasks are blocked on i/o most of the time, you should be ok.  If they're
all cpu-busy then of course things will be slow.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-28 18:06 Large number of tasks slows down my program (using debian) - any fix? reinert
  2018-03-28 18:49 ` Dennis Lee Bieber
  2018-03-28 19:06 ` Paul Rubin
@ 2018-03-28 19:21 ` Dmitry A. Kazakov
  2018-03-28 20:17   ` reinert
  2018-03-29 22:33 ` Shark8
  3 siblings, 1 reply; 39+ messages in thread
From: Dmitry A. Kazakov @ 2018-03-28 19:21 UTC (permalink / raw)


On 2018-03-28 20:06, reinert wrote:

> I use gnat on debian (latest version). Now I tried to initiate a large number of (access) tasks ( "t := new atask_type" ). When I try to start, say, 500 tasks, my program slows down whithout any apparent reason (before the tasks do anything).

500 tasks is way too many.

> Any hope for a quick fix?

Unlikely. Re-design: use a pool of working tasks, queue jobs to them. 
Use state machines instead of active objects. Switch from blocking I/O, 
to asynchronous/overlapped I/O. Use select socket instead of blocking 
sockets etc.

P.S. co-routines/passive tasks is a way to migrate from the craziness 
you have to a reasonable design without complete re-design.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-28 19:21 ` Dmitry A. Kazakov
@ 2018-03-28 20:17   ` reinert
  2018-03-29  8:46     ` reinert
  0 siblings, 1 reply; 39+ messages in thread
From: reinert @ 2018-03-28 20:17 UTC (permalink / raw)


Seems like "active components/objects" is not so simple as I hoped for.

Sometimes it seems active objects facilitate "effective (natural) representation". But there seems to be restrictions :-)

reinert


On Wednesday, March 28, 2018 at 9:21:14 PM UTC+2, Dmitry A. Kazakov wrote:
> On 2018-03-28 20:06, reinert wrote:
> 
> > I use gnat on debian (latest version). Now I tried to initiate a large number of (access) tasks ( "t := new atask_type" ). When I try to start, say, 500 tasks, my program slows down whithout any apparent reason (before the tasks do anything).
> 
> 500 tasks is way too many.
> 
> > Any hope for a quick fix?
> 
> Unlikely. Re-design: use a pool of working tasks, queue jobs to them. 
> Use state machines instead of active objects. Switch from blocking I/O, 
> to asynchronous/overlapped I/O. Use select socket instead of blocking 
> sockets etc.
> 
> P.S. co-routines/passive tasks is a way to migrate from the craziness 
> you have to a reasonable design without complete re-design.
> 
> -- 
> Regards,
> Dmitry A. Kazakov
> http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-28 20:17   ` reinert
@ 2018-03-29  8:46     ` reinert
  2018-03-29  9:18       ` Dmitry A. Kazakov
  2018-03-29 15:39       ` Jeffrey R. Carter
  0 siblings, 2 replies; 39+ messages in thread
From: reinert @ 2018-03-29  8:46 UTC (permalink / raw)


More from my investigations on trying to use many active objects (tasks):

It seems for me that the "main processor" (running the main/top program/task) is used to guide/watch the entries ("accep") in the many tasks and making my program to run slow.

Is it possible to make another processor to do this job?

reinert 


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-29  8:46     ` reinert
@ 2018-03-29  9:18       ` Dmitry A. Kazakov
  2018-03-29 15:39       ` Jeffrey R. Carter
  1 sibling, 0 replies; 39+ messages in thread
From: Dmitry A. Kazakov @ 2018-03-29  9:18 UTC (permalink / raw)


On 29/03/2018 10:46, reinert wrote:
> More from my investigations on trying to use many active objects (tasks):
> 
> It seems for me that the "main processor" (running the main/top program/task) is used to guide/watch the entries ("accep") in the many tasks and making my program to run slow.
> 
> Is it possible to make another processor to do this job?

I am not sure what you mean. There is Ada's distributed annex and you 
could use remote procedure call from there. You could also use any other 
art of communication, e.g. some custom networking protocol. In any case 
that will require re-design and it would be likely much slower.

I would suggest making objects passive, driven by an external task (from 
a pool of worker tasks). You will need a scheduler which to activate 
objects depending on the events and assigning free tasks to them. The 
objects will perform a small amount of actions and then return the 
control back keeping the state, e.g. awaiting for some event.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-29  8:46     ` reinert
  2018-03-29  9:18       ` Dmitry A. Kazakov
@ 2018-03-29 15:39       ` Jeffrey R. Carter
  2018-04-15  5:20         ` reinert
  1 sibling, 1 reply; 39+ messages in thread
From: Jeffrey R. Carter @ 2018-03-29 15:39 UTC (permalink / raw)


On 03/29/2018 10:46 AM, reinert wrote:
> More from my investigations on trying to use many active objects (tasks):

I've had systems that used many hundreds of tasks on Linux without problem. 
However, they weren't all created at once, and most of them spent most of their 
time blocked on entry calls. You haven't provided enough information for us to 
make reasonable suggestions.

-- 
Jeff Carter
"I snapped my chin down onto some guy's fist and
hit another one in the knee with my nose."
Play It Again, Sam
129


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-28 18:06 Large number of tasks slows down my program (using debian) - any fix? reinert
                   ` (2 preceding siblings ...)
  2018-03-28 19:21 ` Dmitry A. Kazakov
@ 2018-03-29 22:33 ` Shark8
  2018-03-30  9:04   ` Dmitry A. Kazakov
  3 siblings, 1 reply; 39+ messages in thread
From: Shark8 @ 2018-03-29 22:33 UTC (permalink / raw)


On Wednesday, March 28, 2018 at 12:06:31 PM UTC-6, reinert wrote:
> Hei,
> 
> I use gnat on debian (latest version). Now I tried to initiate a large number of (access) tasks ( "t := new atask_type" ). When I try to start, say, 500 tasks, my program slows down whithout any apparent reason (before the tasks do anything).
> Any hope for a quick fix?

Hm, you shouldn't be hitting the limit with so few tasks. (500 sounds like a lot, but most OSes you'll be using use an integer-ID; meaning something on the order of 2**32 or 2**64.)

If there's data-initialization going on that may be contributing to the issue.
It could also be problems with ACCESS-types, does the problem persist with non-access setups?

Lastly, take a look at your problem. Some problems are good with task-decomposition, others not so; those which are "embarrassingly parallel" are probably where you'll benefit the most from parallelizing. (e.g. The 'Exponentiation' trick.) -- Though Ada Tasks also lend themselves well to decomposing a sub-system, and that could be used to great effect, though this use isn't particularly a "parallel"-problem.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-29 22:33 ` Shark8
@ 2018-03-30  9:04   ` Dmitry A. Kazakov
  2018-03-30 20:46     ` Paul Rubin
  2018-04-02  6:23     ` alby.gamper
  0 siblings, 2 replies; 39+ messages in thread
From: Dmitry A. Kazakov @ 2018-03-30  9:04 UTC (permalink / raw)


On 2018-03-30 00:33, Shark8 wrote:

> Hm, you shouldn't be hitting the limit with so few tasks. (500 sounds like a lot, but most OSes you'll be using use an integer-ID; meaning something on the order of 2**32 or 2**64.)

It is a simple calculation. Assuming a context switch were around 
1000ns. x 500 tasks / 4 cores = 125ms if all tasks are busy. 125ms is a 
long time.

Another factor is scheduling. Assuming the time quant is 10ms (Windows 
default) and all task are doing pure calculations uninterrupted by any 
I/O, then, again: 10 x 500 / 4 = 1.25s [+125ms] is the time a task gets 
back to a core. 1.3s is an eternity.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-30  9:04   ` Dmitry A. Kazakov
@ 2018-03-30 20:46     ` Paul Rubin
  2018-03-31  0:09       ` Randy Brukardt
  2018-04-02  6:23     ` alby.gamper
  1 sibling, 1 reply; 39+ messages in thread
From: Paul Rubin @ 2018-03-30 20:46 UTC (permalink / raw)


"Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
> It is a simple calculation. Assuming a context switch were around
> 1000ns. x 500 tasks / 4 cores = 125ms if all tasks are busy. 125ms is
> a long time.

It's 125 microseconds or 1/8th of a millisecond, not that bad at all.  I
assume that's what you meant by ms.  I'm used to ms denoting
milliseconds while microseconds are written us.

Also keep in mind that 4-core Intel boxes typically have 8 hardware
threads ("hyperthreads" or logical cores) and there is basically no
context switch time between hyperthreads on a core.  They speed up total
cpu throughput because they allow another thread to run during the time
that the first thread is blocked on a hardware cache miss waiting for a
RAM operation.

> ...  Assuming the time quant is 10ms...: 10 x 500 / 4 = 1.25s [+125ms]
> is the time a task gets back to a core. 1.3s is an eternity.

Sure, if all 500 tasks are cpu-busy then things are in a dreadful state,
but usually the reason to use so many tasks is concurrent i/o, i.e. most
of the threads are blocked awaiting input.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-30 20:46     ` Paul Rubin
@ 2018-03-31  0:09       ` Randy Brukardt
  2018-03-31  6:00         ` Paul Rubin
  0 siblings, 1 reply; 39+ messages in thread
From: Randy Brukardt @ 2018-03-31  0:09 UTC (permalink / raw)


"Paul Rubin" <no.email@nospam.invalid> wrote in message 
news:87h8ox1ect.fsf@nightsong.com...
> "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> writes:
...
>> ...  Assuming the time quant is 10ms...: 10 x 500 / 4 = 1.25s [+125ms]
>> is the time a task gets back to a core. 1.3s is an eternity.
>
> Sure, if all 500 tasks are cpu-busy then things are in a dreadful state,
> but usually the reason to use so many tasks is concurrent i/o, i.e. most
> of the threads are blocked awaiting input.

If you know what you are doing, surely. Since the OP was experimenting, who 
knows?

Assuming that you don't want to wait for your favorite compiler to implement 
Ada 2020 (probably a few years yet, we have to figure out what Ada 2020 is 
first ;-), my recommendation for parallel execution is to use Brad Moore's 
Paraffin libraries. They take care of things like load balancing and 
chunking to make good use out of your CPU resources. One can't do that just 
by launching one task per iteration for the reasons outlined above.

                      Randy.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-31  0:09       ` Randy Brukardt
@ 2018-03-31  6:00         ` Paul Rubin
  2018-03-31  9:37           ` Jacob Sparre Andersen
  2018-04-02  3:35           ` Randy Brukardt
  0 siblings, 2 replies; 39+ messages in thread
From: Paul Rubin @ 2018-03-31  6:00 UTC (permalink / raw)


"Randy Brukardt" <randy@rrsoftware.com> writes:
> Assuming that you don't want to wait for your favorite compiler to
> implement Ada 2020... my recommendation for parallel execution is...

Will Ada 2020 have built-in stuff for parallelism?  I hadn't heard about
this before.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-31  6:00         ` Paul Rubin
@ 2018-03-31  9:37           ` Jacob Sparre Andersen
  2018-03-31 10:44             ` Dmitry A. Kazakov
  2018-04-02  3:35           ` Randy Brukardt
  1 sibling, 1 reply; 39+ messages in thread
From: Jacob Sparre Andersen @ 2018-03-31  9:37 UTC (permalink / raw)


Paul Rubin wrote:

> Will Ada 2020 have built-in stuff for parallelism?  I hadn't heard
> about this before.

It is very likely.  Quite a lot of work on the subject has been
presented at the last Ada-Europe conferences.

Greetings,

Jacob
-- 
"It ain't rocket science!" - Well, some of it is!


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-31  9:37           ` Jacob Sparre Andersen
@ 2018-03-31 10:44             ` Dmitry A. Kazakov
  0 siblings, 0 replies; 39+ messages in thread
From: Dmitry A. Kazakov @ 2018-03-31 10:44 UTC (permalink / raw)


On 2018-03-31 11:37, Jacob Sparre Andersen wrote:
> Paul Rubin wrote:
> 
>> Will Ada 2020 have built-in stuff for parallelism?  I hadn't heard
>> about this before.
> 
> It is very likely.  Quite a lot of work on the subject has been
> presented at the last Ada-Europe conferences.

I doubt that this kind of parallelism will help with active objects.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-31  6:00         ` Paul Rubin
  2018-03-31  9:37           ` Jacob Sparre Andersen
@ 2018-04-02  3:35           ` Randy Brukardt
  1 sibling, 0 replies; 39+ messages in thread
From: Randy Brukardt @ 2018-04-02  3:35 UTC (permalink / raw)


"Paul Rubin" <no.email@nospam.invalid> wrote in message 
news:87y3i82398.fsf@nightsong.com...
> "Randy Brukardt" <randy@rrsoftware.com> writes:
>> Assuming that you don't want to wait for your favorite compiler to
>> implement Ada 2020... my recommendation for parallel execution is...
>
> Will Ada 2020 have built-in stuff for parallelism?  I hadn't heard about
> this before.

The plan is for parallel blocks and loops, and a parallel reduction 
operation. There's more, but how much will get actually finished by early 
next year is obviously a question. The things I mentioned the are most 
likely.

                     Randy.



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-30  9:04   ` Dmitry A. Kazakov
  2018-03-30 20:46     ` Paul Rubin
@ 2018-04-02  6:23     ` alby.gamper
  2018-04-02  7:12       ` alby.gamper
  2018-04-05 14:07       ` Brad Moore
  1 sibling, 2 replies; 39+ messages in thread
From: alby.gamper @ 2018-04-02  6:23 UTC (permalink / raw)


On Friday, March 30, 2018 at 8:04:19 PM UTC+11, Dmitry A. Kazakov wrote:
> On 2018-03-30 00:33, Shark8 wrote:
> 
> > Hm, you shouldn't be hitting the limit with so few tasks. (500 sounds like a lot, but most OSes you'll be using use an integer-ID; meaning something on the order of 2**32 or 2**64.)
> 
> It is a simple calculation. Assuming a context switch were around 
> 1000ns. x 500 tasks / 4 cores = 125ms if all tasks are busy. 125ms is a 
> long time.
> 
> Another factor is scheduling. Assuming the time quant is 10ms (Windows 
> default) and all task are doing pure calculations uninterrupted by any 
> I/O, then, again: 10 x 500 / 4 = 1.25s [+125ms] is the time a task gets 
> back to a core. 1.3s is an eternity.
> 
> -- 
> Regards,
> Dmitry A. Kazakov
> http://www.dmitry-kazakov.de

Hi All
I've done a quick mock-up test which just shows the activation time of 500 tasks
This is the code, targeting windows (x64). Results show 40 & 45 ms for the two
cases (running on a Intel Core i7-2600 cpu running @ 3.4 GHz)

with Ada.Text_IO;                       use Ada.Text_IO;
procedure AdaTasking is

    type Bool is new Integer;
    type Large_Integer is mod 2 ** 64;

    task type Worker is
    end;

    task body Worker is
    begin
        null;
    end;

    task type WorkerWithEntry is
        entry Start;
        entry Stop;
    end;

    task body WorkerWithEntry is
    begin
        accept Start;
        null;
        accept Stop;
    end;

    function QueryPerformanceFrequency( lpFrequency : access Large_Integer) return Bool;
    pragma import (stdcall , QueryPerformanceFrequency , "QueryPerformanceFrequency");

    function QueryPerformanceCounter( lpPerformanceCount : access Large_Integer) return Bool;
    pragma import (stdcall , QueryPerformanceCounter , "QueryPerformanceCounter");

    Ok          : Bool := 0;
    Frequency   : aliased Large_Integer := 0;
    StartTime   : aliased Large_Integer := 0;
    EndTime     : aliased Large_Integer := 0;
    ElapsedTime : aliased Large_Integer := 0;
    
begin

    Ok := QueryPerformanceFrequency(Frequency'access);
    Frequency := Frequency / 1000; -- units is ms (ie milli seconds)
    
    Ok := QueryPerformanceCounter(StartTime'access);
	declare
        Worker_Pool : array (1..500) of Worker;
    begin
        for i in Worker_Pool'Range loop
            null;
        end loop;
    end;
    Ok := QueryPerformanceCounter(EndTime'access);
    ElapsedTime := (EndTime-StartTime) / Frequency;
    Put_Line("Worker          Elapsed Time = " & ElapsedTime'Image & " (ms)");

    Ok := QueryPerformanceCounter(StartTime'access);
	declare
        Worker_Pool : array (1..500) of WorkerWithEntry;
    begin
        for i in Worker_Pool'Range loop
            Worker_Pool(i).Start;
            Worker_Pool(i).Stop;
        end loop;
    end;
    Ok := QueryPerformanceCounter(EndTime'access);
    ElapsedTime := (EndTime-StartTime) / Frequency;
    Put_Line("WorkerWithEntry Elapsed Time = " & ElapsedTime'Image & " (ms)");

end;

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-02  6:23     ` alby.gamper
@ 2018-04-02  7:12       ` alby.gamper
  2018-04-05 14:07       ` Brad Moore
  1 sibling, 0 replies; 39+ messages in thread
From: alby.gamper @ 2018-04-02  7:12 UTC (permalink / raw)


On Monday, April 2, 2018 at 4:23:43 PM UTC+10, alby....@gmail.com wrote:
> On Friday, March 30, 2018 at 8:04:19 PM UTC+11, Dmitry A. Kazakov wrote:
> > On 2018-03-30 00:33, Shark8 wrote:
> > 
> > > Hm, you shouldn't be hitting the limit with so few tasks. (500 sounds like a lot, but most OSes you'll be using use an integer-ID; meaning something on the order of 2**32 or 2**64.)
> > 
> > It is a simple calculation. Assuming a context switch were around 
> > 1000ns. x 500 tasks / 4 cores = 125ms if all tasks are busy. 125ms is a 
> > long time.
> > 
> > Another factor is scheduling. Assuming the time quant is 10ms (Windows 
> > default) and all task are doing pure calculations uninterrupted by any 
> > I/O, then, again: 10 x 500 / 4 = 1.25s [+125ms] is the time a task gets 
> > back to a core. 1.3s is an eternity.
> > 
> > -- 
> > Regards,
> > Dmitry A. Kazakov
> > http://www.dmitry-kazakov.de
> 
> Hi All
> I've done a quick mock-up test which just shows the activation time of 500 tasks
> This is the code, targeting windows (x64). Results show 40 & 45 ms for the two
> cases (running on a Intel Core i7-2600 cpu running @ 3.4 GHz)
> 
> with Ada.Text_IO;                       use Ada.Text_IO;
> procedure AdaTasking is
> 
>     type Bool is new Integer;
>     type Large_Integer is mod 2 ** 64;
> 
>     task type Worker is
>     end;
> 
>     task body Worker is
>     begin
>         null;
>     end;
> 
>     task type WorkerWithEntry is
>         entry Start;
>         entry Stop;
>     end;
> 
>     task body WorkerWithEntry is
>     begin
>         accept Start;
>         null;
>         accept Stop;
>     end;
> 
>     function QueryPerformanceFrequency( lpFrequency : access Large_Integer) return Bool;
>     pragma import (stdcall , QueryPerformanceFrequency , "QueryPerformanceFrequency");
> 
>     function QueryPerformanceCounter( lpPerformanceCount : access Large_Integer) return Bool;
>     pragma import (stdcall , QueryPerformanceCounter , "QueryPerformanceCounter");
> 
>     Ok          : Bool := 0;
>     Frequency   : aliased Large_Integer := 0;
>     StartTime   : aliased Large_Integer := 0;
>     EndTime     : aliased Large_Integer := 0;
>     ElapsedTime : aliased Large_Integer := 0;
>     
> begin
> 
>     Ok := QueryPerformanceFrequency(Frequency'access);
>     Frequency := Frequency / 1000; -- units is ms (ie milli seconds)
>     
>     Ok := QueryPerformanceCounter(StartTime'access);
> 	declare
>         Worker_Pool : array (1..500) of Worker;
>     begin
>         for i in Worker_Pool'Range loop
>             null;
>         end loop;
>     end;
>     Ok := QueryPerformanceCounter(EndTime'access);
>     ElapsedTime := (EndTime-StartTime) / Frequency;
>     Put_Line("Worker          Elapsed Time = " & ElapsedTime'Image & " (ms)");
> 
>     Ok := QueryPerformanceCounter(StartTime'access);
> 	declare
>         Worker_Pool : array (1..500) of WorkerWithEntry;
>     begin
>         for i in Worker_Pool'Range loop
>             Worker_Pool(i).Start;
>             Worker_Pool(i).Stop;
>         end loop;
>     end;
>     Ok := QueryPerformanceCounter(EndTime'access);
>     ElapsedTime := (EndTime-StartTime) / Frequency;
>     Put_Line("WorkerWithEntry Elapsed Time = " & ElapsedTime'Image & " (ms)");
> 
> end;

Note: That rounding errors will occur due to the Large_integer divisions, Which
can be somewhat mitigated by scaling the result to us (ie micro seconds). This
can be done by dividing the frequency by 1,000,000 rather than 1,000.

The revised results are ~44,000 us and ~49000 us  


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-02  6:23     ` alby.gamper
  2018-04-02  7:12       ` alby.gamper
@ 2018-04-05 14:07       ` Brad Moore
  2018-04-05 15:09         ` Dmitry A. Kazakov
                           ` (4 more replies)
  1 sibling, 5 replies; 39+ messages in thread
From: Brad Moore @ 2018-04-05 14:07 UTC (permalink / raw)


Another data point is the entry I submitted to the Computer Language Benchmark Games site for the Thread-Ring benchmark.

The problem there is to create 503 tasks, and then use some sort of message passing to pass a "token" in a ring from from task to the next, 50_000_000 times, printing out the task number that ended up with the token at the end.

I used a protected object to pass a "token" between the tasks. I believe the reason why it works as fast as it did, is that an Ada protected object can have one task perform the action of the protected object on behalf of other tasks that are queued on an entry of the protected object.

Source code:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/threadring-gnat-6.html

The results comparing this version against other approaches in other languages can be seen here, where the entries are executed on a quad-core machine:

Note, the approaches by different languages are very different from each other,
so in some sense, this is like comparing apples and oranges.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/threadring.html

All the benchmarks can be accessed from the top level of the site
https://benchmarksgame-team.pages.debian.net/benchmarksgame/

Brad.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-05 14:07       ` Brad Moore
@ 2018-04-05 15:09         ` Dmitry A. Kazakov
  2018-04-07  4:16           ` Brad Moore
  2018-04-05 15:30         ` Jeffrey R. Carter
                           ` (3 subsequent siblings)
  4 siblings, 1 reply; 39+ messages in thread
From: Dmitry A. Kazakov @ 2018-04-05 15:09 UTC (permalink / raw)


On 05/04/2018 16:07, Brad Moore wrote:
> Another data point is the entry I submitted to the Computer Language Benchmark Games site for the Thread-Ring benchmark.
> 
> The problem there is to create 503 tasks, and then use some sort of message passing to pass a "token" in a ring from from task to the next, 50_000_000 times, printing out the task number that ended up with the token at the end.
> 
> I used a protected object to pass a "token" between the tasks. I believe the reason why it works as fast as it did, is that an Ada protected object can have one task perform the action of the protected object on behalf of other tasks that are queued on an entry of the protected object.

That is unfair! (:-)) With protected object and same token you can 
"pass" it at all available cores without blocking, at once, which IMO 
defeats the idea of the benchmark, but shows how cool are Ada's tasking 
primitives.

-- 
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-05 14:07       ` Brad Moore
  2018-04-05 15:09         ` Dmitry A. Kazakov
@ 2018-04-05 15:30         ` Jeffrey R. Carter
  2018-04-05 19:33           ` Spiros Bousbouras
  2018-04-05 19:44           ` Simon Wright
  2018-04-06  5:58         ` Benchmarks Game: Thread ring (Was: Large number of tasks slows down my program (using debian) - any fix?) Jacob Sparre Andersen
                           ` (2 subsequent siblings)
  4 siblings, 2 replies; 39+ messages in thread
From: Jeffrey R. Carter @ 2018-04-05 15:30 UTC (permalink / raw)


On 04/05/2018 04:07 PM, Brad Moore wrote:
> 
> The problem there is to create 503 tasks, and then use some sort of message passing to pass a "token" in a ring from from task to the next, 50_000_000 times, printing out the task number that ended up with the token at the end.

Isn't the answer always 291?

-- 
Jeff Carter
"Sir Lancelot saves Sir Gallahad from almost certain temptation."
Monty Python & the Holy Grail
69

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-05 15:30         ` Jeffrey R. Carter
@ 2018-04-05 19:33           ` Spiros Bousbouras
  2018-04-05 19:44           ` Simon Wright
  1 sibling, 0 replies; 39+ messages in thread
From: Spiros Bousbouras @ 2018-04-05 19:33 UTC (permalink / raw)


On Thu, 5 Apr 2018 17:30:38 +0200
"Jeffrey R. Carter" <spam.jrcarter.not@spam.not.acm.org> wrote:
> On 04/05/2018 04:07 PM, Brad Moore wrote:
> > 
> > The problem there is to create 503 tasks, and then use some sort of
> > message passing to pass a "token" in a ring from from task to the next,
> > 50_000_000 times, printing out the task number that ended up with the
> > token at the end.
> 
> Isn't the answer always 291?

It seems so to me but I think the point is to see how long it will take
rather than get an answer. Actually the fact that you can calculate the
answer easily by hand is a good thing because it gives you a quick test
to see if the programme or the implementation has a bug.

Now the ideal compiler would be one which can prove that the answer will
always be 291 and produce code which does nothing more that print this
answer !

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-05 15:30         ` Jeffrey R. Carter
  2018-04-05 19:33           ` Spiros Bousbouras
@ 2018-04-05 19:44           ` Simon Wright
  2018-04-05 20:25             ` Jeffrey R. Carter
  1 sibling, 1 reply; 39+ messages in thread
From: Simon Wright @ 2018-04-05 19:44 UTC (permalink / raw)


"Jeffrey R. Carter" <spam.jrcarter.not@spam.not.acm.org> writes:

> On 04/05/2018 04:07 PM, Brad Moore wrote:
>>
>> The problem there is to create 503 tasks, and then use some sort of
>> message passing to pass a "token" in a ring from from task to the
>> next, 50_000_000 times, printing out the task number that ended up
>> with the token at the end.
>
> Isn't the answer always 291?

In the three I looked at the answer was 292.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-05 19:44           ` Simon Wright
@ 2018-04-05 20:25             ` Jeffrey R. Carter
  0 siblings, 0 replies; 39+ messages in thread
From: Jeffrey R. Carter @ 2018-04-05 20:25 UTC (permalink / raw)


On 04/05/2018 09:44 PM, Simon Wright wrote:
> "Jeffrey R. Carter" <spam.jrcarter.not@spam.not.acm.org> writes:
> 
>> On 04/05/2018 04:07 PM, Brad Moore wrote:
>>>
>>> The problem there is to create 503 tasks, and then use some sort of
>>> message passing to pass a "token" in a ring from from task to the
>>> next, 50_000_000 times, printing out the task number that ended up
>>> with the token at the end.
>>
>> Isn't the answer always 291?
> 
> In the three I looked at the answer was 292.

I guess it depends on where the token starts. If it starts outside the ring and 
is 1st passed to task 1, the answer is 291. If it starts in task 1 and is 1st 
passed to task 2, the answer is 292. I think.

-- 
Jeff Carter
"Sir Lancelot saves Sir Gallahad from almost certain temptation."
Monty Python & the Holy Grail
69

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Benchmarks Game: Thread ring (Was: Large number of tasks slows down my program (using debian) - any fix?)
  2018-04-05 14:07       ` Brad Moore
  2018-04-05 15:09         ` Dmitry A. Kazakov
  2018-04-05 15:30         ` Jeffrey R. Carter
@ 2018-04-06  5:58         ` Jacob Sparre Andersen
  2018-04-07  4:28           ` Brad Moore
  2018-04-06 15:48         ` Large number of tasks slows down my program (using debian) - any fix? Jeffrey R. Carter
  2018-04-07 12:21         ` Simon Wright
  4 siblings, 1 reply; 39+ messages in thread
From: Jacob Sparre Andersen @ 2018-04-06  5:58 UTC (permalink / raw)


Brad Moore <bmoore.ada@gmail.com> writes:

> [...]  I believe the reason why it works as fast as it did, is that an
> Ada protected object can have one task perform the action of the
> protected object on behalf of other tasks that are queued on an entry
> of the protected object.

> The results comparing this version against other approaches in other
> languages can be seen here, where the entries are executed on a
> quad-core machine:

> https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/threadring.html

From the CPU load data, it looks like the Ada solution has four tasks
running full time, and not just a single one.

Greetings,

Jacob
-- 
"The point is that I am now a perfectly safe penguin!"

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-05 14:07       ` Brad Moore
                           ` (2 preceding siblings ...)
  2018-04-06  5:58         ` Benchmarks Game: Thread ring (Was: Large number of tasks slows down my program (using debian) - any fix?) Jacob Sparre Andersen
@ 2018-04-06 15:48         ` Jeffrey R. Carter
  2018-04-07  4:39           ` Brad Moore
  2018-04-07 12:21         ` Simon Wright
  4 siblings, 1 reply; 39+ messages in thread
From: Jeffrey R. Carter @ 2018-04-06 15:48 UTC (permalink / raw)


On 04/05/2018 04:07 PM, Brad Moore wrote:
> 
> https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/threadring-gnat-6.html

The code as presented here doesn't compile: Create_Lightweight_Thread is called 
before the body is elaborated.

Correcting that, the code ran in about 25.3 s on a 4 GHz Core i7 (8 virtual 
processors). About a factor of 1.9 from the results presented.

I then made what seemed to me like fairly minor changes to the code. To my 
surprise, the result ran in about 7.6 s. Applying the factor of 1.9 would put 
this result in 3rd place.

Functionally the 2 seem equivalent. Probably the most significant change is that 
the protected entry became a protected procedure.

function Execute_Threadring (Number_Of_Tokens : Positive) return Positive is
    subtype Token_Number is Natural range 0 .. Number_Of_Tokens;

    Threadring_Size : constant := 503;

    type Thread_Index is mod Threadring_Size;
    type Thread_Id is range 1 .. Threadring_Size;

    Next_Name : Thread_Id'Base := 1;

    type Thread_Info is record
       Name  : Thread_Id    := Thread_Id'First;
       Index : Thread_Index := Thread_Index'First;
    end record;

    function New_Thread return Thread_Info;

    function New_Thread return Thread_Info is
    begin
       return Thread : constant Thread_Info :=
          (Name  => Next_Name, Index => Thread_Index (Next_Name - 1) )
       do
          Next_Name := Next_Name + 1;
       end return;
    end New_Thread;

    type Thread_List is array (Thread_Index) of Thread_Info;

    protected Token_Passer is
       procedure Wait_For_Baton (Done : in out Boolean);

       function Get_Result return Thread_Id;
    private
       Token          : Token_Number := Number_Of_Tokens;
       Current_Thread : Thread_Index := 0;
       Result         : Thread_Id    := 1;
       Thread         : Thread_List   := (others => New_Thread);
    end Token_Passer;

    protected body Token_Passer is
       function Get_Result return Thread_Id is
       begin
          return Result;
       end Get_Result;

       procedure Wait_For_Baton (Done : in out Boolean) is
       begin
          if Token = 0 then
             Result := Thread (Current_Thread).Name;
             Done := True;
          else
             Token := Token - 1;
             Current_Thread := Thread (Current_Thread).Index + 1;
          end if;
       end Wait_For_Baton;
    end Token_Passer;

    task type OS_Thread;

    task body OS_Thread
    is
       All_Done : Boolean := False;
    begin
       Pass_All : loop
          Token_Passer.Wait_For_Baton (Done => All_Done);

          exit Pass_All when All_Done;
       end loop Pass_All;
    end OS_Thread;
begin
    -- Wait for workers to complete before returning result
    declare
       Number_Of_Workers : constant := 503;

       type Worker_List is array (1 .. Number_Of_Workers) of OS_Thread;

       pragma Warnings (Off, "*Worker_Pool* is not referenced");

       Worker_Pool : Worker_List;

       pragma Warnings (On, "*Worker_Pool* is not referenced");
    begin
       null;
    end;

    return Positive (Token_Passer.Get_Result);
end Execute_Threadring;


-- 
Jeff Carter
"It's all right, Taggart. Just a man and a horse being hung out there."
Blazing Saddles
34


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-05 15:09         ` Dmitry A. Kazakov
@ 2018-04-07  4:16           ` Brad Moore
  0 siblings, 0 replies; 39+ messages in thread
From: Brad Moore @ 2018-04-07  4:16 UTC (permalink / raw)


On Thursday, April 5, 2018 at 9:09:49 AM UTC-6, Dmitry A. Kazakov wrote:

> That is unfair! (:-)) With protected object and same token you can 
> "pass" it at all available cores without blocking, at once, which IMO 
> defeats the idea of the benchmark, but shows how cool are Ada's tasking 
> primitives.

I see your point, :-) but I still feel its fair in that there are 503 "pseudo threads" that each serially get access to the protected object in a ring fashion, each "pseudo"-thread in fact stores its reference inside the protected object when it "has" the baton. At time same time, there are also 503 real OS threads driving the thing, where the semantics of a protected object is that there is a queue for an entry, and the queue is serviced serially, even if the implementation of the queue happens to be efficient. Since there are only 503 threads and 50_000_000 iterations, each OS thread (Ada task) has to requeue on the protected entry many many times.

Brad

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Benchmarks Game: Thread ring (Was: Large number of tasks slows down my program (using debian) - any fix?)
  2018-04-06  5:58         ` Benchmarks Game: Thread ring (Was: Large number of tasks slows down my program (using debian) - any fix?) Jacob Sparre Andersen
@ 2018-04-07  4:28           ` Brad Moore
  0 siblings, 0 replies; 39+ messages in thread
From: Brad Moore @ 2018-04-07  4:28 UTC (permalink / raw)


On Thursday, April 5, 2018 at 11:59:00 PM UTC-6, Jacob Sparre Andersen wrote:
> Brad Moore <bmoore.ada@gmail.com> writes:
> 
> > [...]  I believe the reason why it works as fast as it did, is that an
> > Ada protected object can have one task perform the action of the
> > protected object on behalf of other tasks that are queued on an entry
> > of the protected object.
> 
> > The results comparing this version against other approaches in other
> > languages can be seen here, where the entries are executed on a
> > quad-core machine:
> 
> > https://benchmarksgame-team.pages.debian.net/benchmarksgame/performance/threadring.html
> 
> From the CPU load data, it looks like the Ada solution has four tasks
> running full time, and not just a single one.
> 

Yes, in fact this appears to be the only example that fully loads all the cores. Some of the language examples only involve 1 core, which to me suggests that what is really happening in those cases is something more like a coroutine, which I think is not really a fair comparison.

When I said one task can service the queue on behalf of other tasks, I was thinking that the implementation might be "passing" the baton between the "pseudo"-threads maybe 503 passes at a time, releasing the 503 Ada tasks that are waiting on the entry, then those 503 tasks would queue back up on the entry (not necessarily all 503 at a time), but however many do get in at a time (based on how many are queued on the entry when the entry guard is opened), they would effectively get released at approximately the same time, but still in a serial order. So, I would expect all cores to be fully loaded. I don't know if GNAT actually has the protected object implemented this way, I just know that that is a possible (and I think recommended) implementation.

Cheers,

Brad


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-06 15:48         ` Large number of tasks slows down my program (using debian) - any fix? Jeffrey R. Carter
@ 2018-04-07  4:39           ` Brad Moore
  2018-04-07  8:15             ` Jeffrey R. Carter
  0 siblings, 1 reply; 39+ messages in thread
From: Brad Moore @ 2018-04-07  4:39 UTC (permalink / raw)


On Friday, April 6, 2018 at 9:48:31 AM UTC-6, Jeffrey R. Carter wrote:
> On 04/05/2018 04:07 PM, Brad Moore wrote:
> > 
> > https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/threadring-gnat-6.html

Thanks for your comments, some interesting points and suggestions...

> 
> The code as presented here doesn't compile: Create_Lightweight_Thread is called 
> before the body is elaborated.

You are correct that technically the compiler would be right to complain here,
but I tried the code and compiled it exactly as presented on the website, and it compiles fine for me.

What version of compiler are you using?

On an aside,
This is the sort of error that Randy's compiler is good at catching. For my test, I was running GNAT GPL 2017.

Thread model: posix
gcc version 6.3.1 20170510 (for GNAT GPL 2017 20170515) (GCC)

> 
> Correcting that, the code ran in about 25.3 s on a 4 GHz Core i7 (8 virtual 
> processors). About a factor of 1.9 from the results presented.
> 
> I then made what seemed to me like fairly minor changes to the code. To my 
> surprise, the result ran in about 7.6 s. Applying the factor of 1.9 would put 
> this result in 3rd place.
> 
> Functionally the 2 seem equivalent. Probably the most significant change is that 
> the protected entry became a protected procedure.

I tried your version of the code with my version of the compiler, but the times for both versions were almost identical. It sounds like your compiler is quite different from the one I am using. I am running under Linux on an AMD quadcore system.

regards,

Brad Moore

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-07  4:39           ` Brad Moore
@ 2018-04-07  8:15             ` Jeffrey R. Carter
  2018-04-07 16:28               ` Brad Moore
  2018-04-07 16:51               ` Brad Moore
  0 siblings, 2 replies; 39+ messages in thread
From: Jeffrey R. Carter @ 2018-04-07  8:15 UTC (permalink / raw)


On 04/07/2018 06:39 AM, Brad Moore wrote:
> 
> You are correct that technically the compiler would be right to complain here,
> but I tried the code and compiled it exactly as presented on the website, and it compiles fine for me.
> 
> What version of compiler are you using?

Technically, it compiled with a warning that the call would raise Program_Error. 
I got the warning with a wavefront of GNAT Pro 18 under Linux. Running the 
program did result in the exception.

> I tried your version of the code with my version of the compiler, but the times for both versions were almost identical. It sounds like your compiler is quite different from the one I am using. I am running under Linux on an AMD quadcore system.

Interesting. I also tried my version with FSF GNAT 7.2.0 on a Core i7 with 4 
virtual processors under Linux and got times about 6.6-9.6 s, depending on 
options. Fastest seems to be my default set of options

-gnatan -gnato2 -O2 -fstack-check

and the slowest

-gnatp -gnat05

(compiler runs in Ada-12 mode by default).

-- 
Jeff Carter
"English bed-wetting types."
Monty Python & the Holy Grail
15

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-05 14:07       ` Brad Moore
                           ` (3 preceding siblings ...)
  2018-04-06 15:48         ` Large number of tasks slows down my program (using debian) - any fix? Jeffrey R. Carter
@ 2018-04-07 12:21         ` Simon Wright
  2018-04-07 16:57           ` Brad Moore
  4 siblings, 1 reply; 39+ messages in thread
From: Simon Wright @ 2018-04-07 12:21 UTC (permalink / raw)


Brad Moore <bmoore.ada@gmail.com> writes:

> Another data point is the entry I submitted to the Computer Language
> Benchmark Games site for the Thread-Ring benchmark.

It doesn't do so well on a Macbook Pro (2.9 GHz Intel Core i5)! it used
about 147% cpu (i.e. the 2 physical cores) and took 8 minutes.

Much the same results with -gnatp -O2 as with the switches reported on
the website, not that I know whether they'd be appropriate for this
machine. (I see that the source code comments say "Note that this
version is compiled with full Ada checks enabled, and optimization
turned off", which isn't the case as reported).

This was with GCC 8.0.1 20180309, which required
Create_Lightweight_Thread's body to be seen before use.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-07  8:15             ` Jeffrey R. Carter
@ 2018-04-07 16:28               ` Brad Moore
  2018-04-07 18:41                 ` Jeffrey R. Carter
  2018-04-08  0:06                 ` Robert I. Eachus
  2018-04-07 16:51               ` Brad Moore
  1 sibling, 2 replies; 39+ messages in thread
From: Brad Moore @ 2018-04-07 16:28 UTC (permalink / raw)


On Saturday, April 7, 2018 at 2:15:37 AM UTC-6, Jeffrey R. Carter wrote:
> On 04/07/2018 06:39 AM, Brad Moore wrote:
> > 
> > You are correct that technically the compiler would be right to complain here,
> > but I tried the code and compiled it exactly as presented on the website, and it compiles fine for me.
> > 
> > What version of compiler are you using?
> 
> Technically, it compiled with a warning that the call would raise Program_Error. 
> I got the warning with a wavefront of GNAT Pro 18 under Linux. Running the 
> program did result in the exception.
> 
> > I tried your version of the code with my version of the compiler, but the times for both versions were almost identical. It sounds like your compiler is quite different from the one I am using. I am running under Linux on an AMD quadcore system.
> 
> Interesting. I also tried my version with FSF GNAT 7.2.0 on a Core i7 with 4 
> virtual processors under Linux and got times about 6.6-9.6 s, depending on 
> options. Fastest seems to be my default set of options
> 
> -gnatan -gnato2 -O2 -fstack-check
> 
> and the slowest
> 
> -gnatp -gnat05
> 
> (compiler runs in Ada-12 mode by default).

Sorry, my two runs had identical times, because the sources were identical, I somehow copied the files incorrectly renaming yours to .old and mine to the new one in my test directory.

After fixing this, I do see a noticeable difference.

My version took 44.4 seconds with your suggested compiler options,
and your version took 11.1 seconds.

I notice also that I dont think there is a real need for 503 Ada tasks for this benchmark. There are conceptually 503 objects that "pass the baton" serially in a ring, each time the protected object is called.

So, I then thought, why not set the number of Ada worker tasks to 4 instead of 503, while keeping the number of baton passers at 503.

When I change that, the code improves to complete in 6.9 seconds.

Then I thought, why even have 4 workers. Why not just one? When I set the number of Ada tasks to 1, then there is even more improvement, the code completes in 1.6 seconds. With just 1 worker, why even have a protected object?
Why not just make it a procedure call? And so on..

After doing this though, I cannot help but think that somewhere along the way, I would have crossed the line of fairness and the intent of what the benchmark is attempting to test. I have also lost some of the demonstration of Ada tasking features. I liked the entry usage because waiting for an entry barrier to open seemed something like a runner waiting for the previous runner to pass him the baton. With a protected procedure, that handoff analogy seems to get lost.

If you think your version is a fair test though, it is an interesting comparison point, then I'd suggest you submit your version to the benchmark site, if you like. They are fairly picky in what they'd accept. There is a chance they'd reject it, but you never know.

Brad 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-07  8:15             ` Jeffrey R. Carter
  2018-04-07 16:28               ` Brad Moore
@ 2018-04-07 16:51               ` Brad Moore
  1 sibling, 0 replies; 39+ messages in thread
From: Brad Moore @ 2018-04-07 16:51 UTC (permalink / raw)


On Saturday, April 7, 2018 at 2:15:37 AM UTC-6, Jeffrey R. Carter wrote:
> > What version of compiler are you using?
> 
> Technically, it compiled with a warning that the call would raise Program_Error. 
> I got the warning with a wavefront of GNAT Pro 18 under Linux. Running the 
> program did result in the exception.
> 

I just tried my version of the code with the 7.2.0 FSF version of GNAT and it still compiles and runs. The checking in your compiler must be a more recent check added, that probably would get propagated into the FSF version at some point in the future.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-07 12:21         ` Simon Wright
@ 2018-04-07 16:57           ` Brad Moore
  0 siblings, 0 replies; 39+ messages in thread
From: Brad Moore @ 2018-04-07 16:57 UTC (permalink / raw)


On Saturday, April 7, 2018 at 6:21:22 AM UTC-6, Simon Wright wrote:
> Brad Moore <bmoore.ada@gmail.com> writes:
> 
> > Another data point is the entry I submitted to the Computer Language
> > Benchmark Games site for the Thread-Ring benchmark.
> 
> It doesn't do so well on a Macbook Pro (2.9 GHz Intel Core i5)! it used
> about 147% cpu (i.e. the 2 physical cores) and took 8 minutes.
> 
> Much the same results with -gnatp -O2 as with the switches reported on
> the website, not that I know whether they'd be appropriate for this
> machine. (I see that the source code comments say "Note that this
> version is compiled with full Ada checks enabled, and optimization
> turned off", which isn't the case as reported).
> 
> This was with GCC 8.0.1 20180309, which required
> Create_Lightweight_Thread's body to be seen before use.

At one point, I believe the optimisation comment was true, but at some later point I must have sent in an update which changed the compile options, but missed updating the comment. I should send in a new version with a fix to the elaboration problem, since presumably at some point they will move to a newer compiler version that has this check.

Thanks,

Brad

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-07 16:28               ` Brad Moore
@ 2018-04-07 18:41                 ` Jeffrey R. Carter
  2018-04-08  0:29                   ` Brad Moore
  2018-04-08  0:06                 ` Robert I. Eachus
  1 sibling, 1 reply; 39+ messages in thread
From: Jeffrey R. Carter @ 2018-04-07 18:41 UTC (permalink / raw)


On 04/07/2018 06:28 PM, Brad Moore wrote:
> 
> I notice also that I dont think there is a real need for 503 Ada tasks for this benchmark. There are conceptually 503 objects that "pass the baton" serially in a ring, each time the protected object is called.
> 
> So, I then thought, why not set the number of Ada worker tasks to 4 instead of 503, while keeping the number of baton passers at 503.
> 
> When I change that, the code improves to complete in 6.9 seconds.

The problem requirements include:

"create 503 linked pre-emptive threads"

so I suspect that would not be a valid implementation.

> If you think your version is a fair test though, it is an interesting comparison point, then I'd suggest you submit your version to the benchmark site, if you like. They are fairly picky in what they'd accept. There is a chance they'd reject it, but you never know.

I saw opportunities in your code to replace explicit actions with elaboration, 
which is usually a good thing. That got rid of the Start procedure, which made 
the barrier of the entry always True, which resulted in changing the entry to a 
procedure. I'm not sure whether it's a fair test or not.

In your version, the tasks queue up on the entry in some order, and queue 
themselves back on the entry again as soon as the calls complete, so the tasks 
are calling the entry repeatedly in the same order.

In my version, all the tasks are calling the protected procedure repeatedly, but 
we don't know if there's an order in which they execute the calls. So my version 
may be less like a ring than yours.

But I'm not sure your version is a fair test, either. There aren't tasks linked 
in a ring. There are just a bunch of tasks that manipulate data in a PO until 
they have done so the correct number of times. That doesn't seem to fulfill the 
spirit of the requirements.

-- 
Jeff Carter
"English bed-wetting types."
Monty Python & the Holy Grail
15


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-07 16:28               ` Brad Moore
  2018-04-07 18:41                 ` Jeffrey R. Carter
@ 2018-04-08  0:06                 ` Robert I. Eachus
  1 sibling, 0 replies; 39+ messages in thread
From: Robert I. Eachus @ 2018-04-08  0:06 UTC (permalink / raw)


On 4/7/2018 12:28 PM, Brad Moore wrote:
> Then I thought, why even have 4 workers. Why not just one? When I set the number of Ada tasks to 1, then there is even more improvement, the code completes in 1.6 seconds. With just 1 worker, why even have a protected object?

You are getting into the area of chip specific optimizations.  If you 
ran this on an AMD Zen chip, carefully assigning processor preferences, 
two would almost certainly be fastest.  Would 1 be better than 3?  No 
clue.  But three through eight should be about the same, then a drop, as 
you went out toward sixteen.  Use a six-core (12 thread) Zen and 
substitute 6 and 12 for 8 and 16 above.  Mobile Zen has different 
characteristics.

With Intel chips, you need to know whether it has Hyperthreading, and 
whether it is enabled.  You also need to know how many cores are 
present--and the sizes of the caches.

What is going on?  When a processor core (or thread) runs it needs the 
token in its L1 data cache.  The cache line is larger than the Packet 
being passed around, either 64 or 128 bytes on most modern processors. 
In addition, the move involves two cores, and may require ejecting a 
line from the target cache.  In other words, there is most of your 
processing time on this toy program, and on much bigger programs if you 
aren't careful.

Why can AMD Zen CPUs and Intel CPUs with Hyperthreading do better with 
two threads than one?  You arrange for the two logical processors to be 
on the same physical processor.  So the caches are shared, and no cache 
move is required.

If this was a real problem, and you needed days or months of CPU time, 
you optimize each thread for the cache space available, and break things 
up into independent threads, or threads that run well together, then 
assign them to the appropriate logical processors.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-07 18:41                 ` Jeffrey R. Carter
@ 2018-04-08  0:29                   ` Brad Moore
  2018-04-08  8:25                     ` Jeffrey R. Carter
  0 siblings, 1 reply; 39+ messages in thread
From: Brad Moore @ 2018-04-08  0:29 UTC (permalink / raw)


On Saturday, April 7, 2018 at 12:41:13 PM UTC-6, Jeffrey R. Carter wrote:

> But I'm not sure your version is a fair test, either. There aren't tasks linked 
> in a ring. There are just a bunch of tasks that manipulate data in a PO until 
> they have done so the correct number of times. That doesn't seem to fulfill the 
> spirit of the requirements.

You are probably right, as echoed by Dmitri, its somewhat borderline.
There are other Ada versions farther down the list of Ada programs, I think that are closer to the original intent. 

See https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/threadring-gnat-4.html

or 

https://benchmarksgame-team.pages.debian.net/benchmarksgame/program/threadring-gnat-1.html

I think some of the other benchmarks at this site are better inter-language comparisons. This test is a bit too abstract, since what a language calls a thread is a pretty loose definition. But I think part of the value of this site is to illustrate unique language features, so in that regard I think it works ok, since protected objects are one of the main features of Ada.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-04-08  0:29                   ` Brad Moore
@ 2018-04-08  8:25                     ` Jeffrey R. Carter
  0 siblings, 0 replies; 39+ messages in thread
From: Jeffrey R. Carter @ 2018-04-08  8:25 UTC (permalink / raw)


On 04/08/2018 02:29 AM, Brad Moore wrote:
> 
> You are probably right, as echoed by Dmitri, its somewhat borderline.
> There are other Ada versions farther down the list of Ada programs, I think that are closer to the original intent.

The one I quickly wrote, in which tasks do actually pass a counter around, 
decrementing each time, until a task receives zero, takes about 5 mins. It 
doesn't have a PO, but does demonstrate a selective accept with terminate 
alternative. That's an interesting Ada tasking feature that you can't use with a 
PO: tasks just stop without having to be told they can.

-- 
Jeff Carter
"My name is Jim, but most people call me ... Jim."
Blazing Saddles
39

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Large number of tasks slows down my program (using debian) - any fix?
  2018-03-29 15:39       ` Jeffrey R. Carter
@ 2018-04-15  5:20         ` reinert
  0 siblings, 0 replies; 39+ messages in thread
From: reinert @ 2018-04-15  5:20 UTC (permalink / raw)


At long last, and after tears and sweat, I may have found the obvious solution. Not all tasks need to be active simultaneously :-) Clever use of protected objects seems to help.

reinert

On Thursday, March 29, 2018 at 5:39:53 PM UTC+2, Jeffrey R. Carter wrote:
> On 03/29/2018 10:46 AM, reinert wrote:
> > More from my investigations on trying to use many active objects (tasks):
> 
> I've had systems that used many hundreds of tasks on Linux without problem. 
> However, they weren't all created at once, and most of them spent most of their 
> time blocked on entry calls. You haven't provided enough information for us to 
> make reasonable suggestions.
> 
> -- 
> Jeff Carter
> "I snapped my chin down onto some guy's fist and
> hit another one in the knee with my nose."
> Play It Again, Sam
> 129


^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2018-04-15  5:20 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-28 18:06 Large number of tasks slows down my program (using debian) - any fix? reinert
2018-03-28 18:49 ` Dennis Lee Bieber
2018-03-28 19:06 ` Paul Rubin
2018-03-28 19:21 ` Dmitry A. Kazakov
2018-03-28 20:17   ` reinert
2018-03-29  8:46     ` reinert
2018-03-29  9:18       ` Dmitry A. Kazakov
2018-03-29 15:39       ` Jeffrey R. Carter
2018-04-15  5:20         ` reinert
2018-03-29 22:33 ` Shark8
2018-03-30  9:04   ` Dmitry A. Kazakov
2018-03-30 20:46     ` Paul Rubin
2018-03-31  0:09       ` Randy Brukardt
2018-03-31  6:00         ` Paul Rubin
2018-03-31  9:37           ` Jacob Sparre Andersen
2018-03-31 10:44             ` Dmitry A. Kazakov
2018-04-02  3:35           ` Randy Brukardt
2018-04-02  6:23     ` alby.gamper
2018-04-02  7:12       ` alby.gamper
2018-04-05 14:07       ` Brad Moore
2018-04-05 15:09         ` Dmitry A. Kazakov
2018-04-07  4:16           ` Brad Moore
2018-04-05 15:30         ` Jeffrey R. Carter
2018-04-05 19:33           ` Spiros Bousbouras
2018-04-05 19:44           ` Simon Wright
2018-04-05 20:25             ` Jeffrey R. Carter
2018-04-06  5:58         ` Benchmarks Game: Thread ring (Was: Large number of tasks slows down my program (using debian) - any fix?) Jacob Sparre Andersen
2018-04-07  4:28           ` Brad Moore
2018-04-06 15:48         ` Large number of tasks slows down my program (using debian) - any fix? Jeffrey R. Carter
2018-04-07  4:39           ` Brad Moore
2018-04-07  8:15             ` Jeffrey R. Carter
2018-04-07 16:28               ` Brad Moore
2018-04-07 18:41                 ` Jeffrey R. Carter
2018-04-08  0:29                   ` Brad Moore
2018-04-08  8:25                     ` Jeffrey R. Carter
2018-04-08  0:06                 ` Robert I. Eachus
2018-04-07 16:51               ` Brad Moore
2018-04-07 12:21         ` Simon Wright
2018-04-07 16:57           ` Brad Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox