From: "Robert I. Eachus" <rieachus@comcast.net>
Subject: Re: Large number of tasks slows down my program (using debian) - any fix?
Date: Sat, 7 Apr 2018 20:06:50 -0400
Date: 2018-04-07T20:06:50-04:00 [thread overview]
Message-ID: <pabmep$1ej9$1@gioia.aioe.org> (raw)
In-Reply-To: c41f508c-9b42-422c-9f58-f29c0f611416@googlegroups.com
On 4/7/2018 12:28 PM, Brad Moore wrote:
> Then I thought, why even have 4 workers. Why not just one? When I set the number of Ada tasks to 1, then there is even more improvement, the code completes in 1.6 seconds. With just 1 worker, why even have a protected object?
You are getting into the area of chip specific optimizations. If you
ran this on an AMD Zen chip, carefully assigning processor preferences,
two would almost certainly be fastest. Would 1 be better than 3? No
clue. But three through eight should be about the same, then a drop, as
you went out toward sixteen. Use a six-core (12 thread) Zen and
substitute 6 and 12 for 8 and 16 above. Mobile Zen has different
characteristics.
With Intel chips, you need to know whether it has Hyperthreading, and
whether it is enabled. You also need to know how many cores are
present--and the sizes of the caches.
What is going on? When a processor core (or thread) runs it needs the
token in its L1 data cache. The cache line is larger than the Packet
being passed around, either 64 or 128 bytes on most modern processors.
In addition, the move involves two cores, and may require ejecting a
line from the target cache. In other words, there is most of your
processing time on this toy program, and on much bigger programs if you
aren't careful.
Why can AMD Zen CPUs and Intel CPUs with Hyperthreading do better with
two threads than one? You arrange for the two logical processors to be
on the same physical processor. So the caches are shared, and no cache
move is required.
If this was a real problem, and you needed days or months of CPU time,
you optimize each thread for the cache space available, and break things
up into independent threads, or threads that run well together, then
assign them to the appropriate logical processors.
next prev parent reply other threads:[~2018-04-08 0:06 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-28 18:06 Large number of tasks slows down my program (using debian) - any fix? reinert
2018-03-28 18:49 ` Dennis Lee Bieber
2018-03-28 19:06 ` Paul Rubin
2018-03-28 19:21 ` Dmitry A. Kazakov
2018-03-28 20:17 ` reinert
2018-03-29 8:46 ` reinert
2018-03-29 9:18 ` Dmitry A. Kazakov
2018-03-29 15:39 ` Jeffrey R. Carter
2018-04-15 5:20 ` reinert
2018-03-29 22:33 ` Shark8
2018-03-30 9:04 ` Dmitry A. Kazakov
2018-03-30 20:46 ` Paul Rubin
2018-03-31 0:09 ` Randy Brukardt
2018-03-31 6:00 ` Paul Rubin
2018-03-31 9:37 ` Jacob Sparre Andersen
2018-03-31 10:44 ` Dmitry A. Kazakov
2018-04-02 3:35 ` Randy Brukardt
2018-04-02 6:23 ` alby.gamper
2018-04-02 7:12 ` alby.gamper
2018-04-05 14:07 ` Brad Moore
2018-04-05 15:09 ` Dmitry A. Kazakov
2018-04-07 4:16 ` Brad Moore
2018-04-05 15:30 ` Jeffrey R. Carter
2018-04-05 19:33 ` Spiros Bousbouras
2018-04-05 19:44 ` Simon Wright
2018-04-05 20:25 ` Jeffrey R. Carter
2018-04-06 5:58 ` Benchmarks Game: Thread ring (Was: Large number of tasks slows down my program (using debian) - any fix?) Jacob Sparre Andersen
2018-04-07 4:28 ` Brad Moore
2018-04-06 15:48 ` Large number of tasks slows down my program (using debian) - any fix? Jeffrey R. Carter
2018-04-07 4:39 ` Brad Moore
2018-04-07 8:15 ` Jeffrey R. Carter
2018-04-07 16:28 ` Brad Moore
2018-04-07 18:41 ` Jeffrey R. Carter
2018-04-08 0:29 ` Brad Moore
2018-04-08 8:25 ` Jeffrey R. Carter
2018-04-08 0:06 ` Robert I. Eachus [this message]
2018-04-07 16:51 ` Brad Moore
2018-04-07 12:21 ` Simon Wright
2018-04-07 16:57 ` Brad Moore
replies disabled
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox