rts: enable parallel GC scan of large (32M+) allocation area

Authored by trofi on Aug 27 2016, 5:47 PM.



Parallel GC does not scan large allocation area (-A)
effectively as it does not do work stealing from nursery
by default.

That leads to large imbalance when only one of threads
overflows allocation area: most of GC threads finish
quickly (as there is not much to collect) and sit idle
waiting while single GC thread finishes scan of single
allocation area for that thread.

The patch enables work stealing for (equivalent of -qb0)
allocation area of -A32M or higher.

Tested on a highlighting-kate package from Trac Trac #9221

On 8-core machine the difference is around 5% faster
of wall-clock time. On 24-core VM the speedup is 20%.

Signed-off-by: Sergei Trofimovich <siarheit@google.com>

Test Plan

measured wall time and GC parallelism on highlighting-kate build

Diff Detail

rGHC Glasgow Haskell Compiler
Lint OK
No Unit Test Coverage
Build Status
Buildable 10805
Build 12858: arc lint + arc unit
trofi updated this revision to Diff 8497.Aug 27 2016, 5:47 PM
trofi retitled this revision from to rts: enable parallel GC scan of large (32M+) allocation area.
trofi updated this object.
trofi edited the test plan for this revision. (Show Details)
trofi added a reviewer: simonmar.
trofi updated the Trac tickets for this revision.
trofi updated this revision to Diff 8500.Aug 28 2016, 4:12 AM
trofi edited edge metadata.

Updated commit description about 24-core VM.

trofi updated this object.Aug 28 2016, 4:14 AM
trofi edited edge metadata.
simonmar requested changes to this revision.Aug 29 2016, 2:51 AM
simonmar edited edge metadata.

I wanted to do this slightly differently. With this patch, +RTS -qb1 -A32m will do something unexpected, and different from +RTS -A32m -qb1.

Instead, make parGcLoadBalancingGen == -1 mean "automatic, based on -A", and determine the correct setting in normaliseRtsOpts().

Obviously we also need some changes to the docs too.

This revision now requires changes to proceed.Aug 29 2016, 2:51 AM
trofi updated this revision to Diff 8508.Aug 29 2016, 4:13 AM
trofi edited edge metadata.

Moved defaulting logic to `normaliseRtsOpts()`
as suggested by Simon. Tweaked help output and
user guide to mention `-qb` now default depends
on `-A`.

bgamari accepted this revision.Aug 29 2016, 2:18 PM
bgamari edited edge metadata.

Looks good to me.

452 ↗(On Diff #8508)

the second default: here is redundant.

trofi updated this revision to Diff 8516.Aug 29 2016, 4:09 PM
trofi edited edge metadata.

Dropped redundant 'default:' from docs.

trofi marked an inline comment as done.Aug 29 2016, 4:11 PM
simonmar accepted this revision.Aug 30 2016, 5:40 AM
simonmar edited edge metadata.

Yep. Thanks!

This revision is now accepted and ready to land.Aug 30 2016, 5:40 AM
This revision was automatically updated to reflect the committed changes.