Fix a bug in parallel GC synchronisation
ClosedPublic

Authored by simonmar on Oct 28 2016, 10:49 AM.

Details

Summary

The problem boils down to global variables: in particular gc_threads[],
which was being modified by a subsequent GC before the previous GC had
finished with it. The fix is to not use global variables.

This was causing setnumcapabilities001 to fail (again!). It's an old
bug though.

Test Plan

Ran setnumcapabilities001 in a loop for a couple of hours. Before this
patch it had been failing after a few minutes. Not a very scientific
test, but it's the best I have.

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
simonmar updated this revision to Diff 9220.Oct 28 2016, 10:49 AM
simonmar retitled this revision from to Fix a bug in parallel GC synchronisation.
simonmar updated this object.
simonmar edited the test plan for this revision. (Show Details)
simonmar added reviewers: bgamari, fryguybob, niteria.
bgamari edited edge metadata.Oct 28 2016, 11:35 AM

Death to global state!

Good catch. One comment inline.

rts/Schedule.c
1537–1540

It would be nice if there were a comment attached to this explaining that it is in correspondence to the capabilities and that true indicates that that capability is sitting idle during this parallel GC.

simonmar updated this revision to Diff 9221.Oct 28 2016, 2:12 PM
simonmar edited edge metadata.

add comment

This revision was automatically updated to reflect the committed changes.
fryguybob edited edge metadata.Nov 1 2016, 8:05 PM

Looks good to me.