This is a work in progress. There are many "TODO DOUG"s scattered through the
code, however nearly all of them are for small refactorings, improvements, or
reminders for documentation.
Thank you in advance for reviewing such a large patch! If you have any ideas for
better names, or anything else, please do share. I think comments on the
design should go on the ticket.
Test coverage of this code is not very good, because few tests exercise the
parUpsweep code path. I intend to rectify this in another diff, as well as add
some unit and (maybe) property tests for ParUpsweep.
The main challenge remaining is to properly separate typechecking and
simplifying, see the comment in HscMain.finish (awkwardly that comment is in a
previous diff). As is, because typechecking depends on simplification of all
dependencies (see GhcMake.hs:966), every module will always typecheck
after its dependencies have had their full simplified interfaces in the home
package table; so there isn't currently any gain in parallelism from having
a typechecking state.
The separation of parsing is not usually particulary useful, but I do expect it
to improve compilations with -fno-code.
I have done only cursory benchmarking all on a four core machine. Benchmarking
is done with the script in the gist below, with ghcs built with this commit and
with cb6cf3c53fbfae8fc2907660b91cfc14da775865, both built with the build.mk in
the gist. https://gist.github.com/duog/2462bc4317b3aceb2bcf288be7e261af
The script times cabal installing lens and all its dependencies into a sandbox,
giving cabal "-j1" and ghc "-j4 +RTS -A128m -RTS". So it does include
cabal configuring all the packages, as well as copying and registering.
This commit:
real 2m13.584s
user 4m10.932s
sys 0m13.040s
82652 .cabal-sandbox
cb6cf3c:
real 2m44.675s
user 4m11.476s
sys 0m12.980s
82656 .cabal-sandbox
About a 12% improvement in total time. Note that this test disables cabal's
parallelism, so we can't expect anything like that for a real cabal install lens.
I will be doing some better benchmarking, including of -fno-code, and will update
this summary when that data is available. I expect the effect to have a large
variance over packages, for example the compile time of lens itself seems to be
unaffected by this commit.
- Commit message:
We now schedule work during a parallel upsweep very differently. There is still
one worker thread per module, however the workers are scheduled through the
new logic in ParUpsweep. They will block after completing:
- parsing
- simplifying
Machinery is present to block after desugaring as well, however this is not
leveraged. See the comment in HscMain.finish.
The earlier changes to move hsc_HPT from a HomePackageTable to an
IORef PackageTable, and to add the HscYield callbacks enable this blocking.
In addition to more fine-grained parallelism, the workers are scheduled by
priority to attempt to unblock as much work as possible. The estimate of
required work per task is currently very simple, see
GhcMake.calculateMakePriority.
There is some change in test output, because more modules will begin to compile
than before. See the change to T14075.stdout.
This code is not very well covered by the test suite, as many tests do not run
with -j >1, and so do not enter the parUpsweep code path. A future commit will
address this.