Merge non-moving garbage collector

Authored by bgamari on Wed, Oct 23, 1:01 PM.

Description

Merge non-moving garbage collector

This introduces a concurrent mark & sweep garbage collector to manage the old
generation. The concurrent nature of this collector typically results in
significantly reduced maximum and mean pause times in applications with large
working sets.

Due to the large and intricate nature of the change I have opted to
preserve the fully-buildable history, including merge commits, which is
described in the "Branch overview" section below.

Collector design

The full design of the collector implemented here is described in detail
in a technical note

B. Gamari. "A Concurrent Garbage Collector For the Glasgow Haskell
Compiler" (2018)

This document can be requested from @bgamari.
The basic heap structure used in this design is heavily inspired by

K. Ueno & A. Ohori. "A fully concurrent garbage collector for
functional programs on multicore processors." /ACM SIGPLAN Notices/
Vol. 51. No. 9 (presented at ICFP 2016)

This design is intended to allow both marking and sweeping
concurrent to execution of a multi-core mutator. Unlike the Ueno design,
which requires no global synchronization pauses, the collector
introduced here requires a stop-the-world pause at the beginning and end
of the mark phase.

To avoid heap fragmentation, the allocator consists of a number of
fixed-size /sub-allocators/. Each of these sub-allocators allocators into
its own set of /segments/, themselves allocated from the block
allocator. Each segment is broken into a set of fixed-size allocation
blocks (which back allocations) in addition to a bitmap (used to track
the liveness of blocks) and some additional metadata (used also used
to track liveness).

This heap structure enables collection via mark-and-sweep, which can be
performed concurrently via a snapshot-at-the-beginning scheme (although
concurrent collection is not implemented in this patch).

Implementation structure

The majority of the collector is implemented in a handful of files:

  • rts/Nonmoving.c is the heart of the beast. It implements the entry-point to the nonmoving collector (nonmoving_collect), as well as the allocator (nonmoving_allocate) and a number of utilities for manipulating the heap.
  • rts/NonmovingMark.c implements the mark queue functionality, update remembered set, and mark loop.
  • rts/NonmovingSweep.c implements the sweep loop.
  • rts/NonmovingScav.c implements the logic necessary to scavenge the nonmoving heap.

Branch overview

* wip/gc/opt-pause:
|   A variety of small optimisations to further reduce pause times.
|
* wip/gc/compact-nfdata:
|   Introduce support for compact regions into the non-moving
|\  collector
| \
|  \
| | * wip/gc/segment-header-to-bdescr:
| | |   Another optimization that we are considering, pushing
| | |   some segment metadata into the segment descriptor for
| | |   the sake of locality during mark
| | |
| * | wip/gc/shortcutting:
| | |   Support for indirection shortcutting and the selector optimization
| | |   in the non-moving heap.
| | |
* | | wip/gc/docs:
| |/    Work on implementation documentation.
| /
|/
* wip/gc/everything:
|   A roll-up of everything below.
|\
| \
| |\
| | \
| | * wip/gc/optimize:
| | |   A variety of optimizations, primarily to the mark loop.
| | |   Some of these are microoptimizations but a few are quite
| | |   significant. In particular, the prefetch patches have
| | |   produced a nontrivial improvement in mark performance.
| | |
| | * wip/gc/aging:
| | |   Enable support for aging in major collections.
| | |
| * | wip/gc/test:
| | |   Fix up the testsuite to more or less pass.
| | |
* | | wip/gc/instrumentation:
| | |   A variety of runtime instrumentation including statistics
| | /   support, the nonmoving census, and eventlog support.
| |/
| /
|/
* wip/gc/nonmoving-concurrent:
|   The concurrent write barriers.
|
* wip/gc/nonmoving-nonconcurrent:
|   The nonmoving collector without the write barriers necessary
|   for concurrent collection.
|
* wip/gc/preparation:
|   A merge of the various preparatory patches that aren't directly
|   implementing the GC.
|
|
* GHC HEAD
.
.
.

Merged Changes

This commit merges a very large number of changes. Only the first 50 are shown.
CommitAuthorDetailsCommitted
984745b074c1bgamari
nonmoving: Upper-bound time we hold SM_MUTEX for during sweep 
Oct 22
a69b28f4c33cbgamari
nonmoving: Don't do two passes over large and compact object lists 
Oct 22
91109404b7acbgamari
nonmoving: Trace GC preparation steps 
Oct 22
7c35d39bc468bgamari
rts: Mark nonmoving GC paths in moving collector as unlikely 
Oct 22
3a862703765bosa1/bgamari
rts: COMPACT_NFDATA support for the nonmoving collector 
Oct 22
22eee2bcc67abgamari
Merge branches 'wip/gc/segment-header-to-bdescr' and 'wip/gc/docs' into… 
Oct 22
116e4646f901bgamari
NonMoving: Add summarizing Note 
Oct 22
dd8d1b4928a9bgamari
NonMoving: Move next_free_snap to block descriptor 
Oct 22
6dcef5eedaeebgamari
NonMoving: Move block size to block descriptor 
Oct 22
c936a245e723bgamari
NonMoving: Introduce nonmovingSegmentLogBlockSize acccessor 
Oct 22
0f8fd3c6ad52osa1/bgamari
NonMoving: Implement -xns to disable selector optimization 
Oct 22
c72e84c6afeabgamari
NonMovingMark: Handle INDs left by shortcutting 
Oct 22
875861efae06osa1/bgamari
NonMoving: Implement selector optimisation 
Oct 22
246ce2af639cosa1/bgamari
NonMoving: Implement indirection shortcutting 
Oct 22
5b130b3d9d69bgamari
Merge branches 'wip/gc/optimize' and 'wip/gc/test' into wip/gc/everything 
Oct 22
8cab149bba91bgamari
testsuite: Mark length001 as failing under nonmoving ways 
Oct 22
25ae8f7d3a2cbgamari
testsuite: Don't run T7160 in nonmoving_thr ways 
Oct 22
99baff8c7ba2bgamari
testsuite: Don't run T9630 in nonmoving ways 
Oct 22
6abefce77dbebgamari
Skip ghc_heap_all test in nonmoving ways 
Oct 22
5ce853c8e4c1bgamari
ghc-heap: Skip heap_all test with debugged RTS 
Oct 22
6e97cc47d17ebgamari
testsuite: Skip T15892 in nonmoving_thr_ghc 
Oct 22
78ce35b96774bgamari
testsuite: bug1010 requires -c, which isn't supported by nonmoving 
Oct 22
4b91dd2566e4bgamari
testsuite: Ensure that threaded tests are run in nonmoving_thr 
Oct 22
097f4fd0e242bgamari
testsuite: Nonmoving collector doesn't support -G1 
Oct 22
01fd0242c9e8bgamari
testsuite: Don't run T15892 in nonmoving ways 
Oct 22
079879570bd6bgamari
testsuite: Add nonmoving_thr_ghc way 
Oct 22
b281e80be516bgamari
testsuite: Add nonmoving_thr way 
Oct 22
8e79e2a973c1bgamari
Unconditionally flush update remembered set during minor GC 
Oct 22
3bc172a41c43bgamari
NonMoving: Clean mut_list 
Oct 22
b967e470d806bgamari
NonMoving: Don't do major GC if one is already running 
Oct 22
53a1a27e51f9bgamari
NonMovingMark: Eliminate redundant check_in_nonmoving_heaps 
Oct 22
19bfe460cd70bgamari
NonMoving: Optimise allocator cache behavior 
Oct 22
56c5ebdc5d90bgamari
NonMoving: Prefetch segment header 
Oct 22
e6f6823f1eb5bgamari
NonMoving: Pre-fetch during mark 
Oct 22
e893877e3964bgamari
NonMoving: Fuse sweep preparation into mark prep 
Oct 22
0387df5bfd16bgamari
NonMoving: Inline nonmovingClearAllBitmaps 
Oct 22
786c52d25e94bgamari
NonMoving: Prefetch when clearing bitmaps 
Oct 22
dacf4cae179ebgamari
rts: Add prefetch macros 
Oct 22
e5eda61e768abgamari
NonMoving: Optimize bitmap search during allocation 
Oct 22
26d2d3316dcbbgamari
NonMovingMark: Optimize representation of mark queue 
Oct 22
d15ac82dcaf1bgamari
NonMoving: Allocate mark queues in larger block groups 
Oct 22
039d29068f0cbgamari
NonMoving: Eliminate integer division in nonmovingBlockCount 
Oct 22
8fffe12bf613bgamari
More comments for aging 
Oct 22
7b79e8b49bb5bgamari
Disable aging when doing deadlock detection GC 
Oct 22
13dd78ddb158bgamari
Nonmoving: Allow aging and refactor static objects logic 
Oct 22
6f1731812331bgamari
NonmovingCensus: Emit samples to eventlog 
Oct 22
0d31819ed27fbgamari
Allow census without live word count 
Oct 22
711837cc4af8bgamari
rts/Eventlog: More descriptive error message 
Oct 22
9f42cd81af3dghc-builders/bgamari
rts: Introduce non-moving heap census 
Oct 22
912e440e6f02ghc-builders/bgamari
rts: Tracing support for nonmoving collection events 
Oct 22