[RFC] StatProfile: Heap and black-hole sampling
This provides samples to the statistical profiler introduced in D1214 based on heap allocation and blackhole-blocking events.

Additional sampling strategies (e.g. based on time, CPU performance counters,
etc.) are left for future work.

The implementation of heap sampling is admittedly a bit hacky.
Based on Peter Wortmann's c01384a26d7c9d22d26a760470bdb6379a2913ee.
Lacking a better idea, I follow Peter's lead and lay claim to R9
to get the attribution address to stg_gc_noregs.

Yeah, the R9 hack is certainly bad. When I originally wrote the code, stg_no_regs was a preprocessor macro, which made it easier in a number of ways.

Most critically, I just remembered that there's now a significant hole here - the code generator directly calls stg_no_regs (see compiler/codeGen/StgCmmHeap.hs), which obviously wouldn't have the R9 set. I think I had a patch to fix that, but I might not have comitted it publically as I was hoping I would find a better solution.

Always meant to check whether there was a better solution than this. When I originally wrote the code stg_gc_noregs was a preprocessor macro, which made it easier.

Note: Punting off the review queue until we want to revive this and look at it more closely.

