WIP Add HeapView functionality

Authored by patrickdoc on Feb 1 2017, 4:22 AM.



This pulls parts of Joachim Breitner's ghc-heap-view library inside GHC.
The bits added are the C hooks into the RTS and a basic Haskell wrapper
to these C hooks. The main reason for these to be added to GHC proper
is that the code needs to be kept in sync with the closure types
defined by the RTS. It is expected that the version of HeapView shipped
with GHC will always work with that version of GHC and that extra
functionality can be layered on top with a library like ghc-heap-view
distributed via Hackage.

Test Plan


Diff Detail

rGHC Glasgow Haskell Compiler
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.
There are a very large number of changes, so older changes are hidden. Show Older Changes
erikd retitled this revision from Add HeapView functionality to WIP Add HeapView functionality.Feb 2 2017, 2:32 PM
erikd edited edge metadata.
erikd updated this revision to Diff 10912.Feb 3 2017, 2:34 AM
  • Rename package to ghc-heap
bgamari requested changes to this revision.Feb 8 2017, 6:49 PM

Looks pretty good so far but not quite there yet.


Frankly I wonder whether Storable should really be used here. Partial instances make me a bit uneasy.


It would be great if these constructors had documentation attached to them. The closure type names aren't terribly descriptive to those unfamiliar with GHC's heap representation.


I'm not sure I like the name allPtrs here; afterall, as the documentation says, the things returned aren't any old pointer, they are closures.


Perhaps drop this?


s/Haskell word/the Haskell world/ I believe.


I'm rather confused; what is wrong with GHC.Arr.amap?


Hmmm indeed. One option would be for the runtime system to expose a boolean flag with this piece of information. We could do the same for tables-next-to-code.


What does this function do? Documentation is definitely needed as the type is quite unhelpful.


Is the plan here to extend this to cover the other easily accessible closure types (e.g. pointer and word arrays)?

8 ↗(On Diff #10912)

A comment describing the intended behavior and type of this function would be helpful.

This revision now requires changes to proceed.Feb 8 2017, 6:49 PM
erikd added inline comments.Feb 9 2017, 1:19 AM

Yep, intend to change that.


I'm hoping to test all the things here. Since submitting this I've figured out there's really quite a bit more work to do.

simonmar requested changes to this revision.Feb 11 2017, 3:30 AM

In general I support having this, however as it stands there are too many problems.

  • It needs to work for all configurations (TABLES_NEXT_TO_CODE, profiling)
  • There's a great deal of overlap with existing code in other places (the GHCi debugger, GHCi itself, and probably bits of the code generator). If this is being brought into the GHC tree we should at least make some effort to de-dup so that we don't have yet more dependencies on representations.

Needs a header comment and multi-slurp protection.


Why is there a strictness annotation on the Any? At the least it seems confusing, since we want to store thunks in a Box.


The version in GHCi.InfoTables is also less brittle, as it uses hsc2hs to pull the field offsets from the header files. I'm worried about having yet another place that needs to be kept consistent with the representation.


ScopedTypeVariables is a better way to do this


Why don't you use unpackClosure#?


Won't work unregisterised.


Yeah, this is a problem. What's wrong with #ifdef PROFILING?

8 ↗(On Diff #10912)

Please merge with unpackClosure#

nomeata added inline comments.Feb 11 2017, 9:23 AM

unpackClosure# does not follow pointers from thunks, and generally supports less closure types, so we cannot use it directly. Merging them would be an option, if we are allowed to change the output of unpackClosure#.


#ifdef PROFILING would only work if this is part of the rts code. Library code needs to work when linked agains either of the runtimes, so a static check is not good enough, isn’t it?

bgamari edited the summary of this revision. (Show Details)Feb 11 2017, 9:40 AM
erikd added inline comments.Feb 11 2017, 6:48 PM

Does it make sense to move GHCi.Infotables to main GHC tree so it can be used in both places?

simonmar added inline comments.Feb 13 2017, 2:54 AM

Sure, if heap-view is going to be part of GHC it would be better to merge.


We compile Haskell code separately for profiling (with -prof), so you can use #ifdef PROFILING in Haskell code just like in C code. We don't do it very often, but it works just fine. (provided we don't change any APIs conditionally, there's a dark corner here related to TemplateHaskell without -fexternal-interpreter).

erikd abandoned this revision.Mar 16 2017, 4:35 PM

There's a lot of work to be done on this and I can't see myself finding the time.

What's needed:

  • Make it work for the debug, profiling etc runtimes.
  • Make it work for closures with unlifted types.
In D3055#96358, @erikd wrote:

There's a lot of work to be done on this and I can't see myself finding the time.

Too bad, but understandable. The Haskell runtime just was not created with this in mind.

bgamari updated this revision to Diff 14741.Nov 20 2017, 7:33 PM

Rebase and fix a few issues

bgamari updated this revision to Diff 14748.Nov 21 2017, 10:38 AM

Kill submodule changes

Phyx requested changes to this revision.Dec 23 2017, 10:31 AM

pushing out of review queue, will need Hadrian integration.

This revision now requires changes to proceed.Dec 23 2017, 10:31 AM
patrickdoc added a subscriber: patrickdoc.

Deduplicate code and integrate ghc-heap

patrickdoc commandeered this revision.Mar 26 2018, 12:58 AM
patrickdoc added a reviewer: erikd.

I've updated the revision so that it builds under "quick" and removes much of the duplication. Heap.hsc is currently structured to make the origin of different functions clear. Most of the functions are taken straight from various places in GHC. I've pulled them all out of their original sources and dealt with the fallout of the changes. compiler/ghci/RtClosureInspect.hs required the most changes.

Currently all but one test in the testsuite pass (under "quick", profiling is still broken). Some problems/places for extra eyes are commented on inline.

397 ↗(On Diff #15853)

As mentioned in the comment, this compiles with the old version of unpackClosure, but is not correct. As a simple example, the old 'dat' is the non-pointer arguments, while the new 'dat' is the entire closure.

552 ↗(On Diff #15853)

This function is a problem for getting profiling to work. I've included the current definition from DebuggerUtils below the large comment. It's very similar, but I can't test profiling/tables_next_to_code at the moment due to a bug in the C code.

686 ↗(On Diff #15853)

This type is almost identical to HValue: https://phabricator.haskell.org/diffusion/GHC/browse/master/libraries/ghci/GHCi/RemoteTypes.hs;639e702b6129f501c539b158b982ed8489e3d09c$58. One difference is that this a data type while HValue is a newtype, but they are used similarly.

Using Box here causes a bit of type swizzling in RtClosureInspect.


MUT_VAR_DIRTY is never handled, which causes the lone failing test print028. This function should fill the ptrs array with pointers to associated closures. getClosureData assumes that there is a single pointer here, and I am not sure which is correct.


This causes a profiling build to fail with CCCS undeclared

bgamari added inline comments.Mar 26 2018, 9:25 AM

Indeed, I think adding the referee, as is done in the clean case, is correct.


I think this should probably be cap->r.rCCCS.

Great work!

Add support for more ways

  • Support Tables_Next_To_Code
  • Support Profiling

Both TNTC and Profiling should now be supported. The biggest difficulty was the interactions between prof and ghci. I pulled a bit more code out of libraries/ghci and into heap-view to reduce the number of places making CPP profiling checks and adjusting pointers, but it's possible that there are some broken interactions hiding. I ran the testsuite a couple times for each way, and only ran into some transient failures that didn't seem like they could have come from me (although they certainly could have).

As a longer term note, I went pretty light on the deduplication front. I think RtClosureInspect in particular has a fair amount of code that probably makes more sense in heap-view. I'm more concerned with making it work, but I think a slightly more feature-full version of the library could simplify chunks of the existing code.

Great work, @patrickdoc! I think we are getting close with this.

Have you tested this in the non-TNTC case?


Which Word does this refer to?


Hmm, that's rather suspicious. Any idea why?


Indeed it sounds like there is some memory unsafety afoot here.


Any reason why this shouldn't be called ConstrClosure? Not only would this be more consistent with the rest of the codebase but I keep reading Cons as the list cons constructor.


Hmm, judging from the Harbormaster output it looks like this breaks when the tree is built using ./validate, which doesn't build profiled libraries.

Thanks! I built unregisterized and tested, but before I got profiling working 100%. So I don't think I've covered the TNTC/Profiling matrix of options.

I have a solution for the unboxed/unlifted closures that just needs a bit more sketching out too.


I believe this is referring to the fact that the values returned by unpackClosure, (_, dat, pointers), overlap. dat is the raw words of the closure. In the case of a constructor, that would be header, pointers, and non-pointers. pointers on the other hand wraps the pointer arguments of the closure in a Box so that when GC or evaluation updates those arguments, the boxed pointer is also updated. So the raw word is just an address, but the value inside the box follows the changes to the argument.

(I'm not 100% on how the pointer updating magic works, but this is my impression from reading the code.)


My guess is that both this and the next one have to do with making sure that the pointer arguments get correctly tracked within the boxes. Something to the effect of resolving the pointer while it still in fact points to the expected closure. Again, mostly guessing here.


Yeah, Constr probably makes more sense. I've mostly just left names as they are for lack of better ideas, so I'm fine with anything.


I haven't used the testsuite much, so I have a bit of exploring to do. I'll fix this up with other changes to the tests.

patrickdoc updated this revision to Diff 16167.Apr 24 2018, 2:06 AM
  • Add unboxed unlifted support
  • Fix (not TNTC && PROF) bug

Ok, I've now tested both non-TNTC prof and non-prof, I even found a bug :P I also updated the comment about the Word, changed to ConstrClosure, and fixed the test profiling.

At this point, I think I would describe it as minimally complete. It could probably use a bit of refinement though. Places that could use eyes in particular:

  • The HasHeapRep class. I've never worked with kinds before, so advice appreciated.
  • The qualified import of InfoTableProf in Heap.hs. The profiled object file was missing when it was needed.
  • The tests cover as many closures as I could figure out how to make. I'm not sure how to generate the rest.

I forgot to make the tests compare on more fields, so I'll do that. There is also the memory stuff at the end of getClosureRaw to explore.
Aside from that I think it's mostly names/organizational stuff to improve.

  • Test more fields of closures for equality
  • Remove unnecessary evaluates
  • Expose all modules to fix Prof bug and match other libs

Just to make sure this doesn't fade away: bump.

Status: I think a review would be beneficial at this point. I'm not sure what is needed to wrap this diff up/where to go from here.

This is looking great! I've had this one lone comment hanging around from a review I started last week. I'll just leave submit it to give you something now.

I'll try to get time to do a more in-depth review tonight or tomorrow.


It would be good if this had a docstring, particularly because there are two other functions in this module with a very similar type.

bgamari accepted this revision.May 16 2018, 2:02 PM

I did another pass over this and as far as I can tell we have converged on something mergeable. There are likely subtleties that we will encounter in due course, but I think it's time to merge.

Thanks for your efforts on this, @patrickdoc and @erikd!

This revision was not accepted when it landed; it landed in state Needs Review.May 20 2018, 5:14 PM
Closed by commit rGHCec22f7ddc81b: Add HeapView functionality (authored by patrickdoc, committed by bgamari). · Explain Why
This revision was automatically updated to reflect the committed changes.