This pulls parts of Joachim Breitner's ghc-heap-view library inside GHC.
The bits added are the C hooks into the RTS and a basic Haskell wrapper
to these C hooks. The main reason for these to be added to GHC proper
is that the code needs to be kept in sync with the closure types
defined by the RTS. It is expected that the version of HeapView shipped
with GHC will always work with that version of GHC and that extra
functionality can be layered on top with a library like ghc-heap-view
distributed via Hackage.
This pulls parts of Joachim Breitner's ghc-heap-view library inside GHC.
- rGHC Glasgow Haskell Compiler
No Unit Test Coverage
- Build Status
Buildable 20469 Build 44841: [GHC] Linux/amd64: Continuous Integration Build 44840: [GHC] OSX/amd64: Continuous Integration Build 44839: [GHC] Windows/amd64: Continuous Integration Build 44838: arc lint + arc unit
Looks pretty good so far but not quite there yet.
Frankly I wonder whether Storable should really be used here. Partial instances make me a bit uneasy.
It would be great if these constructors had documentation attached to them. The closure type names aren't terribly descriptive to those unfamiliar with GHC's heap representation.
I'm not sure I like the name allPtrs here; afterall, as the documentation says, the things returned aren't any old pointer, they are closures.
Perhaps drop this?
s/Haskell word/the Haskell world/ I believe.
I'm rather confused; what is wrong with GHC.Arr.amap?
Hmmm indeed. One option would be for the runtime system to expose a boolean flag with this piece of information. We could do the same for tables-next-to-code.
What does this function do? Documentation is definitely needed as the type is quite unhelpful.
Is the plan here to extend this to cover the other easily accessible closure types (e.g. pointer and word arrays)?
|8 ↗||(On Diff #10912)|
A comment describing the intended behavior and type of this function would be helpful.
In general I support having this, however as it stands there are too many problems.
- It needs to work for all configurations (TABLES_NEXT_TO_CODE, profiling)
- There's a great deal of overlap with existing code in other places (the GHCi debugger, GHCi itself, and probably bits of the code generator). If this is being brought into the GHC tree we should at least make some effort to de-dup so that we don't have yet more dependencies on representations.
Needs a header comment and multi-slurp protection.
Why is there a strictness annotation on the Any? At the least it seems confusing, since we want to store thunks in a Box.
Could they be merged somehow?
The version in GHCi.InfoTables is also less brittle, as it uses hsc2hs to pull the field offsets from the header files. I'm worried about having yet another place that needs to be kept consistent with the representation.
ScopedTypeVariables is a better way to do this
Why don't you use unpackClosure#?
Won't work unregisterised.
Yeah, this is a problem. What's wrong with #ifdef PROFILING?
|8 ↗||(On Diff #10912)|
Please merge with unpackClosure#
unpackClosure# does not follow pointers from thunks, and generally supports less closure types, so we cannot use it directly. Merging them would be an option, if we are allowed to change the output of unpackClosure#.
#ifdef PROFILING would only work if this is part of the rts code. Library code needs to work when linked agains either of the runtimes, so a static check is not good enough, isn’t it?
Sure, if heap-view is going to be part of GHC it would be better to merge.
We compile Haskell code separately for profiling (with -prof), so you can use #ifdef PROFILING in Haskell code just like in C code. We don't do it very often, but it works just fine. (provided we don't change any APIs conditionally, there's a dark corner here related to TemplateHaskell without -fexternal-interpreter).
There's a lot of work to be done on this and I can't see myself finding the time.
- Make it work for the debug, profiling etc runtimes.
- Make it work for closures with unlifted types.
I've updated the revision so that it builds under "quick" and removes much of the duplication. Heap.hsc is currently structured to make the origin of different functions clear. Most of the functions are taken straight from various places in GHC. I've pulled them all out of their original sources and dealt with the fallout of the changes. compiler/ghci/RtClosureInspect.hs required the most changes.
Currently all but one test in the testsuite pass (under "quick", profiling is still broken). Some problems/places for extra eyes are commented on inline.
|397 ↗||(On Diff #15853)|
As mentioned in the comment, this compiles with the old version of unpackClosure, but is not correct. As a simple example, the old 'dat' is the non-pointer arguments, while the new 'dat' is the entire closure.
|552 ↗||(On Diff #15853)|
This function is a problem for getting profiling to work. I've included the current definition from DebuggerUtils below the large comment. It's very similar, but I can't test profiling/tables_next_to_code at the moment due to a bug in the C code.
|686 ↗||(On Diff #15853)|
This type is almost identical to HValue: https://phabricator.haskell.org/diffusion/GHC/browse/master/libraries/ghci/GHCi/RemoteTypes.hs;639e702b6129f501c539b158b982ed8489e3d09c$58. One difference is that this a data type while HValue is a newtype, but they are used similarly.
Using Box here causes a bit of type swizzling in RtClosureInspect.
MUT_VAR_DIRTY is never handled, which causes the lone failing test print028. This function should fill the ptrs array with pointers to associated closures. getClosureData assumes that there is a single pointer here, and I am not sure which is correct.
This causes a profiling build to fail with CCCS undeclared
Both TNTC and Profiling should now be supported. The biggest difficulty was the interactions between prof and ghci. I pulled a bit more code out of libraries/ghci and into heap-view to reduce the number of places making CPP profiling checks and adjusting pointers, but it's possible that there are some broken interactions hiding. I ran the testsuite a couple times for each way, and only ran into some transient failures that didn't seem like they could have come from me (although they certainly could have).
As a longer term note, I went pretty light on the deduplication front. I think RtClosureInspect in particular has a fair amount of code that probably makes more sense in heap-view. I'm more concerned with making it work, but I think a slightly more feature-full version of the library could simplify chunks of the existing code.
Great work, @patrickdoc! I think we are getting close with this.
Have you tested this in the non-TNTC case?
Which Word does this refer to?
Hmm, that's rather suspicious. Any idea why?
Indeed it sounds like there is some memory unsafety afoot here.
Any reason why this shouldn't be called ConstrClosure? Not only would this be more consistent with the rest of the codebase but I keep reading Cons as the list cons constructor.
Hmm, judging from the Harbormaster output it looks like this breaks when the tree is built using ./validate, which doesn't build profiled libraries.
Thanks! I built unregisterized and tested, but before I got profiling working 100%. So I don't think I've covered the TNTC/Profiling matrix of options.
I have a solution for the unboxed/unlifted closures that just needs a bit more sketching out too.
I believe this is referring to the fact that the values returned by unpackClosure, (_, dat, pointers), overlap. dat is the raw words of the closure. In the case of a constructor, that would be header, pointers, and non-pointers. pointers on the other hand wraps the pointer arguments of the closure in a Box so that when GC or evaluation updates those arguments, the boxed pointer is also updated. So the raw word is just an address, but the value inside the box follows the changes to the argument.
(I'm not 100% on how the pointer updating magic works, but this is my impression from reading the code.)
My guess is that both this and the next one have to do with making sure that the pointer arguments get correctly tracked within the boxes. Something to the effect of resolving the pointer while it still in fact points to the expected closure. Again, mostly guessing here.
Yeah, Constr probably makes more sense. I've mostly just left names as they are for lack of better ideas, so I'm fine with anything.
I haven't used the testsuite much, so I have a bit of exploring to do. I'll fix this up with other changes to the tests.
Ok, I've now tested both non-TNTC prof and non-prof, I even found a bug :P I also updated the comment about the Word, changed to ConstrClosure, and fixed the test profiling.
At this point, I think I would describe it as minimally complete. It could probably use a bit of refinement though. Places that could use eyes in particular:
- The HasHeapRep class. I've never worked with kinds before, so advice appreciated.
- The qualified import of InfoTableProf in Heap.hs. The profiled object file was missing when it was needed.
- The tests cover as many closures as I could figure out how to make. I'm not sure how to generate the rest.
I forgot to make the tests compare on more fields, so I'll do that. There is also the memory stuff at the end of getClosureRaw to explore.
Aside from that I think it's mostly names/organizational stuff to improve.
This is looking great! I've had this one lone comment hanging around from a review I started last week. I'll just leave submit it to give you something now.
I'll try to get time to do a more in-depth review tonight or tomorrow.
It would be good if this had a docstring, particularly because there are two other functions in this module with a very similar type.