[RFC] Statistical profiling infrastructure
Needs RevisionPublic

Authored by bgamari on Sep 4 2015, 5:39 AM.

Details

Summary

This adds infrastructure for a simple statistical profiler backed by GHC's event log.
The idea here is that we output debugging information describing the program's
block's (including source note information) to the event log (as implemented in D1280)
and during execution collect and emit samples to the event log. This adds the
infrastructure necessary for this.

Test Plan

Test against soon-to-be-released tools.

bgamari updated this revision to Diff 4064.Sep 4 2015, 5:39 AM
bgamari retitled this revision from to [RFC] Add a simple statistical profiler.
bgamari updated this object.
bgamari edited the test plan for this revision. (Show Details)
bgamari added a reviewer: scpmw.
bgamari added a subscriber: scpmw.
bgamari updated this revision to Diff 4065.Sep 4 2015, 7:29 AM
bgamari edited edge metadata.

Break out sampling into separate diff

bgamari updated this object.Sep 4 2015, 7:35 AM
bgamari edited edge metadata.
bgamari retitled this revision from [RFC] Add a simple statistical profiler to [RFC] A simple statistical profiler.Sep 4 2015, 9:26 AM
bgamari edited the test plan for this revision. (Show Details)
bgamari updated this object.
scpmw edited edge metadata.EditedSep 4 2015, 10:35 AM

You are copying the .debug_ghc contents verbatim here? As you are probably aware, my code associated it with DWARF information at this point, also generating event log entries for all other DWARF it could find.

This doesn't have to be the best approach - for my work it seemed like a good idea to have event-logs as self-contained as possible so I wouldn't have to keep the matching binaries around. However this also lead to a significant startup lag and huge event logs, so we might just as well decide to offload all this work to the profile analysis tool (wich is the "standard" approach here). In that scenario, we would not need to copy anything on RTS startup, but instead just write a few messages documenting our runtime memory map.

rts/eventlog/EventLog.c
1558

Might be that I am going overboard here. Motivation is that different sample providers have very different output characteristics:

  • For typical "cycle" or "time" profiling samples will repeat very rarely
  • "heap allocation" profiling, on the other hand, will repeat a sample very often
  • And for "heap residency" profiling we consider every word a sample, so this is even more extreme

Have been trying unsuccessfully in the past to get people excited about encoding samples this way, but I'm open for simplifications.

scpmw added a comment.EditedSep 4 2015, 10:35 AM
This comment has been deleted.

One thing I have wondered about is how whether/when we want we to capture more of the stack. @scpmw, have you thought about this at all?

In D1215#33864, @scpmw wrote:

You are copying the .debug_ghc contents verbatim here? As you are probably aware, my code associated it with DWARF information at this point, also generating event log entries for all other DWARF it could find.

RIght, I agree that a self-contained event log would be quite nice. I intend on adding event log entries for DWARF information; I'll try to start on this tonight.

This doesn't have to be the best approach - for my work it seemed like a good idea to have event-logs as self-contained as possible so I wouldn't have to keep the matching binaries around. However this also lead to a significant startup lag and huge event logs, so we might just as well decide to offload all this work to the profile analysis tool (wich is the "standard" approach here). In that scenario, we would not need to copy anything on RTS startup, but instead just write a few messages documenting our runtime memory map.

rts/eventlog/EventLog.c
1558

I have no objection to the logic as-is. It's straightforward enough and I think the original reasoning is quite valid.

@scpmw, did you ever consider simply emitting the information currently held in the .debug_ghc section in .debug_info with the vendor-defined tag/attribute regions?

I still think that the event log is the right place to collect the debug information and samples for analysis but eliminating the .debug_ghc intermediate may simplify things.

bgamari updated this revision to Diff 4325.Sep 26 2015, 11:49 AM
bgamari edited edge metadata.
  • Rewrite
bgamari updated this object.Sep 26 2015, 11:54 AM
bgamari edited the test plan for this revision. (Show Details)
bgamari updated this revision to Diff 4635.Oct 23 2015, 7:51 AM

Fix word-size dependence

I wonder if it might be worthwhile to add an option to downsample the samples.

For reference: running Gcd 4000 from nofib runs for 26 seconds and produces a 127MByte eventlog.

bgamari updated this revision to Diff 4862.Nov 1 2015, 11:05 AM
bgamari marked 2 inline comments as done.

Rebase

bgamari retitled this revision from [RFC] A simple statistical profiler to [RFC] Statistical profiling infrastructure.Nov 23 2015, 11:24 AM
austin requested changes to this revision.Mar 14 2016, 9:38 AM
austin edited edge metadata.

Note: Punting off the review queue until we want to revive this and look at it more closely.

This revision now requires changes to proceed.Mar 14 2016, 9:38 AM
austin resigned from this revision.Nov 6 2017, 10:10 PM