Signals: Print backtrace on SIGUSR2
ClosedPublic

Authored by bgamari on Sep 1 2015, 1:41 PM.

Details

Summary

This uses the backtrace support introduced in D1196 to provide backtraces from
Haskell processes when SIGUSR2 is thrown.

Test Plan

Need to add a test.

bgamari updated this revision to Diff 4002.Sep 1 2015, 1:41 PM
bgamari retitled this revision from to Signals: Print backtrace on SIGUSR2.
bgamari updated this object.
bgamari edited the test plan for this revision. (Show Details)
bgamari added reviewers: simonmar, scpmw, Tarrasch.
bgamari updated this revision to Diff 4007.Sep 1 2015, 3:34 PM
bgamari edited edge metadata.

Rebase

scpmw edited edge metadata.EditedSep 2 2015, 4:04 PM

This actually manages to unwind through the signal handler? Wow.

Note however that the cleaner solution here might be to use a "sigaction" callback. That gives you the full register file of the interrupted thread, which you could use to seed the unwinding process.

And while we're at it - this is also how you'd do "glorified timer" perf_events profiling: Use perf_event_open to get an FD for a hardware performance counter, then use fnctl with F_SETSIG to associate, say, SIGUSER2 with the counter. Result: The handler gets called with the register set of the thread whenever a hardware counter overflow occurs.

In D1197#33569, @scpmw wrote:

This actually manages to unwind through the signal handler? Wow.

Note however that the cleaner solution here might be to use a "sigaction" callback. That gives you the full register file of the interrupted thread, which you could use to seed the unwinding process.

Ahh, great! I'll give it a shot.

And while we're at it - this is also how you'd do "glorified timer" perf_events profiling: Use perf_event_open to get an FD for a hardware performance counter, then use fnctl with F_SETSIG to associate, say, SIGUSER2 with the counter. Result: The handler gets called with the register set of the thread whenever a hardware counter overflow occurs.

This sounds fantastic. At some point I'll have to take a look at the perf_event documentation (or at least that which exists)

scpmw added a comment.EditedSep 3 2015, 8:37 AM

Also might be a good spot to discuss this - is libdwfl's unwinder segfault-proof? Because especially for Haskell code it will be a bit hit-and-miss whether the backtrace succeeds. The code generator only updates the unwind information at the start of the block, so basically anything after an Sp update until the end of the block would be potential crash territory.

If yes, that would be another serious advantage over the RTS stack-walker, which would be much more fragile.

@scpmw, I haven't yet seen a crash but that of course doesn't mean it can't happen. One simple test that I can try is just start randomly throwing SIGUSR2 at the stage1 compiler as it's building stage2 and see whether it falls over. Certainly not proof that it can't segfault, but it will at least instill a bit more trust.

bgamari updated this revision to Diff 4079.Sep 4 2015, 4:48 PM
bgamari edited edge metadata.

Update

bgamari updated this revision to Diff 4439.Oct 7 2015, 2:14 AM
bgamari edited edge metadata.

Update

austin accepted this revision.Oct 16 2015, 4:08 PM
austin edited edge metadata.

Alright, go for it.

This revision is now accepted and ready to land.Oct 16 2015, 4:08 PM
bgamari updated this revision to Diff 4539.Oct 17 2015, 9:38 AM
bgamari edited edge metadata.

Update

This revision was automatically updated to reflect the committed changes.