Mark code related symbols as @function not @object
ClosedPublic

Authored by last_g on May 17 2018, 2:12 PM.

Details

Summary

This diff is a part of the bigger project which goal is to improve common profiling tools support (perf) for GHC binaries.

A similar job was already done and reverted in the past:

Reasoning:

Perf and similar tools build in memory symbol table from the .symtab section of the ELF file to
display human-readable function names instead of the addresses in the output. Perf uses only
two types of symbols: @function and @notype but GHC is not capable to produce any @function
symbols so the perf output is pretty useless (All the haskell symbols that you can see in perf now are @notype
internal symbols extracted by mistake/hack).

The changes:

  • mark code related symbols as @function
  • small hack to mark InfoTable symbols as code if TABLES_NEXT_TO_CODE is true

Limitations:

  • The perf symbolization support is not complete after this patch but I'm working on the second patch.
  • Constructor symbols are not supported. To fix that we can issue extra local symbols which mark code sections as code and will be only used for debug.
Test Plan

tests
any additional ideas?

Perf output on stock ghc 8.4.1:

     9.78%  FibbSlow  FibbSlow            [.] ckY_info
     9.59%  FibbSlow  FibbSlow            [.] cjqd_info
     7.17%  FibbSlow  FibbSlow            [.] c3sg_info
     6.62%  FibbSlow  FibbSlow            [.] c1X_info
     5.32%  FibbSlow  FibbSlow            [.] cjsX_info
     4.18%  FibbSlow  FibbSlow            [.] s3rN_info
     3.82%  FibbSlow  FibbSlow            [.] c2m_info
     3.68%  FibbSlow  FibbSlow            [.] cjlJ_info
     3.26%  FibbSlow  FibbSlow            [.] c3sb_info
     3.19%  FibbSlow  FibbSlow            [.] cjPQ_info
     3.05%  FibbSlow  FibbSlow            [.] cjQd_info
     2.97%  FibbSlow  FibbSlow            [.] cjAB_info
     2.78%  FibbSlow  FibbSlow            [.] cjzP_info
     2.40%  FibbSlow  FibbSlow            [.] cjOS_info
     2.38%  FibbSlow  FibbSlow            [.] s3rK_info
     2.27%  FibbSlow  FibbSlow            [.] cjq0_info
     2.18%  FibbSlow  FibbSlow            [.] cKQ_info
     2.13%  FibbSlow  FibbSlow            [.] cjSl_info
     1.99%  FibbSlow  FibbSlow            [.] s3rL_info
     1.98%  FibbSlow  FibbSlow            [.] c2cC_info
     1.80%  FibbSlow  FibbSlow            [.] s3rO_info
     1.37%  FibbSlow  FibbSlow            [.] c2f2_info
...

Perf output on patched ghc:

     7.97%  FibbSlow  FibbSlow            [.] c3rM_info
     6.75%  FibbSlow  FibbSlow            [.] 0x000000000032cfa8
     6.63%  FibbSlow  FibbSlow            [.] cifA_info
     4.98%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_eqIntegerzh_info
     4.55%  FibbSlow  FibbSlow            [.] chXn_info
     4.52%  FibbSlow  FibbSlow            [.] c3rH_info
     4.45%  FibbSlow  FibbSlow            [.] chZB_info
     4.04%  FibbSlow  FibbSlow            [.] Main_fibbzuslow_info
     4.03%  FibbSlow  FibbSlow            [.] stg_ap_0_fast
     3.76%  FibbSlow  FibbSlow            [.] chXA_info
     3.67%  FibbSlow  FibbSlow            [.] cifu_info
     3.25%  FibbSlow  FibbSlow            [.] ci4r_info
     2.64%  FibbSlow  FibbSlow            [.] s3rf_info
     2.42%  FibbSlow  FibbSlow            [.] s3rg_info
     2.39%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_eqInteger_info
     2.25%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_minusInteger_info
     2.17%  FibbSlow  FibbSlow            [.] ghczmprim_GHCziClasses_zeze_info
     2.09%  FibbSlow  FibbSlow            [.] cicc_info
     2.03%  FibbSlow  FibbSlow            [.] 0x0000000000331e15
     2.02%  FibbSlow  FibbSlow            [.] s3ri_info
     1.91%  FibbSlow  FibbSlow            [.] 0x0000000000331bb8
     1.89%  FibbSlow  FibbSlow            [.] ci4N_info
...

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
last_g created this revision.May 17 2018, 2:12 PM
bgamari requested changes to this revision.May 20 2018, 10:08 AM
bgamari added inline comments.
compiler/nativeGen/X86/Ppr.hs
186

I'm having a bit of trouble following this comment. Do you think you could do a syntax/grammar pass over it?

198

Perhaps you could give a concrete example? I'm rather lost.

This revision now requires changes to proceed.May 20 2018, 10:08 AM

We ought to check what LLVM does and that whatever we do here doesn't break the LLVM backend.

compiler/nativeGen/X86/Ppr.hs
190

Yeah, the reasoning is a bit subtle so we'll want to expand this comment, and also include an example as @bgamari suggested above.

The reasoning goes like this, if I'm not mistaken:

  • The danger when we mark a symbol as @function is that the linker will redirect it to point to the PLT and use a JUMP_SLOT relocation when the symbol refers to something outside the current shared object. A PLT / JUMP_SLOT reference only works for symbols that we jump to, not for symbols representing data,, nor for info table symbol references which we expect to point directly to the info table.
  • GHC generates code that might refer to any info table symbol from the text segment, but that's OK, because those will be explicit GOT references generated by the code generator.
  • When we refer to info tables from the data segment, it's either
    • a FUN_STATIC/THUNK_STATIC local to this module
    • a con_info that could be from anywhere

So, the only info table symbols that we might refer to from the data segment of another shared object are con_info symbols, so those are the ones we need to exclude from getting the @function treatment.

last_g updated this revision to Diff 16868.Jun 12 2018, 4:07 PM

Addressing comments and making a code comment readable

last_g updated this revision to Diff 16869.Jun 12 2018, 4:11 PM

Comments

last_g updated this revision to Diff 16870.Jun 12 2018, 4:18 PM

Comments

last_g updated this revision to Diff 16871.Jun 12 2018, 4:27 PM

Make comment readable

last_g updated this revision to Diff 16872.Jun 12 2018, 4:35 PM

Include @simonmar comment into the code

@bgamari I've rewritten the comment please check
@simonmar thanks for the great explanation!

last_g marked 3 inline comments as done.Jun 14 2018, 8:51 AM
bgamari requested changes to this revision.Jun 17 2018, 11:20 AM

Alright, I suppose this is reasonable. However, we should make this logic and its comment much more discoverable. Let's make the comment a Note; I'm not entirely sure where to refer to it from yet.

Also, do you have any insight into why some symbols are apparently not being resolved now (e.g. the 0x000000000032cfa8 in the "after" output)? This is a very unfortunate regression (very nearly enough to compel me to object to this patch).

compiler/nativeGen/X86/Ppr.hs
176

Why not just use text here?

181

This should be a Note.

This revision now requires changes to proceed.Jun 17 2018, 11:20 AM
last_g updated this revision to Diff 17007.Jun 18 2018, 7:56 PM
last_g marked 2 inline comments as done.

Address the comments

I have an idea why it's happening and I plan to investigate it later.

The issue comes from the current perf symbolization algorithm.

The basic logic is (kind of) simple:

  1. Take all the @function symbols and put into a sorted list;
  2. the next steps are a hack to support handwritten assembly
  3. Take all the NOTYPE symbols with the size equals to 0 and put into the same ordered list;
  4. Run the symbols__fixup_end procedure which sets a symbol end address to be the beginning of the next symbol for every 0 sized symbol;
  5. If the last symbol is zero sized: set its size to be ~4k.

(The actual logic is more complicated because it also involves sections&map groups)

In GHC compiled binaries there are no @function symbols and most internal symbols are NOTYPE and 0 sized so we are ending up in the hack's code.

This logic effectively means that every address in our code space is attributed to some internal symbol (correct or not). Adding @function symbols with size directive stops this from happening.
As the first guess, those addresses can come from _con_info entries which we don't mark as @function but in that case, there should be no unknown addresses in D4730 but we have some there too.

bgamari accepted this revision.Jul 2 2018, 3:33 PM

@bgamari ping :)

Alright, given that we seem to have an understanding of why the regression is occurring I am okay with this. However, we should open a ticket to make sure that your explanation is recorded somewhere. Do you think you could do that?

This revision is now accepted and ready to land.Jul 2 2018, 3:33 PM
lelf added a subscriber: lelf.Jul 23 2018, 3:17 AM
last_g updated this revision to Diff 17619.Aug 9 2018, 11:09 AM

rebase attempt

lelf added a comment.Aug 9 2018, 12:16 PM

Will this render dtrace totally unusable?

@bgamari I filed a ticket (https://ghc.haskell.org/trac/ghc/ticket/15501). Can you merge this?

@lelf I don't see a reason why it should break dtrace. I'll test just to be sure. You can share your arguments :)

thomie updated the Trac tickets for this revision.Aug 13 2018, 2:06 AM

@lelf I ran two binaries built with stock ghc 8.4.2 and my patch on top of the master and very the same results from dtrace

https://gist.github.com/last-g/e7457f54e2efc726cc7b0975b3c800d3

last_g updated this revision to Diff 17679.Aug 16 2018, 6:59 AM

Rebase to fix submodule changes

@last_g, the patch looks reasonable. That being said, I don't want to merge it without seeing it pass on all of our tested platforms. It looks to me like you didn't push this to the staging area and therefore it wasn't run through CI. Can you update the diff, this time pushing to the staging area?

last_g updated this revision to Diff 17684.Aug 16 2018, 10:34 AM

Pushing to staging

There is one test failure not related to the changes (I reproduced on the parent commit)

lelf added a comment.Aug 17 2018, 11:48 AM

@lelf I ran two binaries built with stock ghc 8.4.2 and my patch on top of the master and very the same results from dtrace

https://gist.github.com/last-g/e7457f54e2efc726cc7b0975b3c800d3

This has nothing to do with it.
I mean dtracing function calls. (pid$target::…)

@lelf Yeah. It looks like this renders dtrace totally unusable (or more unusable). On 8.4.3 and master ghc crashes after getting some probes (https://ghc.haskell.org/trac/ghc/ticket/15543) but with this patch, it crashes immediately. Any clues? (I have almost no idea how dtrace works, how to debug it and how to debug ghc on MacOS, there is even no stack trace in lldb)

@bgamari will this be a blocker for merging the patch? I can limit changes only to Linux/x86 only.

@lelf Yeah. It looks like this renders dtrace totally unusable (or more unusable). On 8.4.3 and master ghc crashes after getting some probes (https://ghc.haskell.org/trac/ghc/ticket/15543) but with this patch, it crashes immediately. Any clues? (I have almost no idea how dtrace works, how to debug it and how to debug ghc on MacOS, there is even no stack trace in lldb)

Have you tried debugging on Linux? I believe that perf can use dtrace probe points.

@bgamari will this be a blocker for merging the patch? I can limit changes only to Linux/x86 only.

Hmm, this is a tricky one. If dtrace were completely functional I would say that it is absolutely a blocker. The fact that it's already somewhat broken makes the matter significantly more blurry, however.

@bgamari

Have you tried debugging on Linux? I believe that perf can use dtrace probe points.

No, perf is not working with dtrace and dtrace is only supported on MacOS.
I tested master vs this patch branches on System Tap and it works unstable in both cases. I wrote more details on the task (https://ghc.haskell.org/trac/ghc/ticket/15543).
The patched version looks a bit better (more details about calls) when attached to a running process.

Hmm, this is a tricky one. If dtrace were completely functional I would say that it is absolutely a blocker. The fact that it's already somewhat broken makes the matter significantly more blurry, however.

I double checked everything and this patch should not have any effect on MacOS X because the changes are applied only to the Elf platforms. I managed to compile binary with the patched version of the compiler that behaves exactly the same. Probably, the issue is nondeterministic.

@bgamari

Have you tried debugging on Linux? I believe that perf can use dtrace probe points.

No, perf is not working with dtrace and dtrace is only supported on MacOS.

To be clear, I was suggesting that perf can use the same USDT probe-points that dtrace uses, not that perf uses dtrace. See, for instance, Bredan Gregg's blog. However, I don't now how similar the implementation is.

I tested master vs this patch branches on System Tap and it works unstable in both cases. I wrote more details on the task (https://ghc.haskell.org/trac/ghc/ticket/15543).

Alright, it sounds like dtrace support is just plain broken. Quite sad but it is what it is.

The patched version looks a bit better (more details about calls) when attached to a running process.

Hmm, this is a tricky one. If dtrace were completely functional I would say that it is absolutely a blocker. The fact that it's already somewhat broken makes the matter significantly more blurry, however.

I double checked everything and this patch should not have any effect on MacOS X because the changes are applied only to the Elf platforms. I managed to compile binary with the patched version of the compiler that behaves exactly the same. Probably, the issue is nondeterministic.

Alright. I suppose we'll have to leave the remaining investigation of Trac #15501 for future work, assuming @simonmar doesn't mind this (arguable) regression.

last_g updated this revision to Diff 17784.Aug 24 2018, 10:39 AM

Rebase on remote/master
Should fix tests

last_g updated this revision to Diff 17785.Aug 24 2018, 10:42 AM

accidental squash

I don't have any strong opinions about dtrace (I've never used it). But if it's already broken, did anyone notice? Do we have a ticket for the existing breakage? Who would be affected by breaking it further?

last_g added a comment.EditedAug 28 2018, 10:24 AM

@bgamari @simonmar

As I mentioned before this patch actually has no effect on dtrace. My first conclusion was caused by the non-deterministic nature of the problem i.e. dtrace behavior may change after recompilation of the same file. But the process always crashes under dtrace.

@simonmar

It's broken at least since 8.2.
I created a ticket for the breakage (https://ghc.haskell.org/trac/ghc/ticket/15543) and I didn't find any similar report.

lelf added a comment.Sep 6 2018, 4:03 AM

@bgamari

Have you tried debugging on Linux? I believe that perf can use dtrace probe points.

No, perf is not working with dtrace and dtrace is only supported on MacOS.

And many BSDs and Solaris

I double checked everything and this patch should not have any effect on MacOS X because the changes are applied only to the Elf platforms.

How so?

@lelf I'm not sure if you're raising an exception to this patch or not. If you are, could you explain exactly what problem it could introduce? @last_g says the patch has no effect on dtrace, do you disagree?

@lelf

How so?

All the changes are located in the pprLabelType' function which is only called from pprTypeDecl and the call is guarded with osElfTarget (platformOS platform) . MacOS is not an ELF platform thus this patch doesn't affect MacOS.

I think we can go ahead with this. @bgamari would you like to push it?

lelf added a comment.Sep 7 2018, 6:22 PM

@lelf I'm not sure if you're raising an exception to this patch or not. If you are, could you explain exactly what problem it could introduce? @last_g says the patch has no effect on dtrace, do you disagree?

Of course it does! It changes symbols' “types”.
The patch shouldn't break anything, I misread it. But (as someone who uses dtrace, and yes — it works) I'd be very sad if it does. I'm trying to validate it on FreeBSD & SmartOS, but compiling GHC there is an exercise on its own.

As a general point, if dtrace is to be a first-class supported feature then it needs to have tests and documentation.

This revision was automatically updated to reflect the committed changes.