Mark code related symbols as @function not @object
AcceptedPublic

Authored by last_g on May 17 2018, 2:12 PM.

Details

Summary

This diff is a part of the bigger project which goal is to improve common profiling tools support (perf) for GHC binaries.

A similar job was already done and reverted in the past:

Reasoning:

Perf and similar tools build in memory symbol table from the .symtab section of the ELF file to
display human-readable function names instead of the addresses in the output. Perf uses only
two types of symbols: @function and @notype but GHC is not capable to produce any @function
symbols so the perf output is pretty useless (All the haskell symbols that you can see in perf now are @notype
internal symbols extracted by mistake/hack).

The changes:

  • mark code related symbols as @function
  • small hack to mark InfoTable symbols as code if TABLES_NEXT_TO_CODE is true

Limitations:

  • The perf symbolization support is not complete after this patch but I'm working on the second patch.
  • Constructor symbols are not supported. To fix that we can issue extra local symbols which mark code sections as code and will be only used for debug.
Test Plan

tests
any additional ideas?

Perf output on stock ghc 8.4.1:

     9.78%  FibbSlow  FibbSlow            [.] ckY_info
     9.59%  FibbSlow  FibbSlow            [.] cjqd_info
     7.17%  FibbSlow  FibbSlow            [.] c3sg_info
     6.62%  FibbSlow  FibbSlow            [.] c1X_info
     5.32%  FibbSlow  FibbSlow            [.] cjsX_info
     4.18%  FibbSlow  FibbSlow            [.] s3rN_info
     3.82%  FibbSlow  FibbSlow            [.] c2m_info
     3.68%  FibbSlow  FibbSlow            [.] cjlJ_info
     3.26%  FibbSlow  FibbSlow            [.] c3sb_info
     3.19%  FibbSlow  FibbSlow            [.] cjPQ_info
     3.05%  FibbSlow  FibbSlow            [.] cjQd_info
     2.97%  FibbSlow  FibbSlow            [.] cjAB_info
     2.78%  FibbSlow  FibbSlow            [.] cjzP_info
     2.40%  FibbSlow  FibbSlow            [.] cjOS_info
     2.38%  FibbSlow  FibbSlow            [.] s3rK_info
     2.27%  FibbSlow  FibbSlow            [.] cjq0_info
     2.18%  FibbSlow  FibbSlow            [.] cKQ_info
     2.13%  FibbSlow  FibbSlow            [.] cjSl_info
     1.99%  FibbSlow  FibbSlow            [.] s3rL_info
     1.98%  FibbSlow  FibbSlow            [.] c2cC_info
     1.80%  FibbSlow  FibbSlow            [.] s3rO_info
     1.37%  FibbSlow  FibbSlow            [.] c2f2_info
...

Perf output on patched ghc:

     7.97%  FibbSlow  FibbSlow            [.] c3rM_info
     6.75%  FibbSlow  FibbSlow            [.] 0x000000000032cfa8
     6.63%  FibbSlow  FibbSlow            [.] cifA_info
     4.98%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_eqIntegerzh_info
     4.55%  FibbSlow  FibbSlow            [.] chXn_info
     4.52%  FibbSlow  FibbSlow            [.] c3rH_info
     4.45%  FibbSlow  FibbSlow            [.] chZB_info
     4.04%  FibbSlow  FibbSlow            [.] Main_fibbzuslow_info
     4.03%  FibbSlow  FibbSlow            [.] stg_ap_0_fast
     3.76%  FibbSlow  FibbSlow            [.] chXA_info
     3.67%  FibbSlow  FibbSlow            [.] cifu_info
     3.25%  FibbSlow  FibbSlow            [.] ci4r_info
     2.64%  FibbSlow  FibbSlow            [.] s3rf_info
     2.42%  FibbSlow  FibbSlow            [.] s3rg_info
     2.39%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_eqInteger_info
     2.25%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_minusInteger_info
     2.17%  FibbSlow  FibbSlow            [.] ghczmprim_GHCziClasses_zeze_info
     2.09%  FibbSlow  FibbSlow            [.] cicc_info
     2.03%  FibbSlow  FibbSlow            [.] 0x0000000000331e15
     2.02%  FibbSlow  FibbSlow            [.] s3ri_info
     1.91%  FibbSlow  FibbSlow            [.] 0x0000000000331bb8
     1.89%  FibbSlow  FibbSlow            [.] ci4N_info
...
last_g created this revision.May 17 2018, 2:12 PM
bgamari requested changes to this revision.May 20 2018, 10:08 AM
bgamari added inline comments.
compiler/nativeGen/X86/Ppr.hs
179

I'm having a bit of trouble following this comment. Do you think you could do a syntax/grammar pass over it?

191

Perhaps you could give a concrete example? I'm rather lost.

This revision now requires changes to proceed.May 20 2018, 10:08 AM

We ought to check what LLVM does and that whatever we do here doesn't break the LLVM backend.

compiler/nativeGen/X86/Ppr.hs
183

Yeah, the reasoning is a bit subtle so we'll want to expand this comment, and also include an example as @bgamari suggested above.

The reasoning goes like this, if I'm not mistaken:

  • The danger when we mark a symbol as @function is that the linker will redirect it to point to the PLT and use a JUMP_SLOT relocation when the symbol refers to something outside the current shared object. A PLT / JUMP_SLOT reference only works for symbols that we jump to, not for symbols representing data,, nor for info table symbol references which we expect to point directly to the info table.
  • GHC generates code that might refer to any info table symbol from the text segment, but that's OK, because those will be explicit GOT references generated by the code generator.
  • When we refer to info tables from the data segment, it's either
    • a FUN_STATIC/THUNK_STATIC local to this module
    • a con_info that could be from anywhere

So, the only info table symbols that we might refer to from the data segment of another shared object are con_info symbols, so those are the ones we need to exclude from getting the @function treatment.

last_g updated this revision to Diff 16868.Jun 12 2018, 4:07 PM

Addressing comments and making a code comment readable

last_g updated this revision to Diff 16869.Jun 12 2018, 4:11 PM

Comments

last_g updated this revision to Diff 16870.Jun 12 2018, 4:18 PM

Comments

last_g updated this revision to Diff 16871.Jun 12 2018, 4:27 PM

Make comment readable

last_g updated this revision to Diff 16872.Jun 12 2018, 4:35 PM

Include @simonmar comment into the code

@bgamari I've rewritten the comment please check
@simonmar thanks for the great explanation!

last_g marked 3 inline comments as done.Jun 14 2018, 8:51 AM
bgamari requested changes to this revision.Sun, Jun 17, 11:20 AM

Alright, I suppose this is reasonable. However, we should make this logic and its comment much more discoverable. Let's make the comment a Note; I'm not entirely sure where to refer to it from yet.

Also, do you have any insight into why some symbols are apparently not being resolved now (e.g. the 0x000000000032cfa8 in the "after" output)? This is a very unfortunate regression (very nearly enough to compel me to object to this patch).

compiler/nativeGen/X86/Ppr.hs
169

Why not just use text here?

174

This should be a Note.

This revision now requires changes to proceed.Sun, Jun 17, 11:20 AM
last_g updated this revision to Diff 17007.Mon, Jun 18, 7:56 PM
last_g marked 2 inline comments as done.

Address the comments

I have an idea why it's happening and I plan to investigate it later.

The issue comes from the current perf symbolization algorithm.

The basic logic is (kind of) simple:

  1. Take all the @function symbols and put into a sorted list;
  2. the next steps are a hack to support handwritten assembly
  3. Take all the NOTYPE symbols with the size equals to 0 and put into the same ordered list;
  4. Run the symbols__fixup_end procedure which sets a symbol end address to be the beginning of the next symbol for every 0 sized symbol;
  5. If the last symbol is zero sized: set its size to be ~4k.

(The actual logic is more complicated because it also involves sections&map groups)

In GHC compiled binaries there are no @function symbols and most internal symbols are NOTYPE and 0 sized so we are ending up in the hack's code.

This logic effectively means that every address in our code space is attributed to some internal symbol (correct or not). Adding @function symbols with size directive stops this from happening.
As the first guess, those addresses can come from _con_info entries which we don't mark as @function but in that case, there should be no unknown addresses in D4730 but we have some there too.

bgamari accepted this revision.Mon, Jul 2, 3:33 PM

@bgamari ping :)

Alright, given that we seem to have an understanding of why the regression is occurring I am okay with this. However, we should open a ticket to make sure that your explanation is recorded somewhere. Do you think you could do that?

This revision is now accepted and ready to land.Mon, Jul 2, 3:33 PM