Add local @function symbols for the code entries
Needs RevisionPublic

Authored by last_g on May 24 2018, 9:45 AM.

Details

Summary

This patch is an alternative to D4713. This provides the same capabilities for debugging as D4713 with its own pros/cons.
This patch adds extra local _entry symbols if tablesNextToCode.

Differences from D4713:

  • more labels in .symbtab (not in .dynsym)
  • relocation safe (all @function symbols are local)

This patch can be also used as a base for D4722.

As an option, we can merge this with D4713 and use one of the approaches depending on the possibility of relocation issues.

Test Plan

tests

Perf output on stock ghc

     9.78%  FibbSlow  FibbSlow            [.] ckY_info
     9.59%  FibbSlow  FibbSlow            [.] cjqd_info
     7.17%  FibbSlow  FibbSlow            [.] c3sg_info
     6.62%  FibbSlow  FibbSlow            [.] c1X_info
     5.32%  FibbSlow  FibbSlow            [.] cjsX_info
     4.18%  FibbSlow  FibbSlow            [.] s3rN_info
     3.82%  FibbSlow  FibbSlow            [.] c2m_info
     3.68%  FibbSlow  FibbSlow            [.] cjlJ_info
     3.26%  FibbSlow  FibbSlow            [.] c3sb_info
     3.19%  FibbSlow  FibbSlow            [.] cjPQ_info
     3.05%  FibbSlow  FibbSlow            [.] cjQd_info
     2.97%  FibbSlow  FibbSlow            [.] cjAB_info
     2.78%  FibbSlow  FibbSlow            [.] cjzP_info
     2.40%  FibbSlow  FibbSlow            [.] cjOS_info
     2.38%  FibbSlow  FibbSlow            [.] s3rK_info
     2.27%  FibbSlow  FibbSlow            [.] cjq0_info
     2.18%  FibbSlow  FibbSlow            [.] cKQ_info
     2.13%  FibbSlow  FibbSlow            [.] cjSl_info
     1.99%  FibbSlow  FibbSlow            [.] s3rL_info
     1.98%  FibbSlow  FibbSlow            [.] c2cC_info
     1.80%  FibbSlow  FibbSlow            [.] s3rO_info
     1.37%  FibbSlow  FibbSlow            [.] c2f2_info
...

Perf output on D4713

     7.97%  FibbSlow  FibbSlow            [.] c3rM_info
     6.75%  FibbSlow  FibbSlow            [.] 0x000000000032cfa8
     6.63%  FibbSlow  FibbSlow            [.] cifA_info
     4.98%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_eqIntegerzh_info
     4.55%  FibbSlow  FibbSlow            [.] chXn_info
     4.52%  FibbSlow  FibbSlow            [.] c3rH_info
     4.45%  FibbSlow  FibbSlow            [.] chZB_info
     4.04%  FibbSlow  FibbSlow            [.] Main_fibbzuslow_info
     4.03%  FibbSlow  FibbSlow            [.] stg_ap_0_fast
     3.76%  FibbSlow  FibbSlow            [.] chXA_info
     3.67%  FibbSlow  FibbSlow            [.] cifu_info
     3.25%  FibbSlow  FibbSlow            [.] ci4r_info
     2.64%  FibbSlow  FibbSlow            [.] s3rf_info
     2.42%  FibbSlow  FibbSlow            [.] s3rg_info
     2.39%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_eqInteger_info
     2.25%  FibbSlow  FibbSlow            [.] integerzmgmp_GHCziIntegerziType_minusInteger_info
     2.17%  FibbSlow  FibbSlow            [.] ghczmprim_GHCziClasses_zeze_info
     2.09%  FibbSlow  FibbSlow            [.] cicc_info
     2.03%  FibbSlow  FibbSlow            [.] 0x0000000000331e15
     2.02%  FibbSlow  FibbSlow            [.] s3ri_info
     1.91%  FibbSlow  FibbSlow            [.] 0x0000000000331bb8
     1.89%  FibbSlow  FibbSlow            [.] ci4N_info
...

Perf output on this patch ghc

    10.41%  FibbSlow.n  FibbSlow.n          [.] stg_upd_frame_ret
     8.81%  FibbSlow.n  FibbSlow.n          [.] integerzmgmp_GHCziIntegerziType_eqIntegerzh_entry
     7.14%  FibbSlow.n  FibbSlow.n          [.] c3sg_info
     6.95%  FibbSlow.n  FibbSlow.n          [.] integerzmgmp_GHCziIntegerziType_eqInteger_entry
     5.61%  FibbSlow.n  FibbSlow.n          [.] cjpX_info
     4.25%  FibbSlow.n  FibbSlow.n          [.] Main_fibbzuslow_entry
     4.14%  FibbSlow.n  FibbSlow.n          [.] stg_ap_pp_ret
     4.06%  FibbSlow.n  FibbSlow.n          [.] cjPF_info
     3.82%  FibbSlow.n  FibbSlow.n          [.] cjPZ_info
     3.50%  FibbSlow.n  FibbSlow.n          [.] ghczmprim_GHCziClasses_zeze_entry
     3.45%  FibbSlow.n  FibbSlow.n          [.] cjzX_info
     3.32%  FibbSlow.n  FibbSlow.n          [.] cjAk_info
     3.09%  FibbSlow.n  FibbSlow.n          [.] c3sb_info
     2.84%  FibbSlow.n  FibbSlow.n          [.] cjzM_info
     2.38%  FibbSlow.n  FibbSlow.n          [.] s3rN_info
     2.25%  FibbSlow.n  FibbSlow.n          [.] s3rL_info
     2.19%  FibbSlow.n  FibbSlow.n          [.] s3rK_info
     2.01%  FibbSlow.n  FibbSlow.n          [.] 0x00000000000b10d8
     2.00%  FibbSlow.n  FibbSlow.n          [.] 0x00000000000b1116
     1.90%  FibbSlow.n  FibbSlow.n          [.] cjPK_info
     1.89%  FibbSlow.n  FibbSlow.n          [.] integerzmgmp_GHCziIntegerziType_minusInteger_entry
     1.84%  FibbSlow.n  FibbSlow.n          [.] stg_BLACKHOLE_entry
     1.79%  FibbSlow.n  FibbSlow.n          [.] s3rO_info
     1.35%  FibbSlow.n  FibbSlow.n          [.] cjPu_info
     1.31%  FibbSlow.n  FibbSlow.n          [.] cjRo_info
     1.20%  FibbSlow.n  FibbSlow.n          [.] base_GHCziNum_zm_entry
     0.72%  FibbSlow.n  FibbSlow.n          [.] c2d1_info
     0.67%  FibbSlow.n  FibbSlow.n          [.] base_GHCziNum_zp_entry
     0.60%  FibbSlow.n  FibbSlow.n          [.] ghczmprim_GHCziClasses_CZCEq_con_entry
     0.60%  FibbSlow.n  FibbSlow.n          [.] cjBJ_info
...
last_g created this revision.May 24 2018, 9:45 AM
last_g edited the summary of this revision. (Show Details)May 24 2018, 9:47 AM
last_g retitled this revision from Add local @function symbols for the code entries code to Add local @function symbols for the code entries.
angerman requested changes to this revision.May 26 2018, 5:42 AM
angerman added inline comments.
compiler/nativeGen/X86/Ppr.hs
193–197

I believe you'll need a similar osElfTarget guard, as there is for pprType. Mach-O doesn't have the .type directive, I believe.

We use the NCG to produce code at least for macOS (machO), linux/bsd (ELF), and windows (PE).

This revision now requires changes to proceed.May 26 2018, 5:42 AM

Note that while we have the fix in LLVM, the result only works with the llvm-ng backend, not with the stock llvm backend due to the excessive use of aliases in the stock backend, which confuses LLVM's entry-point logic.

compiler/nativeGen/X86/Ppr.hs
103–111

With the extra entry label, we might be able to drop this $dsp generating code on macOS.

E.g. see the logic we implemented in LLVM (https://reviews.llvm.org/D30770).

193–197

In combination with the above, you probably want an alternative guard for platformHasSubsectionsViaSymbols platform with

$$ (ppr lbl <> char ':')
$$ text ".alt_entry" <> ppr functionLbl