[Elf/arm] Thumb indicator bit only for STT_FUNC

Authored by angerman on Apr 10 2017, 12:28 AM.

Diff Detail

rGHC Glasgow Haskell Compiler
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.
angerman created this revision.Apr 10 2017, 12:28 AM
trofi accepted this revision.Apr 10 2017, 2:28 AM
trofi added a subscriber: trofi.
trofi added inline comments.

Two questions here:

  • How about STT_GNU_IFUNC? (Perhaps we don't support them in ghci linker anyway)
  • llvmFixupAsm rewrites @function types to @object types to avoid calls via PLT. Is it an issue or we never generate thumb?
This revision is now accepted and ready to land.Apr 10 2017, 2:28 AM
angerman added inline comments.Apr 10 2017, 3:08 AM
  • I do not believe we support STT_GNU_IFUNC at all.
  • in rts/linker/SymbolExtras.c we find the following note:
  Note [The ARM/Thumb Story]

  Support for the ARM architecture is complicated by the fact that ARM has not
  one but several instruction encodings. The two relevant ones here are the original
  ARM encoding and Thumb, a more dense variant of ARM supporting only a subset
  of the instruction set.

  How the CPU decodes a particular instruction is determined by a mode bit. This
  mode bit is set on jump instructions, the value being determined by the low
  bit of the target address: An odd address means the target is a procedure
  encoded in the Thumb encoding whereas an even address means it's a traditional
  ARM procedure (the actual address jumped to is even regardless of the encoding bit).

  Interoperation between Thumb- and ARM-encoded object code (known as "interworking")
  is tricky. If the linker needs to link a call by an ARM object into Thumb code
  (or vice-versa) it will produce a jump island using makeArmSymbolExtra. This,
  however, is incompatible with GHC's tables-next-to-code since pointers
  fixed-up in this way will point to a bit of generated code, not a info
  table/Haskell closure like TNTC expects. For this reason, it is critical that
  GHC emit exclusively ARM or Thumb objects for all Haskell code.

  We still do, however, need to worry about calls to foreign code, hence the
  need for makeArmSymbolExtra.
  • Regarding the function/object rewrite, I'm skeptical at best, and still try to reproduce that scenario, to have this properly fixed in llvm. This is currently on hold, as the sample project we had, was flawed: https://reviews.llvm.org/D30812 I am currently deliberately running with the mangler disabled hoping to run into the specific issue that required the function/object rewrite.

The case where this came up was the following, when using the ghc interpreter.

Go obtain the name of some object, we use a relative offset to the function, which
in this case happened to have the first bit set. The relocation then wrote back a for
the symbol that had the first bit cleared (-295856 instead of -295855), and thus
we ended up pointing to a NUL byte, which for a string is perfectly legal, yet wrong.

(lldb) mem read `(char*)(con_info+1)-295856`
0xa0aa9710: 00 67 68 63 2d 70 72 69 6d 3a 47 48 43 2e 54 79  .ghc-prim:GHC.Ty
0xa0aa9720: 70 65 73 2e 49 23 00 67 68 63 2d 70 72 69 6d 3a  pes.I#.ghc-prim:
bgamari accepted this revision.Apr 10 2017, 8:39 PM

Nice catch!

This revision was automatically updated to reflect the committed changes.