Systools: read ELF section without calling readelf
ClosedPublic

Authored by hsyl20 on Oct 27 2015, 8:23 AM.

Details

Summary

This patch tackles two issues:

  1. GHC stores a "link info" string into a ELF section. Initially a section with type "note" was used but GHC didn't follow the ELF specification which specifies a record-based format for these sections. With D1375 we switched to a "progbits" section type for which there isn't any format constraint. This is an issue for D1242 which use GCC's --gc-sections which collects "unused" sections, such as our section containing link info... In this patch, we fall back to a section with type "note" but we respect the specified format.
  1. Reading back the ELF section was done by parsing the result of a call to "readelf". Calling readelf is problematic because the program may not be available or it may be renamed on some platforms (see D1326). Moreover we have no garanty that its output layout will stay the same in future releases of readelf. Finally we would need to fix the parsing to support "note" sections because of 1. Instead, this patch proposes to use Data.Binary.Get to directly read the "link info" note into its section. ELF has a specification, hence it should work on every conforming platform.

This patch "reverts" D1375, hence it supersedes D1432. It makes D1326 not necessary anymore.

Test Plan
  • recomp011 should pass (test that relinking is avoided when both "link info" match)
  • we should add a test for ELF objects with more than 0xff00 sections => added test "recomp015"
  • we should check that GAS generates 32-bit words with .int on every supported platform using ELF (or find a place where this is documented). harbomaster and I (@hsyl20) only tested on x86-64. On platforms where it is not true, it should make recomp011 fail. => tested to work on Linux/amd64, Solaris/i386 and OpenBSD/amd64

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
There are a very large number of changes, so older changes are hidden. Show Older Changes
bgamari edited edge metadata.Oct 27 2015, 9:14 AM

Perhaps we should move this to another source file? SysTools is already a bit of a grab-bag.

hsyl20 updated this revision to Diff 4705.Oct 27 2015, 9:20 AM
hsyl20 edited edge metadata.

Replace (<$>) with fmap

hsyl20 updated this revision to Diff 4707.Oct 27 2015, 9:34 AM
hsyl20 edited edge metadata.

Move readElfSection in its own module

hsyl20 updated this revision to Diff 4708.Oct 27 2015, 9:46 AM
hsyl20 edited edge metadata.

Replace match with fmap

Haven't read over the code carefully or compared it to the ELF spec but I am generally in favor of this approach. If it is so easy to do, then as you say, parsing the ELF format, which has a spec, must be more sensible than parsing the output of readelf, which does not. (Think of the trouble with parsing the output of nm in deriveConstants...)

The situation is a bit delicate though with the existence of D1326 and the timing of 7.10.3. It would be a little unfortunate to add a -pgmreadelf option and then immediately deprecate it in the next version of GHC, but there's also not lots of time to test this patch on exotic systems. Should we just assume that it will work on all ELF platforms? Maybe it's prudent to print a warning and then return Nothing in the event of parse failure, then nothing can really go wrong since this is currently only used to detect whether to avoid relinking the final executable.

Maybe it's prudent to print a warning and then return Nothing in the event of parse failure, then nothing can really go wrong since this is currently only used to detect whether to avoid relinking the final executable.

Oh, you already implemented this while I was writing my comment. Great.

(Think of the trouble with parsing the output of `nm` in `deriveConstants`...)

I have code to read ELF symbol information too ;-)

In D1381#39698, @hsyl20 wrote:
(Think of the trouble with parsing the output of `nm` in `deriveConstants`...)

I have code to read ELF symbol information too ;-)

Unfortunately that part of the code definitely needs to support all systems, and not just those which use the ELF object format. Probably once you add Mach-O and the Windows object format(s) it's not worth it to do the work ourselves.

hsyl20 updated this revision to Diff 4717.Oct 27 2015, 12:05 PM
hsyl20 edited edge metadata.

Rebase to origin/master

hsyl20 updated this revision to Diff 4924.Nov 4 2015, 12:33 PM
hsyl20 edited edge metadata.

Support ELF "note" section type and use them for "link info"

hsyl20 updated this object.Nov 4 2015, 1:17 PM
hsyl20 edited the test plan for this revision. (Show Details)
hsyl20 added a reviewer: olsner.
hsyl20 updated the Trac tickets for this revision.
hsyl20 updated this revision to Diff 4928.Nov 4 2015, 1:21 PM

Minor fixes

hsyl20 updated this revision to Diff 4932.Nov 4 2015, 3:51 PM
hsyl20 edited edge metadata.

Fix last commits

hsyl20 updated this revision to Diff 4933.Nov 4 2015, 4:05 PM
hsyl20 edited edge metadata.

Fix lint warnings

hsyl20 updated this revision to Diff 4937.Nov 4 2015, 6:34 PM
hsyl20 edited edge metadata.

Fix last lint issues

ony edited edge metadata.Nov 5 2015, 11:36 AM

It would be nice to have some external library which can be used by user programs as well. Thus I'd prefer to see it a separate project pulled in as a git submodule similar to other libraries user in GHC.

There is already http://hackage.haskell.org/package/elf but extracting a section and a note is relatively trivial so I don't think it is worth adding another submodule. However, if we have to manage symbol tables and the like in the future, it could be a solution.

thomie edited edge metadata.Nov 5 2015, 1:26 PM

Looks good in general. A few remarks:

  • Can you put a link to the ELF specification you based this on in compiler/main/Elf.hs? rts/Linker.c refers to a draft for a "next" version:
These extensions seem to be undocumented in version 4.1 of the ABI and only
appear in the drafts for the "next" version:
   https://refspecs.linuxfoundation.org/elf/gabi4+/contents.html

I also found this draft, dated 10 June 2013: http://www.sco.com/developers/gabi/latest/contents.html
Does the code work with either specification?

" If the number of sections is greater than or equal to SHN_LORESERVE (0xff00), e_shnum has the value SHN_UNDEF (0) and the actual number of section header table entries is contained in the sh_size field of the section header at index 0 (otherwise, the sh_size member of the initial entry contains 0)."

"If the section name string table section index is greater than or equal to SHN_LORESERVE (0xff00), this member has the value SHN_XINDEX (0xffff) and the actual index of the section name string table section is contained in the sh_link field of the section header at index 0. (Otherwise, the sh_link member of the initial entry contains 0.) "

If it isn't necessary to look at those fields, please explain so in the code.

  • You mentioned on irc: "It works on my x86-64 arch but I don't know for other arch". Is more testing needed? Harbormaster only tests x86-64 as well.
  • What happens when readElfNote is called with a sectionName that doesn't contain Notes but something else instead. I suppose bad things will happen? If there is no way to prevent this, maybe expand the docstring on readElfNote a little, saying you shouldn't do that.
compiler/main/DriverPipeline.hs
1678
  • What does "try" mean?
  • Why do you want .int to be 32-bit in the first place? Does it not work if you don't do that, and then use gwN to read the values back in readElfNoteBS. Please explain in the code.
  • Also put a link to a specification of the format you are using here (gas syntax).

I also found this (https://refspecs.linuxfoundation.org/elf/gabi4+/ch5.pheader.html#note_section), it might be relevant:

"In 64-bit objects (files with e_ident[EI_CLASS] equal to ELFCLASS64), each entry is an array of 8-byte words in the format of the target processor. In 32-bit objects (files with e_ident[EI_CLASS] equal to ELFCLASS32), each entry is an array of 4-byte words in the format of the target processor."

compiler/main/Elf.hs
190

Why not wordSize hdr and gwN hdr instead of allign 4 and gw32 hdr?

https://refspecs.linuxfoundation.org/elf/gabi4+/ch5.pheader.html#note_section mentions

In 64-bit objects (files with e_ident[EI_CLASS] equal to ELFCLASS64), each entry is an array of 8-byte words in the format of the target processor. In 32-bit objects (files with e_ident[EI_CLASS] equal to ELFCLASS32), each entry is an array of 4-byte words in the format of the target processor.

thomie requested changes to this revision.Nov 9 2015, 5:13 AM
thomie edited edge metadata.

Moving out of review queue.

This revision now requires changes to proceed.Nov 9 2015, 5:13 AM
hsyl20 updated this revision to Diff 4984.Nov 9 2015, 7:48 AM
hsyl20 edited edge metadata.

Refactoring and more comments.

Thanks for the review @thomie! I have tried to answer you on every point. Sorry for the delay, I had to delve a little bit into readelf code to understand how it was possible to support the "next" version of the spec for Note sections in a forward-compatible manner (answer: it is not possible). I will update the test plan too.

hsyl20 edited the test plan for this revision. (Show Details)Nov 9 2015, 7:55 AM
hsyl20 edited edge metadata.
hsyl20 marked 2 inline comments as done.Nov 9 2015, 8:24 AM
thomie added a subscriber: kgardas.

Nice improvements. I like the refactorings you did, and the Note [ELF specification] is excellent. All my questions were indeed answered.

I double checked the code against the spec (including all calculations, Elf64_Word representing 4 bytes is tricky), and I vouch for its correctness.

I'm not an expert of lazy IO

I'm not either, so a review by someone else might be a good idea. @bgamari or @austin or @rwbarton? (This patch unblocks D1242, which is slated for ghc-8)?

@kgardas: maybe you could check that recomp011 still passes on your platforms using this patch. See testplan. Thanks.

kgardas edited edge metadata.Nov 10 2015, 3:31 AM

@thomie: Solaris/i386 runs fine. Solaris/amd64 fails with:

=====> recomp011(normal) 1 of 1 [0, 0, 0] 
cd . && $MAKE -s --no-print-directory recomp011    </dev/null > recomp011.run.stdout 2> recomp011.run.stderr
Actual stdout output differs from expected:
Warning: missing newline at end of file ./recomp011.stdout.normalised
Warning: missing newline at end of file ./recomp011.run.stdout.normalised
--- ./recomp011.stdout.normalised       Tue Nov 10 10:25:19 2015
+++ ./recomp011.run.stdout.normalised   Tue Nov 10 10:25:19 2015
@@ -1,10 +1,6 @@
 [1 of 1] Compiling Main             ( Main.hs, Main.o )
 Linking Main ...
 42
-[1 of 1] Compiling Main             ( Main.hs, Main.o ) [B.hsinc changed]
-Linking Main ...
-43
-[1 of 1] Compiling Main             ( Main.hs, Main.o ) [A.hsinc changed]
-Linking Main ...
-4343
-4343
+42
+42
+42
*** unexpected failure for recomp011(normal)

Unexpected results from:
TEST="recomp011"

OpenBSD/amd64 would run fine if there would not be OpenBSD specific warnings:

+++ ./recomp011.run.stderr.normalised   Tue Nov 10 10:30:20 2015
@@ -0,0 +1,27 @@
+/home/karel/vcs/ghc-src/d1381-test/rts/dist/build/libHSrts.a(RtsFlags.o): In function `copyArg':
+
+rts/RtsFlags.c:1773:0:
+     warning: warning: strcpy() is almost always misused, please use strlcpy()
+/usr/local/lib/libgmp.so.9.0: warning: warning: vsprintf() is often misused, please use vsnprintf()
+/home/karel/vcs/ghc-src/d1381-test/rts/dist/build/libHSrts.a(RtsUtils.o): In function `showStgWord64':
+
+rts/RtsUtils.c:205:0:
+     warning: warning: sprintf() is often misused, please use snprintf()
+/home/karel/vcs/ghc-src/d1381-test/rts/dist/build/libHSrts.a(RtsFlags.o): In function `copyArg':
+
+rts/RtsFlags.c:1773:0:
+     warning: warning: strcpy() is almost always misused, please use strlcpy()
+/usr/local/lib/libgmp.so.9.0: warning: warning: vsprintf() is often misused, please use vsnprintf()
+/home/karel/vcs/ghc-src/d1381-test/rts/dist/build/libHSrts.a(RtsUtils.o): In function `showStgWord64':
+
+rts/RtsUtils.c:205:0:
+     warning: warning: sprintf() is often misused, please use snprintf()
+/home/karel/vcs/ghc-src/d1381-test/rts/dist/build/libHSrts.a(RtsFlags.o): In function `copyArg':
+
+rts/RtsFlags.c:1773:0:
+     warning: warning: strcpy() is almost always misused, please use strlcpy()
+/usr/local/lib/libgmp.so.9.0: warning: warning: vsprintf() is often misused, please use vsnprintf()
+/home/karel/vcs/ghc-src/d1381-test/rts/dist/build/libHSrts.a(RtsUtils.o): In function `showStgWord64':
+
+rts/RtsUtils.c:205:0:
+     warning: warning: sprintf() is often misused, please use snprintf()
\ No newline at end of file
*** unexpected failure for recomp011(normal)

Unexpected results from:
TEST="recomp011"

is there anything I shall test on Solaris/amd64 in order to fix that?

@kgardas: recomp011 actually already fails on Solaris/amd64 with HEAD, see http://haskell.inf.elte.hu/builders/solaris-amd64-head/391/22.html. It has been failing for at least a year: http://haskell.inf.elte.hu/builders/solaris-amd64-head/10/22.html, so we don't really know if this patch makes the problem worse or not.

It is a test for Trac #3589: "Recompilation checker doesn't take into account CPP headers."

The test itself looks ok to me. I don't why it doesn't pass on Solaris/amd64. The patch attached to the ticket doesn't seem to have any platform specific bits: https://git.haskell.org/ghc.git/commitdiff/3f34e0913efcc82dd90be56d04c5e57ec60d3677
I'm afraid this'll be a difficult one to crack.

@thomie: silly me! I've kept in head a need to verify that this test passes on builder, but then submitted too quickly. :-) Of course since it already fails this is no regression so for my platforms this D looks fine.

thomie accepted this revision.Nov 10 2015, 5:03 AM
thomie edited edge metadata.

@kgardas ok, hopefully you can debug that problem some other time.

So this patch has been tested to work on:

  • Linux/amd64
  • Solaris/i386
  • OpenBSD/amd64

Good enough for me.

we should add a test for ELF objects with more than 0xff00 sections

That would still be a good idea. @hsyl20: either add it to this patch, or create a new Diff later.

@austin, @bgamari: I would like this to go in, to unblock D1242.

This revision is now accepted and ready to land.Nov 10 2015, 5:03 AM
hsyl20 updated this revision to Diff 5000.Nov 10 2015, 6:28 AM
hsyl20 edited edge metadata.

Add test for huge number of ELF sections

hsyl20 updated this revision to Diff 5001.Nov 10 2015, 6:37 AM
hsyl20 edited edge metadata.

Remove some tabs...

hsyl20 edited the test plan for this revision. (Show Details)Nov 10 2015, 6:40 AM
hsyl20 edited edge metadata.
hsyl20 updated this revision to Diff 5002.Nov 10 2015, 7:53 AM

Avoid non portable loops in Makefile

bgamari accepted this revision.Nov 11 2015, 5:28 AM
bgamari edited edge metadata.

Looks good to me!

This revision was automatically updated to reflect the committed changes.