Make Windows linker more robust to unknown sections

Authored by Phyx on Oct 3 2015, 3:28 PM.

Description

Make Windows linker more robust to unknown sections

The Windows Linker has 3 main parts that this patch changes.

  1. Identification and classification of sections
  2. Adding of symbols to the symbols tables
  3. Reallocation of sections

Previously section identification used to be done on a whitelisted
basis. It was also exclusively being done based on the names of the
sections. This meant that there was a bit of a cat and mouse game
between GCC and GHC. Every time GCC added new sections there was a
good chance GHC would break. Luckily this hasn't happened much in the
past because the GCC versions GHC used were largely unchanged.

The new code instead treats all new section as CODE or DATA
sections, and changes the classifications based on the Characteristics
flag in the PE header. By doing so we no longer have the fragility of
changing section names. The one exception to this is the .ctors
section, which has no differentiating flag in the PE header, but we know
we need to treat it as initialization data.

The check to see if the sections are aligned by 4 has been removed.
The reason is that debug sections often time are 1 aligned but do have
relocation symbols. In order to support relocations of .debug sections
this check needs to be gone. Crucially this assumption doesn't seem to
be in the rest of the code. We only check if there are at least 4 bytes
to realign further down the road.

The second loop is iterating of all the symbols in the file and trying
to add them to the symbols table. Because the classification of the
sections we did previously are (currently) not available in this phase
we still have to exclude the sections by hand. If they don't we will
load in symbols from sections we've explicitly ignored the in # 1. This
whole part should rewritten to avoid this. But didn't want to do it in
this commit.

Finally the sections are relocated. But for some reason the PE files
contain a Linux relocation constant in them 0x0011 This constant as
far as I can tell does not come from GHC (or I couldn't find where it's
being set). I believe this is probably a bug in GAS. But because the
constant is in the output we have to handle it. I am thus mapping it to
the constant I think it should be 0x0003.

Finally, static linking *should* work, but won't. At least not if you
want to statically link libgcc with exceptions support. Doing so would
require you to link libgcc and libstd++ but also libmingwex. The
problem is that libmingwex also defines a lot of symbols that the RTS
automatically injects into the symbol table. Presumably because they're
symbols that it needs. like coshf. The these symbols are not in a
section that is declared with weak symbols support. So if we ever want
to get this working, we should either a) Ask mingw to declare the
section as such, or b) treat all a imported symbols as being weak.
Though this doesn't seem like it's a good idea..

Test Plan:
Running ./validate for both x86 and x86_64

Also running the specific test case for Trac #10672

make TESTS="T10672_x86 T10672_x64"

Reviewed By: ezyang, thomie, austin

Differential Revision: https://phabricator.haskell.org/D1244

GHC Trac Issues: Trac #9907, Trac #10672, Trac #10563

Details

Committed
thomieOct 3 2015, 3:33 PM
Pushed
bgamariAug 25 2016, 1:39 PM
Reviewer
ezyang
Differential Revision
D1244: Make Windows linker more robust to unknown sections
Parents
rGHCDIFFd2fb53281da4: testsuite: Bump up haddock.base expected allocations
Branches
Unknown
Tags
Unknown