Implement split-sections support for windows.
ClosedPublic

Authored by Phyx on Mar 26 2017, 3:01 PM.

Details

Summary

Initial implementation of split-section on Windows.

This also corrects section namings and uses the platform
convention of $ instead of . to separate sections.

Implementation is based on @awson's patches to binutils.

Binutils requires some extra help when compiling the libraries
for GHCi usage. We drop the -T and use implicit scripts to amend
the linker scripts instead of replacing it.

Because of these very large GHCi object files, we need big-obj support,
which will be added by another patch.

Test Plan

./validate

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
There are a very large number of changes, so older changes are hidden. Show Older Changes
Phyx added a comment.Mar 26 2017, 3:01 PM

This unfortunately increases the size of the GHCi libraries.
So much that memory can't be allocated to load them. Have to
figure that out first.

Phyx updated the Trac tickets for this revision.Mar 26 2017, 3:02 PM
Phyx updated this revision to Diff 11878.Mar 26 2017, 5:25 PM

correct base

Phyx added a comment.Mar 27 2017, 4:51 AM

Guess the issue is with big-obj. I'll revert that from every call and instead add it only to the packages that need it for now while i modify GHCi.

awson edited edge metadata.Mar 27 2017, 5:34 AM

Nope, big-obj has nothing to do with the size of GHCi object files. The problem is that bintuils is still broken here, since native section merging is implemented only for "normal executables", but not for linked "without relocations" prelinked object files used by GHCi.

Thus, either binutils should be patched or correct linker script files used.

Phyx added a comment.Mar 27 2017, 5:42 AM

Isn't that why we have $1_$2_LD_SCRIPT? to apply the custom linker scripts when making the GHCi objects?

looking at the output of objdump there are only 5 sections left so I though the sections were merged fine.
I'll try using your patched scripts.

awson added a comment.Mar 27 2017, 6:22 AM

If you have only 5 section remained then you, perhaps, used the same linker script both for executables and GHCi prelinked object files which is wrong. You should use different scripts for them -- those with .x extension for executables and .xr for prelinked object files (I refer to the scripts I've posted on trac).

But the scripts on trac are tailored to old section naming scheme. Those used for executables contain section merging logic for new scheme (since binutils supports it for quite a long time already), but .xr scripts don't contain it. This is what I've written about in previous comment. Thus you need to add it (things like *(SORT(.text$*)), *(SORT(.data$*)), *(SORT(.rdata$*))) to .xr files.

awson added a comment.Mar 27 2017, 6:28 AM

old and new schemes are meant regarding your patches to GHC. Perhaps, it would be better to name them "gcc-elf-like" and "visualc-like".

Phyx added a comment.EditedMar 27 2017, 6:35 AM
In D3383#97145, @awson wrote:

But the scripts on trac are tailored to old section naming scheme. Those used for executables contain section merging logic for new scheme (since binutils supports it for quite a long time already), but .xr scripts don't contain it. This is what I've written about in previous comment. Thus you need to add it (things like *(SORT(.text$*)), *(SORT(.data$*)), *(SORT(.rdata$*))) to .xr files.

AAhhh sorry, Now I understand the extend of the changes. I checked the .x files but didn't the .xr. This makes sense now. Thanks!

Btw, If I can ask one last question, the string merging, do you know how to specify a COMDAT attribute for sections for the PE target? The documentation has it for ELF but not PE. I've currently used the .gnu.linkonce.<name> section naming scheme to do it, but would be much nicer if I can do it via an attribute...

EDIT: Nvm :) I figured out how to do it properly with binutils.

dfeuer added a subscriber: dfeuer.Mar 29 2017, 1:27 AM
dfeuer added inline comments.
compiler/nativeGen/PprBase.hs
99

What's this catch-all about?

Phyx added inline comments.Mar 29 2017, 1:35 AM
compiler/nativeGen/PprBase.hs
99

What do you mean? The catch all existed before. I just added a special case for Windows above it.
It's there because after handling XCoff, Mach-O and PE. the only thing left are ELF platforms.

Phyx updated this revision to Diff 11945.Mar 31 2017, 5:22 PM
  • Finished split sections.
Phyx retitled this revision from WIP: Implement split-sections on windows. to Implement split-sections support for windows..Mar 31 2017, 5:24 PM
Phyx edited the summary of this revision. (Show Details)
Phyx updated this revision to Diff 11946.Mar 31 2017, 5:30 PM
  • revert section labels.
Phyx planned changes to this revision.Apr 1 2017, 4:04 AM

Harbormaster seems to have some test failures. Will take a look.

Phyx updated this revision to Diff 12087.Apr 11 2017, 3:34 PM
  • Finished split sections.
  • revert section labels.
  • Revert "Finished split sections."
  • Merge only sections with $
  • Fix linker scripts.
  • temporarily disable big-obj
Phyx updated this revision to Diff 12093.Apr 12 2017, 1:43 AM

Hopefully finally have a clean rebase.

bgamari accepted this revision.Apr 25 2017, 11:32 AM

Yay! This looks good to me.

rules/build-package-way.mk
151

The naming here is a bit odd; afterall, it's possible that we will link objects on a non-ELF (e.g. MachO) platform using GNU ld. This is a bit of a nit though.

This revision is now accepted and ready to land.Apr 25 2017, 11:32 AM
Phyx added inline comments.Apr 25 2017, 11:36 AM
rules/build-package-way.mk
151

I'll fix it before merge, expand the summary and add a release note entry.

Phyx edited the summary of this revision. (Show Details)Apr 30 2017, 11:19 AM
Phyx marked 2 inline comments as done.Apr 30 2017, 11:24 AM
Phyx updated this revision to Diff 12320.Apr 30 2017, 11:28 AM
  • Finished split sections.
  • revert section labels.
  • Revert "Finished split sections."
  • Merge only sections with $
  • Fix linker scripts.
  • temporarily disable big-obj
  • Split: update for merge.
bgamari requested changes to this revision.May 1 2017, 10:19 AM

I believe @Phyx intends to continue iterating on this.

This revision now requires changes to proceed.May 1 2017, 10:19 AM
Phyx updated this revision to Diff 12365.May 2 2017, 4:28 PM
Phyx edited edge metadata.
  • split: add big-obj for ghci files.
Phyx added a comment.May 2 2017, 4:29 PM

That should be all, I appreciate it if you have some time for a review @awson !

awson added a comment.EditedMay 3 2017, 2:44 AM

I don't quite understand the purpose of the last -Wa,-mbig-obj addition.

The general idea is that neither any exe/dll nor any prelinked object file (for GHCi consumption) should ever contain a large number of sections. These files are all generated by ld which should merge all relevant sections from all incoming files according to the linker script it is fed with. ld automatically recognizes if the COFF object is bigobj or not and need no any command line hints for this.

The only type of artifacts which could ever contain a large number of sections is an object file generated by gnu as, which should be supplied with -mbig-obj if a user want it to handle a lot of sections, and as I've already mentioned elsewhere in the current GHC we need this flag only once, when building template-haskell package with split-sections enabled.

If you ever obtain some exe/dll or prelinked object file containing a large number of sections then it is something wrong with the linker scripts used, perhaps some type of sections leaked through its merging logic.

awson added a comment.May 3 2017, 3:12 AM

And this also reminds me of another problem which I mentioned earlier too.

Thus we should either solve it somehow or resort to old --split-objs scheme for 32-bit GHC.

awson added a comment.EditedMay 7 2017, 2:28 AM

I think the problem is in non-merged string literals sections.

And I think LLVM codegen should be updated too (relevant code is in compiler/llvmGen/LlvmCodeGen/Data.hs, I believe).

compiler/nativeGen/PprBase.hs
126

I guess this is the culprit of your problems. Should be something like ".rdata$str". Without $ they remain non-merged.

driver/utils/merge_sections_pe.ld
17

I believe we don't need anything below.

rules/build-package-way.mk
142

Should be $1_$2_LD_SCRIPT_CMD = I believe. -Wa,-mbig-obj doesn't make any sense here, I think.

awson added inline comments.May 7 2017, 2:38 AM
driver/utils/merge_sections_pe.ld
17

AFAIUI, .pdata and .xdata merging can break linking of gcc-generated object code, see https://sourceware.org/bugzilla/show_bug.cgi?id=15041.

awson added inline comments.May 7 2017, 4:32 AM
rules/build-package-way.mk
142

And yeah, you have to add -Wa,-mbig-obj to ghc-options in libraries/template-haskell/template-haskell.cabal instead (and this would work for 64-bit build only as I've already mentioned).

awson added inline comments.May 7 2017, 4:38 AM
driver/utils/merge_sections_pe.ld
17

Ah, sorry, was wrong here, since gcc is also invoked with -fdata-sections it generates a lot of .xdata$... and .pdata$... sections, thus we should retain this. Thus disregard comments above, please.

Phyx added a comment.May 7 2017, 4:56 AM
In D3383#100409, @awson wrote:

I don't quite understand the purpose of the last -Wa,-mbig-obj addition.

The general idea is that neither any exe/dll nor any prelinked object file (for GHCi consumption) should ever contain a large number of sections. These files are all generated by ld which should merge all relevant sections from all incoming files according to the linker script it is fed with. ld automatically recognizes if the COFF object is bigobj or not and need no any command line hints for this.

The only type of artifacts which could ever contain a large number of sections is an object file generated by gnu as, which should be supplied with -mbig-obj if a user want it to handle a lot of sections, and as I've already mentioned elsewhere in the current GHC we need this flag only once, when building template-haskell package with split-sections enabled.

Yes so the last -mbig-obj was added for this case. But instead of special casing template haskell I added it to just every invocation. I had assume the big-obj support for TH was only needed when building the GHCi library for TH, I had forgotten that the GHCi libraries are just pre-linked and so this has no effect. Sorry, I keep forgetting that..

Phyx added a comment.May 7 2017, 5:00 AM

Thanks @awson!, I'll update the diff and take a look at the 32-bit build.

compiler/nativeGen/PprBase.hs
126

I handle the string merging in another diff where I have the constant merging implemented as well. Which is why I didn't add it here. I'll update the section name.

rules/build-package-way.mk
142

Indeed, sorry. Too many tickets at once and I keep forgetting that these are just prelinked :)

awson added a comment.May 7 2017, 7:50 AM

I tried to build latest GHC with my corrections to this patch and I've found that things don't work because of some other, completely unrelated bug somewhere in GHC driver(?).

The problem is that inplace/bin/ghc-stage1.exe even supplied with -Wa,-mbig-obj option, doesn't transfer -mbig-obj option to the assembler. Thus template-haskell package turns being broken.

The most mystical thing is that this happens only for profiled library, i.e. when ghc is invoked with -prof option added.

After I've manually built Syntax.p_o with -mbig-obj, manually supplied to the assembler, I've finally got full release ghc-8.3.20170506 distribution, and it works.

awson added a comment.May 8 2017, 3:10 AM

I've looked at the big-obj+prof problem and found that assembler is called with correct options but an extra phase is triggered, where ld -r is called to merge some auxiliary object files into the target object file.

And GNU ld is unable to generate extended COFF at all (I experimented with it a little bit, adding OUTPUT(pe-bigobj-x86-64) to the linker script used by ld when processing the files to merge, but ld completely ignores it).

awson added a comment.May 8 2017, 5:29 AM

Well, I've solved the problem, but this requires a patch to ld to enable pe-bigobj-x86-64 output for ln -r. I'll make a binutils ticket with the patch.

Phyx added a comment.May 8 2017, 1:47 PM
In D3383#100839, @awson wrote:

Well, I've solved the problem, but this requires a patch to ld to enable pe-bigobj-x86-64 output for ln -r. I'll make a binutils ticket with the patch.

@awson Thanks, you're aweome. I don't have commit rights to binutils yet (only gcc), but if you don't get a response I can probably help things along.

Phyx added a comment.May 8 2017, 1:48 PM
In D3383#100831, @awson wrote:

I've looked at the big-obj+prof problem and found that assembler is called with correct options but an extra phase is triggered, where ld -r is called to merge some auxiliary object files into the target object file.

And GNU ld is unable to generate extended COFF at all (I experimented with it a little bit, adding OUTPUT(pe-bigobj-x86-64) to the linker script used by ld when processing the files to merge, but ld completely ignores it).

Hmm that seems quite dodgee.

awson added a comment.EditedMay 9 2017, 9:34 AM

The patch is accepted and applied to mainline binutils.
Now we should amend template-haskell.cabal with something like

if os(windows) && arch(x86_64)
    ghc-options: -opta-Wa,-mbig-obj -optl-Wl,--oformat -optl-Wl,pe-bigobj-x86-64

and build GHC successfully!

OTOH, there exists a workaround. Even with old binutils we could use section merging linker script when building template-haskell package. This way profiled library would contain ordinary PECOFF objects with all split sections merged back. But then we would have huge object files linked into every binary using template haskell and built with profiling enabled.

Should I interpret the discussion here to mean that we should hold off on merging this?

Phyx added a comment.May 10 2017, 4:46 PM

@bgamari yes, I need to make some minor adjustment to the patch (will do so this weekend) and also we need an updated binutils for the 32-bit build.
I'll see if I can't get them to back port the patch to msys2 quicker.

@awson thanks for all the help! I'll make the changes this weekend, just a bit swamped atm :)

Phyx planned changes to this revision.May 10 2017, 4:47 PM
awson added a comment.EditedMay 11 2017, 4:04 AM

Not quite so, updated binutils allow us to build 64-bit GHC (which is currently impossible because of template-haskell package being miscompiled).

32-bit build requires bigobj support for 32-bit COFF objects, which binutils has no.

In fact the only place in the whole GHC codebase which requires bigobj support (if compiled with split-sections) is Language.Haskell.TH.Syntax module, thus for 32-bit case we should either somehow disable split-sections (and, perhaps, replace it with split-objs) for this particular module, or disable it for the whole template-haskell package, or use no split-sections for 32-bit GHC at all.

Another absolutely unexplored avenue is to use llvm toolchain, which has automatic bigobj support for both the 32-bit and 64-bit cases, it also (unlike the GNU toolchain) generates spec-conformant PECOFF modules. But simple -fllvm usage won't help us, since in the current form it produces GNU assembly, which, in turn, is assembled by GNU as, closing the circle. To utilize llvm toolchain advantages we should rework GHC driver to make it produce object files by llvm tools directly, not using GNU tools.

Phyx added a comment.May 16 2017, 5:26 PM
In D3383#101408, @awson wrote:

Not quite so, updated binutils allow us to build 64-bit GHC (which is currently impossible because of template-haskell package being miscompiled).

32-bit build requires bigobj support for 32-bit COFF objects, which binutils has no.

Ah, sorry I misread. @bgamari what do you think about committing this for 64-bit GHC for now. Or do you prefer to wait until both toolchains support it.

Another absolutely unexplored avenue is to use llvm toolchain, which has automatic bigobj support for both the 32-bit and 64-bit cases, it also (unlike the GNU toolchain) generates spec-conformant PECOFF modules. But simple -fllvm usage won't help us, since in the current form it produces GNU assembly, which, in turn, is assembled by GNU as, closing the circle. To utilize llvm toolchain advantages we should rework GHC driver to make it produce object files by llvm tools directly, not using GNU tools.

I've been meaning to look into this due to lld, which would be much faster than ld.

Ah, sorry I misread. @bgamari what do you think about committing this for 64-bit GHC for now. Or do you prefer to wait until both toolchains support it.

As long as we fail gracefully on 32-bit I am fine with committing it for 64-bit.

Phyx updated this revision to Diff 12807.Jun 11 2017, 6:53 AM
Phyx marked 9 inline comments as done.
  • rebase
  • split: disable x86 for now and updated llvm codegen.
Phyx updated this revision to Diff 12808.Jun 11 2017, 7:38 AM
  • Split: fix llvm codegen.
Phyx updated this revision to Diff 13049.Jul 7 2017, 2:03 PM
  • rebase
Phyx planned changes to this revision.Jul 7 2017, 2:47 PM

I'll have to verify that user created GHCi objects files aren't now needlessly big.
We use the linker script for the ones we create during compile time, but I wonder about those that cabal creates.

This revision was automatically updated to reflect the committed changes.