RFC: Always build GHCi libs
ClosedPublic

Authored by simonmar on Mar 8 2017, 7:35 AM.

Details

Summary

Since the introduction of -split-sections, using GHCi with the RTS
linker is really slow:

$ time (echo :quit | ./inplace/bin/ghc-stage2 --interactive -fexternal-interpreter)
GHCi, version 8.1.20170304: http://www.haskell.org/ghc/  :? for help
Prelude> Leaving GHCi.

real        0m3.793s

(when we use -fexternal-interpreter it uses the RTS linker by default,
you can make it use the system linker by adding -dynamic)

Building the GHCi libs doesn't take much time or space in the GHC build,
but makes things much quicker for people using the RTS linker:

$ time (echo :quit | ./inplace/bin/ghc-stage2 --interactive -fexternal-interpreter)
GHCi, version 8.1.20170304: http://www.haskell.org/ghc/  :? for help
Prelude> Leaving GHCi.

real        0m0.285s

So I propose that we build and ship them unconditionally.

Test Plan

validate, perf tests above

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
simonmar created this revision.Mar 8 2017, 7:35 AM

If I understand correctly this would also increase the size of the unpacked binary distribution by a significant amount (maybe 20% by my estimates). Have you measured that yet?

I think this even builds a ghci library for the ghc package, right? I believe historically we didn't do it, because loading it into ghci is rare; and besides, the ghc library is not build with split sections anyways. The Cabal library is also shockingly large, and perhaps worth not building a ghci library for.

If necessary we could even compress the ghci .o files, since they're only intended to be consumed by the RTS and not by a normal linker...

simonmar planned changes to this revision.Mar 8 2017, 8:55 AM

Good point about the GHC package. The logic is here: https://phabricator.haskell.org/diffusion/GHC/browse/master/compiler/ghc.mk;8e053700f9357c1b9030c406130062795ae5015c$603-611

If we compressed the .o files then we wouldn't be able to mmap them in the linker, which would be a bit annoying.

I'll look into the overhead of shipping these in the bindist.

simonmar updated this revision to Diff 11637.Mar 8 2017, 9:08 AM
  • Simplify; the way I was doing this was wrong
  • Don't build a GHCi lib for the GHC package

It seems to be 80MB on disk, which comparing to an unpacked binary dist I have lying around (8.0.1) looks to be about 7%.

Phyx edited edge metadata.Mar 8 2017, 1:31 PM

Fair enough, the behavior is a bit counter intuitive to me though. If I understand correctly this is saying that loading the .a is faster than loading the .so?
At least from the Windows point of view this seems odd since DLLs are fully relocated and have no dangling symbols.

If I understand correctly this is saying that loading the .a is faster than loading the .so?

Not exactly...

  • Dynamically-linked GHCi and ghci -fexternal-interpreter -dynamic use the .so
  • Statically-linked GHCi and ghc -fexternal-interpreter use the .a or the .o if it is available

    In the second case, the .o is much faster to load than the .a, but we aren't building it by default.
bgamari accepted this revision.Mar 14 2017, 6:04 PM

Alright. Sounds reasonable to me. I'll merge to 8.2 as well.

This revision is now accepted and ready to land.Mar 14 2017, 6:04 PM
Phyx added a comment.Mar 15 2017, 4:12 AM
In the second case, the `.o` is much faster to load than the `.a`, but we aren't building it by default.

Ahh, that makes sense. Thanks!

I wonder if this is because we don't use the indexes in the archives (there is code to detect it, I just don't think we do anything with it).
If we do then we don't have to scan all the object files in the archive to know what's there, that should be even quicker.

This revision was automatically updated to reflect the committed changes.