Hadrian: Add support for building Stage3
AbandonedPublic

Authored by mpickering on Dec 16 2018, 12:27 PM.

Details

Summary

This ticket enables the building of a stage3 compiler by making the build logic more consistent and predictable in Hadrian.

Two of the main changes are:

  1. In order to build anything at stageN we use the package database present at stageN. Fixing Trac #16069
  2. haddock and ghc-tags are built as stage1 executables (with the stage1 compiler) rather than as stage2 compiler. Fixing hadrian#661
  3. In order to build a stage3 compiler, you have to add the right packages to the packages variable.
Test Plan

Try to build _build/stage2/bin/ghc after modifying the build packages
in UserSettings.

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Branch
hadrian-stage3
Lint
Lint ErrorsExcuse: not
SeverityLocationCodeMessage
Errorhadrian/README.md:2MERGECONFLICT1Unresolved merge conflict
Unit
No Unit Test Coverage
Build Status
Buildable 25534
Build 64890: [GHC] Linux/amd64: Continuous Integration
Build 64889: [GHC] OSX/amd64: Continuous Integration
Build 64888: [GHC] Windows/amd64: Continuous Integration
Build 64887: arc lint + arc unit
There are a very large number of changes, so older changes are hidden. Show Older Changes
snowleopard added inline comments.Dec 16 2018, 2:03 PM
hadrian/src/UserSettings.hs
6–7

Did you mean to commit changes here or is this accidental?

hadrian/src/Utilities.hs
43

See my comment on relativePackageDbPath.

mpickering added inline comments.Dec 16 2018, 2:43 PM
hadrian/src/Packages.hs
196

I think probably not but it seemed more uniform to also rebuild it.

hadrian/src/Rules.hs
49

I was under the impression that the build targets were specified per flavour, so if packages Stage3 resolves to [] then nothing in stage3 will be built?

hadrian/src/UserSettings.hs
6–7

Yes, because the helper is meant to be used by people if they want to modify an existing packages function in order to also build stage3.

snowleopard added inline comments.Dec 16 2018, 3:36 PM
hadrian/src/Rules.hs
49

Ah, I see, that's a good point.

Do we want to build Stage3 selectively, e.g. control which packages we'd like to build with Stage2 compiler? I thought this was more of a "build the same things as with Stage1 GHC but using Stage2 GHC" consistency check.

hadrian/src/UserSettings.hs
6–7

I see. As I commented above, I'm not sure if this is a real use case. Do we really want to have fine-grained control over which packages should be built in Stage3?

mpickering added inline comments.Dec 16 2018, 3:40 PM
hadrian/src/Rules.hs
49

It's consistent with how all the other targets work so I don't see why not. At the moment the things that get built are precisely the things that the packages variable resolves to if you ignore the flag which stops stage2 things being built.

snowleopard added inline comments.Dec 16 2018, 3:47 PM
hadrian/src/Rules.hs
49

OK, sounds good. Let's keep everything configurable for the sake of uniformity.

hadrian/src/UserSettings.hs
55

I don't think we really need this, because users already have the ability to override default packages for any build stage, as described here:

https://github.com/ghc/ghc/blob/master/hadrian/doc/user-settings.md#packages

mpickering updated this revision to Diff 19164.Dec 16 2018, 4:15 PM
  • Define and use stageLibaries
  • Resolve some more stage confusion

Thanks Andrey, I managed to apply your suggestions and things appear to work smoothly still. I'll try a clean rebuild overnight.

hadrian/src/Rules/Generate.hs
192

It looks like this usage of rtsContext is a bit unusual anyway so I will set it to Stage1, as you suggest.

hadrian/src/Rules/Program.hs
93

Answer: Because it should have been stageLibaries stage where stageLibraries is a function I defined to compute the set of libraries we want to use for a certain stage.

hadrian/src/UserSettings.hs
55

It exists for two reasons.

  1. Discoverability, it's not obvious how to build a stage3 compiler or what that means.
  2. It is more composable to define package set modifiers rather than new package sets. In the example there is a coincidence that modifiedPackages is defined in terms of defaultPackages and we happen to be overriding the defaultFlavour which initially has packages = defaultPackages. It would be very easy to get unexpected results if for example I changed defaultFlavour to a package set which didn't use the defaultPackages.

Do you have another idea about how to make it obvious to users what they have to do in order to build the stage3?

Thanks Andrey, I managed to apply your suggestions and things appear to work smoothly still. I'll try a clean rebuild overnight.

Sounds great, thanks!

hadrian/src/UserSettings.hs
55

I don't have a strong opinion about how we should manage package sets in Hadrian. In fact, the current implementation does feel rather ad-hoc to me. If you come up with a more compositional approach that would be great!

It would be very easy to get unexpected results if for example I changed defaultFlavour to a package set which didn't use the defaultPackages.

The only file that Hadrian users are expected to modify is UserSettings. At least I think we should strive to this. I guess this rules out your scenario.

Do you have another idea about how to make it obvious to users what they have to do in order to build the stage3?

Well, I think we could just add a section dedicated to building Stage3 to the README and/or user-settings.md file. My understanding is that building Stage3 GHC is not a very common task, so maybe it doesn't need to be the first thing Hadrian users see when they open UserSettings.hs.

Still not ready. There is a problem with packageDbPath resolving to the wrong thing in some places. This stems from the fact that packageDbPath used to have a call to min in which clearly does the wrong thing and relies on the fact that we didn't used to build any libraries with the stage2 compiler. I haven't managed yet to work out how to precisely fix this though.

I think that stageLibraries also needs to be used in configurePackage to make sure you end up needing packages from the right stage. That occurs here: https://github.com/ghc/ghc/blob/ed69f8bd1fd4b84b62bb4f10eef7f76e537be07b/hadrian/src/Hadrian/Haskell/Cabal/Parse.hs#L116. Note that this is already wrong for stage 2 thereby preventing me from building Haddock >:(.

I think that stageLibraries also needs to be used in configurePackage to make sure you end up needing packages from the right stage. That occurs here: https://github.com/ghc/ghc/blob/ed69f8bd1fd4b84b62bb4f10eef7f76e537be07b/hadrian/src/Hadrian/Haskell/Cabal/Parse.hs#L116. Note that this is already wrong for stage 2 thereby preventing me from building Haddock >:(.

Don't think that is right still as when we configure a library in Stage2 we want to use the Stage2 libraries to build it right?

  • fix build

I think I fixed things up now. @alpmestan can you review?

I think I fixed things up now.

@mpickering Great! Could you please also update the docs? We need 1) a short paragraph about how to build Stage3 GHC in README.md, and 2) a section in doc/user-settings.md about using the new buildStage3 setting that you added, ideally with an example of modifying it.

mpickering planned changes to this revision.Dec 19 2018, 7:19 AM

Still broken in fact, see Trac #16069 for where the problem lies.

mpickering edited the summary of this revision. (Show Details)Dec 19 2018, 11:09 AM
  • Make things even more consistent
  • docs
  • cleanup
  • Remove tabs

I completed a stage3 build from a clean tree successfully with this patch and a modified UserSettings.hs. Another review @snowleopard would be good seeing as it also involved some more radical changes now.

snowleopard requested changes to this revision.EditedDec 19 2018, 5:14 PM

@mpickering Thank you. I've left a few more comments. The question about moving ghcTags and haddock to Stage1 is the biggest. I don't know whether this is the right solution in the long term, and why Make's behaviour is different. Please clarify.

hadrian/README.md
190

Link to the corresponding section?

hadrian/doc/user-settings.md
152

I find this function very confusing. It defines not only what happens in Stage3, but also what happens in other stages too.

Shall the users modify other settings? Judging by the name of the function, the answer is No -- this function is only for customising the Stage3 build. But then it needs to return only Action [Package], not the function Stage -> Action [Package]

If this function has more general purpose, then we should give it a more general name, e.g. customiseBuildPackages, and explain in the comments that it should be used whenever the user would like to modify the set of built packages, in any stage.

hadrian/src/Base.hs
94

This looks like a leftover. Let's drop it to avoid triggering git blame.

hadrian/src/Rules/Program.hs
37

Could you clarify this change? Is the above TODO still relevant?

91

I don't understand this comment: what does "this" refer to? What "should surely be stage"?

hadrian/src/Settings/Builders/GhcPkg.hs
35 ↗(On Diff #19196)

Why this change? The Make build system uses update --force.

hadrian/src/Settings/Default.hs
113

This change needs to be discussed: why are we deviating from the Make build system?

hadrian/src/Utilities.hs
40–41

Let's just drop depStage then.

This revision now requires changes to proceed.Dec 19 2018, 5:14 PM
mpickering updated this revision to Diff 19203.Dec 20 2018, 8:59 AM
mpickering marked 2 inline comments as done.
  • Andrey's suggestions
mpickering added inline comments.Dec 20 2018, 9:47 AM
hadrian/doc/user-settings.md
152

I don't understand your suggestion sorry. I wrote the function like this so you can build a stage3 compiler for any flavour by modifying the packages used to build a stage2 compiler.

If I write a function Action [Package] then I have to create a list of packages I want to build in stage3 and then a user has to write this function
themselves in potentially an inconsistent way with their stage2 packages.

hadrian/src/Rules/Program.hs
91

Stale comment.

hadrian/src/Settings/Builders/GhcPkg.hs
35 ↗(On Diff #19196)

I'll remove this change. The reason was because I wanted ghc-pkg to error if I was installing the same package twice in the same package database.

https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/packages.html#package-management-the-ghc-pkg-command

register does the same as update but doesn't allow you to register a package twice.

alpmestan requested changes to this revision.Dec 20 2018, 10:40 AM

This is looking pretty good. Besides Andrey's outstanding question, I made a suggestion and have one request to add a TODO.

Can you clarify how far this gets you? I don't think the flavour matters much, so I won't ask whether you've built with different flavours. You said you get as far as building a working stage 3 compiler, right? And if you make stagePackages return the whole series of packages that a bindist would ship with, does that work too? Maybe you can just borrow that list from stage2Packages (or alternatively stagePackages Stage2)?

hadrian/doc/user-settings.md
152

Couldn't it just be a Bool though? Either a setting or a field of Flavour. Or perhaps finalStage :: Stage, which would indicate which GHC we want to produce in the end (Stage1 for cross compilers, Stage2 for our usual distributions, Stage3 for Matthew). It seems to me that when that field would be Stage3, we could trigger this exact piece of code that you wrote to compute what we need for a stage 1/2/3 GHC.

hadrian/src/Packages.hs
196

Could you add a little TODO about possibly making this context-independent again, as we certainly expect the stage to be irrelevant for libffi

This revision now requires changes to proceed.Dec 20 2018, 10:40 AM

This is looking pretty good. Besides Andrey's outstanding question, I made a suggestion and have one request to add a TODO.

Can you clarify how far this gets you? I don't think the flavour matters much, so I won't ask whether you've built with different flavours. You said you get as far as building a working stage 3 compiler, right? And if you make stagePackages return the whole series of packages that a bindist would ship with, does that work too? Maybe you can just borrow that list from stage2Packages (or alternatively stagePackages Stage2)?

The configuration I tried was quickestFlavour with buildStage3 applied to packages quickestFlavour. So I think that is all the packages?

So I think that is all the packages?

There's a trivial way to find out, ls _build/stage2/lib/package.conf.d/.

snowleopard added inline comments.Dec 20 2018, 12:11 PM
hadrian/doc/user-settings.md
152

I don't understand your suggestion sorry. I wrote the function like this so you can build a stage3 compiler for any flavour by modifying the packages used to build a stage2 compiler.

@mpickering The function you wrote allows the user to do a lot more things than just that. For example, the user can modify the list of packages built in Stage2, but leave the Stage3 list empty. Instead of giving the user control over building Stage3 (as the function's name advertises), you give general-purpose functionality for modifying build package sets [1].

I agree with @alpmestan that having something like finalStage looks like a cleaner/simpler solution. It will be set to Stage2 by default, with a simple instruction for the user: change to Stage1 for cross compilers, or to Stage3 to build Stage3 GHC.

[1] As for general-purpose functionality for modifying build package sets, I agree that we can improve the current approach based on flavour's field packages. But I suggest we discuss the design in a separate ticket.

mpickering added a comment.EditedDec 20 2018, 12:12 PM
So I think that is all the packages?

There's a trivial way to find out, ls _build/stage2/lib/package.conf.d/.

array-0.5.2.0.conf        ghc-8.7.conf              integer-gmp-1.0.2.0.conf  stm-2.5.0.0.conf
base-4.12.0.0.conf        ghc-boot-8.7.conf         libiserv-8.7.conf         template-haskell-2.15.0.0.conf
binary-0.8.6.0.conf       ghc-boot-th-8.7.conf      mtl-2.2.2.conf            terminfo-0.4.1.2.conf
bytestring-0.10.9.0.conf  ghc-compact-0.1.0.0.conf  package.cache             text-1.2.3.1.conf
Cabal-2.5.0.0.conf        ghc-heap-8.7.conf         package.cache.lock        time-1.9.2.conf
containers-0.6.0.1.conf   ghci-8.7.conf             parsec-3.1.13.0.conf      transformers-0.5.5.0.conf
deepseq-1.4.4.0.conf      ghc-prim-0.5.3.conf       pretty-1.1.3.6.conf       unix-2.7.2.2.conf
directory-1.3.3.1.conf    haskeline-0.7.4.3.conf    process-1.6.3.0.conf      xhtml-3000.2.2.1.conf
filepath-1.4.2.1.conf     hpc-0.6.0.3.conf          rts-1.0.conf

Which is also the same as in the stage1 package.conf.d.

@mpickering Great! This look very encouraging.

mpickering updated this revision to Diff 19214.Dec 21 2018, 6:15 AM
  • Implement the finalStage build option
  • Note about libffiContext
  • Fix lint

@snowleopard I implement the finalStage solution that @alpmestan suggested. It turned out quite nicely and meant we could remove the stage1Only option.

I think the only question to resolve is whether it is ok to build haddock and ghcTags at stage1. How do you want to resolve this without leaving this patch in limbo?

alpmestan added inline comments.Dec 21 2018, 6:38 AM
hadrian/doc/cross-compile.md
19

Don't we want to replace this line with an instruction to set finalStage = Stage1 ?

hadrian/doc/user-settings.md
156

I think it wouldn't hurt to repeat here quickly something like "..., or to build a Stage1 compiler (e.g for a cross-compiler) by setting finalStage = Stage1.

hadrian/src/Rules.hs
48

Just a nitpick (you can leave it as is I won't mind), but [Stage0 .. finalStage] would probably be simpler.

50

What does this resolve to by default for Stage3?

Somewhat relatedly, shouldn't we refine the package list further? There are packages that we need "to build the next stage" and there are packages that should be built by the finalStage compiler only, the ones that'll ship with the binary distribution, no? It might be useful (not necessarily in this diff) to reflect that split and guarantee that we never build something that we don't need and that we build everything we must ship with.

hadrian/src/Settings/Default.hs
124

How about stage3Packages ?

hadrian/src/UserSettings.hs
58

Shouldn't it be Stage2 by default?

mpickering added inline comments.Dec 21 2018, 6:52 AM
hadrian/doc/cross-compile.md
19

I don't think so because I think the original implementation of stage1Only was broken so I'm not sure that it would work the same as in make.

hadrian/src/Rules.hs
48

I don't think that is the same as [Stage0 .. Stage0] = [Stage0] not [].

50

Yes that might be useful but a further refinement that I don't want to do in this diff. We can make a ticket after this is merged.

hadrian/src/Settings/Default.hs
124

What should that contain?

hadrian/src/UserSettings.hs
58

Woops yes.

mpickering updated this revision to Diff 19216.Dec 21 2018, 7:06 AM
  • Alp comments

I think the only question to resolve is whether it is ok to build haddock and ghcTags at stage1 How do you want to resolve this without leaving this patch in limbo?

I'm not sure what's the right approach.

@alpmestan @bgamari Do you have any suggestions? Any ideas what could go wrong if we build Haddock using Stage1 GHC instead of Stage2 GHC?

mpickering updated this revision to Diff 19230.Dec 22 2018, 6:38 AM
  • Define and use stageLibaries
  • Resolve some more stage confusion
  • fix build
  • Make things even more consistent
  • docs
  • Cleanup
  • Remove tabs
  • Andrey's suggestions
  • Implement the finalStage build option
  • Note about libffiContext
  • Fix lint
  • Alp comments
  • Use stage1 haddock

I rebased. Last night I tested that the default build works correctly, I fixed the binary-dist target so that it doesn't build haddock with Stage2 libraries and also verified that test works correctly.

Ready for merge as far as I'm concerned.

mpickering updated this revision to Diff 19231.Dec 22 2018, 6:41 AM
  • remove merge artifacts