Backend plugins
Needs ReviewPublic

Authored by angerman on Oct 8 2018, 9:25 AM.

Details

Summary

This diff extends the current Plugins mechanism to implement "backend plugins", which allows one to intercept post-Core intermediate representations (STG/Cmm) and use them as starting points of implementing experimental GHC backends to target other platforms.

TerrorJack created this revision.Oct 8 2018, 9:25 AM

What have you been able to implement with these extension points? The Hsc looks suspicious.

@mpickering The Hsc there indicates there's a HscEnv in scope available for use.

As for the points of the extensions: previously, the GHC API has limited ways to retrieve STG/Cmm. One needs to get Core via either plugins or hooks then do the conversion by themselves, instead of reusing the IRs passed through the pipeline. I implemented hooks for use in Asterius, and would like to do it using plugins instead.

The Cmm pipeline has multiple stages.
It would be good to give a reason why we allow people to hook into it at the place you suggest.

Also maybe consider the option to do non-raw cmm passes in plugins. It seems like a logical thing and shouldn't add much complexity.

compiler/main/Plugins.hs
83

Assuming my understanding is correct I suggest to slightly change the phrasing.

This is called by HscMain with the STG code generated from Core before any other STG pass is run.

GHC does some optimizations on top of STG so it's better to be explicit if this function will be called before or after these kick in.

luite added a comment.Oct 8 2018, 2:27 PM

Is there a discussion somewhere else (other than the ghc-devs mailing list) about this? For example I'm still not sure why for the motivating example (custom backends) you prefer plugins over hooks. Perhaps I missed it.

I see two main differences between plugins and hooks:

  1. Plugins are exposed to the compiler user on the command line and can be used by library/package authors. For a custom backend, per-package plugins don't seem to offer any direct benefit.
  2. Plugins that operate on an IR use the "modify" principle, they return a modified IR, without affecting the rest of the pipeline. Hooks "replace" existing functionality: It's the responsibility of the hook to call the default pipeline (or other hooks) if desired. For a custom backend, replacing looks like a far better fit to me.

As far as I know, Asterius currently intercepts Cmm but just lets the existing NCG run. Can this be prevented by this plugin infrastructure? Should it? In general this seems like something a backend author would want to decide.

Some higher level discussion would be helpful here for me. Perhaps extending plugins this way really is the best solution! But I'd like to understand which problem we're solving here and why this is the best way to do it.

I think the advantages of plugins you mentioned on ghc-devs, "availability", "composability" roughly correspond with my points 1. and 2. I don't understand how those are helpful for this specific use case. In particular "composability" seems like it could prevent backend authors from making certain choices they might want to make!

@luite There isn't a discussion other than the ghc-devs thread yet, I'll create a reddit thread to collect more opinions later.

The current plugin infrastructure has one method to completely alter a pipeline: the frontend plugins. It uses the same load-via-ghci mechanism as ordinary plugins and enables users to create new major modes to GHC. There was a time when Asterius implemented its main logic in a frontend plugin, and in practice, it was kind of inconvenient: one needs to create a ghc wrapper executable which intercepts regular --make flags and alter it with --frontend (this technique is first described in http://blog.ezyang.com/2017/02/how-to-integrate-ghc-api-programs-with-cabal/), then this wrapper can be fed to Cabal via --with-ghc=, then Cabal will happily call our fake GHC and go through compilation/linking for native code, while we do the work of targetting another platform. If you don't want to go through this level of indirection, you need to patch Cabal (or at least use Cabal hooks) to teach it to not do native linking.

One major advantage of adding backend plugins to ordinary plugins mechanism is: you just need to add --ghc-options= at Cabal configure time to load the backend plugin, without creating a GHC major mode and a GHC wrapper. This enables easy prototyping of any STG/Cmm to X compiler. People may still move to frontend plugins or hooks or whatever later, and completely alter the pipeline.

Still, your reply make a lot of sense. I can implement a "pipeline plugin" along with the current "backend plugin" mechanism, the idea is: we re-organize the runPhase behavior a little bit, and allow plugins to decide which next phase to go to or create custom phases, so it becomes possible to skip generating native code completely. Do you think this idea makes sense?

luite added a comment.Oct 9 2018, 12:29 AM

@luite There isn't a discussion other than the ghc-devs thread yet, I'll create a reddit thread to collect more opinions later.

I meant a GHC Trac ticket or wiki page. I don't think we really need more opinions at the moment. We need a goal and a plan first.

One major advantage of adding backend plugins to ordinary plugins mechanism is: you just need to add --ghc-options= at Cabal configure time to load the backend plugin, without creating a GHC major mode and a GHC wrapper. This enables easy prototyping of any STG/Cmm to X compiler. People may still move to frontend plugins or hooks or whatever later, and completely alter the pipeline.

I think this is a bad idea. GHC plugin options are not the right way to do this, it's not a per-package setting. Besides, Cabal and ghc-pkg won't know anything about backend-specific output files: Unless you manage to stuff all the wasm32 code into the existing .o and .a files, the result wouldn't be copied to the right place.

I don't think a frontend plugin is the right way to make a cross compiler either (since really we just want the --make frontend set up for our target). It's unfortunate that the GHC frontend is not in the ghc library. Maybe we could expose the library through the ghc-bin package or move the code to ghc.

Still, your reply make a lot of sense. I can implement a "pipeline plugin" along with the current "backend plugin" mechanism, the idea is: we re-organize the runPhase behavior a little bit, and allow plugins to decide which next phase to go to or create custom phases, so it becomes possible to skip generating native code completely. Do you think this idea makes sense?

It is already possible to skip generating native code: This is basically why the runPhase hook exists, I needed this for GHCJS. Why is a plugin better?

Besides, Cabal and ghc-pkg won't know anything about backend-specific output files: Unless you manage to stuff all the wasm32 code into the existing .o and .a files, the result wouldn't be copied to the right place.

I think serializing compiled code and managing the custom "object files" is already beyond the scope of the current proposal. If we are to improve current ghc/ghc-pkg logic to manage "custom object"s, that's going to be a huge diff, so getting something simpler reviewed and landed first looks sensible to me.

I don't think a frontend plugin is the right way to make a cross compiler either (since really we just want the --make frontend set up for our target). It's unfortunate that the GHC frontend is not in the ghc library. Maybe we could expose the library through the ghc-bin package or move the code to ghc.

Agreed. A lot of logic in ghc-bin should really be in ghc the library. But that change should be in another differential.

It is already possible to skip generating native code: This is basically why the runPhase hook exists, I needed this for GHCJS. Why is a plugin better?

Plugins are easier to load, one can just supply some ghc flags and have them loaded via ghci, instead of manually modifying DynFlags.

luite added inline comments.Oct 9 2018, 1:05 AM
compiler/main/Plugins.hs
123

This is a tad misleading. Target word size is really baked into the ghc library. In addition to targetPlatform in DynFlags being configured correctly, use of the ghcPrimIfaceHook is needed to get the right types for the primops if the target word size doesn't match the host.

However that opens a whole new can of worms with interaction with built-in names and rewrite rules, and I'm not sure if this actually works correctly at all. I stopped using this for GHCJS because for this reason.

There's still a fair bit of work to be done before the ghc library is completely target word size agnostic (and also a bit to prevent the host word size affecting results!), we shouldn't pretend otherwise.

TerrorJack added inline comments.Oct 9 2018, 1:17 AM
compiler/main/Plugins.hs
123

There's still a fair bit of work to be done before the ghc library is completely target word size agnostic (and also a bit to prevent the host word size affecting results!), we shouldn't pretend otherwise.

Thanks for the reminder. Do you think the current ghc-as-a-cabal-package workaround in GHCJS can be backported to GHC? That sounds a reasonable goal before resuming work on backend plugins.

luite added inline comments.Oct 9 2018, 1:56 AM
compiler/main/Plugins.hs
123

Do you think the current ghc-as-a-cabal-package workaround in GHCJS can be backported to GHC?

The approach is basically statically configuring a ghc library for a 32 bit target and installing that under a different name. That can't really be upstreamed in this form.

angerman commandeered this revision.Oct 11 2018, 4:31 AM
angerman added a reviewer: TerrorJack.
angerman added a reviewer: angerman.