Extended the plugin system to run plugins on the parsed, typechecked representation, splices to be run and interfaces that are loaded. Pre-loading the plugins.
ClosedPublic

Authored by nboldi on Jan 23 2018, 8:06 PM.

Details

Summary

Extend GHC plugins to access parsed, type checked representation, interfaces that are loaded. And splices that are evaluated. The goal is to enable development tools to access the GHC representation in the pre-existing build environment.

See the full proposal here: https://ghc.haskell.org/trac/ghc/wiki/ExtendedPluginsProposal

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
nboldi created this revision.Jan 23 2018, 8:06 PM
nboldi edited the summary of this revision. (Show Details)Jan 23 2018, 8:09 PM
nboldi added a subscriber: ulysses4ever.

Does this work? Needs some documentation at least.

Why do you run a plugin immediately after loading the interface file? what is the usage for this?

This version does not work correctly in all cases, but I wanted to upload the first version of my solution for others to see. Sorry if it is not the right place.

I'm running the plugin when an interface file is loaded to provide information for tools that need additional information about the environment in which the source code is compiled. For example an automatic source code generator tool should know what names and typeclass instances are available in a given context.

mpickering requested changes to this revision.Jan 24 2018, 5:51 AM

No, it is definitely good to get this up here!

But an interface file is typechecked shortly after being loaded and then handled like any other module? It seems redundant somehow to have this level of granularity when modifying the interface file itself seems quite difficult to do correctly.

Similarly, the hook to keep the renamed source seems a bit complicated, couldn't the same be achieved by a command line flag?

This revision now requires changes to proceed.Jan 24 2018, 5:51 AM

Also, which cases exactly don't work yet and which ones do?

In this version the plugins are loaded too early, so OPTIONS_GHC pragmas cannot add new plugins. I nearly solved this issue and will upload a new version tomorrow.

What granularity do you suggest for the interface files?

The reason why I implemented that way is because the plugin's author could decide to keep the renamed tree based on the module being compiled. It can also depend on the arguments set by the user. For example, if the plugin only "activates" for one module, only the renamed syntax for that module is needed and the plugin author can implement this. I suppose that there is some performance penalty for keeping the renamed tree, that is why it is disabled by default. So keeping it for only one module could be better than keeping it for all of the modules.

nboldi updated this revision to Diff 15213.EditedJan 24 2018, 7:27 PM
  • Fixed the loading of plugins
  • Fixed lint warnings.
nboldi edited the summary of this revision. (Show Details)Jan 24 2018, 7:32 PM
nboldi updated the Trac tickets for this revision.
mpickering requested changes to this revision.Jan 25 2018, 5:38 AM

I think you perhaps chose the wrong base commit when uploading the new patch. There also seems to be some submodule updates which are not relevant.

I think you want to use

arc diff --update D4342 origin/master

I'll post more detailed comments when the right patch is uploaded.

compiler/main/HscMain.hs
474

Irrelevant?

This revision now requires changes to proceed.Jan 25 2018, 5:38 AM
nboldi updated this revision to Diff 15215.Jan 25 2018, 7:19 AM
  • Added GhcPrelude import to new hs-boot files, removed unnecessary imports.
nboldi updated this revision to Diff 15216.Jan 25 2018, 7:22 AM
  • Reversing unintended changes in Main.hs and libraries.
mpickering added inline comments.Jan 25 2018, 7:32 AM
libraries/transformers
1 ↗(On Diff #15216)

Still seems like these submodules are updated?

I think now I got the commit range right. I tried to get rid of the library changes, but I'm not experienced with git submodules. Can you give me some hint how to reverse them?

mpickering updated this revision to Diff 15217.Jan 25 2018, 7:48 AM
  • Remove submodule changes

I updated the diff and removed them for you. The commands I ran were:

git reset HEAD libraries/
git submodule update
git commit

You can get the latest revision if you want by using arc patch D4342, that will make a new branch with just this patch on.

This certainly looks good to me so far. A nice small clean patch that will have a lot of value.

Some things which are needed.

  1. Documentation in the user manual.
  2. A Note in main/Plugins.hs explaining about the different plugin types you have added (a condensed version of the wiki page)

I still don't understand what the motivation for the iface plugin is? Firstly, would it not be easier to run the plugin after typecheckIface has run so you running it on a ModDetails which has more information rather than a skeleton which is more difficult to safely modify. Second, what kind of information do you hope to extract from the interface file? A concrete example would go a long way to help me understand.

I can see the motivation for the rename plugin. Would a more general mechanism be a DynFlags modification function Module -> DynFlags -> DynFlags which could set/unset options on a per module basis, running at the start of the pipeline.

compiler/main/DynFlags.hs
882 ↗(On Diff #15217)

So the point of this field is to stop reloading a plugin in every module? Say it!

882 ↗(On Diff #15217)

Could make this into a data type rather than a tuple which would simplify some of the other code.

compiler/main/HscMain.hs
508

I think running a plugin should invalidate any SAFE guarantees for a module but I don't think anyone seriously uses SAFE.

708

Why do we call initializePlugins twice? Is it necessary to update both sets of dflags?

It probably also makes sense for completeness to have a renameResultAction plugin, do you agree?

I still don't understand what the motivation for the iface plugin is? Firstly, would it not be easier to run the plugin after typecheckIface has run so you running it on a ModDetails which has more information rather than a skeleton which is more difficult to safely modify. Second, what kind of information do you hope to extract from the interface file? A concrete example would go a long way to help me understand.

I think typecheckInterface is only called for home-package modules. I would like to be able to monitor all interface files loaded by GHC.

For a usage example, the refactoring tool I'm developing can rename identifiers. I this case it is important to check for name conflicts. For example if a function named map is imported, the tool should not allow the user to rename a function to map. By extracting the names from the interface files loaded, I can check if any module is imported that exports the name map.

I can see the motivation for the rename plugin. Would a more general mechanism be a DynFlags modification function Module -> DynFlags -> DynFlags which could set/unset options on a per module basis, running at the start of the pipeline.

I think that the more general version could be dangerous for a plugin author not familiar with the internal workings of GHC. Besides, I think there is no flag that instructs type checking to keep the renamed AST. I think needsRenamedSyntax is better for this task.

It probably also makes sense for completeness to have a renameResultAction plugin, do you agree?

Both the type checked and the renamed AST fragments can be found in TcGblEnv. Maybe typeCheckResultAction should be renamed to renameTypeCheckAction to reflect the fact that it gives access for both typechecked and renamed representation?

nboldi updated this revision to Diff 15225.Jan 26 2018, 5:36 AM
  • Created LoadedPlugin datatype. Added tests for modification using the plugin. Removed SAFE annotation when plugin is applied. Removed unnecessary initializePlugins call.
  • Written a note and extended the plugins section in user manual.
  • Fixing lint warnings.

I still don't understand what the motivation for the iface plugin is? Firstly, would it not be easier to run the plugin after typecheckIface has run so you running it on a ModDetails which has more information rather than a skeleton which is more difficult to safely modify. Second, what kind of information do you hope to extract from the interface file? A concrete example would go a long way to help me understand.

I think typecheckInterface is only called for home-package modules. I would like to be able to monitor all interface files loaded by GHC.

ok, I can see now that typecheckIface is only called much later in the compilation pipeline than the interface file is initially loaded. I still think it will be difficult to work with interface files like this but I can accept having this plugin.

For a usage example, the refactoring tool I'm developing can rename identifiers. I this case it is important to check for name conflicts. For example if a function named map is imported, the tool should not allow the user to rename a function to map. By extracting the names from the interface files loaded, I can check if any module is imported that exports the name map.

That's a good reason, write it in a comment!

I can see the motivation for the rename plugin. Would a more general mechanism be a DynFlags modification function Module -> DynFlags -> DynFlags which could set/unset options on a per module basis, running at the start of the pipeline.

I think that the more general version could be dangerous for a plugin author not familiar with the internal workings of GHC. Besides, I think there is no flag that instructs type checking to keep the renamed AST. I think needsRenamedSyntax is better for this task.

I don't accept this. If you're writing a plugin you have to know what you're doing anyway as you are making arbitrary changes to the source program. Adding a flag to control this isn't hard.

It probably also makes sense for completeness to have a renameResultAction plugin, do you agree?

Both the type checked and the renamed AST fragments can be found in TcGblEnv. Maybe typeCheckResultAction should be renamed to renameTypeCheckAction to reflect the fact that it gives access for both typechecked and renamed representation?

But if you want to modify the renamed representation before it is typechecked, this isn't sufficient? But saying that, is it even possible to modify the "renamed" source before it is typechecked, if I remember right, the two processes are intertwined.

@bgamari perhaps you could comment on the API to resolve our different opinions?

PS: Lots of people are excited about his patch! Thanks for implementing it. https://twitter.com/mpickering_/status/956615980935008256

nboldi marked 4 inline comments as done.Jan 26 2018, 5:42 AM
nboldi added inline comments.
compiler/main/HscMain.hs
474

I removed the parentheses to keep inside the 80 character limit. I can break the line instead if you prefer that.

708

The second call proved to be unnecessary, the cached flags are already merged.

mpickering added inline comments.Jan 26 2018, 5:46 AM
docs/users_guide/extending_ghc.rst
800

Really good documentation.

mpickering added inline comments.Jan 26 2018, 5:48 AM
compiler/main/HscMain.hs
474

I think it's clearer with the parentheses. I was just confused why the line had been touched and stared at it for a while to work out if there was any semantic difference.

nboldi marked an inline comment as done.Jan 26 2018, 7:12 AM
nboldi updated this revision to Diff 15226.Jan 26 2018, 7:20 AM
  • Re-adding parentheses and describing interface plugin usage
nboldi marked 3 inline comments as done.Jan 26 2018, 7:23 AM
bgamari edited reviewers, added: ezyang, angerman; removed: mpickering.Jan 26 2018, 9:57 AM
bgamari added subscribers: angerman, ezyang.

Adding @ezyang and @angerman, both of whom have thought about GHC's plugin interfaces in the past.

Add myself back on as a reviewer.

I can see the motivation for the rename plugin. Would a more general mechanism be a DynFlags modification function Module -> DynFlags -> DynFlags which could set/unset options on a per module basis, running at the start of the pipeline.

I think that the more general version could be dangerous for a plugin author not familiar with the internal workings of GHC. Besides, I think there is no flag that instructs type checking to keep the renamed AST. I think needsRenamedSyntax is better for this task.

I don't accept this. If you're writing a plugin you have to know what you're doing anyway as you are making arbitrary changes to the source program. Adding a flag to control this isn't hard.

So, let's go back to this issue.

I'm agreeable to replace needsRenamedSyntax with an inspector function renamedResultAction :: Maybe ([CommandLineOption] -> ModSummary -> RenamedSource -> Hsc ()). There is no point in modifying the representation, since the renaming and type checking is performed in one step. The behavior would be that if any used plugins contains a renamedResultAction than the renamed representation will be kept and it will be given to the plugins that have this field.

What is your opinion, @mpickering?

I think that's a better option as it is more uniform with the rest of the plugins. It should be said on the proposal as well.

nboldi updated this revision to Diff 15353.Feb 5 2018, 12:58 AM
  • Updated design of renamed action and user guide.
nboldi updated this revision to Diff 15354.Feb 5 2018, 1:28 AM
  • Removing unintended changes.

Sorry to be a bit late to the party 🎉

I believe this plugin approach is a bit different to what I initially wanted to accomplish; my initial
goal was to allow external plugins to be linked in into GHC, and have hooks across the driver
pipeline to swap out/augment phases and/or even reorder the phases. (E.g. the primary idea here
was to implement a different code gen as a plugin; or let a plugin terminate compilation and
accept some intermediate representation as the final result).

As such, I think these constitute two different types of plugins.

Sorry to be a bit late to the party 🎉

I believe this plugin approach is a bit different to what I initially wanted to accomplish; my initial
goal was to allow external plugins to be linked in into GHC, and have hooks across the driver
pipeline to swap out/augment phases and/or even reorder the phases. (E.g. the primary idea here
was to implement a different code gen as a plugin; or let a plugin terminate compilation and
accept some intermediate representation as the final result).

As such, I think these constitute two different types of plugins.

For the record, there is a proposal here. However, in general I'm not sure such a general plugin mechanism is desireable; IMHO applications needing such flexibility should just be using the GHC API and write their own frontend.

Sorry to be a bit late to the party 🎉

I believe this plugin approach is a bit different to what I initially wanted to accomplish; my initial
goal was to allow external plugins to be linked in into GHC, and have hooks across the driver
pipeline to swap out/augment phases and/or even reorder the phases. (E.g. the primary idea here
was to implement a different code gen as a plugin; or let a plugin terminate compilation and
accept some intermediate representation as the final result).

As such, I think these constitute two different types of plugins.

For the record, there is a proposal here. However, in general I'm not sure such a general plugin mechanism is desireable; IMHO applications needing such flexibility should just be using the GHC API and write their own frontend.

Can you explain why you think this? Tooling authors have repeatedly explained why it is difficult to use the GHC API. It makes no sense to have to rewrite the whole GHC frontend in order to extract some small piece of information from the end of typechecking.

Sorry to be a bit late to the party 🎉

I believe this plugin approach is a bit different to what I initially wanted to accomplish; my initial
goal was to allow external plugins to be linked in into GHC, and have hooks across the driver
pipeline to swap out/augment phases and/or even reorder the phases. (E.g. the primary idea here
was to implement a different code gen as a plugin; or let a plugin terminate compilation and
accept some intermediate representation as the final result).

As such, I think these constitute two different types of plugins.

For the record, there is a proposal here. However, in general I'm not sure such a general plugin mechanism is desireable; IMHO applications needing such flexibility should just be using the GHC API and write their own frontend.

Can you explain why you think this? Tooling authors have repeatedly explained why it is difficult to use the GHC API. It makes no sense to have to rewrite the whole GHC frontend in order to extract some small piece of information from the end of typechecking.

I believe @bgamari was referring to my type of plugins, not yours @mpickering :-)

mpickering accepted this revision.May 3 2018, 5:43 AM

What is this patch waiting on now the proposal has been accepted? Perhaps a rebase?

This revision is now accepted and ready to land.May 3 2018, 5:43 AM

Nothing in particular; I have just been perpetually behind in reviewing.

That being said, @m0ar has some provided interesting feedback on some limitations of the current design on the proposal. It seems that Hsc isn't quite strong enough for some valid uses of the interface.

bgamari accepted this revision.Jun 2 2018, 6:10 PM

Alright, looks good to me. Let's merge this.

This revision was automatically updated to reflect the committed changes.