Extend the Quasi Monad
Needs RevisionPublic

Authored by angerman on May 23 2017, 9:45 PM.

Details

Summary

This adds File IO and Process IO commands to the Quasi monad. This
makes Template Haskell code more declarative, allows for reading and
writing of files / processes in TH that are on the build system, when the
interpreter runs on a different host (e.g. cross compiling).

There are a very large number of changes, so older changes are hidden. Show Older Changes

Thank you for the detailed explanation. But I'm not clearer than before on my second question: why are the file/process opesrtions you just added not subject to the same issues that qRunIO has? After all, they're also IO operations. Where specifically in the GHC codebase does a cross compiler make the decision to, e.g., look up executables on the build machine instead of the host machine when running qFindExecutables?

Depending on what kind of IO we do, we might be fine. (if the IO doesn't touch processes or files, I do not (yet) see any issue with that kind of IO). When running ghc with -fexternal-interpreter the qXXX are evaluated in GHCiQ (see libraries/ghci/GHCi/TH.hs). Which running on the host has the capability to query the ghc instance on the build machine.

Thus we have a two way communication, where the discrimination of qRunIO into separate file and process calls, allows us to ask the ghc process on the build machine to provide us with the value for them.

Say ghc wants to compile $(qFindExecutables "git"). This is transmitted as a ResolvedBCO to the GHCSlave running on the host, which evaluates the splice in the GHCiQ,
this in turn evaluates qFindExecutables by querying the ghc process on the build machine by sending a FindExecutables message back. GHC then responds with the result
of evaluating Dir.findExecutables and returning the result back to the GHCSlave on the host.

A somewhat high level description of the communication is a follows:

ghc -> slave: send library X
ghc -> slave: link + load library X
ghc -> slave: send ResolvedBCO for splice
ghc -> slave: run BCO
ghc <- slave: ask for findExecutables
ghc -> slave: findExeutables result.
ghc <- slave: send back splice result

Thus the additional qXXX function allow the slave to decide how to handle the function. qRunIO is true IO on the host, while the others
ask the ghc process to invoke the IO the build machine on the on behalf of the slave.

Depending on what kind of IO we do, we might be fine. (if the IO doesn't touch processes or files, I do not (yet) see any issue with that kind of IO). When running ghc with -fexternal-interpreter the qXXX are evaluated in GHCiQ (see libraries/ghci/GHCi/TH.hs). Which running on the host has the capability to query the ghc instance on the build machine.

Hm, OK. I'm still a bit unclear where in libraries/ghci/GHCi/TH.hs this decision to query the build machine instead of the host one takes place, but I'll trust your word on this matter.

More importantly, I find the prospect of cramming a bunch of (fairly ad hoc) file/process IO operations into Quasi to be very unsettling. I know that Quasi is already a grab bag of assorted things, but this increases the API surface area by an extraordinary amount. Moreover, there doesn't appear to be any end in sight: what happens when a user needs even more operations from the directory/process library in Template Haskell? They'd either need to add even more Quasi class methods, or they'd need to completely reimplement their desired functions from scratch, but using Q operations instead of IO ones. Neither approach is very satisfying.

Instead, why not have functionality to toggle which machine to search for files on?

qWithMachine :: BuildOrHost -> Q a -> Q a

(API subject to bikeshedding.) That way, you can continue to use qRunIO as before for anything that queries files or processes, and you won't need to have a Quasi counterpart to every basic file/process op under the sun.

Depending on what kind of IO we do, we might be fine. (if the IO doesn't touch processes or files, I do not (yet) see any issue with that kind of IO). When running ghc with -fexternal-interpreter the qXXX are evaluated in GHCiQ (see libraries/ghci/GHCi/TH.hs). Which running on the host has the capability to query the ghc instance on the build machine.

Hm, OK. I'm still a bit unclear where in libraries/ghci/GHCi/TH.hs this decision to query the build machine instead of the host one takes place, but I'll trust your word on this matter.

More importantly, I find the prospect of cramming a bunch of (fairly ad hoc) file/process IO operations into Quasi to be very unsettling. I know that Quasi is already a grab bag of assorted things, but this increases the API surface area by an extraordinary amount. Moreover, there doesn't appear to be any end in sight: what happens when a user needs even more operations from the directory/process library in Template Haskell? They'd either need to add even more Quasi class methods, or they'd need to completely reimplement their desired functions from scratch, but using Q operations instead of IO ones. Neither approach is very satisfying.

Instead, why not have functionality to toggle which machine to search for files on?

qWithMachine :: BuildOrHost -> Q a -> Q a

(API subject to bikeshedding.) That way, you can continue to use qRunIO as before for anything that queries files or processes, and you won't need to have a Quasi counterpart to every basic file/process op under the sun.

This sadly would not work. As we'd still have a way too generic IO action. With a cross compiler we do not have access to the same libraries we have on the host. qRunIO can run an arbitrary IO action and by extension call
any arbitrary function. This in turn requires those functions to be available on the build, which they are not necessarily are, unless we start to build each and every library for the build machine (but the cross compiler can't do this),
and for the host. The ultimate plan is to eventually make ghc multi-target aware, once we have that (though this is *far* out of the scope of this diff), this could become feasible.

The key here is that we provide specific function instead of a generic one. Only by doing this, we can provide the special handling.

If we wanted to support arbitrary file or process IO in through qRunIO, we'd need to hook into the rts, (e.g. the approach taken in D3502).

Regarding the increase in API surface, this is a valid concern, and the goal should be to implement only the minimal necessary set and from which anything else can be combined. This is the one where you'd need to combine everything
from Q operations. And yes, while this is not very satisfying, it is a tradeoff I'm willing to make and advocate for, as it allows for proper cross compilation support. With something like Backpack, this could even become less painful for
downstream consumers.

This sadly would not work. As we'd still have a way too generic IO action. With a cross compiler we do not have access to the same libraries we have on the host. qRunIO can run an arbitrary IO action and by extension call
any arbitrary function. This in turn requires those functions to be available on the build, which they are not necessarily are, unless we start to build each and every library for the build machine (but the cross compiler can't do this),
and for the host. The ultimate plan is to eventually make ghc multi-target aware, once we have that (though this is *far* out of the scope of this diff), this could become feasible.

Alas, I was afraid it wouldn't be that simple.

If we wanted to support arbitrary file or process IO in through qRunIO, we'd need to hook into the rts, (e.g. the approach taken in D3502).

Ah, I hadn't seen D3502. Well, so much for that idea :)

Regarding the increase in API surface, this is a valid concern, and the goal should be to implement only the minimal necessary set and from which anything else can be combined. This is the one where you'd need to combine everything
from Q operations. And yes, while this is not very satisfying, it is a tradeoff I'm willing to make and advocate for, as it allows for proper cross compilation support. With something like Backpack, this could even become less painful for
downstream consumers.

Well, either path towards short-term cross-compiler support for TH is going to involve some unsightly hacks, and I suppose this is at least a manageable hack. All I can do is grumble from the sidelines until we have proper multi-target awareness in GHC ;)

Well, either path towards short-term cross-compiler support for TH is going to involve some unsightly hacks, and I suppose this is at least a manageable hack. All I can do is grumble from the sidelines until we have proper multi-target awareness in GHC ;)

Just to be clear, here, I'm not a big fan of blowing up Quasi either. This however seems to be the least worst option. I've come to find a certain benefit in being explicit in the qAction though, as it provides more information to ghc about the actual intent.

It would also allow a custom Quasi instance to be restricted to read-only file IO. And while we could provide default implementations, doing so at the definition site of the class, would add additional dependencies on the template-haskell package, which I
believe we'd rather not get into.

angerman updated this revision to Diff 12708.May 27 2017, 2:08 AM
  • add time accessors
bgamari requested changes to this revision.May 29 2017, 11:20 PM

I agree that this is much better than the hooked base option we were looking at earlier. However, some documentation is in order.

libraries/template-haskell/Language/Haskell/TH/Syntax.hs
570

I know we haven't done a great job of documenting this module, but can we have a nice section with a Haddock comment explaining what these are and why they exist?

This revision now requires changes to proceed.May 29 2017, 11:20 PM
angerman updated this revision to Diff 12771.Jun 6 2017, 10:18 AM
angerman edited edge metadata.
  • Add AppendFile
  • Adds removeFile
bgamari requested changes to this revision.EditedJun 8 2017, 1:50 PM

There are still things to be done here.

libraries/template-haskell/Language/Haskell/TH/Syntax.hs
570

I stand by this request :)

This revision now requires changes to proceed.Jun 8 2017, 1:50 PM

There are still things to be done here.

I know, I know. I'm just adding stuff as needed...

I'll try to clean this up once I'm back in sG by the end of the month :-/

angerman updated this revision to Diff 13029.Jul 4 2017, 10:48 PM
angerman edited edge metadata.
  • rebase onto master
bgamari requested changes to this revision.Jul 7 2017, 10:02 AM

Bump out of review queue while this is finished up.

This revision now requires changes to proceed.Jul 7 2017, 10:02 AM
angerman updated this revision to Diff 13083.Jul 9 2017, 8:46 PM
angerman edited edge metadata.
  • rebase

What is the status of this, @angerman? It seems to fail validation.

angerman updated this revision to Diff 13124.Jul 11 2017, 8:08 PM
  • rebase & relax time to 1.5
angerman updated this revision to Diff 13133.Jul 12 2017, 2:58 AM
  • rebase onto fixed master
angerman updated this revision to Diff 13146.Jul 12 2017, 9:26 PM
  • proper rebase
bgamari requested changes to this revision.Aug 18 2017, 7:38 AM

It looks like the output of the test needs to be updated.

This revision now requires changes to proceed.Aug 18 2017, 7:38 AM
angerman updated this revision to Diff 13771.Sep 7 2017, 12:55 AM
angerman edited edge metadata.
  • rebase; fix TH_Roles2
angerman updated this revision to Diff 13801.Sep 9 2017, 2:40 AM
  • rebase
angerman added a subscriber: luite.Sep 9 2017, 6:20 AM
simonmar requested changes to this revision.Sep 11 2017, 2:48 AM
simonmar added inline comments.
libraries/template-haskell/Language/Haskell/TH/Syntax.hs
110–137

I'd like to suggest an alternative approach that should be a bit more modular and extensible, and require fewer changes overall.

Let's define a datatype for the IO operations we want to perform on the build machine:

data BuildIO r where
  BuildIOReadFile :: FilePath -> BuildIO String
  BuildIOWriteFile :: FilePath -> String -> BuildIO ()
  ...

Now define a way to execute these in IO:

performBuildIO :: BuildIO r -> IO r
performBuildIO (BuildIOReadFile f) = readFile f
performBuildIO (BuildIOWriteFile f s) = writeFile f s
...

and in the Q monad we only need one new method:

class Quasi m where
  qBuildIO :: BuildIO r -> m r
  ...

and in the remote GHCi code we can serialise/deserialize BuildIO, so we just need one additional message. We can reuse performBuildIO to actually execute the IO on the build machine.

Does that sound reasonable?

This revision now requires changes to proceed.Sep 11 2017, 2:48 AM
angerman added inline comments.Sep 11 2017, 7:21 AM
libraries/template-haskell/Language/Haskell/TH/Syntax.hs
110–137

Thanks for taking the time to review this. That does seem like a sensible idea indeed. I'll have to play with it for a bit. Will hopefully get around to it this week.

bgamari added inline comments.Sep 11 2017, 9:21 AM
libraries/template-haskell/Language/Haskell/TH/Syntax.hs
110–137

In principle we could even drop the ability to use liftIO and require that the user be explicit by introducing two lifting primitives,

haskell
class Quasi m where
  qHostIO :: BuildIO r -> m r
  qTaretIO :: BuildIO r -> m r

I suspect the breakage would be far more severe than we are willing to stomach, but on the bright side it would force TH users to consider how their code should behave in cross-compiled environments.

Regardless of what we do, we should take care to note the build/target distinction in the Haddocks for runIO and the MonadIO Q instance. Perhaps a reference-able Haddock section in Language.Haskell.TH.Syntax is in order.

angerman updated this revision to Diff 14426.Oct 19 2017, 9:48 PM
  • Rebase. Prior to adapting.
angerman updated this revision to Diff 14427.Oct 19 2017, 9:51 PM
  • Rebase. Again.
bgamari requested changes to this revision.Nov 5 2017, 9:33 AM

Requesting changes pending rework.

This revision now requires changes to proceed.Nov 5 2017, 9:33 AM
austin resigned from this revision.Nov 9 2017, 11:38 AM

Alright, let's do this. I'll rework this as suggested.

How is this going, @angerman?

angerman updated this revision to Diff 15465.Feb 15 2018, 1:58 AM
  • rebase onto master
bgamari requested changes to this revision.Mar 25 2018, 12:42 PM

Any update on this?

This revision now requires changes to proceed.Mar 25 2018, 12:42 PM

Any update on this?

Guess, I'll look into this soon. Currently stuck getting the last few kinks out of the cross compilation with nix and TH without file/process IO; after which I'll hopefully be able to focus on TH with file/process IO, and clean this up.