RFC: Continuation arguments
Needs RevisionPublic

Authored by alexbiehl on Apr 30 2018, 9:46 AM.

Details

Reviewers
bgamari
simonmar
Summary

This diff implements code generation for continuation arguments. For primops like catch#/with#/mask... which are defined in the runtime-system there is no mechanism for inlining them. This often leads to otherwise unnecessary allocations of closures.

This patch introduces the notion of continuation arguments to code generation, including:

  • A way to control CorePrep to not ANFize certain primops. That is, leaving arguments of the form State# s -> (# State# s, a #) in defined positions.
  • Teaching CoreToStg how to translate these to STG by extending STG language. Namely the GenStgArg type.
  • Inline primops and continuation in code generation.

This patch happily inlines catch#:

... 
I64[Sp - 24] = PicBaseReg + stg_catch_frame_info;
I64[Sp - 16] = %MO_UU_Conv_W32_W64(I32[I64[BaseReg + 872] + 28]);
I64[Sp - 8] = PicBaseReg + (Test.someHandler_closure+2);
...

No...

  • ... call to runtime system
  • ... allocation
  • ... copying of free variables

needed

Currently this is implemented only for catch# primop. Once we agree to merge this
I will bring in the rest.

P.S. Also StgCse and Unarise are broken for continuation arguments. I will fix that in the coming days.

[1] https://github.com/ghc/ghc/blob/ba5797937e575ce6119de6c07703e90dda2557e8/rts/Exception.cmm#L393

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Branch
pr/cont-primops
Lint
Lint WarningsExcuse: Sorry for the long lines
SeverityLocationCodeMessage
Warningcompiler/cmm/CLabel.hs:506TXT3Line Too Long
Warningcompiler/codeGen/StgCmmPrim.hs:2440TXT3Line Too Long
Warningcompiler/coreSyn/CorePrep.hs:889TXT3Line Too Long
Warningcompiler/coreSyn/CorePrep.hs:942TXT3Line Too Long
Warningcompiler/coreSyn/CorePrep.hs:954TXT3Line Too Long
Warningcompiler/coreSyn/CorePrep.hs:960TXT3Line Too Long
Warningcompiler/simplStg/StgCse.hs:187TXT3Line Too Long
Unit
No Unit Test Coverage
Build Status
Buildable 20487
Build 44913: [GHC] Linux/amd64: Continuous Integration
Build 44912: [GHC] OSX/amd64: Continuous Integration
Build 44911: [GHC] Windows/amd64: Continuous Integration
Build 44910: arc lint + arc unit
alexbiehl created this revision.Apr 30 2018, 9:46 AM
alexbiehl retitled this revision from This diff implements code generation for continuation arguments. to Continuation arguments.Apr 30 2018, 9:47 AM
alexbiehl edited the summary of this revision. (Show Details)
alexbiehl retitled this revision from Continuation arguments to RFC: Continuation arguments.
alexbiehl edited the summary of this revision. (Show Details)
alexbiehl edited the summary of this revision. (Show Details)Apr 30 2018, 9:50 AM
alexbiehl edited the summary of this revision. (Show Details)
alexbiehl edited the summary of this revision. (Show Details)Apr 30 2018, 3:03 PM
alexbiehl edited the summary of this revision. (Show Details)
alexbiehl edited the summary of this revision. (Show Details)
alexbiehl edited the summary of this revision. (Show Details)
alexbiehl edited the summary of this revision. (Show Details)Apr 30 2018, 3:05 PM
dfeuer added a subscriber: dfeuer.Apr 30 2018, 10:12 PM

This sounds really good for catch#, if I understand it correctly, but it needs a note with examples. If we currently allocate closures for both arguments to catch# when they have free variables (the usual case), that's quite bad.

@dfeuer I will provide documentation and examples once I get the Unarise and StgCse to behave.

The idea to this came from the discussions around https://phabricator.haskell.org/D4110 which introduces the with# primop.

Looks like a good start although a note would indeed help during review.

Please hit "request review". This stupid new Phabricator change Ben has been trying to reverse otherwise disables notifications to subscribers.

alexbiehl requested review of this revision.May 1 2018, 1:01 PM
simonmar requested changes to this revision.May 17 2018, 12:52 PM

Partial review - this looks really cool! Definitely needs polishing + comments though.

compiler/coreSyn/CorePrep.hs
849

Remove commented-out code?

1060

can you close up the whitespace here? I find this really hard to read.

compiler/stgSyn/StgSyn.hs
494

You'll need a case here for StgContArg

This revision now requires changes to proceed.May 17 2018, 12:52 PM
alexbiehl added a comment.EditedJun 2 2018, 10:52 AM

I picked work up this weekend. I still can't get it to work reliably:

When trying to compile GHC.IO.bracket I get:

ghc-stage2: panic! (the 'impossible' happened)
  (GHC version 8.5.20180427 for x86_64-apple-darwin):
	Ix{Int}.index: Index (3) out of range ((5,5))
CallStack (from -prof):
  CmmPipeline.setInfoTableStackMap (compiler/cmm/CmmPipeline.hs:129:13-66)
  CmmPipeline.tops (compiler/cmm/CmmPipeline.hs:44:33-58)
  HscMain.cmmPipeline (compiler/main/HscMain.hs:(1458,11)-(1466,25))
  HscMain.doCodeGen (compiler/main/HscMain.hs:(1427,1)-(1482,22))
  HscMain.StgCmm (compiler/main/HscMain.hs:(1341,29)-(1343,50))
  HscMain.cmmToRawCmm (compiler/main/HscMain.hs:1347:23-45)
  CodeOutput.NativeCodeGen (compiler/main/CodeOutput.hs:166:18-78)
  CodeOutput.OutputAsm (compiler/main/CodeOutput.hs:(164,37)-(166,78))
  HscMain.codeOutput (compiler/main/HscMain.hs:(1356,19)-(1357,67))
  HscMain.hscGenHardCode (compiler/main/HscMain.hs:(1298,1)-(1358,64))
  GHC.withCleanupSession (compiler/main/GHC.hs:(466,1)-(475,37))
  GHC.runGhc (compiler/main/GHC.hs:(441,1)-(446,26))
  GHC.defaultErrorHandler (compiler/main/GHC.hs:(381,1)-(413,7))

I suspect something is wrong with stack layouting but I have no idea where it is. The strange thing is, inlined catch# should behave exactly like update frames do - regarding ceremony around setup - but still this somehow breaks something.

hsyl20 added a subscriber: hsyl20.Jul 30 2018, 7:02 PM

I'm not familiar with this part of GHC but here are a few guesses...

I suspect something is wrong with stack layouting but I have no idea where it is. The strange thing is, inlined catch# should behave exactly like update frames do - regarding ceremony around setup - but still this somehow breaks something.

Maybe this will help: if instead of setting the update frame offset with withUpdFrameOff we increase it by the size of the catch frame, the error becomes: Ix{Int}.index: Index (3) out of range ((6,6)) (notice the range change)

compiler/codeGen/StgCmmPrim.hs
88

Shouldn't we substitute the binder in body with the real-world token applied to catch#?

Shouldn't we emitReturn code just like the other inline primops?

2440

It seems like some masking is missing compared to stg_catch#: see here