Zap coercions when not building with -dcore-lint
Needs ReviewPublic

Authored by bgamari on Jun 1 2018, 3:41 PM.

Details

Reviewers
goldfire
simonpj
Trac Issues
#8095
Summary

Coercions for even small programs can grow to be quite large (e.g. Trac #8095),
especially when type families are involved. For instance, the case of addition
of inductive naturals can build coercions quadratic in size of the summands.

In this patch we avoid this quadratic cost by eliminating coercions from Core
when they are not needed (e.g. when we aren't Core Linting). Through careful
representation we can achieve an asymptotic improvement in typechecking
performance in type-family-heavy programs.

See Note [Zapping Coercions] for details.

Thanks to @goldfire for his inspiration and guidance.

Performance

Below I've tabulated the runtimes (in seconds) of ghc -freduction-depth=0 -v0 compiling
the testcase generated by Types.hs attached to Trac #8095 (specifically the a a variant) with
GHC 8.2.2 and master with this patch:

# n     8.2.2    D4766
200      0.44    0.31
400      0.66    0.40
600      0.93    0.42
800      1.31    0.45
1000     1.77    0.55
1500     3.32    0.66
2000     5.94    0.84
2500     8.73    0.96
3000    12.80    1.11
3500    17.00    1.28
4000    20.64    1.42
5000    33.60    1.84

Looks pretty linear to me.

The results are even more dramatic when the test is run in the b a variant:

# n     8.2.2    D4766
200      1.82    0.32
400     10.40    0.39
600     32.37    0.45
800     81.69    0.49
1000   138.87    0.53

Here we see that compilation times remain nearly zero with this patch long after they become completely untenable under 8.2.2.

The story is pretty similar under the c a variant:

# n    8.2.2     D4766
200     1.23     0.35
400     4.32     0.39
600    11.95     0.42
800    25.72     0.51
1000   44.57     0.56
1500  148.11     0.65

Even the allocation numbers of some testsuite cases benefit pretty significantly:

T5030      -80.3%
T12227     -53.7%
T12545     -75.3%
Test Plan

Validate

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Branch
zap-coercions-2
Lint
Lint WarningsExcuse: It's okay
SeverityLocationCodeMessage
Warningcompiler/coreSyn/CoreFVs.hs:400TXT3Line Too Long
Warningcompiler/iface/IfaceEnv.hs:229TXT3Line Too Long
Warningcompiler/iface/TcIface.hs:1252TXT3Line Too Long
Warningcompiler/typecheck/TcFlatten.hs:1755TXT3Line Too Long
Warningcompiler/typecheck/TcFlatten.hs:1756TXT3Line Too Long
Warningcompiler/typecheck/TcFlatten.hs:1792TXT3Line Too Long
Warningcompiler/typecheck/TcFlatten.hs:1793TXT3Line Too Long
Warningcompiler/typecheck/TcHsType.hs:898TXT3Line Too Long
Warningcompiler/types/Coercion.hs:1111TXT3Line Too Long
Warningcompiler/types/Coercion.hs:1135TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:1604TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:1620TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:1660TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:1665TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:1979TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:2904TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:3583TXT3Line Too Long
Unit
No Unit Test Coverage
Build Status
Buildable 23015
Build 54360: [GHC] Linux/amd64: Continuous Integration
Build 54359: [GHC] OSX/amd64: Continuous Integration
Build 54358: [GHC] Windows/amd64: Continuous Integration
Build 54357: arc lint + arc unit
There are a very large number of changes, so older changes are hidden. Show Older Changes
bgamari added inline comments.Jun 8 2018, 8:32 PM
compiler/types/TyCoRep.hs
3640

Thankfully it is benign; contrary to what I suggested in person, coercion sizes are not accounted for in the expression sizes used by the simplifier.

bgamari marked 3 inline comments as done.Jun 8 2018, 8:40 PM

There is still a bit more to be done on the documentation side of things. However, this otherwise is looking pretty good. The only changes in the testsuite are double-digit-percentage improvements in allocations; not a single regression. The numbers reported in the Differential summary are measurements of the current state of the patch and reflect the asymptotic improvement that I would expect from the refactor.

bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 8:47 PM
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 8:51 PM
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 8:55 PM
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 8:57 PM
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 9:01 PM
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 9:03 PM
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 9:21 PM
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 9:23 PM
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 9:29 PM
bgamari edited the summary of this revision. (Show Details)
alanz added a subscriber: alanz.Jun 9 2018, 9:41 AM
stites added a subscriber: stites.Jun 9 2018, 11:27 AM
bgamari updated this revision to Diff 16821.Jun 9 2018, 4:21 PM
  • Fix order of types
  • Eliminate some zapping
  • Reenable zapping in CoreOpt
  • Work on Note
  • More work on the note
  • A few fixes

Looks good

compiler/typecheck/TcFlatten.hs
1794

I wonder about defining

maybeZapCo :: Coercion -> TcType -> TcType -> Role -> TcM Coercion
-- Precondition: co :: t1 ~r t2
maybeZapCo co t1 t2 r
  = do { dflags <- getDynFlags
          ; if buidlCoercions dflags
           then return co
           else return (ZapCoercion. ....) }

This would localise the choice.

compiler/types/TyCoRep.hs
1677

Great! A 100% regression can't be hard to find :-)

1778

I'd prefer not to do so here, but rather to do it in optCoercion (see below). For a start you need unsafeDynFlags here.

1780

Why?

1786

I don't understand this. If we define

optCoercion co = zapCoercion co

are you really say that increases allocation??

1826

For all of these things, I wonder

  • What is an example?
  • Does it matter?
compiler/types/Type.hs
523

Do we really need this? It's a new complication. How much worse are things without it?

The only changes in the testsuite are double-digit-percentage improvements in allocations; not a single regression.

But on the phone you said that one benchmark gets 100% worse. It would be good to understand how that happens.

I took another detailed pass through. Mostly documentation requests... Thanks!

compiler/coreSyn/CoreOpt.hs
1026

Indeed it does. Specifically: why zap this particular coercion? Aren't there others, too?

compiler/iface/IfaceSyn.hs
1474

While the differences are nicely documented now, do we have a new "way" to run the testsuite? The testsuite generally runs with -dcore-lint (as it should). However, I think a slow run should now disable that flag so we can spot any bugs that arise from GHC's different behavior.

compiler/iface/IfaceType.hs
332

I find this terminology confusing: "closed free" seems contradictory.

How about "local" vs "free"?

Also, I believe that one list will always be empty. Why? Iface conversion happens for types/coercions (for pretty printing) or for whole programs. Covars can never be locally bound in a type/coercion, so the first list will be empty in that case. On the other hand, when converting a whole program, there are no free variables, so the second list will be empty in that case. We don't necessarily need to change this type, but it might be helpful to (test and) document this invariant.

336

It's Zapping coercions elsewhere (note the case). Also, tell me what module the Note is in, please.

516

This seems wrong, to me. Even though these are just coercion variables, couldn't a substed type appear in a covar's kind? This could appear with -fprint-explicit-coercions -fprint-explicit-kinds or some such. We should probably also update the IfaceCoVarCo case.

1753

Maybe MASSERT that the free vars are empty here? c.f. the panic in the IfaceFreeTyVar case.

compiler/iface/TcIface.hs
1206–1211

Why the lambda?

1252

This could also perhaps assert that the second list is empty. Doing so would also help poor readers understand why a field in IfaceZappedProv is ignored.

compiler/main/DynFlags.hs
1545

Tell me where the Note is, please.

1549

Just a reminder to remove this TODO (either by doing it or deciding not to -- I actually don't care much which)

compiler/typecheck/TcFlatten.hs
1725–1732

Tell me where the Note is, please (here and elsewhere in this file)

1751

Should we put the zapped coercion in the cache? I think so. We want this to be true: When shouldBuildCoercions is false, we never force norm_co.

compiler/typecheck/TcHsSyn.hs
1688 ↗(On Diff #16821)

This record update is now redundant.

compiler/typecheck/TcType.hs
963

I wonder if the best answer here is to have all type synonyms expanded in the fvs, always.

compiler/types/Coercion.hs
911

Why only ZappedProv? Is there a reason not to do this for other provenances?

990

Is there a reason we handle these cases here? They're not bad, but if these additions were inspired by, say, a performance regression, that would be good to document.

998

I don't think we can always do this. Perhaps one of the types involved isn't zonked. Or maybe we've zapped an unsafeCoerce.

1053

Similarly to the NthCo case: do we need this optimization? If so, what's the test case? Also, I'm not sure this will always succeed in bizarre corner cases (like unsafeCoerce)

2090

Does that actually work to seq a DVarSet? But maybe that's enough...

compiler/types/OptCoercion.hs
493

I don't think you need this case. The catch-all case is good enough.

509

This should almost certainly be prov', not prov.

521

Ditto here.

compiler/types/TyCoRep.hs
1432

It Zapping coercions elsewhere (note the case). This actually confused my Note-retrieval process!

1765

s/zapping/zapped/

1786

I, too, am quite curious here.

1826

While examples would be nice, I understand they may be hard to cook up. But I am interested in how these changes might manifest in different behavior, and understanding these seem necessary to make sure this patch is correct.

1838

This difference makes me worry that some instance will pass the termination check while zapping but wouldn't if we're not zapping. Maybe, just maybe, that's OK. But it sure is strange!

3563

Why this change?

compiler/types/Type.hs
523

This field seems always to be false. If it hasn't proved useful, please rip it out.

613

But this should never be true now, right?

2825

Tell me where this Note is, please.

bgamari marked 10 inline comments as done.Jun 12 2018, 1:34 PM

I'm in the process of doing a bit of refactoring on this; I've introduced some additional assertions and am seeing some free variables inconsistencies in the flattener.

compiler/iface/IfaceType.hs
332

Agreed; I am also not a fan of this language. Amended to local/free.

compiler/typecheck/TcFlatten.hs
1725–1732

Is it possible that the free variable set of

update_co $ mkSymCo `mkTcCoherenceLeftCo` kind_co)

is larger than fvs?

1794

Yes, this is an excellent suggestion.

compiler/typecheck/TcHsSyn.hs
1688 ↗(On Diff #16821)

Well, originally this False was a True. However, I think I'm just going to rip out the zapping-during-zonking support.

compiler/typecheck/TcType.hs
963

Hmm, yes, that is an interesting idea.

compiler/types/Coercion.hs
911

I wanted to start with a minimal change. However, I do intend on extending this to cover the other cases in a later patch.

2090

Good catch; indeed it's not. If DVarSet were backed by a Set then this would be sufficient but it's not; it's backed by an IntMap.

compiler/types/OptCoercion.hs
493

Very true. Silly me.

527–545

prov' here as well I suppose?

compiler/types/TyCoRep.hs
3563

Just deduplicating a repeated pattern.

compiler/types/Type.hs
523

We don't need it; in fact it's currently disabled. Enabling it introduces a few ~5% regressions.

613

That is correct.

bgamari updated this revision to Diff 16867.Jun 12 2018, 3:38 PM
bgamari marked 5 inline comments as done.
  • Harmless changes
  • Rework handling of DynFlags in mapCoercion/mapType
  • CoreLint: Check ZappedCo free variables
  • TcUnify: Unify ZappedCo fvs
  • Coercion: Fix promoted ZappedCo role
  • FamInstEnv: More efficient fvs
  • Type: more efficient FVs
  • More changes
  • Refactor as Prov
  • Zap coercions from interface files
  • Optimise mkSymCo of ZappedCo
  • Use mkUnivCo
  • Zap coercions in pushCoTyArg
  • Handle zapped coercions in mkNthCo
  • Handle zapped coercions in mkLRCo
  • Handle zapped coercions in mkSubCo
  • ZappedCos will never have free type variables
  • Deduplicate zapped coercion promoteCoercion
  • Don't re-zap a zapped coercion
  • First zap, then map
  • Fix long line
  • Don't filter for CoVars
  • Don't look to the coercion for its kind
  • Drop unused binders
  • Make zapping during zonking optional
  • Drop zapCoercionWithFVs
  • Note difference
  • Don't zap Refl coercions
  • Fix linting
  • Fix order of types
  • Eliminate some zapping
  • Reenable zapping in CoreOpt
  • Work on Note
  • More work on the note
  • A few fixes
  • Rip out zapping during zonking
  • Refactoring suggested by Simon
  • Add -ddrop-coercions for debugging
  • Fix seqProv implementation
  • Fix note reference
  • Rip out impossible case
  • Be more specific about location of Note
  • Fix comments
goldfire added inline comments.Jun 12 2018, 3:41 PM
compiler/typecheck/TcFlatten.hs
1725–1732

Yes, indeed it is! The fvs calculated here must equal the fvs of co (the trans of norm_co and final_co, below), but they do not include the kind_co or the update_co. Good catch. You'll need to account for that. It would be OK to just replace co with the zapped one below, calling mkCoherenceLeftCo and update_co as you normally would.

compiler/types/OptCoercion.hs
527–545

Yes indeed.

bgamari updated this revision to Diff 18264.Oct 8 2018, 4:40 PM
  • Kill unused binding
  • Fix free variables sets
bgamari updated this revision to Diff 18269.Oct 8 2018, 9:51 PM
bgamari marked an inline comment as done.

Rebase