Zap coercions when not building with -dcore-lint
Needs ReviewPublic

Authored by bgamari on Jun 1 2018, 3:41 PM.

Details

Reviewers
goldfire
simonpj
Trac Issues
#8095
Summary

Coercions for even small programs can grow to be quite large (e.g. Trac #8095),
especially when type families are involved. For instance, the case of addition
of inductive naturals can build coercions quadratic in size of the summands.

In this patch we avoid this quadratic cost by eliminating coercions from Core
when they are not needed (e.g. when we aren't Core Linting). Through careful
representation we can achieve an asymptotic improvement in typechecking
performance in type-family-heavy programs.

See Note [Zapping Coercions] for details.

Thanks to @goldfire for his inspiration and guidance.

Performance

Below I've tabulated the runtimes (in seconds) of ghc -freduction-depth=0 -v0 compiling
the testcase generated by Types.hs attached to Trac #8095 (specifically the a a variant) with
GHC 8.2.2 and master with this patch:

# n     8.2.2    D4766
200      0.44    0.31
400      0.66    0.40
600      0.93    0.42
800      1.31    0.45
1000     1.77    0.55
1500     3.32    0.66
2000     5.94    0.84
2500     8.73    0.96
3000    12.80    1.11
3500    17.00    1.28
4000    20.64    1.42
5000    33.60    1.84

Looks pretty linear to me.

The results are even more dramatic when the test is run in the b a variant:

# n     8.2.2    D4766
200      1.82    0.32
400     10.40    0.39
600     32.37    0.45
800     81.69    0.49
1000   138.87    0.53

Here we see that compilation times remain nearly zero with this patch long after they become completely untenable under 8.2.2.

The story is pretty similar under the c a variant:

# n    8.2.2     D4766
200     1.23     0.35
400     4.32     0.39
600    11.95     0.42
800    25.72     0.51
1000   44.57     0.56
1500  148.11     0.65

Even the allocation numbers of some testsuite cases benefit pretty significantly:

T5030      -80.3%
T12227     -53.7%
T12545     -75.3%
Test Plan

Validate

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Branch
zap-coercions
Lint
Lint WarningsExcuse: It's okay
SeverityLocationCodeMessage
Warningcompiler/coreSyn/CoreFVs.hs:396TXT3Line Too Long
Warningcompiler/iface/IfaceEnv.hs:228TXT3Line Too Long
Warningcompiler/iface/TcIface.hs:1244TXT3Line Too Long
Warningcompiler/typecheck/TcFlatten.hs:1723TXT3Line Too Long
Warningcompiler/typecheck/TcFlatten.hs:1758TXT3Line Too Long
Warningcompiler/types/Coercion.hs:1039TXT3Line Too Long
Warningcompiler/types/Coercion.hs:1062TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:1496TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:1512TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:1552TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:1557TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:1863TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:2718TXT3Line Too Long
Warningcompiler/types/TyCoRep.hs:3319TXT3Line Too Long
Unit
No Unit Test Coverage
Build Status
Buildable 21224
Build 47761: [GHC] Linux/amd64: Continuous Integration
Build 47760: [GHC] OSX/amd64: Continuous Integration
Build 47759: [GHC] Windows/amd64: Continuous Integration
Build 47758: arc lint + arc unit
There are a very large number of changes, so older changes are hidden. Show Older Changes
bgamari marked 31 inline comments as done.Jun 8 2018, 8:32 PM
bgamari added inline comments.
compiler/typecheck/TcFlatten.hs
1695

Rehashed in the Note.

1718–1719

I think the Note should adequately treat this now.

compiler/typecheck/TcType.hs
956

Still need to look at this.

compiler/types/OptCoercion.hs
91

TODO: Drop this.

compiler/types/TyCoRep.hs
944

Why not just use UnivCo and make a new UnivCoProvenance for ZappedCo? UnivCo is used in precisely the same way; e.g. PluginProv "foo" precisely says "I lack a detailed proof"

The patch now does precisely this.

1514

I need to finish writing this.

3375

Thankfully it is benign; contrary to what I suggested in person, coercion sizes are not accounted for in the expression sizes used by the simplifier.

bgamari marked 3 inline comments as done.Jun 8 2018, 8:40 PM

There is still a bit more to be done on the documentation side of things. However, this otherwise is looking pretty good. The only changes in the testsuite are double-digit-percentage improvements in allocations; not a single regression. The numbers reported in the Differential summary are measurements of the current state of the patch and reflect the asymptotic improvement that I would expect from the refactor.

bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 8:47 PM
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 8:51 PM
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 8:55 PM
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 8:57 PM
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 9:01 PM
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 9:03 PM
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 9:21 PM
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 9:23 PM
bgamari edited the summary of this revision. (Show Details)Jun 8 2018, 9:29 PM
bgamari edited the summary of this revision. (Show Details)
alanz added a subscriber: alanz.Jun 9 2018, 9:41 AM
stites added a subscriber: stites.Jun 9 2018, 11:27 AM
bgamari updated this revision to Diff 16821.Jun 9 2018, 4:21 PM
  • Fix order of types
  • Eliminate some zapping
  • Reenable zapping in CoreOpt
  • Work on Note
  • More work on the note
  • A few fixes

Looks good

compiler/typecheck/TcFlatten.hs
1759

I wonder about defining

maybeZapCo :: Coercion -> TcType -> TcType -> Role -> TcM Coercion
-- Precondition: co :: t1 ~r t2
maybeZapCo co t1 t2 r
  = do { dflags <- getDynFlags
          ; if buidlCoercions dflags
           then return co
           else return (ZapCoercion. ....) }

This would localise the choice.

compiler/types/TyCoRep.hs
1484

Great! A 100% regression can't be hard to find :-)

1585

I'd prefer not to do so here, but rather to do it in optCoercion (see below). For a start you need unsafeDynFlags here.

1587

Why?

1593

I don't understand this. If we define

optCoercion co = zapCoercion co

are you really say that increases allocation??

1633

For all of these things, I wonder

  • What is an example?
  • Does it matter?
compiler/types/Type.hs
504

Do we really need this? It's a new complication. How much worse are things without it?

The only changes in the testsuite are double-digit-percentage improvements in allocations; not a single regression.

But on the phone you said that one benchmark gets 100% worse. It would be good to understand how that happens.

I took another detailed pass through. Mostly documentation requests... Thanks!

compiler/coreSyn/CoreOpt.hs
994

Indeed it does. Specifically: why zap this particular coercion? Aren't there others, too?

compiler/iface/IfaceSyn.hs
1464

While the differences are nicely documented now, do we have a new "way" to run the testsuite? The testsuite generally runs with -dcore-lint (as it should). However, I think a slow run should now disable that flag so we can spot any bugs that arise from GHC's different behavior.

compiler/iface/IfaceType.hs
314

I find this terminology confusing: "closed free" seems contradictory.

How about "local" vs "free"?

Also, I believe that one list will always be empty. Why? Iface conversion happens for types/coercions (for pretty printing) or for whole programs. Covars can never be locally bound in a type/coercion, so the first list will be empty in that case. On the other hand, when converting a whole program, there are no free variables, so the second list will be empty in that case. We don't necessarily need to change this type, but it might be helpful to (test and) document this invariant.

318

It's Zapping coercions elsewhere (note the case). Also, tell me what module the Note is in, please.

495

This seems wrong, to me. Even though these are just coercion variables, couldn't a substed type appear in a covar's kind? This could appear with -fprint-explicit-coercions -fprint-explicit-kinds or some such. We should probably also update the IfaceCoVarCo case.

1650

Maybe MASSERT that the free vars are empty here? c.f. the panic in the IfaceFreeTyVar case.

compiler/iface/TcIface.hs
1200–1205

Why the lambda?

1244

This could also perhaps assert that the second list is empty. Doing so would also help poor readers understand why a field in IfaceZappedProv is ignored.

compiler/main/DynFlags.hs
1515

Tell me where the Note is, please.

1519

Just a reminder to remove this TODO (either by doing it or deciding not to -- I actually don't care much which)

compiler/typecheck/TcFlatten.hs
1694–1698

Tell me where the Note is, please (here and elsewhere in this file)

1717

Should we put the zapped coercion in the cache? I think so. We want this to be true: When shouldBuildCoercions is false, we never force norm_co.

compiler/typecheck/TcHsSyn.hs
1687

This record update is now redundant.

compiler/typecheck/TcType.hs
963

I wonder if the best answer here is to have all type synonyms expanded in the fvs, always.

compiler/types/Coercion.hs
853

Why only ZappedProv? Is there a reason not to do this for other provenances?

915

Is there a reason we handle these cases here? They're not bad, but if these additions were inspired by, say, a performance regression, that would be good to document.

923

I don't think we can always do this. Perhaps one of the types involved isn't zonked. Or maybe we've zapped an unsafeCoerce.

999

Similarly to the NthCo case: do we need this optimization? If so, what's the test case? Also, I'm not sure this will always succeed in bizarre corner cases (like unsafeCoerce)

1829

Does that actually work to seq a DVarSet? But maybe that's enough...

compiler/types/OptCoercion.hs
429

I don't think you need this case. The catch-all case is good enough.

445

This should almost certainly be prov', not prov.

457

Ditto here.

compiler/types/TyCoRep.hs
1324

It Zapping coercions elsewhere (note the case). This actually confused my Note-retrieval process!

1572

s/zapping/zapped/

1593

I, too, am quite curious here.

1633

While examples would be nice, I understand they may be hard to cook up. But I am interested in how these changes might manifest in different behavior, and understanding these seem necessary to make sure this patch is correct.

1645

This difference makes me worry that some instance will pass the termination check while zapping but wouldn't if we're not zapping. Maybe, just maybe, that's OK. But it sure is strange!

3298

Why this change?

compiler/types/Type.hs
504

This field seems always to be false. If it hasn't proved useful, please rip it out.

583

But this should never be true now, right?

2436

Tell me where this Note is, please.

bgamari marked 10 inline comments as done.Jun 12 2018, 1:34 PM

I'm in the process of doing a bit of refactoring on this; I've introduced some additional assertions and am seeing some free variables inconsistencies in the flattener.

compiler/iface/IfaceType.hs
314

Agreed; I am also not a fan of this language. Amended to local/free.

compiler/typecheck/TcFlatten.hs
1694–1698

Is it possible that the free variable set of

update_co $ mkSymCo `mkTcCoherenceLeftCo` kind_co)

is larger than fvs?

1759

Yes, this is an excellent suggestion.

compiler/typecheck/TcHsSyn.hs
1687

Well, originally this False was a True. However, I think I'm just going to rip out the zapping-during-zonking support.

compiler/typecheck/TcType.hs
963

Hmm, yes, that is an interesting idea.

compiler/types/Coercion.hs
853

I wanted to start with a minimal change. However, I do intend on extending this to cover the other cases in a later patch.

1829

Good catch; indeed it's not. If DVarSet were backed by a Set then this would be sufficient but it's not; it's backed by an IntMap.

compiler/types/OptCoercion.hs
429

Very true. Silly me.

463

prov' here as well I suppose?

compiler/types/TyCoRep.hs
3298

Just deduplicating a repeated pattern.

compiler/types/Type.hs
504

We don't need it; in fact it's currently disabled. Enabling it introduces a few ~5% regressions.

583

That is correct.

bgamari updated this revision to Diff 16867.Jun 12 2018, 3:38 PM
bgamari marked 5 inline comments as done.
  • Harmless changes
  • Rework handling of DynFlags in mapCoercion/mapType
  • CoreLint: Check ZappedCo free variables
  • TcUnify: Unify ZappedCo fvs
  • Coercion: Fix promoted ZappedCo role
  • FamInstEnv: More efficient fvs
  • Type: more efficient FVs
  • More changes
  • Refactor as Prov
  • Zap coercions from interface files
  • Optimise mkSymCo of ZappedCo
  • Use mkUnivCo
  • Zap coercions in pushCoTyArg
  • Handle zapped coercions in mkNthCo
  • Handle zapped coercions in mkLRCo
  • Handle zapped coercions in mkSubCo
  • ZappedCos will never have free type variables
  • Deduplicate zapped coercion promoteCoercion
  • Don't re-zap a zapped coercion
  • First zap, then map
  • Fix long line
  • Don't filter for CoVars
  • Don't look to the coercion for its kind
  • Drop unused binders
  • Make zapping during zonking optional
  • Drop zapCoercionWithFVs
  • Note difference
  • Don't zap Refl coercions
  • Fix linting
  • Fix order of types
  • Eliminate some zapping
  • Reenable zapping in CoreOpt
  • Work on Note
  • More work on the note
  • A few fixes
  • Rip out zapping during zonking
  • Refactoring suggested by Simon
  • Add -ddrop-coercions for debugging
  • Fix seqProv implementation
  • Fix note reference
  • Rip out impossible case
  • Be more specific about location of Note
  • Fix comments
goldfire added inline comments.Jun 12 2018, 3:41 PM
compiler/typecheck/TcFlatten.hs
1694–1698

Yes, indeed it is! The fvs calculated here must equal the fvs of co (the trans of norm_co and final_co, below), but they do not include the kind_co or the update_co. Good catch. You'll need to account for that. It would be OK to just replace co with the zapped one below, calling mkCoherenceLeftCo and update_co as you normally would.

compiler/types/OptCoercion.hs
463

Yes indeed.