Towards using level numbers for generalisation
AbandonedPublic

Authored by goldfire on Nov 7 2018, 11:36 AM.

Details

Reviewers
simonpj
bgamari
Trac Issues
#15809
Summary

(posted on behalf of Simon PJ)

My original goal was (Trac Trac #15809) to move towards using level numbers
as the basis for deciding which type variables to generalise, rather
than searching for the free varaibles of the environment.

And indeed this patch does make such a step:

  • Augment quantifyTyVars to calculate the type variables to quantify using level numbers, and compare the result with the existing approach. That is; no change in behaviour, just a WARNing if the two approaches give different answers.
  • To do this I had to get the level number right when calling quantifyTyVars, and this entailed a bit of care, especially in the code for kind-checking type declarations.
  • However, on the way I was able to eliminate or simplify a number of calls to solveEqualities.

    Similarly I curbed the excessive use of scopeTyVars, which also invokes solveEqualities. There is only one left, in tcFamTyPats -- a gruesome function that is next on my target list.

This led me into deep dive into kind inference for type and class
declarations, which has been a soggy part of GHC for a long time.
Richard already did some good work recently in

commit 5e45ad10ffca1ad175b10f6ef3327e1ed8ba25f3
Date:   Thu Sep 13 09:56:02 2018 +0200
 
 Finish fix for #14880.
 
 The real change that fixes the ticket is described in
 Note [Naughty quantification candidates] in TcMType.

but I kept turning over stones. So this patch has ended up
with a pretty significant refactoring of that code too.

Specifically:

  • Major refactoring in the way we generalise the inferred kind of a TyCon, in kcTyClGroup. Indeed, I made it into a new top-level function, generaliseTcTyCon. Plus a new Note to explain it Note [Inferring kinds for type declarations].
  • We decided (Trac Trac #15592) not to treat class type variables specially when dealing with Inferred/Specified/Required for associated types. That simplifies things quite a bit. I also rewrote Note [Required, Specified, and Inferred for types]
  • Some refactoring in kcLHsQTyVarBndrs (the CUSK case); it is easier because of our new decision not to treat the class variables specially
  • I now deal with the "naughty quantification candidates" of the previous patch in candidateQTyVars, rather than in quantifyTyVars; see Note [Naughty quantification candidates] in TcMType.

    I also killed off closeOverKindsCQTvs in favour of the same strategy that we use for tyCoVarsOfType: namely, close over kinds at the occurrences.

    And candidateQTyVars no longer needs a gbl_tvs argument.
  • Passing the ContextKind, rather than the expected kind itself, to tc_hs_sig_type_and_gen makes it easy to allocate the expected result kind (when we are in inference mode) at the right level.
  • I moved all the error checks from tcTyClTyVars (which was a bizarre place for it) into generaliseTcTyCon and/or the CUSK case of kcLHsQTyVars. Now tcTyClTyVars is extremely simple.

Smaller things

  • kcLHsQTYVarBndrs had strange stuff about "bind unless scoped". I could not see any justification for this, so I simply removed it all. Much simpler now.
  • I rejigged the error message in checkValidTelescope; I think it's quite a bit better now.
  • checkValidType was not rejecting constraints in a kind signature forall (a :: Eq b => blah). blah2 That led to further errors when we then do an ambiguity check. So I make checkValidType reject it more aggressively.
  • I killed off quantifyConDecl, instead calling kindGeneralize directly.
  • I fixed an outright bug in tyCoVarsOfImplic, where we were not colleting the tyvar of the kind of the skolems

Some other oddments

  • Refactoring around the trace messages from reportUnsolved.
  • A bit of extra tc-tracing in TcHsSyn.commitFlexi
Test Plan

./validate

goldfire created this revision.Nov 7 2018, 11:36 AM

Out of time now; more later. (Note to self: I read through files before TcHsType, kcImplicitTKBndrs but not kcLHsQTyVars; also read through TcMType)

compiler/typecheck/TcHsType.hs
233

Presumably you moved this here so that the level numbers work out. But I think it was better beforehand -- the kind of this type surely can't mention local variables, can it? For example, what if we have [t| forall (r :: RuntimeRep) (a :: TYPE r). a |]? We want to get a skolem-escape error here, and normally it's the level numbers that help us. If we allow the expected kind to have a bumped level-number, then rubbish like that will be accepted.

I'd undo this change, but I imagine you had a good reason for doing it, so there may be something to debate here.

303

Ah. I see how passing in a ContextKind is useful. But I still think that the meta-var generation should be done outside the tcImplicitTKBndrs in tc_hs_sig_type_and_gen.

1432

No reference to this Note.

1749

This refactoring is making kcImplicitTKBndrs awfully like tcImplicitTKBndrs. Indeed, the only real difference is that the latter sorts the tyvars while the former doesn't. (There's also the latter's call to checkTvConstraints; see next comment.) So perhaps we have one function that does scoping and another that does scoping-and-sorting, where the second calls the first.

1758

This smells wrong to me (in the CUSK case, which newly goes via this code path). It doesn't bump the level, like tcImplicitTKBndrs does. That means that an outer tyvar might unify with one brought into scope here. I'm failing to come up with an example, essentially because type declarations are always top-level. So maybe it doesn't bite, but it goes against the general principle that bringing skolems into scope should always bump the level and create an implication constraint.

Actually, that last piece may be more telling: could you get into a situation with a panic and no skolem info without the call to checkTvConstraints?

Note: we don't want to call checkTvConstraints in the non-CUSK case, as that would reject class C a where type F (a :: k), as we need the kappa from the class head to unify with the k in the type family. So this is a bit delicate.

1762

This is awkward. Sometimes, new_tv is newFlexiKindedQTyVar, which also does a lookup. Maybe newFlexiKindedQTyVar should take a new_tv function which is either newSkolemTyVar or newTyVarTyVar as appropriate. This would make newFlexiKindedQTyVar nicely parallel with the existing newFlexiKindedTyVar.

compiler/typecheck/TcMType.hs
1212

This repetition of code can be easily refactored, no?

1362

This isn't part of the above Note. Should it be made into a fresh top-level Note?

1415

Should this be an ASSERT? More likely to be noticed.

More generally, why not just use the new technique? I suppose something is stopping you from committing to it.

1452

Is this the "note to tidy up"?

compiler/typecheck/TcSimplify.hs
2015–2018

... in TcMType

I should add that this patch validates.

Thanks! I've responded

compiler/typecheck/TcHsType.hs
233

I think the particular case of [t| <type> |] isnt' relevant; I don't think we ever convert that from a HsType to a Type. Do we? Indeed, searching for TcBrackCtxt suggests that it's entirely unused.

233

It's hard to find uses of AnyKind or OpenKind, which are the only times this matters. Especially the latter -- when iis it used?

But for AnyKind consider

data T = ... deriving (  T (forall a. blah) )

The fact that this is impredicative is not relevant here. We'll typecheck this with tcHsDeriv which calls tc_hs_sig_type_and_gen. The latter will push the level (to, say, level 1) and start checking (T ...) :: kappa[1]. When we encounter the forall we'll push the level again (to 2) and typecheck blah. And indeed now kappa[1] will not unify with the skolem a[2].

But it's crucial that kappa[1] has level 1, else it won't be touchable and we won't be able to unify it with T's result kind.

It's a bit like

f = e

When inferring the type of f, we

  • Push the level to (say) 1
  • Allocate a unification variable alpha[1] at the new level
  • Check that e :: alpha[1]
  • Generalise

In short: I think it's right as-is. Remember, tc_hs_sig_type_and_gen is NOT where we bind skolems (that happens if we hit a forall inside); it's where we get ready to generalise.

If that makes sense, how could we turn it into a Note?

303

I disagree, as noted with tc_hs_sig_type_and_gen.

1432

Yes I think it was a failed attempt; I'll delete it

1749

That's a great idea.

1758

general principle

Yes that's the principle, but we we want to bump once for all these variables: implicit (specified) and explicit. So I've carefully separated the bump level/solve-equalities stuff from the bring-tyvars-into scope stuff. That why all the scopeTyVars calls have gone, dramatically reducing the number of (entirely redundant) solveEqualities calls.

1758

I don't understand what you mean about checkTvConstraints.

1762

Ha! You are right!

I think we can simply do

kcImplicitTKBndrsSkol = kcImplicitTKBndrsX newFlexiKindedSkolemTyVar

and kill off newFlexiKindedQTyVar. Hurrah!

compiler/typecheck/TcMType.hs
1212

There are several fiddly differences, so I'm not sure it would be "easy"; and worse, the resulting abstraction might be more confusing that the repetition. Would you like to have a go?

1362

I think it's a vestige of history.

In that example, both have CUSKs, so no unification variables are insight.

If the class is non-CUSK, neither is the type; and it's all covered in detail by Note [Inferring kinds for type declarations] in TcTyClsDecls.

Do you agree?

1415

I'm not sure that all the warnings are gone yet. This patch is just getting ready to switch it on. When it's warning free I'll delete all the gbl-tyvar stuff.

1452

Actually I'm not sure that this case is ever need these days. ( I didn't change this code.)

compiler/typecheck/TcSimplify.hs
2015–2018

Actually it's now Note [Inferring kinds for type declarations] in TcTyClsDecls. I should change those out of date references.

One more thing. I've come across your Note [Use TyVarTvs in kind-checking pass] in TcTyClsDecls. I have not changed anything to do with this.

At first I thought it was on the same topic as Note [Inferring kinds for type declarations], but it isn't. Your note is just about the very special case of kcConDecl for a non-CUSK data type. Correct.

In which case I think the unique place that special magic must happen is in the kcExplicitTKBndrs (and kcImplictTKBndrs) for kcConDecl in the GADT case.

I think it's just there -- nowhere else has this odd behaviour. So the scope of the Note is very narrow. Do you agree?

goldfire updated this revision to Diff 18669.Nov 12 2018, 10:46 AM

More commits from Simon:

  • Comments and alpha-renaming
  • Comments only
  • Progress
  • Combine kcImplicitTKBndrs and tcImplicitTKBndrs
  • Get rid of kcLHsQTyVarBndrs
  • Start to eliminate tcFamTyPats
goldfire updated this revision to Diff 18768.Nov 19 2018, 8:28 AM

More updates from Simon.

  • Tc-tracing, and order of arguments only
  • WIP on using level numbers for generalisation
  • More progress on using level numbers for gen
  • Further work on TyCon generalisation
  • Further progress
  • More progress
  • More progress
  • More progress
  • Comemnts only
  • Finally done
  • Comments and alpha-renaming
  • Comments only
  • Progress
  • Combine kcImplicitTKBndrs and tcImplicitTKBndrs
  • Get rid of kcLHsQTyVarBndrs
  • Start to eliminate tcFamTyPats
  • More progress on tcFamTyPats
  • Wibbles
  • Much more progress on tcFamTyPats
  • Data family instances working, I think
  • Print tycon arity in -ddump-types
  • More progress with data instances
  • Wibbles
  • More progress on reporting unbound variables
  • Nearly there...
  • Better validity checks, simplification
  • Finally, validate-clean
  • Simplify typing of associated family instances

More later!

compiler/typecheck/TcDeriv.hs
786

Why this change? Should it be commented?

compiler/typecheck/TcHsType.hs
234

Did we decide to move this outside the bindImplicitTKBndrs call?

863

I'm not yet convinced we can drop this argument. We need it for associated type instances.

1001

Yes, this is an improvement.

1373

Where did this go? This is an important Note.

simonpj added inline comments.Nov 19 2018, 10:58 AM
compiler/typecheck/TcDeriv.hs
786

It's because the two might not be the same.

The tc and tc_args passed into deriveTyData come from deriveClause which in turn gets them from tyConFamInstSig_maybe, which in turn gets them from DataFamInstTyCon:

| DataFamInstTyCon          -- See Note [Data type families]
      (CoAxiom Unbranched) 
      TyCon   -- The family TyCon
      [Type]  -- Argument types (mentions the tyConTyVars of this TyCon)
              -- No shorter in length than the tyConTyVars of the family TyCon
              -- How could it be longer? See [Arity of data families] in FamInstEnv

Notice that the arg tys might not be the same as the tycon arity (= length tyConTyVars).

I agree this needs an explanation but first let's check you agree.

compiler/typecheck/TcHsType.hs
234

No -- remember that bindImplicitTKBndrs does not push level now!

1373

It went into Note [Inferring kinds for type declarations] in TcTyClsDecls. Look ok?

compiler/typecheck/TcMType.hs
1452

I'll add an ASSERT

goldfire updated this revision to Diff 18806.Nov 21 2018, 9:28 AM

More from SPJ.

  • Tc-tracing, and order of arguments only
  • WIP on using level numbers for generalisation
  • More progress on using level numbers for gen
  • Further work on TyCon generalisation
  • Further progress
  • More progress
  • More progress
  • More progress
  • Comemnts only
  • Finally done
  • Comments and alpha-renaming
  • Comments only
  • Progress
  • Combine kcImplicitTKBndrs and tcImplicitTKBndrs
  • Get rid of kcLHsQTyVarBndrs
  • Start to eliminate tcFamTyPats
  • More progress on tcFamTyPats
  • Wibbles
  • Much more progress on tcFamTyPats
  • Data family instances working, I think
  • Print tycon arity in -ddump-types
  • More progress with data instances
  • Wibbles
  • More progress on reporting unbound variables
  • Nearly there...
  • Better validity checks, simplification
  • Finally, validate-clean
  • Simplify typing of associated family instances
  • Wibbles
  • More wibbles
  • More progress in tcFamTyPats
  • Wibbles
  • Wibble, to fix build

Sharing comments for call Simon and I are having.

compiler/typecheck/Inst.hs
504

... in TcHsType

505

I think this is somewhat delicate.

In checkExpectedKindX we absolutely do not want to zonk. That's because the result returned from this is compared against exp_kind. (You will want to look at the code there to make sense of this.) Suppose the equality check passes only because of the zonk. Then the returned type's kind will be unzonked, failing Note [The tcType invariant].

In tcTyVar#handle_tyfams, we surely do want to zonk, in order to uphold Note [The tcType invariant].

In tcFamTyPats, it's not clear, as the comments for the function don't specify whether the result is zonked yet.

compiler/typecheck/TcDeriv.hs
784

Not related to this patch, but: does this work if we have, e.g., C :: (forall k. k -> Type) -> Constraint? splitFunTys may need to be splitPiTys, or we may need to check for the higher-kinded case and fail.

I tried it with this:

class C (a :: forall k. k -> Type)

data P :: forall k. k -> Type
  deriving C

That fails with

• Expected kind ‘k0’,
    but ‘C’ has kind ‘(forall k. k -> *) -> Constraint’
• In the data declaration for ‘P’

Why this error? Because of the treatment in newExpectedKind to produce a kappa for an AnyKind type, as provided in tcHsDeriv. That's a bit removed for my tastes.

But then, I got lost trying to figure out if this approach to AnyKind fails elsewhere. (Really, AnyKind means AnyMonoKind. Perhaps it should be renamed.) It doesn't... but only because expectedKindInCtxt never actually returns AnyKind, as far as I can tell. Trace the calls to tcHsSigType, which is the only call site of expectedKindInCtxt. I believe tcHsSigType will never see a UserTypeCtxt that leads to AnyKind. So I think some refactoring may help clarify around here.

I am worried about something like [t| C |], which should certainly work, regardless of C's intricate kind. But, of course, the contents of untyped brackets don't get type-checked, so we're OK... despite what expectedKindInCtxt suggests for ThBrackCtxt. Actually, ThBrackCtxt is never used, anywhere!

Looking for trouble elsewhere, I checked GhciCtxt. That is used only in validation, never in checking, so it's OK too. Same with TypeAppCtxt. Urgh.

786

Yes, I agree with this change. I'm surprised this hasn't bitten.

compiler/typecheck/TcHsType.hs
1526

I think this zonk is unnecessary. candidateQTyVarsOfKinds says that its input needn't be zonked.

1536

s/qtkvs/inferred/

Thanks for the help. Some rejoinders inline

compiler/typecheck/Inst.hs
504

done

505

I don't grok this yet. Once the patch is committed, might you add some clarifying notes?

compiler/typecheck/TcDeriv.hs
784

Richard, might you make a ticket for this if you think it's important?

compiler/typecheck/TcHsType.hs
1526

Good catch

1536

Done!

goldfire added inline comments.Nov 24 2018, 1:26 PM
compiler/typecheck/Inst.hs
505

It's not just a matter of commentary. I think the zonk is wrong. The caller can be held responsible for zonking if they think it necessary.

compiler/typecheck/TcDeriv.hs
784

I went to do this, but it's really this patch that makes the mess -- specifically, expectedKindInCtxt is defined over many arguments which it never sees. If you just want to push this through, though, we can always clean it up later.

compiler/typecheck/TcHsType.hs
1838

I'm not sure I love the reuse of ContextKind here. In tcHsSigType, it specifies an expected kind, something to look for. Here, it specified a default kind. These are different! In particular, in tcHsSigType, AnyKind accepts only monokinds. But, of course, we can have a polykind on a type variable here in tcHsQTyVarBndr (if the user has written a kind signature).

In addition, one of the three ContextKinds is never used here: OpenKind never gets seen here. I think it's cleaner to consider these cases separately.

simonpj added inline comments.Nov 24 2018, 3:21 PM
compiler/typecheck/Inst.hs
505

Fine! There are three calls to tcInstTyBinders:

  • In checkExpectedKindX
  • In tcTyFamInstEqnGuts (make sure you are looking at the most recent wip/T15809 branch
  • In the handle_tyfams case of tcTyVar

Would you venture an opinion about which of these need a zonk?

compiler/typecheck/TcDeriv.hs
784

I'm a bit lost. The only change I made here was to fix a lurking bug, as described above. What's this business about expectedKindInCtxt?

compiler/typecheck/TcHsType.hs
1838

Hmm. Previously it was a Bool, which is rather uninformative. And newExpectedKind does just what we want. So it was convenient. Whether it is worth defining a new two-constructor data type, and an equivalent of newExpectedKind I'm not sure. I'm inclined just to document it better.

goldfire added inline comments.Nov 24 2018, 10:25 PM
compiler/typecheck/Inst.hs
505

In tcTyFamInstEqnGuts: Yes, zonk. This is necessary because the rhs_kind gets passed to tcCheckLHsType, which then passes it to ..., which then passes it to checkExpectedKind, which inspects the type without zonking. This should be documented better.

In checkExpectedKindX: No, do not zonk. Examine checkExpectedKindX. Suppose act_kind is (alpha ~ Bool) => alpha and exp_kind is Bool. The tcInstTyBinders will end up unifying alpha := Bool. If we zonk, then the eqType act_kind' exp_kind check (line 1037) will be True. Yet, ty' might have type alpha, not Bool. We need to supply a "naked cast" to get the type of ty' to be Bool.

In handle_tyfams: Yes, zonk. This is necessary to uphold the tcType invariant.

All of this gets me thinking: what would you say to

newtype Zonked a = Zonked { forgetZonk :: a }

Then, various functions can accept Zonked TcType and such. This type would be fully concrete; anyone can inject and project at will. But doing this will give us more checked documentation around where we expect zonking.

compiler/typecheck/TcDeriv.hs
784

Ah. I thought expectedKindInCtxt was new to this patch. I see it isn't. I take back my claim.

compiler/typecheck/TcHsType.hs
1838

Documenting it better would work, I suppose, but I'm still not thrilled.

I saw bindExplicitTKBndrs_Q_Skol AnyKind somewhere and got worried that the AnyKind would mean that we invent a kappa for the kind of KindedTyVars and then unify said kappa with the user-written kind. This would be wrong, of course, because users can write polykinds. Of course, the ContextKind here is used only for UserTyVars. This is quite different from the use of ContextKind in tcHsSigType, where a kappa is introduced for AnyKind regardless of what the user has written.

So: my suggestion for a change here wasn't from my usual pedantry, but from actual confusion that arose while reading the code.

What is the status of this, @goldfire?

What is the status of this, @goldfire?

It's all in HEAD. So we can abandon this diff.

PS the commit is

commit 2257a86daa72db382eb927df12a718669d5491f8
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date:   Wed Nov 28 16:06:15 2018 +0000

    Taming the Kind Inference Monster
bgamari abandoned this revision.Jan 22 2019, 7:46 PM