Taming the Kind Inference Monster
Concern Raised2257a86daa72

Authored by simonpj on Nov 28 2018, 10:06 AM.

Description

Taming the Kind Inference Monster

My original goal was (Trac Trac #15809) to move towards using level numbers
as the basis for deciding which type variables to generalise, rather
than searching for the free varaibles of the environment. However
it has turned into a truly major refactoring of the kind inference
engine.

Let's deal with the level-numbers part first:

  • Augment quantifyTyVars to calculate the type variables to quantify using level numbers, and compare the result with the existing approach. That is; no change in behaviour, just a WARNing if the two approaches give different answers.
  • To do this I had to get the level number right when calling quantifyTyVars, and this entailed a bit of care, especially in the code for kind-checking type declarations.
  • However, on the way I was able to eliminate or simplify a number of calls to solveEqualities.

This work is incomplete: I'm not /using/ level numbers yet.
When I subsequently get rid of any remaining WARNings in
quantifyTyVars, that the level-number answers differ from
the current answers, then I can rip out the current
"free vars of the environment" stuff.

Anyway, this led me into deep dive into kind inference for type and
class declarations, which is an increasingly soggy part of GHC.
Richard already did some good work recently in

commit 5e45ad10ffca1ad175b10f6ef3327e1ed8ba25f3
Date:   Thu Sep 13 09:56:02 2018 +0200

 Finish fix for #14880.

 The real change that fixes the ticket is described in
 Note [Naughty quantification candidates] in TcMType.

but I kept turning over stones. So this patch has ended up
with a pretty significant refactoring of that code too.

Kind inference for types and classes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  • Major refactoring in the way we generalise the inferred kind of a TyCon, in kcTyClGroup. Indeed, I made it into a new top-level function, generaliseTcTyCon. Plus a new Note to explain it Note [Inferring kinds for type declarations].
  • We decided (Trac Trac #15592) not to treat class type variables specially when dealing with Inferred/Specified/Required for associated types. That simplifies things quite a bit. I also rewrote Note [Required, Specified, and Inferred for types]
  • Major refactoring of the crucial function kcLHsQTyVars: I split it into kcLHsQTyVars_Cusk and kcLHsQTyVars_NonCusk because the two are really quite different. The CUSK case is almost entirely rewritten, and is much easier because of our new decision not to treat the class variables specially
  • I moved all the error checks from tcTyClTyVars (which was a bizarre place for it) into generaliseTcTyCon and/or the CUSK case of kcLHsQTyVars. Now tcTyClTyVars is extremely simple.
  • I got rid of all the all the subtleties in tcImplicitTKBndrs. Indeed now there is no difference between tcImplicitTKBndrs and kcImplicitTKBndrs; there is now a single bindImplicitTKBndrs. Same for kc/tcExplicitTKBndrs. None of them monkey with level numbers, nor build implication constraints. scopeTyVars is gone entirely, as is kcLHsQTyVarBndrs. It's vastly simpler.

    I found I could get rid of kcLHsQTyVarBndrs entirely, in favour of the bnew bindExplicitTKBndrs.

Quantification
~~~~~~~~~~~~~~

  • I now deal with the "naughty quantification candidates" of the previous patch in candidateQTyVars, rather than in quantifyTyVars; see Note [Naughty quantification candidates] in TcMType.

    I also killed off closeOverKindsCQTvs in favour of the same strategy that we use for tyCoVarsOfType: namely, close over kinds at the occurrences.

    And candidateQTyVars no longer needs a gbl_tvs argument.
  • Passing the ContextKind, rather than the expected kind itself, to tc_hs_sig_type_and_gen makes it easy to allocate the expected result kind (when we are in inference mode) at the right level.

Type families
~~~~~~~~~~~~~~

  • I did a major rewrite of the impenetrable tcFamTyPats. The result is vastly more comprehensible.
  • I got rid of kcDataDefn entirely, quite a big function.
  • I re-did the way that checkConsistentFamInst works, so that it allows alpha-renaming of invisible arguments.
  • The interaction of kind signatures and family instances is tricky. Type families: see Note [Apparently-nullary families] Data families: see Note [Result kind signature for a data family instance] and Note [Eta-reduction for data families]
  • The consistent instantation of an associated type family is tricky. See Note [Checking consistent instantiation] and Note [Matching in the consistent-instantation check] in TcTyClsDecls. It's now checked in TcTyClsDecls because that is when we have the relevant info to hand.
  • I got tired of the compromises in etaExpandFamInst, so I did the job properly by adding a field cab_eta_tvs to CoAxBranch. See Coercion.etaExpandCoAxBranch.

tcInferApps and friends
~~~~~~~~~~~~~~~~~~~~~~~

  • I got rid of the mysterious and horrible ClsInstInfo argument to tcInferApps, checkExpectedKindX, and various checkValid functions. It was horrible!
  • I got rid of [Type] result of tcInferApps. This list was used only in tcFamTyPats, when checking the LHS of a type instance; and if there is a cast in the middle, the list is meaningless. So I made tcInferApps simpler, and moved the complexity (not much) to tcInferApps.

    Result: tcInferApps is now pretty comprehensible again.
  • I refactored the many function in TcMType that instantiate skolems.

Smaller things

  • I rejigged the error message in checkValidTelescope; I think it's quite a bit better now.
  • checkValidType was not rejecting constraints in a kind signature forall (a :: Eq b => blah). blah2 That led to further errors when we then do an ambiguity check. So I make checkValidType reject it more aggressively.
  • I killed off quantifyConDecl, instead calling kindGeneralize directly.
  • I fixed an outright bug in tyCoVarsOfImplic, where we were not colleting the tyvar of the kind of the skolems
  • Renamed ClsInstInfo to AssocInstInfo, and made it into its own data type
  • Some fiddling around with pretty-printing of family instances which was trickier than I thought. I wanted wildcards to print as plain "_" in user messages, although they each need a unique identity in the CoAxBranch.

Some other oddments

  • Refactoring around the trace messages from reportUnsolved.
  • A bit of extra tc-tracing in TcHsSyn.commitFlexi

This patch fixes a raft of bugs, and includes tests for them.

Details

Auditors
RyanGlScott
Committed
simonpjNov 29 2018, 11:27 AM
Parents
rGHC79d5427e1f9d: Hadrian: support dynamically linking ghc
Branches
Unknown
Tags
Unknown
RyanGlScott raised a concern with this commit.Nov 29 2018, 11:39 AM
RyanGlScott added a subscriber: RyanGlScott.
RyanGlScott added inline comments.
/testsuite/tests/indexed-types/should_fail/T14179.stderr
10

Apologies for not noticing this before it landed, but this feels like a regression in error message quality. What happened to those arguments to Foo2? In the test case, we have:

data instance Foo2 :: * -> *
data instance Foo2 :: * -> *

Did this patch get rid of the eta-expansion business for data families somehow?

This commit now has outstanding concerns.Nov 29 2018, 11:39 AM
RyanGlScott added inline comments.Nov 29 2018, 7:00 PM
/testsuite/tests/partial-sigs/should_compile/DataFamilyInstanceLHS.stderr
9

Similarly, my improvements in 014d6c1f08808c4dab6cba80efdc634527d91086 appear to have been washed away here.

16

I deliberately wanted to avoid this sort of thing (having a wildcard appear on the RHS) from happening in 014d6c1f08808c4dab6cba80efdc634527d91086.

simonpj added inline comments.Nov 30 2018, 7:10 AM
/testsuite/tests/indexed-types/should_fail/T14179.stderr
10

It's still there but more accurate.

Now it reports the instances exactly as you wrote them, in this case with no variables before the "::". If you has

data family T :: Type -> Type -> Type -> Type -> Type

data instance T Int where ...
data instance T Int where ...

then the conflicting instances will be reported as T Int vs T Int, not T Int a b c d vs T Int a b c d, which you dind't write.

I think it's quite good actually :-).

/testsuite/tests/partial-sigs/should_compile/DataFamilyInstanceLHS.stderr
9

This is debug printing, showing the ground truth about the axiom. That's important for debugging, and I deliberately made it so! I think the GHCI :info should be ok though?

16

Ah, that is true, and I thought I'd fixed that. Thanks. I wonder what happens if there are two parameters!!

I'm also not sure that it makes sense to show *both* "COERCION AXIOMS" and "FAMILY INSTANCES" in this debug print. What do you think?

RyanGlScott added inline comments.Nov 30 2018, 7:56 AM
/testsuite/tests/indexed-types/should_fail/T14179.stderr
10

Very well.

/testsuite/tests/partial-sigs/should_compile/DataFamilyInstanceLHS.stderr
9

OK. If you're happy with it, then I suppose I am too. I just wanted to make sure that you were aware of the change.

16

Ah, that is true, and I thought I'd fixed that. Thanks. I wonder what happens if there are two parameters!!

It uses multiple wildcards, distressingly enough. If you have:

{-# LANGUAGE GADTs #-}
{-# LANGUAGE TypeFamilies #-}
{-# OPTIONS_GHC -ddump-types #-}
module Bug where

data family Foo a b
data instance Foo _ _ where
  MkFoo :: Foo a b

Then compiling this yields:

TYPE CONSTRUCTORS
  type role Foo nominal nominal
  Foo{2} :: * -> * -> *
COERCION AXIOMS
  axiom Bug.D:R:Foo__0 :: Foo = Bug.R:Foo__
DATA CONSTRUCTORS
  MkFoo :: forall a b. Foo a b
FAMILY INSTANCES
  data instance Foo _ _ = Bug.R:Foo__ _ _ -- Defined at Bug2.hs:7:15
Dependent modules: []
Dependent packages: [base-4.12.0.0, ghc-prim-0.5.3,
                     integer-gmp-1.0.2.0]

I'm also not sure that it makes sense to show *both* "COERCION AXIOMS" and "FAMILY INSTANCES" in this debug print. What do you think?

As you've pointed out previously, this is purely debugging output, so I don't really have a strong preference here. I suppose there is a difference in the way coercion axioms and family instances are displayed now, since in the former, you eta reduce, but in the latter, you don't. But that's a rather small difference.

If we do end up removing one, I think I'd be in favor of keeping the one that eta expands its variables, if only because that's closer to what the user actually wrote.