Allow GeneralizedNewtypeDeriving for classes with associated type families
ClosedPublic

Authored by RyanGlScott on Oct 25 2016, 12:46 PM.

Details

Summary

This implements the ability to derive associated type family instances
for newtypes automatically using GeneralizedNewtypeDeriving. Refer to the
users' guide additions for how this works; I essentially follow the pattern
laid out in https://ghc.haskell.org/trac/ghc/ticket/8165#comment:18.

Fixes Trac #2721 and Trac #8165.

Test Plan

./validate

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
RyanGlScott retitled this revision from to Allow GeneralizedNewtypeDeriving for classes with associated type families.
RyanGlScott updated this object.
RyanGlScott edited the test plan for this revision. (Show Details)
RyanGlScott updated the Trac tickets for this revision.
RyanGlScott added inline comments.
compiler/typecheck/TcDeriv.hs
382

@mpickering, do you mind giving this Note a read-through to see if it makes sense? We discussed this on IRC a while back when working on Trac #12144 (turns out I needed this trick here as well), and I want to make sure others can follow along, since the details are a bit tricky.

goldfire edited edge metadata.Oct 25 2016, 2:22 PM

See comments below. Good Note about staging!

compiler/typecheck/TcDeriv.hs
233

Why is this change necessary? I would imagine that ditching the old behavior would lead to some failures where we didn't have them before, where a standalone-deriving instance allows GHC to simplify an "exotic" context to a mundane one.

EDIT: OK. Now I see why this is necessary. But it still looks like it could cause trouble.

EDIT: Ah. I do believe I understand. Because you're not calling simplifyInstanceContexts until a bit later, it's all OK. Plausible.

242

This is icky. You need the suspended computation only for the infer_specs, and yet you have to do this back-twisting work for the given_specs. Could you have two wrappers for genInst, one that deals with known constraints and one with unknown ones?

263

This is icky, too! At this point, we really only care about the ds_theta fields of the specs. Right?

Arrange for simplifyInstanceContexts just to spit out the [ThetaType] it finds. Then you can just use zipWithM ($) instead of apply_inst_infos.

If you remove the ds_theta fields from a DerivSpec, then you can also easily use zipWithM ($) to sort out the given_specs, above.

382

Note makes sense to me.

425

missing "have"

440

This is really continuation-returning style.

1317

Are quite sure that nullary classes can't end up here? If so, why?

1413

s/C a/(C a)/

1417

We don't know that will be the context, which is inferred. Actually, with the method-free C, you'd get no context.

1421

We could relax this constraint, actually. Here's the example:

class C a b where
  type F a
  -- other stuff

-- other stuff

newtype N = MkN τ deriving (C σ)

For GND to work at all, there must exist an instance C σ τ. It has some definition for type F σ = ρ. Just use the same one in the instance for C σ N! Type family instance are allowed to overlap, as long as the right-hand sides are coincident. That's precisely what would happen here. We'd need to handle (and document) the last-class-variable-mentioned case separately from the last-class-variable-not-mentioned case, but we needn't just give up.

compiler/typecheck/TcGenDeriv.hs
1670

Shouldn't in_scope also contain the tvs in cls_tys? I'm surprised ASSERTs in TyCoRep don't fire.

1673

The rhs_ty is eta-reduced. I think this will be wrong if eta-reduction is in play.

1716–1717

Why not just put this in Util?

testsuite/tests/deriving/should_compile/T8165.hs
5

Be sure also to test an eta-reduced newtype, like newtype N a = MkN (Either Int a).

goldfire requested changes to this revision.Oct 25 2016, 2:22 PM
goldfire edited edge metadata.
This revision now requires changes to proceed.Oct 25 2016, 2:22 PM
mpickering added inline comments.Oct 25 2016, 2:41 PM
compiler/typecheck/TcDeriv.hs
382

Good note.

simonpj accepted this revision.Oct 25 2016, 3:53 PM
simonpj edited edge metadata.

I trust you two on this. The basic idea seems sound.

Simon

RyanGlScott updated this revision to Diff 9153.Oct 25 2016, 3:57 PM
RyanGlScott edited edge metadata.
RyanGlScott marked 5 inline comments as done.
  • RAE's suggestions
RyanGlScott added inline comments.Oct 25 2016, 3:57 PM
compiler/typecheck/TcDeriv.hs
242

Could you have two wrappers for genInst, one that deals with known constraints and one with unknown ones?

I suppose I could have genInstInfer and genInstGiven, but the only difference would be whether you invoke apply_inst_infos on the first field of the triple or not. I'm not sure how that makes it less "icky".

263

If you remove the ds_theta fields from a DerivSpec, then you can also easily use zipWithM ($) to sort out the given_specs, above.

...is this not what apply_inst_infos is already doing?

apply_inst_infos :: [ThetaType -> TcM (InstInfo RdrName)]
                 -> [DerivSpec ThetaType] -> TcM [InstInfo RdrName]
apply_inst_infos = zipWithM (\f ds -> f (ds_theta ds))
1317

Yes. If you have a class with no type variables:

class Empty

Then you can't derive an Empty instance for a newtype:

newtype Foo = MkFoo Int deriving Empty
• ‘Empty’ is not a unary constraint, as expected by a deriving clause
• In the newtype declaration for ‘Foo’

And you can't standalone-derive it either:

deriving instance Empty Foo
• Expecting one fewer arguments to ‘Empty’
  Expected kind ‘* -> Constraint’, but ‘Empty’ has kind ‘Constraint’
• In the stand-alone deriving instance for ‘Empty Foo’

So you shouldn't be able to reach this code with a nullary class.

1417

We don't know that will be the context, which is inferred. Actually, with the method-free C, you'd get no context.

That's not true, though? GND always wraps the class around the representation type and simplifies, and C a x is what you get. (I even ran this program through GHC with -ddump-deriv, and it gives the same thing.)

If you didn't have the C a x context, then the associated type family instances's RHS (were it to make any sense in this example) wouldn't reduce.

1421

If I understand your proposal, you'd want this code:

class C a b where
  type F a

instance C Bool Int where
  type F Bool = ()

newtype N = MkN Int deriving (C Bool)

to generate the following code?

instance C Bool N where
  type F Bool = ()

I can see this working for monomorphic types, but what happens in this case?

newtype N2 a = MkN2 a deriving (C Bool)

Then you'd have something like:

instance C Bool a => C Bool (N2 a) where
  type F Bool = ??

We don't have the C Bool a instance information a priori, so I don't know what we'd fill in for ???.

compiler/typecheck/TcGenDeriv.hs
1670

Let me first state that I adopted this substitution code pretty much verbatim from mkCoerceClassMethEqn, which you wrote :)

It turns out that inst_tvs is somewhat misleadingly named, now that I went back and looked at where it comes from. It comes from the definition of mkNewTypeEqn:

dfun_tvs = tyCoVarsOfTypesWellScoped inst_tys
inst_tys = cls_tys ++ [inst_ty]

And dfun_tvs is what is referred to as inst_tvs here, so it turns out that "inst_tvs" actually contains the type variables in cls_tys too. We should probably give this another name... cls_and_inst_tvs?

1673

Hm... again, this is exactly what happens in mkCoerceClassMethEqn. Moreover, it seems to do the right thing—I added a test for newtype N a = MkN (Either Int a), which eta-reduces, and it works out fine.

I also tried printing out what rhs_ty is at this moment in time, and GHC told me it's Either Int a_a14A, which would seem to suggest it's not eta-reduced. Why, I cannot say... but if it works for mkCoerceClassMethEqn, it works for me :)

goldfire added inline comments.Oct 25 2016, 8:07 PM
compiler/typecheck/TcDeriv.hs
263

I was proposing something more radical. The ds_theta field isn't a great fit for a DerivSpec. I'm suggesting to remove it (and the type parameter to DerivSpec) and instead have the InferTheta and GivenTheta constructors for EarlyDerivSpec take extra fields instead.

genInst continues to take a DerivSpec, just as now. But there would be another function, defined right afterward genInstGivenTheta :: DerivSpec -> ThetaType -> TcM (InstInfo RdrName, ...) that calls genInst and does the application. I see what you mean about this being almost as icky, but by having the application done in a function defined with genInst -- instead of at the other end of the module -- makes it less troublesome (to me).

With this design, your apply_inst_infos gets type [ThetaType -> TcM (InstInfo RdrName)] -> [ThetaType] -> TcM [InstInfo RdrName] and really can be zipWithM ($). I agree with the decision to use "continuation-returning style" here, but hiding the entrails as much as possible seems like a step in the right direction.

1421

Bah. Good point.

compiler/typecheck/TcGenDeriv.hs
1670

Ah. OK. No, I don't think inst_tvs needs to be renamed. Just clarify the comment. I'm satisfied here. (And just because you copied it from me does not mean it's correct!)

1673

Seems the "already eta-reduced" bit is wrong then. Poking around the code verifies this claim. Remove that from the commentary.

RyanGlScott marked 17 inline comments as done.
RyanGlScott edited edge metadata.
  • Documentation issues
RyanGlScott added inline comments.Oct 28 2016, 10:12 AM
compiler/typecheck/TcDeriv.hs
263

Hm. I certainly could remove ds_theta from DerivSpec, but there's an issue in that DerivSpec also contains a ds_tvs field intended to bind all of the type variables in ds_theta. If ds_theta is removed, then ds_tvs might be scoping over variables that aren't mentioned anywhere else in the DerivSpec!

I suppose I could factor out both ds_tvs and ds_theta into another datatype:

data DerivSpecTheta theta = DST { dst_tvs :: [TyVar], dst_theta :: theta, dst_spec :: DerivSpec }

But now I'd have to plumb around the tyvars in the same way I have to plumb around the theta in tcDeriving. This would mean, for instance, that genInst would now return TcM (ThetaType -> [TyVar] -> TcM (InstInfo RdrName), ...).

I don't know if I like the sound of that, to be honest.

I'll relent on the refactoring point. Are there other outstanding issues?

compiler/typecheck/TcDeriv.hs
263

Perhaps you're right. In any case, this is a minor refactoring issue that can always be returned to later. Let's not let it hold up the rest of this good work.

RyanGlScott updated this revision to Diff 9310.Nov 5 2016, 1:21 PM
RyanGlScott marked 7 inline comments as done.
RyanGlScott edited edge metadata.

Rebase on top of master

I'll relent on the refactoring point. Are there other outstanding issues?

I believe I've addressed all of your concerns.

There was one minor thing I discovered recently. You brought up an interesting point in https://phabricator.haskell.org/D2636#inline-21759:

Actually, with the method-free C, you'd get no context.

As I noted in https://phabricator.haskell.org/D2636#inline-21769, even if you derived a class C with no methods or associated classes whatsoever with GND, GHC will still stick a C context onto the derived instance. That made me wonder: just under what circumstances would GHC deduce that a GND-derived context is redundant? To my horror, I found out even with a class like this:

{-# LANGUAGE GeneralizedNewtypeDeriving, TypeFamilies #-}
class C a where
  type T a

newtype Identity a = Identity a deriving C

If you compile this with -Wredundant-constraints, GHC will generate code like this:

instance C a => C (Identity a) where
  type T (Identity a) = T a

But will emit a warning!

• Redundant constraint: C a
• In the instance declaration for ‘C (Identity a)’

This definitely feels off, because the RHS of T (Identity a) (i.e., T a) definitely relies on a T a instance being available in order to reduce the type family. And the T a instance won't be available unless there's a C a instance! So I don't think the C a constraint is redundant.

But in any case, this might be an orthogonal issue. I've opened Trac #12810 to track this. (There's also the question of what should GND do if we're deriving a class with no methods or associated classes, but I'm not sure what the answer should be.)

This revision was automatically updated to reflect the committed changes.