Injective type families
ClosedPublic

Authored by jstolarek on Sep 9 2014, 9:10 AM.

Details

Summary

This ticket is about implementing injective type families. See Trac #6018 and https://ghc.haskell.org/trac/ghc/wiki/InjectiveTypeFamilies

Test Plan

./validate

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
There are a very large number of changes, so older changes are hidden. Show Older Changes
jstolarek updated this revision to Diff 2526.Mar 24 2015, 11:22 AM
jstolarek updated this revision to Diff 2527.Mar 24 2015, 11:34 AM
  • Update User's Guide

I managed to squash the bugs, or at least their symptoms. I'm not too confident about my implementation so I'd like to ask for a careful review of getInjectivityList - only six lines of code. See inline comment.

Another thing worth reviewing would be the User's Guide entry for injectivity. I feel the wording is awkward. Perhaps someone who is a native speaker could suggest how to improve it.

compiler/hsSyn/HsDecls.hs
681–686

I'm not too confident with this. From my investigation it looks that:

  • type variables in the RHS of injectivity condition are the same type variables as the ones declared in type family head (ie. they have the same uniques)
  • but this is not always the case for kind variables - sometimes they have the same uniques and sometimes not (see my comment from March 20).

As a solution I decided to rely on stableNameCmp for comparing names. The problem is that NameSet and friends are all based on instances of Ord and Eq and these are based on uniques. The result is an ugly quadratic algorithm operating on list. In practice it shouldn't be harmful but I feel that this might only be working around a bug somewhere else in the code.

docs/users_guide/glasgow_exts.xml
6951–6952

I feel this wording is very awkward. How to improve it?

jstolarek updated this revision to Diff 2629.Apr 2 2015, 10:56 AM

Rebased on today's HEAD. Patch validates with only 3 failures:

  • one expected failure in injectivity implementation
  • two LLVM failures that are mostl likely the result of too old LLVM version on my machine. Sadly, my Linux is too old and I can't upgrade to supported LLVM version
jstolarek updated this revision to Diff 2834.Apr 23 2015, 2:25 AM
  • Refactor FamInstEnv.findBranch to create apartnessCheck.
  • Fix comment typos
  • Proper fix for Trac #123
  • Fix Trac #128 by doing apartness check
  • Comments, code reorganization
  • Document Trac #129

Richard, some time ago you showed me this code that might cause infinite loop if my implementation does the wrong thing:

data T a = MkT (F a)
type family F a where
  F (T a) = a
  F (T Int) = Bool

I believe I have fixed it back then but I just discovered that the problem is still present, perhaps after recent rebase. The previous fix went into TcTyClsDecls.tcTyClGroup and was about adding lazy pattern match on rec_tyclss on this line:

; tyclss <- fixM $ \ ~rec_tyclss -> do

I vaguely recall discussing some alternative solution. I think it was more difficult to implement so we went with this one because it worked. Do you remember what that alternative solution was?

I vaguely recall discussing some alternative solution. I think it was more difficult to implement so we went with this one because it worked. Do you remember what that alternative solution was?

I haven't a clue -- sorry. But what's going wrong?

I vaguely recall discussing some alternative solution. I think it was more difficult to implement so we went with this one because it worked. Do you remember what that alternative solution was?

I haven't a clue -- sorry. But what's going wrong?

The observable problem is that compilation does not finish but freezes. Most likely reason is too eager evaluation inside a knot.

jstolarek updated this revision to Diff 2837.Apr 23 2015, 10:35 AM
  • Consider infinite unification in injectivity check
  • Test mutual recursion betwen TFs and data types

I just had a moment of enlightenment and realized what is the cause of compiler going into a loop: dropping overlapped branches in TcTyFamDecls.tcFamDecl1. Removing this line:

branches' <- dropDominatedAxioms tc_name kind branches

and renaming subsequent use of branches' to branches makes the problem go away. I looked at implementation of TcTyClsDecls.dropDominatedAxioms but I can't see where things go wrong. I can only guess it is somewhere in tcMatchTys called from isDominatedBy. Richard, do you see a way out of this? I'd rather not give up on dropping unreachable equations.

I just had a moment of enlightenment and realized what is the cause of compiler going into a loop: dropping overlapped branches in TcTyFamDecls.tcFamDecl1. Removing this line:

branches' <- dropDominatedAxioms tc_name kind branches

and renaming subsequent use of branches' to branches makes the problem go away. I looked at implementation of TcTyClsDecls.dropDominatedAxioms but I can't see where things go wrong. I can only guess it is somewhere in tcMatchTys called from isDominatedBy. Richard, do you see a way out of this? I'd rather not give up on dropping unreachable equations.

Urgh. This is a tough one to fix.

The problem is that, in much of TcTyClsDecls, the code is executed "inside the knot". That is, the TyCons that represent the mutually recursive types being defined are still unformed, and looking at them causes GHC to hang. There is much delicate dancing that this fact causes. In particular, you can't look at any result of zonkTcTypeToType, because that function looks at TyCons. (You can still call the function inside the knot because of laziness. You just can't examine the result.) (The result of zonkTcType is OK, though, because zonkTcType is very careful never to look at a TyCon.) The types in the branches are indeed zonked, so isDominatedBy examines zonked types, causing GHC to hang.

The only way I can see to fix this is to change tcTyFamInstEqn to take as input some representation of the type-space that has already been taken up. (That is, if an equation follows equations for Int and Maybe a, then we need to know that those points in the "type-space" are already claimed.) Then, you can do the domination check earlier. But you must be careful never to look at a TyCon! It all sounds far too complicated to be worth it, to me.

Perhaps Simon has another approach, but I don't think much effort should be spent in ironing out this bizarre corner case which should always be avoidable by programmers.

austin requested changes to this revision.May 10 2015, 6:24 PM

(Moving to Request Changes, since it seems this is getting there with a bit more work!)

This revision now requires changes to proceed.May 10 2015, 6:24 PM
jstolarek updated this revision to Diff 2945.May 15 2015, 6:55 AM
  • Function renaming
  • Implementation of U
  • Debugging U, adding tests
jstolarek planned changes to this revision.May 15 2015, 7:26 AM

I've run into some problems when implementing the algorithm U and injectivity check we described in the paper. Here's a definition that gets rejected though it should be accepted:

type family G2 a b = r | r -> a b
type instance G2 a    Bool = (a, a)
type instance G2 Bool b    = (b, Bool)

The implementation of algorithm U is done by Unify.tcUnifyTyWithTFs. I added some traces to see what's going on (not present in the uploaded code). It looks like I'm not constructing the substitution correctly. The expected result is [b |-> a, a |-> Bool]. In FamInstEnv.injectiveBranches I added a trace to display the created substitution and it looks like this:

[TvSubst In scope: InScope [] Type env: [alQ :-> a, alY :-> Bool]]

Clearly, that a in the RHS of first substitution is wrong. Help?

Also, I don't know how to freshen variables in the equation. I looked through the code and found TcMType.freshenTyVarBndrs. Should I use this?

I've run into some problems when implementing the algorithm U and injectivity check we described in the paper. Here's a definition that gets rejected though it should be accepted:

type family G2 a b = r | r -> a b
type instance G2 a    Bool = (a, a)
type instance G2 Bool b    = (b, Bool)

The implementation of algorithm U is done by Unify.tcUnifyTyWithTFs. I added some traces to see what's going on (not present in the uploaded code). It looks like I'm not constructing the substitution correctly. The expected result is [b |-> a, a |-> Bool].

No. It should be [b |-> Bool, a |-> Bool]. We don't want a to be in the range of the substitution.

In FamInstEnv.injectiveBranches I added a trace to display the created substitution and it looks like this:

[TvSubst In scope: InScope [] Type env: [alQ :-> a, alY :-> Bool]]

Clearly, that a in the RHS of first substitution is wrong. Help?

This substitution looks exactly like the substitution you said you wanted. (Note that the domain elements are Uniques -- the as that appear are utterly unrelated to the name of the type variable a.)

Also, I don't know how to freshen variables in the equation. I looked through the code and found TcMType.freshenTyVarBndrs. Should I use this?

I don't think you need to worry about freshening (except in the compare-equation-against-itself case). All FamInstEnv tyvars are already freshened. See FamInst.newFamInst.

No. It should be [b |-> Bool, a |-> Bool]. We don't want a to be in the range of the substitution.

Ok, so this is not a bug in the implementation but in my understanding of U. Once again, the equations:

type instance G2 a    Bool = (a, a)
type instance G2 Bool b    = (b, Bool)

RHS of second equation will be the first argument to U, RHS of first will be the second (counter-intuitive, I know, but that's how it is implemented and it shouldn't really matter). Here's how I see what should happen (numbers in parens refer to equation of U in the paper by which we reduce):

U( (b, Bool), (a, a) ) []

 = (5)

U ( (,), (,) ) [] >>= U (b, a) >>= U (Bool, a)

 = (6)

U (b, a) [] >>= U (Bool, a)

 = (3)

U (Bool, a) [ b |-> a ]

 = (4)

U (a, Bool) [ b |-> a ]

 = (3)

[ b |-> a, a |-> Bool ]

From you saying we want [ b |-> Bool, a |-> Bool ] I'm guessing we want an idempotent substitution. I'd guess the right thing to do now is use Unify.niFixTvSubst on the substitution created by U. Did I got things right here? (Except for the types - tcUnifyTyWithTFs returns Maybe TvSubst but niFixTvSubst want TvSubstEnv.)

I don't think you need to worry about freshening (except in the compare-equation-against-itself case).

Yes, it's the "except" part that gets me worried.

From you saying we want [ b |-> Bool, a |-> Bool ] I'm guessing we want an idempotent substitution.

Yes, and you're right about the algorithm in the paper not creating such a substitution. I've tweaked the paper; it should be better now.

I'd guess the right thing to do now is use Unify.niFixTvSubst on the substitution created by U. Did I got things right here? (Except for the types - tcUnifyTyWithTFs returns Maybe TvSubst but niFixTvSubst want TvSubstEnv.)

niFixTvSubst will probably work for you, yes. It's more efficient than what I wrote in the paper.

I don't think you need to worry about freshening (except in the compare-equation-against-itself case).

Yes, it's the "except" part that gets me worried.

I think freshenTyVars will be hard to call here. Use uniqAway somehow?

jstolarek updated this revision to Diff 2973.May 20 2015, 10:23 AM
  • Create an idempotent substitution with U
jstolarek updated this revision to Diff 2980.May 22 2015, 8:23 AM
  • Work in progress on implementing definition U
  • Error messages tweaks
jstolarek planned changes to this revision.May 22 2015, 9:26 AM

More problems. A quick recap: in our e-mail correspondence I reported problems with implementing a special case for "bare variable in the RHS" restriction. Id is the only function that, according to the paper, is allowed to have a bare variable in the RHS. Sadly, kind polymorphism trips things up. This definition is rejected

type family Id (a :: k) = (result :: k) | result -> a
type instance Id a = a

Id should be accepted as a special case, but internally Id has two arguments and there is no way
to tell which type variables are implicit kind arguments and which are explicit type arguments. So internally Id looks like this:

type instance Id k a = a

Since it looks like having two patterns it is rejected.

As a solution I decided to implement liberal "bare variable in the RHS" condition, which says that bare type variable in the RHS is permitted if all LHS patterns are bare variables. Sadly, this does not work if kinds are not polymorphic - input arguments are not kind variables but concrete types. For example if I say:

type family Fa (a :: k) (b :: k) = r | r -> k
type instance Fa a b = a

the LHS patterns are [*, a, b] and Fa is rejected as violating the injectivity annotation. I don't see a way out. Help?

For now, just reject Id too. Getting everything else working is plenty for now, and we can come back to it. I can live without Id.

jstolarek updated this revision to Diff 2981.May 22 2015, 9:35 AM
  • Fix pre-unification with constructors
  • WIP: Liberal "bare variable in RHS" restriction

Actually, we're in pretty good shape, as almost everything works like it should. Aside from mentioned problem with Id one thing that does not yet work is this:

type family F a = r | r -> a
type instance F a = Maybe (G a)

If G is not injective then F should be rejected but it is accepted. One simple way of implementing this would be to test equation against itself just like described in the paper. But that means horrible error messages so I would like to implement a separate check. This means traversing the Type on the RHS to find injective variables that don't appear on injective positions.

As a solution I decided to implement liberal "bare variable in the RHS" condition, which says that bare type variable in the RHS is permitted if all LHS patterns are bare variables.

I approve.

Sadly, this does not work if kinds are not polymorphic - input arguments are not kind variables but concrete types. For example if I say:

type family Fa (a :: k) (b :: k) = r | r -> k
type instance Fa a b = a

the LHS patterns are [*, a, b] and Fa is rejected as violating the injectivity annotation. I don't see a way out. Help?

Fa is not injective, so the behavior is correct: Fa (Fa Maybe IO) Bool ~ Fa Maybe IO by reduction. But then the injectivity condition would give us * ~ (* -> *), which is terrible. So the implementation is doing the right thing here.

As a solution I decided to implement liberal "bare variable in the RHS" condition, which says that bare type variable in the RHS is permitted if all LHS patterns are bare variables.

I approve.

Good. This means we need to update the paper. Are you planning on doing this, or should I do that tomorrow morning?

Sadly, this does not work if kinds are not polymorphic - input arguments are not kind variables but concrete types. For example if I say:

type family Fa (a :: k) (b :: k) = r | r -> k
type instance Fa a b = a

the LHS patterns are [*, a, b] and Fa is rejected as violating the injectivity annotation. I don't see a way out. Help?

Fa is not injective, so the behavior is correct: Fa (Fa Maybe IO) Bool ~ Fa Maybe IO by reduction. But then the injectivity condition would give us * ~ (* -> *), which is terrible. So the implementation is doing the right thing here.

Ah, I suspected this might be the case bit couldn't come up with a counter-example.

jstolarek updated this revision to Diff 3195.Jun 11 2015, 10:45 AM

This patch implements Definition 10 from the paper, so improvements now work in the presence of type families in the RHS. But the implementation is shabby and needs fixing. Most importantly I didn't know how to generate new unification variables inside a TcS monad. I only found functions for doing this in TcM. So I used tcInstTyVars and converted from TcM to TcS using wrapWarnTcS. I know this is very wrong. Simon, Richard, can you point me to a correct way of doing this? The implementation is in TcInteract on line 1532.

jstolarek added inline comments.Jun 11 2015, 10:46 AM
compiler/typecheck/TcInteract.hs
1532

This needs fixing.

jstolarek planned changes to this revision.Jun 11 2015, 10:47 AM

Most importantly I didn't know how to generate new unification variables inside a TcS monad. I only found functions for doing this in TcM. So I used tcInstTyVars and converted from TcM to TcS using wrapWarnTcS. I know this is very wrong. Simon, Richard, can you point me to a correct way of doing this? The implementation is in TcInteract on line 1532.

How about newFlexiTcSTy?

simon

Looks like instFlexiTcS is what I'm looking for. Thanks!

I think the only missing piece of the implementation is the extension to Core.

jstolarek updated this revision to Diff 3279.Jun 19 2015, 2:32 AM
jstolarek removed rGHC Glasgow Haskell Compiler as the repository for this revision.

Minor comment updates.

jstolarek updated this revision to Diff 3575.Jul 17 2015, 3:49 AM

Rebase against HEAD

There are some unexpected validation failures at the moment:

polykinds                 T6068 [bad stderr] (ghci)
polykinds                 T7128 [exit code non-0] (normal)
polykinds                 T7230 [stderr mismatch] (normal)
polykinds                 T7973 [exit code non-0] (normal)
typecheck/should_fail     tcfail220 [exit code 0] (normal)
bgamari requested changes to this revision.Jul 17 2015, 4:18 AM

@jstolarek, for the record Harbormaster actually appears to be failing to even build the compiler due to a kind mismatch in CoAxiom.

This revision now requires changes to proceed.Jul 17 2015, 4:18 AM

Thanks for drawing my attention to this - I tend to ignore harbormaster build failures because they always fail for this patch due to submodules. I suppose the problem might be with too old GHC version used by Harbormaster. The patch *should* build with 7.8 (I have not tested it yet) but most likely it should fail for GHC 7.6.

I spent time today trying to figure out what might be causing failures after a rebase. Sadly, with no result. I will need help with these I'm afraid.

I also can't figure out the cause of build failure with Harbormaster. Austin says that Harbormaster uses GHC 7.8.4 and I verified on my machine that the patch successfully builds with 7.8.

I also can't figure out the cause of build failure with Harbormaster. Austin says that Harbormaster uses GHC 7.8.4

I have reasons to believe Harbormaster is still on 7.6. D904 is still failing because of "Unsupported extension: RoleAnnotations", which was added in 7.8.

@jstolarek for what it's worth hopefully harbormaster's submodule woes will be resolved soon. Indeed I suspect that the Harbormaster build machine is still running 7.6.

@jstolarek, indeed the build machine was still running GHC 7.6. This has now been fixed although it looks like this changes fails validation for a new reason. Perhaps you could have a look?

jstolarek updated this revision to Diff 3689.Jul 28 2015, 7:16 AM
jstolarek removed rGHC Glasgow Haskell Compiler as the repository for this revision.
jstolarek set the repository for this revision to rGHC Glasgow Haskell Compiler.
jstolarek planned changes to this revision.Jul 29 2015, 3:13 AM
jstolarek updated this revision to Diff 3697.Jul 29 2015, 4:16 AM

Documentation updated.

jstolarek planned changes to this revision.Jul 29 2015, 4:16 AM

@jstolarek, any updates here?

At the moment I'm focusing on preparing a presentation on ITFs for the next week's Haskell Symposium so I'm not working on the code. But I hope the implementation gets finished Real Soon Now.

jstolarek updated this revision to Diff 4038.Sep 2 2015, 9:02 PM

Rebase against HEAD

This revision was automatically updated to reflect the committed changes.

@jstolarek, it seems that this has introduced a few testsuite failures,

Unexpected results from:
TEST="T9838 T10307 tcfail220 T2435"

Unexpected failures:
   ghc-api/annotations    T10307 [bad exit code] (normal)
   polykinds              T9838 [exit code non-0] (normal)
   rename/should_compile  T2435 [exit code non-0] (normal)
   typecheck/should_fail  tcfail220 [exit code 0] (normal)

Do you think you could take a look at these when you get a chance?

@jstolarek, actually it looks like @thomie has already worked around these (see D1208).

thomie added a comment.Sep 3 2015, 7:43 PM

@jstolarek: can you have a look at the following?

These tests are failing on Travis, starting from the commit for D202:

indexed-types/should_fail  T5439 [stderr mismatch] (normal)
th                         T6018th [stderr mismatch] (normal)
typecheck/should_compile   T5490 [exit code non-0] (normal)

The Travis build settings are in the file .travis.yml. Notably:

- echo 'DYNAMIC_GHC_PROGRAMS = NO' >> mk/validate.mk
- echo 'GhcLibWays = v' >> mk/validate.mk
- echo 'GhcStage2HcOpts += -DDEBUG' >> mk/validate.mk

Here are the test errors, from this log: https://s3.amazonaws.com/archive.travis-ci.org/jobs/78514905/log.txt

Compile failed (status 256) errors were:
ghc-stage2: panic! (the 'impossible' happened)
  (GHC version 7.11.20150903 for x86_64-unknown-linux):
	ASSERT failed! file compiler/typecheck/TcInteract.hs, line 1381

Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

*** unexpected failure for T5490(normal)
Actual stderr output differs from expected:
--- ./th/T6018th.stderr.normalised	2015-09-03 04:58:35.965973269 +0000
+++ ./th/T6018th.comp.stderr.normalised	2015-09-03 04:58:35.969973302 +0000
@@ -1,4 +1,3 @@
-| r_0 -> a_1
 
 T6018th.hs:98:4:
     Type family equations violate injectivity annotation:
*** unexpected failure for T6018th(normal)
Actual stderr output differs from expected:
--- ./indexed-types/should_fail/T5439.stderr.normalised	2015-09-03 04:54:51.184120403 +0000
+++ ./indexed-types/should_fail/T5439.comp.stderr.normalised	2015-09-03 04:54:51.184120403 +0000
@@ -1,25 +1,5 @@
+ghc: panic! (the 'impossible' happened)
+  (GHC version 7.11.20150903 for x86_64-unknown-linux):
+	ASSERT failed! file compiler/typecheck/TcInteract.hs, line 1381
 
-T5439.hs:82:28:
-    Couldn't match type ‘Attempt (HElemOf rs)’
-                   with ‘Attempt (HHead (HDrop n0 l0)) -> Attempt (HElemOf l0)’
-    Expected type: f (Attempt (HNth n0 l0) -> Attempt (HElemOf l0))
-      Actual type: f (Attempt (WaitOpResult (WaitOps rs)))
-    Relevant bindings include
-      register :: Bool -> Peano n -> WaitOps (HDrop n rs) -> IO Bool
-        (bound at T5439.hs:64:9)
-      ev :: f (Attempt (WaitOpResult (WaitOps rs)))
-        (bound at T5439.hs:61:22)
-      ops :: WaitOps rs (bound at T5439.hs:61:18)
-      registerWaitOp :: WaitOps rs
-                        -> f (Attempt (WaitOpResult (WaitOps rs))) -> IO Bool
-        (bound at T5439.hs:61:3)
-    In the first argument of ‘complete’, namely ‘ev’
-    In the expression: complete ev
-
-T5439.hs:82:39:
-    Couldn't match expected type ‘Peano n0’
-                with actual type ‘Attempt α0’
-    In the second argument of ‘($)’, namely
-      ‘Failure (e :: SomeException)’
-    In the second argument of ‘($)’, namely
-      ‘inj $ Failure (e :: SomeException)’
\ No newline at end of file
+Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
\ No newline at end of file
*** unexpected failure for T5439(normal)

I'm surprised to see these - validate passed on my machine. Assert that trips T5439 and T5490 is actually in the code fragment implemented by @simonpj so it is very likely that he will have to fix this. T6018th should have a simple fix but I have no idea why that test passed on my machine and is failing here. I'll look into it later this evening.

alanz added a comment.Sep 3 2015, 8:22 PM

I am currently running validate on my machine to see what is happening with
the api-annotions one

@bgamari, where do these testsuite failures come from? Harbormaster seems to be happy and Travis reports different failures. I'm confused. Were they indeed fixed by @thomie?

@alanz, thanks for looking into this.

alanz added a comment.Sep 3 2015, 9:48 PM

My tests pass fine, phab's tests pass fine, and the travis ghc ones do too
wrt api-annotations.

So I am not sure there is actually a problem with

I can reproduce on my machine failures reported by @thomie. I fixed T6018th but I have to leave the other two to @simonpj

I have fixed the other two in HEAD

Thanks! But it also looks that one of the recent commits also broke another test:

Actual stderr output differs from expected:
--- ./indexed-types/should_fail/BadSock.stderr.normalised	2015-09-11 16:55:54.720859677 +0000
+++ ./indexed-types/should_fail/BadSock.comp.stderr.normalised	2015-09-11 16:55:54.720859677 +0000
@@ -1,5 +1,5 @@
+ghc: panic! (the 'impossible' happened)
+  (GHC version 7.11.20150911 for x86_64-unknown-linux):
+	ASSERT failed! file compiler/typecheck/TcFlatten.hs, line 1041

-BadSock.hs:30:5:
-    The type family ‘Readable’ should have 1 argument, but has been given none
-    In the equations for closed type family ‘Foo’
-    In the type family declaration for ‘Foo’
\ No newline at end of file
+Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug
\ No newline at end of file
*** unexpected failure for BadSock(normal)