Exhaustiveness check for EmptyCase (Trac #10746)
ClosedPublic

Authored by gkaracha on Apr 12 2016, 9:46 AM.

Details

Summary

Empty case expressions have strict semantics so the
problem boils down to inhabitation checking for the
type of the scrutinee. 3 main functions added:

  • pmTopNormaliseType_maybe for the normalisation of the scrutinee type
  • inhabitationCandidates for generating the possible patterns of the appropriate type
  • checkEmptyCase' to filter out the candidates that give rise to unsatisfiable constraints.

See Note [Checking EmptyCase Expressions] in Check
and Note [Type normalisation for EmptyCase] in FamInstEnv

Test Plan

validate

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
There are a very large number of changes, so older changes are hidden. Show Older Changes
gkaracha updated the Trac tickets for this revision.
dfeuer requested changes to this revision.Apr 12 2016, 11:11 AM
dfeuer edited edge metadata.

Argh! I forgot about families! I actually think you should reduce them, even at risk of falling down an infinite loop or otherwise exploding. Without this, empty case is completely useless for data families; there is no way to talk about results of data families except by applying them. I don't think the situation is necessarily quite as grim for type families, but I think they should be consistent. When I write an empty case, I'm very explicitly telling GHC that I want it to verify that there are no possibilities. If I didn't care all that much, I'd have written error "The impossible happened!" or some such instead. If the family gets stuck, I want the usual non-exhaustive warning.

This revision now requires changes to proceed.Apr 12 2016, 11:11 AM
In D2105#61458, @dfeuer wrote:

Argh! I forgot about families! I actually think you should reduce them, even at risk of falling down an infinite loop or otherwise exploding. Without this, empty case is completely useless for data families; there is no way to talk about results of data families except by applying them. I don't think the situation is necessarily quite as grim for type families, but I think they should be consistent. When I write an empty case, I'm very explicitly telling GHC that I want it to verify that there are no possibilities. If I didn't care all that much, I'd have written error "The impossible happened!" or some such instead. If the family gets stuck, I want the usual non-exhaustive warning.

I see. I am a bit busy at the moment but I will update the revision as soon as possible.

gkaracha updated this revision to Diff 7369.Apr 25 2016, 9:34 AM
gkaracha edited edge metadata.
  • Reduce type families and newtypes

It is not working properly yet. For a newtype like

newtype T a = MkT a

we get a coercion like (coT : forall a. T a ~ a), which means that
in Core, for any a, T a and a have the same type. Yet, in
source Haskell, for example T Bool and Bool are distinct types
so we do not want to issue for f

f :: T Bool -> ()
f = \case

non-exhaustive warnings about True or False, but rather about
MkT True and MkT False, respectively. This is not fixed yet.

Still a wip implementation. I can make it work, but I can not find an elegant way to do so. For this task we seem to need to

  • reduce type families & newtypes, yet
  • bound the search, and
  • keep the newtype data constructors when reducing.

That is, for the following function

data B = B1 | B2
newtype T a = MkT a

-- non-exhaustive
f :: T1 B -> z
f = \case

we want to report as missing cases MkT1 B1 and MkT1 B2, and not simply B1 and B2, because even though they are interchangeable in Core, in source Haskell they aren't (if the programmer pastes the missing cases, it won't typecheck).

Function topNormaliseTypeX_maybe does indeed reduce what we want, but uses the RecTcChecker in NormaliseStepper only for bounding newtypes and not type family reduction. Additionally, it does not keep track of the zero-cost data constructors which I need for pretty printing the missing cases.

I wrote topNormaliseTypeBoundedX_maybe to bound the type-family reduction (I have not yet adjusted it to keep track of the data cons) but I am hoping there is a cleaner way than this. @simonpj, @goldfire, do you have any pointers/suggestions?

simonpj edited edge metadata.Apr 25 2016, 5:39 PM

I'm a bit slow with all this. See my note on Trac #10746 (comment:25)

bgamari requested changes to this revision.May 11 2016, 6:35 AM
bgamari edited edge metadata.

Bumping out of the review queue while this is discussed.

This revision now requires changes to proceed.May 11 2016, 6:35 AM

@gkaracha, is there any chance that this will be finished for 8.2?

@gkaracha, is there any chance that this will be finished for 8.2?

Hi @bgamari, yes, I think this one is of the easy ones. To be honest I forgot about it after the discussion moved to Trac #10746 but I'll wrap it up for 8.2. I will let you know when the code is ready for reviewing.

gkaracha updated this revision to Diff 10521.Jan 17 2017, 6:28 PM
gkaracha edited edge metadata.

Exhaustiveness check for EmptyCase (Trac Trac #10746)

Summary:
Empty case expressions have strict semantics so the
problem boils down to inhabitation checking for the
type of the scrutinee. 3 main functions added:

  • pmTopNormaliseType_maybe for the normalisation of the scrutinee type
  • inhabitationCandidates for generating the possible patterns of the appropriate type
  • checkEmptyCase' to filter out the candidates that give rise to unsatisfiable constraints.

See Note [Checking EmptyCase Expressions] in Check
and Note [Type normalisation for EmptyCase] in FamInstEnv

I think the patch is now ready for review :)

PS. Harbormaster seems to fail due to perf/compiler/T12234, but it was broken already, as far as I can tell.

dfeuer requested changes to this revision.Jan 18 2017, 10:24 AM
dfeuer edited edge metadata.

I don't see any non-trivial newtype wrappers in the test cases. Could you add some? Some examples:

newtype Foo a = Foo (F a)  -- particularly consider cases where F is a GADT, type family, or data family
newtype Bar f = Bar (f X)  -- f may turn out to be a GADT or a data family
newtype Baz (f :: k -> *) (a :: k) = Baz (f a) -- try applying this to a GADT of kind (*->*)->* and a data family of kind *->*, say.
testsuite/tests/pmcheck/should_compile/EmptyCase003.hs
9

Is this TODO still an issue? If it will remain in place for now, can you expand it to explain what's unfinished?

This revision now requires changes to proceed.Jan 18 2017, 10:24 AM
In D2105#87124, @dfeuer wrote:

I don't see any non-trivial newtype wrappers in the test cases. Could you add some? Some examples:

newtype Foo a = Foo (F a)  -- particularly consider cases where F is a GADT, type family, or data family
newtype Bar f = Bar (f X)  -- f may turn out to be a GADT or a data family
newtype Baz (f :: k -> *) (a :: k) = Baz (f a) -- try applying this to a GADT of kind (*->*)->* and a data family of kind *->*, say.

Thanks @dfeuer, I'll add some tests and update tomorrow. The "TODO" comment was forgotten there, I'll just remove it. Also, @goldfire, I have not used data families that much myself so my examples will mostly be made-up. If you have any real examples you came across (e.g. from the singletons library) with data families and EmptyCase, I could add them as well.

gkaracha updated this revision to Diff 10552.Jan 19 2017, 8:07 PM
gkaracha edited edge metadata.
  • Deleted forgotten TODO in test pmcheck/EmptyCase003
  • Added some more integration tests
gkaracha retitled this revision from Address #10746 (check empty case expressions) to Exhaustiveness check for EmptyCase (Trac #10746).Jan 19 2017, 8:12 PM
gkaracha updated this object.
gkaracha edited edge metadata.

Most of what I understand so far looks good. My main remaining concern is with your overall structure for accumulating conclusions/messages about patterns in general, which isn't really specific to empty case issues.

compiler/deSugar/Check.hs
251–257

What's the difference between a redundant clause and a clause with inaccessible RHS? If this is explained in a note somewhere, that should be referenced here. It seems a tad strange to me to separate the redundant and inaccessible clauses (whatever those things mean) into separate lists, rather than keeping them in source code order. Would that be harder? If so, don't bother.

262

This feels a bit odd. Why not turn this into a Maybe and add a separate field for the type? Are we only interested in the type when we don't know the constructors? Using separate fields should also make it easier to interpret the code; having to look over here to see what Left and Right mean is not so great.

304

What makes this case so special?

let P = e1 in e2  ==== case e1 of ~P -> e2
let !P = e1 in e2 ==== case e1 of P -> e2

To the best of my knowledge, the only difference is that completeness checking for pattern bindings and patterns in lambdas is controlled by a different warning flag. I'd have expected everything else to be the same.

317

It seems really awkward to use the same result type for an individual pattern as for the whole check. This PmResult type feels a bit weird to me. I could be wrong, but I have the feeling you can reshape this to be simpler.

340

I don't see anything detecting that this was improperly called with an empty list.

407

Is there a reason not to push the return into the case branches here?

476

These sound like good reasons.

1201

If I'm not mistaken, runMany is nothing but

runMany :: (Applicative f, Foldable t, Monoid m) => (a -> f m) -> t a -> f m
runMany f = foldr (\x r -> liftA2 mappend (f x) r) (pure mempty)

given an appropriate Monoid instance for PartialResult. But I still have my doubts about your overall accumulation structure.

testsuite/tests/pmcheck/should_compile/EmptyCase010.hs
46

Glad to see a few more interesting newtypes!

I'm sorry but this patch feels vastly over-weight for what it does. Large slabs of new code, all to deal with a corner case. It just doesn't smell right at all.

I'm totally lost about why such an elaborate fix is needed. Would you like to explain, starting from scratch?

compiler/deSugar/Check.hs
458

I don't understand this at all.

  1. In Haskell case expressions are almost always strict. Only ones like case e of x -> rhs are lazy, with a variable (or lazy) pattern, and only one branch.
  1. Why does strictness or otherwise make such a monster difference to the question of exhaustiveness checking?

It can't be this hard!

dfeuer added inline comments.Jan 20 2017, 8:23 AM
compiler/deSugar/Check.hs
458

I was also skeptical initially, but I think I understand the complexity better now. There are (as I understand it) two basic problems that show up in empty case that don't show up elsewhere:

  1. Thanks to type families and data families, the pattern checker may not know, a priori, what type it's dealing with. So it may need to ask the type checker for more information, and needs to deal with situations where no more information is available.
  1. Thanks to the module system, users may not have access to the data constructor of a newtype they're matching on. Say we have
module A (HiddenVoid) where
import Data.Void

newtype HiddenVoid = HiddenVoidD Void
...

module B where
import A

absurdly :: HiddenVoid -> a
absurdly x = case x of
  -- We can't match on `HiddenVoidD` to reveal the `Void` because it's hidden from us!
  -- Uh oh!

So I think some special measures are probably actually justified.

I'm sorry but this patch feels vastly over-weight for what it does. Large slabs of new code, all to deal with a corner case. It just doesn't smell right at all.

I'm totally lost about why such an elaborate fix is needed. Would you like to explain, starting from scratch?

Hi Simon, I wouldn't say that there are large slabs of new code, I think the comments are the ones that make the patch look big. The only interesting/elaborate part is function pmTopNormaliseType_maybe, which is just a variant of topNormaliseType_maybe, which keeps track of the newtype constructors skipped. Function topNormaliseType_maybe normalizes with respect to FC and not source Haskell so I cannot use it and issue warnings with FC types (like representation constructors for data families, or to expose the internal type of a newtype).

For a good example, consider this:

data G a where
  MkG1 :: G Int
  MkG2 :: G Char

newtype T a = MkT (G a)

f :: T Char -> ()
f = \case

what would you issue as missing here? I see several options:

  • we just say that it's non-exhaustive without issuing any specific pattern missing (just _ missing)
  • we only print the missing type (like missing _ :: T Char)
  • we do the complete analysis and print the patterns we checked (MkT MkG2).

I went with the 3rd option, but for that I needed to

  1. keep the newtype data constructor MkT while normalizing the type
  2. keep the newtype type constructor T, because the fact that T a ~ G a is not true in source Haskell (but it's true concerning the operational semantics, the two behave the same)

So, I'd mostly attribute the size of the patch to my attempt to give good warnings.

Concerning the difference between lazy pattern matching and strict, they are in fact very different. For strict languages the biggest problem for exhaustiveness checking is that it corresponds to type inhabitation checking which is undecidable. Our 2015 checker did not have this problem because in Haskell we get inhabitation for free (undefined inhabits everything), or, if there is at least a pattern, we know which data type we are matching against, hence we can use our generate-and-test approach. But with EmptyCase we have no patterns, so we really need to check inhabitation (in the cases we can). Jacques and Jacques talked about the OCaml checker and this challenge last year at the ML workshop, you could take a look if you are interested:

https://www.math.nagoya-u.ac.jp/~garrigue/papers/gadtspm.pdf

I hope this explains a bit the need for a different approach. You can also take a look at the expected outputs in the tests I added, to get a better view of what kind of warnings I'd like us to have.

gkaracha updated this revision to Diff 10559.Jan 20 2017, 9:08 AM
  • Addressed David's comments
gkaracha marked 3 inline comments as done.Jan 20 2017, 9:19 AM
gkaracha added a subscriber: mpickering.
gkaracha added inline comments.
compiler/deSugar/Check.hs
251–257

The difference between redundant and with-inaccessible-rhs is explained in the wiki page (laziness section):

https://ghc.haskell.org/trac/ghc/wiki/PatternMatchCheck

I added a pointer in the code to it.

262

The reason I turned it into an either is because the type is useless in the general case (we already know it). In EmptyCase though, we may end up with a simplified type for the argument but still have no patterns to present as missing to the user. In these cases, I think it is informative at least to print the type of the missing too. E.g.

type family F a where
  F Bool = Int

f1 :: F Bool -> ()
f1 = \case -- this emits as missing: (_ :: Int)
           -- and not (_ :: F Bool)

or

type family G (n :: Nat) (a :: *) :: * where
  G Zero     a = a
  G (Succ n) a = G n a

f2 :: G (Succ (Succ (Succ (Succ Zero)))) Int -> ()
f2 = \case -- this emits as missing: (_ :: Int)
           -- and not (_ :: G (Succ (Succ (Succ (Succ Zero)))) Int)

Still, I agree that the either is maybe a bad choice, I will define it as a separate datatype with domain specific names.

304

checkSingle is not really special. The reason we have both checkSingle and checkMatches is that there are two entry points to the checker, due to the syntactic differences (see deSugar/Match.hs). As you can see, they both call pmcheckI internally, it's just two entry points.

317

If we don't use the same type we will have duplication, which there is no need for I think. The only checkSingle-specific properties we have are:

  • The list of redundant clauses is empty/singleton, and
  • The list of inaccessible clauses is empty/singleton.

Apart from that, the structure of the results is exactly the same with multiple matches, so I see no reason not to reuse the PmResult for both. Do you have a specific alternative in mind?

340

Ah, yes, good point, I added a check.

407

If you think it is more readable, I will. I just didn't want to write it twice.

1201

Ah, I am not sure what you mean here by the overall accumulation structure. I understand that if we ignore the COMPLETE pragma @mpickering is working on, on all other cases we just have a singleton list. Btw, the ListT component of PmM was added by @mpickering, as the first part of Phab:D2669, so I think he can explain better what the expected behavior is here.

testsuite/tests/pmcheck/should_compile/EmptyCase003.hs
9

was just forgotten there, I removed it.

testsuite/tests/pmcheck/should_compile/EmptyCase010.hs
46

Yeah, I think that our examples are sufficient now, thanks for pointing these cases out! :)

bgamari updated this revision to Diff 10657.Jan 23 2017, 9:51 PM
bgamari edited edge metadata.

Rebase

bgamari requested changes to this revision.Jan 27 2017, 3:32 PM
bgamari edited edge metadata.

One clarification needed, @gkaracha.

Also, could you rebase onto the current state of master?

compiler/types/FamInstEnv.hs
1267

Does this really strip off newtypes? It seems to me from the comment below and eq_src_ty that you retain newtypes here.

This revision now requires changes to proceed.Jan 27 2017, 3:32 PM
gkaracha updated this revision to Diff 10789.Jan 29 2017, 5:48 AM
gkaracha edited edge metadata.
gkaracha marked 3 inline comments as done.

rebase

gkaracha added inline comments.Jan 29 2017, 5:51 AM
compiler/types/FamInstEnv.hs
1267

Ah, yes. The first result does not strip off newtypes (it is for printing to the user, so newtypes are not equivalent to the underlying type). The first one strips off only type families and the last one is the which strips off everything.

bgamari requested changes to this revision.Feb 1 2017, 3:38 PM
bgamari edited edge metadata.
bgamari added inline comments.
compiler/deSugar/Check.hs
111

This pattern match is incomplete (missing Just _). Slightly ironic, given that this is the pattern match checker ;)

This revision now requires changes to proceed.Feb 1 2017, 3:38 PM
gkaracha updated this revision to Diff 10884.Feb 2 2017, 4:11 AM
gkaracha edited edge metadata.
  • Make getResult exhaustive
gkaracha marked an inline comment as done.Feb 2 2017, 4:13 AM
gkaracha added inline comments.
compiler/deSugar/Check.hs
111

Ooops, it is quite ironic indeed. Fixed :)

bgamari accepted this revision.Feb 2 2017, 12:20 PM
bgamari edited edge metadata.

Thanks George!

This revision was automatically updated to reflect the committed changes.