Implement improved error messages for ambiguous type variables (#10733)
ClosedPublic

Authored by KaneTW on Aug 27 2015, 6:35 PM.

Details

Summary

Improved error messages are only printed when the old message would be
"No instance for...", since they're not as helpful for "Could not deduce..."

No special test case as error messages are tested by other tests already.

Signed-off-by: David Kraeutmann <kane@kane.cx>

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
KaneTW updated this revision to Diff 3947.Aug 27 2015, 6:35 PM
KaneTW retitled this revision from to Implement improved error messages for ambiguous type variables (#10733).
KaneTW updated this object.
KaneTW edited the test plan for this revision. (Show Details)
bgamari added a subscriber: simonpj.

@simonpj, do you think you could take a quick look at this?

There are a lot of error message wibbles here. That's to be expected. @KaneTW, have you double-checked all of the changes against the source code of the test cases? I just want to guard against the possibility that this patch improves many error messages but makes a few completely wrong. I often find it necessary (but painful, admittedly) to look at the source code of each case with a changed error message (unless the change is truly boring).

Thanks for jumping in and doing this!

testsuite/tests/annotations/should_fail/annfail10.stderr
41

I know this was taken from the original ticket, but I find the "or" too far from the "either". Maybe use indentation to make this clearer? I've long decided that the error message is grammatically incorrect before I hit the "or". :)

You're right, at a cursory glance there are a few cases where the new message shows although it shouldn't (tcfail010 for example). I'll look through the rest.

I'll see what I can do about the error message.

simonpj edited edge metadata.Aug 31 2015, 9:44 AM
simonpj updated the Trac tickets for this revision.
bgamari requested changes to this revision.Sep 2 2015, 6:33 AM
bgamari edited edge metadata.

Requesting changes while test error messages are checked.

This revision now requires changes to proceed.Sep 2 2015, 6:33 AM
KaneTW updated this revision to Diff 4039.Sep 2 2015, 10:16 PM
KaneTW edited edge metadata.
  • Print old message when no unifiers and no givens exist

I eliminated one cause of incorrect "ambiguous type variable" warnings; when we have no unifiers and no givens, it doesn't matter that the type variable is ambiguous (i.e. a metavariable).

I'm looking through the rest one more time, but the messages seem to be correct.

@goldfire: Do you have any suggestions regarding the error message? I find it well readable with the current 2-space indentation, and I'm not sure what could make it clearer.

The current build might fail since I haven't merged my changed with D1176 yet.

KaneTW updated this revision to Diff 4041.Sep 2 2015, 11:55 PM
KaneTW edited edge metadata.
KaneTW updated this revision to Diff 4042.Sep 2 2015, 11:58 PM
KaneTW edited edge metadata.
  • Merge properly (diff 4039 tests got swallowed)

This rebase got messed up something fierce. Next diff should fix it in a moment.

KaneTW updated this revision to Diff 4043.Sep 3 2015, 12:23 AM
KaneTW edited edge metadata.

Restore tests that shouldn't have been reverted

KaneTW added a comment.Sep 3 2015, 1:10 AM

Jan's injective type families commit is causing tcfail220 to fail, but that's unrelated to this ticket.

erikd added a subscriber: erikd.Sep 10 2015, 7:55 PM

This looks ok to me. @bgamari @austin ?

simonpj edited edge metadata.Sep 11 2015, 4:26 AM

I'm just not yet convinced that this is an improvement. Compare

`
No instance for (Show r0) arising from a use of ‘show’
The type variable ‘r0’ is ambiguous
Potential instances:
  ....lots of instances...

with

Ambiguous type variable ‘r0’ arising from a use of ‘show’
caused by the lack of an instance ‘(Show r0)’
Either use a type annotation to specify what ‘r0’ should be
 based on the following instances
  ....lots of instances...

Is this really better? What does it mean to say that the ambiguous type variable was "caused by" the lack of an instance? What does it mean to add a type annotation "based on" these instances. Personally I prefer the current message!

That said, I think it could be improved. Perhaps more like this:

No instance for (Show r0) arising from a use of ‘show’
Probable cause: the type variable ‘r0’ is ambiguous
Potential instances:
  ....lots of instances...
Probable solution: use a type annotation to fix 'r0' 
   so that one of these instance is chosen

Would that serve your purposes?

I have to agree with @simonpj that the "caused by" is misleading. But I like the fact (in the new message) that the first line is about ambiguity.

What about this:

Ambiguous type variable ‘r0’
I must know this type to choose an instance for ‘(Show r0)’ arising from a use of ‘show’
Probable fix: add a type annotation
Relevant bindings:
  ...     -- these help the reader know where r0 comes from
Potential instances:
  instance Show Ordering -- Defined in ‘GHC.Show’
  instance Show Integer -- Defined in ‘GHC.Show’
  instance (Show a, Show b, Number a, Digit b) => Show (a :@ b)
    -- Defined at tcfail133.hs:11:54
  ...plus 25 others
  (use -fprint-potential-instances to see them all)

Well, I suppose that in the case of a single-parameter type class with an ambiguous type-variable argument, we might want to draw more attention to the type variable. But T5858 has:

Ambiguous type variables ‘t0’, ‘t1’ arising from a use of ‘infer’
caused by the lack of an instance ‘(InferOverloaded ([t0], [t1]))’

Now it seems altogether less clear; saying that there is no instance for InferOverlaoged ([t0],[t1]) seems more plausible.

I'm still a bit unconvinced, at least outside a pretty narrow use-case such as (C a0) where a0 is ambig.

Well, I suppose that in the case of a single-parameter type class with an ambiguous type-variable argument, we might want to draw more attention to the type variable. But T5858 has:

Ambiguous type variables ‘t0’, ‘t1’ arising from a use of ‘infer’
caused by the lack of an instance ‘(InferOverloaded ([t0], [t1]))’

Now it seems altogether less clear; saying that there is no instance for InferOverlaoged ([t0],[t1]) seems more plausible.

I would render that error as

Ambiguous type variables 't0', 't1'
I must know these types to choose an instance for ‘(InferOverloaded ([t0], [t1]))’ arising from a use of ‘infer’
Probable fix: add a type annotation
Relevant bindings:
  ...
Potential instances:
  ...

Looks fine to me.

I'm still a bit unconvinced, at least outside a pretty narrow use-case such as (C a0) where a0 is ambig.

But that case is very common! Even if we hit only that one case, I think this is a clear win.

So if there's an Eq (Tree a), with ambiguous a, and no Eq (Tree t) instance, would you complain about ambiguity, or just about the absence of an Eq (Tree a) instance? The latter seems more plausible to me.

Let's see: under what circumstances, exactly, do you think it's better to complain about an ambiguous type variable than about the lack of an instance?

So if there's an Eq (Tree a), with ambiguous a, and no Eq (Tree t) instance, would you complain about ambiguity, or just about the absence of an Eq (Tree a) instance? The latter seems more plausible to me.

But what if there are Eq (Tree Int) and Eq (Tree Bool) instances? Admittedly, this is a little unlikely.

Let's see: under what circumstances, exactly, do you think it's better to complain about an ambiguous type variable than about the lack of an instance?

Here's an easy condition: when the Wanted constraint unifies with an existing instance head. This covers the straightforward Show a0 case and your Eq (Tree a0) case. In the Eq (Tree a0) case: if there are no Eq (Tree ...) instances, then Eq (Tree a0) will not unify with any instance heads. We report a "no instance" error. If we do have my (unlikely) Eq (Tree Int) instance lying around, then we report an ambiguous variable.

Aren't the "potential instances" that are reported generated the same way? Then this is even simpler: highlight the ambiguous variable precisely when there are potential instances.

I believe this already happens. The emphasized message is chosen when there are potential instances.

In D1182#34704, @KaneTW wrote:

I believe this already happens. The emphasized message is chosen when there are potential instances.

Very good! That seems somewhat plausible. Perhaps add a Note to give the rationale? Currently there is just a cryptically-named boolean called "emphasis".

Then there is the question of wording. Saying that an ambiguous type variable is caused by the lack of an instance is clearly not right. What about:

Ambiguous type variable 'r0` prevents the constraint `(Show r0)` from being solved
These potential instances exist:
    ....
Probable fix: use a type annotation to specify what 'r0' should be.

That sounds good. I'll add a note and change the message then.

Ambiguous type variable 'r0` prevents the constraint `(Show r0)` from being solved
These potential instances exist:
    ....
Probable fix: use a type annotation to specify what 'r0' should be.

I like this, but would move the probable fix over the potential instances.

austin requested changes to this revision.Sep 21 2015, 7:16 PM
austin edited edge metadata.

(Putting in Request Changes state to move out of review queue so we know what state this is in. Thanks for the hard work David!)

This revision now requires changes to proceed.Sep 21 2015, 7:16 PM

I'm going to put the following note if that's sufficient/correct (can't submit a new patch yet as I haven't adjusted the tests and my battery's dying)

{- Note [Highlighting ambiguous type variables]
-----------------------------------------------
When we encounter ambiguous type variables (i.e. type variables
that remain metavariables after type inference), we need a few more
conditions before we can reason that *ambiguity* prevents constraints
from being solved:
  - We can't have any givens, as encountering a typeclass error 
    with given constraints just means we couldn't deduce
    a solution satisfying those constraints and as such couldn't
    bind the type variable to a known type.
  - If we don't have any unifiers, we don't even have potential
    instances from which an ambiguity could arise.
  - Lastly, I don't want to mess with error reporting for
    unknown runtime types so we just fall back to the old message there.
Once these conditions are satisfied, we can safely say that ambiguity prevents
the constraint from being solved. -}
KaneTW updated this revision to Diff 4301.Sep 24 2015, 2:11 AM
KaneTW edited edge metadata.

Adjust error message and explain when ambiguity should be highlighted

KaneTW updated this revision to Diff 4304.Sep 24 2015, 6:47 AM
KaneTW edited edge metadata.

Error message wibbles.

This has been stale for a while. Can someone take a look?

Great. I'm happy with this with the changes below addressed. It would be nice if there were an "accept with request changes" button in Phab.

Thanks for doing this, and sorry for letting it go stale!

compiler/typecheck/TcErrors.hs
1536

I agree with Simon's comment (buried somewhere above) that emphasis is a poor variable name here. lead_with_ambig? At least something that refers to ambiguity, not just emphasis in general.

1858–1860

Comment what that Bool is, please.

KaneTW updated this revision to Diff 4443.Oct 7 2015, 3:21 PM

Added goldfire's suggestions.

austin accepted this revision.Oct 7 2015, 8:27 PM
austin edited edge metadata.

You got it. Thanks!

This revision was automatically updated to reflect the committed changes.