Use IP based CallStack in error and undefined
ClosedPublic

Authored by gridaphobe on Apr 21 2015, 12:33 PM.

Details

Summary

This patch modifies error, undefined, and assertError to use implicit call-stacks to provide better error messages to users.

There are a few knock-on effects:

  • GHC.Classes.IP is now wired-in so it can be used in the wired-in types for error and undefined.
  • TysPrim.tyVarList has been replaced with a new function TysPrim.mkTemplateTyVars. tyVarList made it easy to introduce subtle bugs when you need tyvars of different kinds. The naive
tv1 = head $ tyVarList kind1
tv2 = head $ tyVarList kind2

would result in tv1 and tv2 sharing a Unique, thus substitutions would be applied incorrectly, treating tv1 and tv2 as the same tyvar. mkTemplateTyVars avoids this pitfall by taking a list of kinds and producing a single tyvar of each kind.

  • The types GHC.SrcLoc.SrcLoc and GHC.Stack.CallStack now live in ghc-prim.
  • The type GHC.Exception.ErrorCall has a new constructor ErrorCallWithLocation that takes two Strings instead of one, the 2nd one being arbitrary metadata about the error (but usually the call-stack). A bi-directional pattern synonym ErrorCall continues to provide the old API.

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
There are a very large number of changes, so older changes are hidden. Show Older Changes

OK I'm happy. Thanks.

hvr requested changes to this revision.Jul 10 2015, 5:40 PM
hvr added inline comments.
libraries/base/GHC/Exception.hs
181

If I'm understanding this one right, this will break existing code which relies on the single ErrorCall-field to contain only the original string that was passed as the argument to the error function. I assumed that we'd extend ErrorCall with a 2nd field to contain the CallStack as "out of band" information in order not to corrupt the original error message. Any reason we can't do that?

This revision now requires changes to proceed.Jul 10 2015, 5:40 PM
gridaphobe added inline comments.Jul 10 2015, 5:47 PM
libraries/base/GHC/Exception.hs
181

I chose this approach because I was concerned about breaking code that expected ErrorCall to have a single field.

I suppose it's easier to work around an extra field with CPP than trying to de-munge a String.

hvr added inline comments.Jul 10 2015, 6:06 PM
libraries/base/GHC/Exception.hs
181

well, especially code that expects ErrorCall to have a single field, will most likely also have assumptions about what that field contains :-)

gridaphobe added inline comments.Jul 10 2015, 6:42 PM
libraries/base/GHC/Exception.hs
181

Fair point, though I'll note that errorWithStackTrace is already munging the stack trace into the error message :)

Sticking with errorWithStackTrace for a moment, it would have to convert the CostCentreStack into a CallStack if we add the extra field to ErrorCall, but a CostCentre doesn't have a "file" field, so using the same formatter for CostCentreStacks and CallStacks will produce poor messages.

Perhaps https://ghc.haskell.org/trac/ghc/ticket/10068 would be a better place to change ErrorCall?

What is the status of the haddock changes that this seems to require?

bgamari added inline comments.Jul 13 2015, 9:29 AM
docs/users_guide/7.12.1-notes.xml
261

I'm not sure I like sending users to the release notes for documentation. Perhaps we want to rather send them to the "Special implicit parameters" included in D578? Might it make sense to actually provide a link?

@bgamari the haddock changes are at https://github.com/gridaphobe/haddock/commit/144dc9f69f5c0cd6a3a6cb503bdb33e61bd40110. I wish there were a way to include them in this patch because my patch to haddock depends on this patch, it won't compile against HEAD.

docs/users_guide/7.12.1-notes.xml
261

Good point, I'll link to the main docs instead.

gridaphobe updated this revision to Diff 3543.Jul 14 2015, 12:23 PM

Link to the "special implicit parameters" section from the release notes.

simonmar added inline comments.Jul 15 2015, 9:52 AM
libraries/base/GHC/Exception.hs
181

I'd rather not wire a particular representation for call stacks into ErrorCall. We have 3 ways to get call stacks.

Doing it this way will break code, as @hvr points out, but I think the right way to fix that is to provide a version of error that does not attach call site information. (better still, users that rely on this behaviour should be using their own exception types)

gridaphobe added inline comments.Jul 15 2015, 11:18 PM
libraries/base/GHC/Exception.hs
181

What if we redefine ErrorCall as

data ErrorCall = ErrorCall { errorCallMessage :: String
                           , errorCallStack   :: Maybe String
                           }

It's a bit ugly in my opinion, but it would allow us to include the callstack as "out of band" data as @hvr wants, without having to reconcile the different callstack representations (at least for the moment).

It would still be a breaking change, but at least the type system would help now.

hvr added inline comments.Jul 16 2015, 1:18 AM
libraries/base/GHC/Exception.hs
181

(minor nitpick: if errorCallStack = Just "" is non-sensical, we could avoid the Maybe-container, and just interpret "" as Nothing, i.e. no call-stack)

hvr added inline comments.Jul 16 2015, 1:29 AM
libraries/base/GHC/Exception.hs
181

oh, and iirc there was the suggestion to use PatternSynonyms to reduce the breakage, e.g. with

data ErrorCall = ErrorCallStack String (Maybe String)

pattern ErrorCall msg <- ErrorCallStack msg _

or something like that

simonmar added inline comments.Jul 16 2015, 5:55 AM
libraries/base/GHC/Exception.hs
181

Ok, the PatternSynonyms trick looks like it will help.

I would call it ErrorCallWithLocation.

bgamari requested changes to this revision.Jul 16 2015, 1:00 PM

@gridaphobe, I agree that this would be a good use for PatternSynonyms.

This revision now requires changes to proceed.Jul 16 2015, 1:00 PM
gridaphobe updated this revision to Diff 3573.Jul 16 2015, 10:09 PM

Ok, I've added the extra String field to ErrorCall, with a bidirectional pattern synonym ErrorCall that ignores the extra field to minimize API breakage. Unfortunately cabal still had to be patched because it uses an explicit import list

import Control.Exception (ErrorCall(..))

which does *not* import associated pattern synonyms.

As before, the new submodule commits are available on github.

PS I'm not sure how I feel about the fact that the pattern synonym isn't imported as a "constructor". Sure, it's a distinct language construct, but a big selling point of pattern synonyms is to minimize API breakage.

@gridaphobe, hopefully we'll have the Phab submodule issue sorted out soon.

Indeed it is unfortunate that changes are required in Cabal. It seems reasonable to propose that this be changed. It seems like pretending that patterns exist in the type's namespace in addition to the top-level namespace wouldn't be so bad.

PS I'm not sure how I feel about the fact that the pattern synonym isn't imported as a "constructor". Sure, it's a distinct language construct, but a big selling point of pattern synonyms is to minimize API breakage.

I have no strong opinion about the ErrorCall stuff; it's really a Core Libraries Committee matter.

But pattern synonyms should not be part of the T(...) exports! What would you say for

pattern P x = Tree (Just x)

would you put it in list or Maybe? What about this?

pattern Q x = (x, True)

Pattern synonyms inherently do not belong to one type in the way that data constructors do.

Anyway modulo getting agreement about ErrorCall API, it'd be great to get this landed.

Simon

Based on the comments above I see that: assert now lives in GHC.IO.Exception (where assertError was defined), and it also takes an implicit CallStack. I removed the code that rewrites assert to assertError completely, but we still need a mechanism to discard assertions e.g. when optimizations are turned on. I'm inclined to invert the rewriting mechanism, so assert will be rewritten to assertNoError when we want to ignore assertions, but I'm open to other suggestions.

Does that mean we can replace calls to the following CPP macros to assertError? If so, could we add an additional assertError2 (or with whatever name makes sense) that takes an additional custom message argument?

compiler/HsVersions.h:#define ASSERT(e)      if debugIsOn && not (e) then (assertPanic __FILE__ __LINE__) else
compiler/HsVersions.h:#define ASSERT2(e,msg) if debugIsOn && not (e) then (assertPprPanic __FILE__ __LINE__ (msg)) else

If so, I would like to continue a related differential I have open and replace ASSERT with assertError and ASSERT2 with assertError2.

libraries/base/GHC/IO/Exception.hs
355–356

What about adding an additional function: assertError2 :: (?callStack :: CallStack) => Bool -> String -> a -> a with a user defined string argument?

Does that mean we can replace calls to the following CPP macros to assertError? If so, could we add an additional assertError2 (or with whatever name makes sense) that takes an additional custom message argument?

compiler/HsVersions.h:#define ASSERT(e)      if debugIsOn && not (e) then (assertPanic __FILE__ __LINE__) else
compiler/HsVersions.h:#define ASSERT2(e,msg) if debugIsOn && not (e) then (assertPprPanic __FILE__ __LINE__ (msg)) else

If so, I would like to continue a related differential I have open and replace ASSERT with assertError and ASSERT2 with assertError2.

In theory assertPanic and assertPprPanic could take an implicit CallStack parameter like assertError. The problem is that GHC needs to be bootstrappable by previous versions of the compiler, so we can't use language features introduced in version N until at least version N+1. It's not clear whether the implicit call-stacks will end up in 7.10.2 (due to the API change), but if they are postponed to 7.12 we would have to wait at least until 7.14 to use them internally.

hvr added a comment.Jul 18 2015, 2:09 AM

The problem is that GHC needs to be bootstrappable by previous versions of the compiler, so we can't use language features introduced in version N until at least version N+1.

Fwiw, you can often use #if __GLASGOW_HASKELL__ > ... to fallback to backward-compat code for older GHCs during bootstrap...

Let's separate (a) error, call stacks etc, from (b) the use of ASSERT in GHC itself. Trying to do too much at once leads to delay. I think we know enough to do (a) don't we?

I agree that we probably do want to review what happens for assert as part of (a).

bgamari requested changes to this revision.Jul 21 2015, 7:35 AM

Marking as "needs changes" as there is clearly plenty to be done here.

This revision now requires changes to proceed.Jul 21 2015, 7:35 AM
In D861#29635, @bgamari wrote:

Marking as "needs changes" as there is clearly plenty to be done here.

I believe I've addressed all of the requested changes. There was some discussion of using implicit call-stacks internally, but that should be a separate ticket. The harbormaster build fails because I had to make changes to four submodules (I linked to the commits above).

@gridaphobe, oh dear; my apologies!

Could you change .gitmodules to point to the locations of your submodule repos so Harbormaster can validate this?

gridaphobe updated this revision to Diff 3622.Jul 21 2015, 1:01 PM
gridaphobe updated this object.

point .gitmodules to my forks of the submodules so harbormaster can validate.

gridaphobe updated this revision to Diff 3623.Jul 21 2015, 1:16 PM

merge in master and resolve a conflict in libraries/array and testsuite/tests/th/T7276a.stdout

gridaphobe updated this revision to Diff 3627.Jul 21 2015, 3:44 PM

appease ./validate after merge

@bgamari thanks for the suggestion. Good thing we got Harbormaster to run the build, because I completely forgot to ask about the perf tests (!)

T5321FD and T5321Fun both report a 20% increase in allocated bytes (~100MB). I believe that number is referring to total allocation over the course of the compilation, so I'm not really sure what to make of the increase. Both tests use undefined so we should expect some increase from the extra constraint solving and the core/asm for the CallStacks, but 20% seems quite a lot for one line out of 100..

It's a little less than 20% though. You can check GHC speed [1] for the current numbers, and compare them with yours.

tests/alloc/T5321FD 430782128 = 430756800 bytes
tests/alloc/T5321Fun 450815816 + 0.02% 450887704 bytes

https://perf.haskell.org/ghc/#revision/9ade087108afe2eec2698b6ce41146df02524810

bgamari requested changes to this revision.Jul 27 2015, 7:23 AM

It would be nice to have a better understanding of what this increase is due to. Can you check whether the Core produced for these testcases looks reasonable? How much effort is the simplifier expending on these (e.g. compare GHC runtime metrics of -O0 and -O1.

This revision now requires changes to proceed.Jul 27 2015, 7:23 AM
gridaphobe added a comment.EditedAug 2 2015, 4:46 PM

@bgamari it looks like the culprit is the fact that uNDEFINED_ID is no longer defined with pc_bottoming_Id0; it needs to be supplied a CallStack before diverging, so I used pc_bottoming_Id1 instead.

You can confirm this on master by using a non-wired-in undefined in T5321Fun (i.e. just re-define undefined locally). If GHC isn't explicitly told that undefined diverges immediately, it seems to do a lot more work, allocating ~70MB more.

I'm not sure what to do about this.

EDIT: Also, I examined the Core on my branch and noticed that all occurrences of undefined were saturated applications, so GHC *ought* to know that they would immediately diverge too.

Could you rebase the various submodules and this branch and update this Diff? I would be happy to have a look at the reason for the allocation change but at the moment it seems that applying this set is a bit nontrivial.

my guess is that if the program has a LOT of calls to error, the will now have an extra argument with location information; and that will generate more code for the simplifier to chew on. -dshow-passes may show this up.

Anyway, getting a little more insight would be good. Thanks Ben.

rwbarton requested changes to this revision.Aug 6 2015, 11:47 AM
rwbarton added a reviewer: rwbarton.
rwbarton added a subscriber: rwbarton.
rwbarton added inline comments.
testsuite/tests/simplCore/should_compile/T4930.stderr
4–5

This test output will now be quite sensitive to various sorts of changes. Maybe make the test use an infinite loop rather than error?

testsuite/tests/simplCore/should_compile/T4945.stdout
1–6 ↗(On Diff #3627)

Don't make accept this output, the test is fragile and has expect_broken. See Trac #4945.

bgamari added a comment.EditedAug 6 2015, 2:49 PM

I've been looking at the test against master that @gridaphobe describe, replacing undefined with undefined' = let a = a in a in T5321Fun.hs. Indeed heap allocations increase during compilation and I believe the problem may be tied to the fact that this test is compiled with -O0, where demand analysis doesn't see that,

a40 :: forall a. a
[GblId, Str=DmdType]
a40 = \ (@ a50) -> a40

is bottom (note the plain DmdType).

In D861#31146, @bgamari wrote:
I believe the problem may be tied to the fact that this test is compiled with `-O0`, where demand analysis doesn't see that,
a40 :: forall a. a
[GblId, Str=DmdType]
a40 = \ (@ a50) -> a40

is bottom (note the plain DmdType).

That sounds plausible. And clearly nothing to do with call-stacks, so I think we should get on and commit this patch.

But where do those let a = a in a things come from??? (Indeed without -O they are going to behave badly.)

@simonpj wrote,

But where do those let a = a in a things come from??? (Indeed without -O they are going to behave badly.)

This was a modification I made to the testcase in an attempt to reproduce something like what @gridaphobe observes with this patch. In D861#30734 he reports an increase in GHC's allocations while compiling T5321Fun. Moreover, he reports seeing a similar increase when the undefined in that testcase is replaced with a locally defined bottom (which lacks the demand analysis annotation that the wired-in undefined has).

After the various branches involved in this change have been rebased I can look more closely at the Core generated for this testcase with the IP-based callstacks. That being said, it seems very likely that any change in behavior that we see with this testcase is merely due to the fact that it is compiled with -O0.

gridaphobe updated this revision to Diff 3820.Aug 10 2015, 4:07 PM

I've rebased the diff onto the latest master and addressed @rwbarton's concerns.

Since the allocation increases for T5321Fun and T5321FD are unrelated to the call-stack machinery, I've bumped both expected values by roughly 70MB (the difference between using the wired-in undefined and a local definition on master).

There's one last thing that comes to mind. The SrcLoc and CallStack types are now public in base-4.8.1.0, and this patch moves them into ghc-prim so they'll be available earlier. As a result the base modules GHC.SrcLoc and GHC.Stack` no longer export the corresponding types. I can add re-exports for backwards-compatibility if desired, not sure what the general sentiment is there (Trac #10068 will probably end up mucking about with these types too, so maybe this isn't the right time/place to worry about it).

bgamari updated the Trac tickets for this revision.Aug 12 2015, 7:35 AM

@gridaphobe, I intend on merging this today but before I do do you suppose you have have a look over the Diff description and revise it as necessary.

gridaphobe updated this object.Aug 21 2015, 8:35 AM

@gridaphobe, for the record this is currently blocked on merge of the Cabal pull request. MIkhail has some questions that I think you might be best equipped to address.

bgamari requested changes to this revision.Aug 30 2015, 11:59 AM

@gridaphobe, I'm afraid one of the refactorings I committed last week conflicts with this. I don't have time to work through the conflicts at the moment. If you want to have a stab at rebasing (shouldn't be hard) then feel free. Otherwise I'll get to it this week.

This revision now requires changes to proceed.Aug 30 2015, 11:59 AM
gridaphobe updated this revision to Diff 3989.Aug 30 2015, 12:01 PM

rebase against master and fix conflicts

gridaphobe updated this revision to Diff 3990.Aug 30 2015, 1:43 PM

forgot to bump the commit for the Cabal submodule

This revision was automatically updated to reflect the committed changes.

@gridaphobe, I've merged this although there are a few outstanding test failures with the full make test suite. Could you have a look at these?