Generate Typeable info at definition sites
ClosedPublic

Authored by bgamari on Mar 26 2015, 9:49 AM.

Details

Summary

(Posted on behalf of Simon, who is on holiday.)

This patch implements the idea floated in Trac #9858, namely that we should
generate type-representation information at the data type declaration
site, rather than when solving a Typeable constraint.

However, this turned out quite a bit harder than I expected. I still
think it's the right thing to do, and it's done now, but it was quite
a struggle.

See particularly

  • Note [Grand plan for Typeable] in TcTypeable (which is a new module)
  • Note [The overall promotion story] in DataCon (clarifies existing stuff)

The most painful bit was that to generate Typeable instances (ie TyConRepName
bindings) for every TyCon is tricky for types in ghc-prim etc:

  • we need to have enough data types around to *define* a TyCon
  • many of these types are wired-in

Also, to minimise the code generated for each data type, I wanted to generate
pure data, not CAFs with unpackCString# stuff floating about.

Performance
~~~~~~~~~~~
Three perf/compiler tests start to allocate quite a bit more. This isn't surprising,
because they all allocate zillions of data types, with practically no other code,
esp T1969

T1969:  GHC allocates 30% more
T5642:  GHC allocates 14% more
T9872d: GHC allocates  5% more

I'm treating this as acceptable. The payoff comes in Typeable-heavy code.

Remaining to do
~~~~~~~~~~~~~~~

  • I think that "TyCon" and "Module" are over-generic names to use for the runtime type representations used in GHC.Typeable. Better might be "TrTyCon" and "TrModule". But I have not yet done this
  • Add more info the the "TyCon" e.g. source location where it was defined
  • Use the new "Module" type to help with Trac Trac #10068
  • It would be possible to generate TyConRepName (ie Typeable instances) selectively rather than all the time. We'd need to persist the information in interface files. Lacking a motivating reason I have not done this, but it would not be difficult.

Refactoring
~~~~~~~~~~~
As is so often the case, I ended up refactoring more than I intended.
In particular

  • In TyCon,
    • a type *family* (whether type or data) is repesented by a FamilyTyCon
    • a algebraic data type (including data/newtype instances) is represented by AlgTyCon This wasn't true before; a data family was represented as an AlgTyCon. There are some corresponding changes in IfaceSyn.

      Also get rid of the (unhelpfully named) tyConParent.
  • In TyCon define 'Promoted', isomorphic to Maybe, used when things are optionally promoted; and use it elsewhere in GHC.
  • Each TyCon, including promoted TyCons, contains its TyConRepName, if it has one. This is, in effect, the name of its Typeable instance.
  • I added PatSynId, DefMethId, and ReflectionId to the IdInfo.IdDetails type. They are used for debugging only, namely to suppress excessive output in -ddump-types.
  • Tidy up the generation of PrelInfo.knownKeyNames
  • Move newImplicitBinder from IfaceEnv to BuildTyCl
  • PrelNames.conName renamed to dcQual for consistency with varQual, tcQual
  • Move mkDefaultMethodIds, mkRecSelBinds from TcTyClsDecls to TcTyDecls
Test Plan

validate, examine performance

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
There are a very large number of changes, so older changes are hidden. Show Older Changes
goldfire added inline comments.Apr 1 2015, 11:14 PM
compiler/basicTypes/DataCon.hs
1096

Indeed.

compiler/coreSyn/CorePrep.hs
63 ↗(On Diff #2533)

Shouldn't that be 709?

compiler/deSugar/DsBinds.hs
989

I'm confused about EvTypeable. What does it store? It seems to be completely isomorphic to the type at hand, so I'm not sure what it's buying us. And, why does it store EvTerms internally, as opposed to EvTypeables?

991

Not "together with ty" anymore.

bgamari requested changes to this revision.Jun 30 2015, 12:06 PM
bgamari added a reviewer: bgamari.
bgamari added a subscriber: bgamari.

It seems there are validation issues.

This revision now requires changes to proceed.Jun 30 2015, 12:06 PM
austin edited edge metadata.Jun 30 2015, 12:07 PM

In any case, I'm not sure this is ready for merging, so putting it as 'Changes Required' is OK as to get it out of the queue.

Is there a haddock branch available which this is supposed to build with?

For the record, we discussed this Diff in last week's meeting and decided that @simonpj would rebase this and commit it as-is.

In D757#29981, @bgamari wrote:

For the record, we discussed this Diff in last week's meeting and decided that @simonpj would rebase this and commit it as-is.

I have no further objections if GHC HQ wants this patch to go in. But do please check out my source-line notes above. In particular, I think EvTypeable can be removed completely.

bgamari retitled this revision from Generate Typeble info at definition sites to Generate Typeable info at definition sites.Aug 25 2015, 3:51 AM
bgamari edited edge metadata.

Also, the types in ghc-prim:GHC.Types don't quite match the suggestions in Trac #10068,
Module should contain We should introduce a Package type in place of the module name field in Module

  • missing the PackageKey as suggested in Trac #10068. This would be a simple addition after the fact however.
compiler/coreSyn/CorePrep.hs
63 ↗(On Diff #2533)

Yes, I believe it should be.

compiler/deSugar/DsBinds.hs
73–74

Perhaps we want to just delete this line?

bgamari commandeered this revision.Aug 26 2015, 9:16 AM
bgamari edited reviewers, added: goldfire; removed: bgamari.

I've rebased this against master. Commandeering.

bgamari updated this revision to Diff 3929.Aug 26 2015, 9:17 AM
bgamari edited edge metadata.
  • Rebase against master
bgamari added a comment.EditedAug 26 2015, 9:31 AM

I've updated this Diff with an updated patch against master. The rebase was a bit tricky in places due to the large surface area of the patch. It seems, however, that I've messed up somewhere which I'll discuss in a future comment.

Here are some notes tersely describing how things have changed since the last iteration and why.

Notes

The patch also interacted a bit with the constraint tuple rework performed in
ffc21506894c7887d3620423aaf86bc6113a1071.

It also interacted quite severely with f6ab0f2d59, which eliminated TupleTyCon, unifying all ADTs into AltTyCon. This patch is a bit strange as AlgTyCon constructors are always, according to a comment, lifted and boxed, yet the TupleTyCon constructor of AlgTyConRhs can represent unboxed tuples. Simon confirmed that the comment is inaccurate. The trouble is that the Typeable patch wants to move the TyConRepName associated with a TyCon into AlgTyConParent.

This puts you in a tricky situation when you want to build a tuple TyCon (e.g. in mkTupleTyCon) as you need to provide a representation to build an AlgTyConParent, even if you are building an unboxed tuple (which are not representable, if I understand correctly).

To address this I've added a constructor to AlgTyConParent to handle the case of unboxed values. I've removed the tup_rep_name field from TyCon as it is already available via the AlgTyConParent. I've also renamed AlgTyConParent to AlgTyConFlav per Simon's suggestion. Consequently, mkTupleTyCon lost the TyConRepName argument as this is passed via the flavour.

tcRnDeclsi was affected by 3c44a46b35 as tcRnDecls now expects a boolean argument indicating whether the declarations are in a module with a where clause. It seemed reasonable to assume no where clause in the case tcRnDeclsi.

I was also a bit perplexed by the apparent overlap in intent in TyCon.promotableTyCon_maybe and DataCon.computeTyConPromotability. I added the obvious case for TupleTyCon for computeTyConPromotability.

There were also a few interactions with 7eceffb3dd1e9c99218630b94ba97da483cec32d, which renamed several of the helpers in TyCon.

Type-level literals

I wasn't entirely sure how to handle generating the representations for type-level symbols and naturals in Data.Typeable.Internal. there is some trickiness in the original Typeable patch.

Iavor had reworked Typeable's handling of type literals in 4854fcea in response to Trac #10348. I've done my best to undo the rework in this patch, which addresses Trac Trac #10348, without rolling back the fix to that bug (by generating a sub-goal for KnownNat or KnownSymbol). I did roll back Iavor's approach to representation generation, as it was incompatible with Simon's goal of representations being plain old data.

TcInteract

The class instance matching in TcInteract was a bit of a mess. The original Typeable patch performed a rather ambitious refactoring of this code which conflicted with "Make the matchable-given check happen first", 228ddb95ee. In the interest of remaining conservative I've try to apply only the Typeable changes here.

TcInteract was also complicated by "New handling of overlapping inst in Safe Haskell", 4fffbc34c0. This required coming up with some SafeOverlapping values, but otherwise this was relatively straightforward (assuming my guess at the overlapping condition is correct).

Finally, the change to the representation of LookupInstResult in a1275a762ec04c1159a required a bit of thought.

In sum, I think matchTypeableClass deserves a thorough review.

bgamari added a comment.EditedAug 26 2015, 9:37 AM

As noted above, while this does compile it does not yet work. I suspect something is amiss in the interface file generation

For example, consider these two modules,

{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE TypeFamilies #-}
module Hi where

class HelloWorld a where
    type MyItem a

data MyTy a = TyTy a
instance HelloWorld (MyTy a) where
    type MyItem (MyTy a) = a
{-# LANGUAGE NoImplicitPrelude #-}
{-# LANGUAGE TypeFamilies #-}
module Hi2 where

import Hi

data MyTy2 a = TyTy2 a

instance HelloWorld (MyTy2 a) where
    type MyItem (MyTy2 a) = a

If I compile them both in the same GHC session things work as expected,

$ rm Hi{2,}.{hi,o}
$ inplace/bin/ghc-stage1 Hi2.hs
[1 of 2] Compiling Hi               ( Hi.hs, Hi.o )
[2 of 2] Compiling Hi2              ( Hi2.hs, Hi2.o )

However, if I compile one and then the other (thereby requiring a roundtrip through the interface file), things fail,

$ rm Hi{2,}.{hi,o}
$ inplace/bin/ghc-stage1 Hi.hs
[1 of 1] Compiling Hi               ( Hi.hs, Hi.o )
$ inplace/bin/ghc-stage1 Hi2.hs
[2 of 2] Compiling Hi2              ( Hi2.hs, Hi2.o )

Hi2.hs:11:10: error:
    Class ‘HelloWorld’ does not have an associated type ‘MyItem’Nothing
    In the type instance declaration for ‘MyItem’
    In the instance declaration for ‘HelloWorld (MyTy2 a)’

This is first tickled in stage 2 build by users of IsList.

Strangely enough, the interface file seems to look reasonable,

$ inplace/bin/ghc-stage1 --show-iface Hi.hi
Magic: Wanted 33214052,
       got    33214052
Version: Wanted [7, 1, 1, 2, 0, 1, 5, 0, 8, 2, 6],
         got    [7, 1, 1, 2, 0, 1, 5, 0, 8, 2, 6]
Way: Wanted [],
     got    []
interface Hi [family instance module] 71120150826
  interface hash: 35ab2a046a0b761cde649dfc23bdf36f
  ABI hash: c2dce4fdfec89ed201854b72d30b8f52
  export-list hash: a662678f5930079fe8eb1abf1c3130aa
  orphan hash: 693e9af84d3dfcc71e640e005bdc5e2e
  flag hash: 18a77600461aa9a31511aec91feccc4b
  sig of: Nothing
  used TH splices: False
  where
exports:
  HelloWorld{MyItem}
  MyTy{TyTy}
module dependencies:
package dependencies:
orphans:
family instance modules:
34843196dce88f970da0467d193ff0c8
  $fHelloWorldMyTy :: HelloWorld (MyTy a)
  DFunId
5a4b855811a868b3534872338d3c4b1a
  $tc'TyTy :: TyCon
9d63d871976d88574ad24348d9df2f63
  $tcHelloWorld :: TyCon
7a22d73fd4cd22ec318069f3831eece4
  $tcMyTy :: TyCon
b9094bbbfabd3e3622f694039f989d3e
  $trModule :: Module
34843196dce88f970da0467d193ff0c8
  class HelloWorld a where
    type family MyItem a :: * open
c1fbe4e7fc7b6a01a305188eb823a0f1
  type family MyItem a :: * open
6c61cbfc7dcd0be12b73021dd050951e
  data MyTy a = TyTy a
    Promotable
055005ab597568b780cccd438dbaf560
  axiom TFCo:R:MyItemMyTy:: MyItem (MyTy a) = a
instance [safe] HelloWorld [MyTy] = $fHelloWorldMyTy
family instance MyItem [MyTy] = TFCo:R:MyItemMyTy
vectorised variables:
vectorised tycons:
vectorised reused tycons:
parallel variables:
parallel tycons:
trusted: safe
require own pkg trusted: False

One obvious place to mess this up would be in IfaceSyn.hs. Sadly the tags there look correct.

Actually, now that I compare to another GHC build it seems there are a few differences. The obvious difference is the new $tc and $tr symbols, but these are expected (that was the original point of the rework). The other difference is that my branch has an additional a family instance module of the form,

c1fbe4e7fc7b6a01a305188eb823a0f1
  type family MyItem a :: * open

I haven't yet thought enough about it to know whether it is expected.

It seems that the problem only affects type families, not data families.

compiler/iface/LoadIface.hs
363 ↗(On Diff #3929)

This file shouldn't have been touched.

bgamari planned changes to this revision.Aug 26 2015, 11:48 AM
bgamari marked 2 inline comments as done.

I've split this commit up a bit which should ease reviewing a bit.

@goldfire, I intend on addressing your comments as soon as I get things in working order again.

compiler/simplCore/CoreMonad.hs
891 ↗(On Diff #3929)

This also shouldn't have been touched.

bgamari updated this revision to Diff 3933.Aug 27 2015, 2:19 AM
bgamari edited edge metadata.
  • Rebase on top of master having split out some of the indepedent pieces of the patch
bgamari updated this revision to Diff 3934.Aug 27 2015, 2:58 AM
bgamari edited edge metadata.
  • Try another approach for type-level literals
bgamari marked 3 inline comments as done.Aug 27 2015, 5:19 AM

I have been merging those pieces of the patch that were easily split out to master separately. This Diff is what remains. I've also merged some changes by @simonpj addressing some of @goldfire's comments.

@goldfire, I intend on looking into your suggestion of dropping EvTypeable. I'm still feeling out the structure of the typechecker so I'm not entirely certain at the moment. I am, however, leaning towards it being necessary. See my Comment above. The issue is made even more unclear by the fact that evidence generation seems to have changed quite a bit since the original patch. @simonpj, perhaps you could explain if (or whether) EvTypeable is necessary?

compiler/deSugar/DsBinds.hs
994

I'm confused about EvTypeable. What does it store?

My apologies if this is obvious, but here's my attempt at answering this:

EvTypeable stores Typeable dictionaries for the various subexpressions of the type for which we are constructing the dictionary. For instance, to construct a Typeable dictionary for TyApp f a, we will need Typeable dictionaries for f and a. We generate these goals in TcInteract and use them to piece together the final dictionary in DsBinds.

And, why does it store EvTerms internally, as opposed to EvTypeables?

I'm not entirely sure what you mean here. As far as I can tell which EvTerms we need will depend upon what sort of type we are constructing a dictionary for.

compiler/prelude/PrelNames.hs
782

It probably deserves a comment at least. I'll add one.

801

Honestly I am also having a bit of trouble seeing why this function is necessary, even with the aid of Note [Grand plan for Typeable]. @simonpj, perhaps you could clarify?

compiler/prelude/TysPrim.hs
325

If I understand correctly, tc_name is the Name by which the TyCon will be known to GHC. rep_fs on the other hand is the name that will appear in the runtime representation that we generate.

Why don't we generate the latter from the Name? That's a good question. @simonpj, perhaps you could comment?

compiler/typecheck/TcInteract.hs
1917

While this sounds like it might be a nice learning opportunity for me, I'm afraid I have enough on my plate at the moment so I'll leave it for another patch.

@simonpj, is there any reason why this shouldn't happen eventually?

I should also mention that the type synonym issues still haven't been resolved. I'll buy a beer for anyone who spots the bug before I stumble upon it.

bgamari updated this revision to Diff 3935.Aug 27 2015, 5:26 AM
bgamari edited edge metadata.
  • CorePrep: Fix GHC version CPP guard
  • Fix things
  • Comments
  • Comments

Ahh, actually it seems that @simonpj already addressed @goldfire's concerns about EvTypeable in his amendments (which will be uploaded shortly).

bgamari updated this revision to Diff 3936.Aug 27 2015, 5:43 AM
bgamari edited edge metadata.

Address Richard's remarks

bgamari updated this revision to Diff 3937.Aug 27 2015, 5:46 AM
bgamari edited edge metadata.

Wibbles

goldfire edited edge metadata.Aug 27 2015, 8:05 AM
In D757#32648, @bgamari wrote:

Ahh, actually it seems that @simonpj already addressed @goldfire's concerns about EvTypeable in his amendments (which will be uploaded shortly).

I look forward to seeing it. :)

Your point about EvTypeables storing EvIds sometimes is accurate, I think, and it defeats my original argument. EvTypeable would be redundant for closed types, but it's not redundant for open ones. But I still think it should be simplified, making it clear that all EvTerms inside it are either EvTypeables or EvIds, by adding a new constructor to EvTypeable that stores Ids directly, instead of using the overly-heavy EvTerm.

compiler/typecheck/TcInteract.hs
1864–1910

I have not done a thorough review, but this does, at least, seem to be missing the case for a bare variable. It looks like with this implementation, we would be unable to prove Typeable a given Typeable a.

simonpj edited edge metadata.Aug 28 2015, 9:28 AM

Richard thanks for your comments. The ones in the code are all spot on (and I'll fix), except the one about EvTypeable. It's nearly isomorphic to Type, but not quite: it can contain EvIds that refer to lambda-bound dictiionaries. Just think of

f :: forrall a. Typeable a => [a] -> TypeRep
f xs = typeRep (Proxy :: Proxy [a])

Ben
I’ve pushed changes to the typeable-ben branch. It compiles fine now. You’ll need to apply this patch to haddock, which I did not do in the repo.

There are lots of testsuite wibbles that I have not look at… that’s the next task.

Two compiler perf-tests get worse, as expected.
Less expected is

perf/should_run  T5205 [stat not good enough] (normal)

That is more bytes allocated at runtime, which I was not expecting. Need to investigate.

Simon

simonpj@cam-05-unx:~/code/HEAD-2/utils/haddock$ git diff
diff --git a/haddock-api/src/Haddock/Convert.hs b/haddock-api/src/Haddock/Convert.hs
index 7c9040a..567f1a6 100644
--- a/haddock-api/src/Haddock/Convert.hs
+++ b/haddock-api/src/Haddock/Convert.hs
@@ -163,37 +163,31 @@ synifyTyCon coax tc
                                       , dd_derivs = Nothing }
            , tcdFVs = placeHolderNamesTc }
 
-  | isTypeFamilyTyCon tc
+  | isFamilyTyCon tc
   = case famTyConFlav_maybe tc of
       Just rhs ->
         let info = case rhs of
-              OpenSynFamilyTyCon -> return OpenTypeFamily
+              OpenSynFamilyTyCon -> OpenTypeFamily
+              DataFamilyTyCon {} -> DataFamily
               ClosedSynFamilyTyCon mb -> case mb of
                   Just (CoAxiom { co_ax_branches = branches })
-                          -> return $ ClosedTypeFamily $ Just $
+                          -> ClosedTypeFamily $ Just $
                                brListMap (noLoc . synifyAxBranch tc) branches
-                  Nothing -> return $ ClosedTypeFamily $ Just []
+                  Nothing -> ClosedTypeFamily $ Just []
               BuiltInSynFamTyCon {}
-                -> return $ ClosedTypeFamily $ Just []
+                -> ClosedTypeFamily $ Just []
               AbstractClosedSynFamilyTyCon {}
-                -> return $ ClosedTypeFamily Nothing
-        in info >>= \i ->
-           return (FamDecl
-                   (FamilyDecl { fdInfo = i
+                -> ClosedTypeFamily Nothing
+            kind_sig | isDataFamilyTyCon tc = Nothing  -- Always '"*'
+                     | otherwise            = Just (synifyKindSig (synTyConResKind tc))
+        in return (FamDecl
+                   (FamilyDecl { fdInfo = info
                                , fdLName = synifyName tc
                                , fdTyVars = synifyTyVars (tyConTyVars tc)
-                               , fdKindSig =
-                                 Just (synifyKindSig (synTyConResKind tc))
+                               , fdKindSig = kind_sig
                                }))
       Nothing -> Left "synifyTyCon: impossible open type synonym?"
 
-  | isDataFamilyTyCon tc
-  = --(why no "isOpenAlgTyCon"?)
-    case algTyConRhs tc of
-        DataFamilyTyCon -> return $
-          FamDecl (FamilyDecl DataFamily (synifyName tc) (synifyTyVars (tyConTyVars tc))
-                              Nothing) --always kind '*'
-        _ -> Left "synifyTyCon: impossible open data type?"
   | Just ty <- synTyConRhs_maybe tc
   = return $ SynDecl { tcdLName = synifyName tc
                      , tcdTyVars = synifyTyVars (tyConTyVars tc)

@simonpj, oh dear, it looks we duplicated a bit of work. I pretty much wrote the same patches. Sorry about that, I should have been more vocal.

A runtime change is certainly concerning. Hmm.

Take a look at Trac #10835, as well.

For the record, I'm currently trying to resolve an import cycle between ,

GHC.Prim          depends upon
GHC.Tuple         which to derive Typeable needs unpackString# from
GHC.CString       which needs isTrue# and Char from
GHC.Types         which needs types from
GHC.Prim
In D757#34335, @bgamari wrote:

For the record, I'm currently trying to resolve an import cycle between ,

Just to be clear, this shows up in the currently-committed branch T9859-typeable-ben? How does it manifest?

In particular, sh validate --fast works for me. What doesn't work for you?

See Trac Trac #10867 for the Right Way to unravel the loop, incidentally

In D757#34483, @simonpj wrote:

In particular, sh validate --fast works for me. What doesn't work for you?

See Trac Trac #10867 for the Right Way to unravel the loop, incidentally

Sorry for the late reply; internet access has been rather spotty while in the States this week. I've pushed the latest version of this work to the wip/T9858-typeable-ben2 branch.

In particular the error seen is,

Module imports form a cycle:
         module ‘GHC.CString’ (libraries/ghc-prim/GHC/CString.hs)
        imports ‘GHC.Types’ (libraries/ghc-prim/GHC/Types.hs)
  which imports ‘GHC.Tuple’ (libraries/ghc-prim/GHC/Tuple.hs)
  which imports ‘GHC.CString’ (libraries/ghc-prim/GHC/CString.hs)

while compiling ghc-prim.

austin requested changes to this revision.Oct 12 2015, 9:16 PM
austin edited edge metadata.

(Punting to Request Changes while @bgamari keeps going at this.)

This revision now requires changes to proceed.Oct 12 2015, 9:16 PM
bgamari updated this revision to Diff 4738.Oct 28 2015, 9:39 AM
bgamari edited edge metadata.

Rebase and fix treatment of data families

This will hopefully be the final version of this patch.

This revision was automatically updated to reflect the committed changes.

This merge was botched and, worse, subsequently merged. It was then reverted in bbaf76f949426c91d6abbbc5eced1f705530087b. I'm proposing for merge again in D1404.