Record Pattern Synonyms
AbandonedPublic

Authored by bgamari on Aug 17 2015, 7:30 AM.

Details

Summary

This patch implements an extension to pattern synonyms which allows user to specify pattern
synonyms using record syntax. Doing so generates appropriate selectors and update functions.

TODO

  • Fix panic when there is an partial record update.
  • The panic only occurs when the update is polymorphic.
  • Make sure selectors/updaters are only generated for the appropriate patterns
  • Increase code reuse between record update typechecking and pattern syn update checking (DONE: completely unified)
  • Fix interface file (currently an undefined in there)
  • Decide what the correct thing to do for explicitly bidirectional updates is (see below)
  • Generate proper update functions for explicitly bidirectional patterns
  • Fix lint errors
  • Add tests
  • Add proper annotation information in the parser
  • Fix core lint errors .. no idea currently about this one

TODO: After merge

These refactorings will muddy the diff so I will make these changes later.

  • Refactor PatSynDetails to HsConDetails? The two datatypes are isomorphic.
  • Refactor IdDetails to use ConLike? (Not a good ideas as RecordSelId takes a TyCon, not a DataCon.

Updating Explicitly Bidirectional Pattern Synonyms

Consider the following

pattern Silly{a} <- [a] where
  Silly a = [a, a]

f1 = a [5] -- 5

f2 = [5] {a = 6} -- currently [6,6]

Fixing Polymorphic Updates

They were fixed by adding these two lines in dsExpr. This might break record updates but will be easy to fix.

+ ; let req_wrap = mkWpTyApps (mkTyVarTys univ_tvs)

- , pat_wrap = idHsWrapper }
+, pat_wrap = req_wrap }

Mixed selectors error

Note [Mixed Record Field Updates]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Consider the following pattern synonym.

data MyRec = MyRec { foo :: Int, qux :: String }

pattern HisRec{f1, f2} = MyRec{foo = f1, qux=f2}

This allows updates such as the following

updater :: MyRec -> MyRec
updater a = a {f1 = 1 }

It would also make sense to allow the following update (which we reject).

updater a = a {f1 = 1, qux = "two" } ==? MyRec 1 "two"

This leads to confusing behaviour when the selectors in fact refer the same field.

updater a = a {f1 = 1, foo = 2} ==? ???

For this reason, we reject a mixture of pattern synonym and normal record selectors in the same update block. Although of course we still allow the following.

updater a = (a {f1 = 1}) {foo = 2}

> updater (MyRec 0 "str")
MyRec 2 "str"
Test Plan

./validate

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped
There are a very large number of changes, so older changes are hidden. Show Older Changes
mpickering added inline comments.Oct 11 2015, 5:32 PM
compiler/hsSyn/HsBinds.hs
908

I promise I am not being deliberately obtuse here but I don't see how to implement it using only one name.

Could either of you be perhaps a bit more specific?

mpickering added inline comments.Oct 11 2015, 5:37 PM
compiler/hsSyn/HsBinds.hs
908

It also appears to me that both are used in multiple places. So I'm not sure what you mean when you say it's only used in one place, do you mean that the usage in the other places is wrong or unnecessary?

~/Documents/haskell/ghc:git grep recordPatSynId
compiler/hsSyn/HsBinds.hs:      recordPatSynId :: a     -- Selector name visible in rest of the file
compiler/hsSyn/HsUtils.hs:  = map recordPatSynId as ++ L bind_loc n : pss
compiler/rename/RnBinds.hs:                   do { checkDupRdrNames (map recordPatSynId vars)
compiler/rename/RnBinds.hs:                                  map (unLoc . recordPatSynId) names
compiler/typecheck/TcPatSyn.hs:       ; let field_labels = map (unLoc . recordPatSynId) rec_fields

~/Documents/haskell/ghc:git grep recordPatSynArg
compiler/hsSyn/HsBinds.hs:      , recordPatSynArg :: a  -- Filled in by renamer, the name used internally
compiler/rename/RnBinds.hs:                               , mkFVs (map (unLoc . recordPatSynArg) names)) }
compiler/typecheck/TcPatSyn.hs:                  -> (map (unLoc . recordPatSynArg) names, False)
compiler/typecheck/TcPatSyn.hs:                  -> (map (unLoc . recordPatSynArg) names, False)
compiler/typecheck/TcPatSyn.hs:              RecordPatSyn args     -> map recordPatSynArg args

I don't see how to implement it using only one name.

Ah, now I get it. Consider these two:

pattern P x y = ([x,True], [y,'v'])
pattern Q{ x, y } =([x,True], [y,'v'])

For P everything is clear. We have purely-local binders x::Bool and y::Char, which are bound by the pattern.

For Q we have that same situation. But we ALSO have top-level record selectors x :: ([Bool],[Char]) -> Bool and similarly y. That's why we want two names in RecordPatSynField.

We don't support this syntax, but it would make sense to support

pattern Q{ x=x1, y=y1 } = ([x1,True], [y1,'v'])

where we can use a different name for the local binder than we do for the record selector. Now it is even clearer why there are two names involved. Supporting the syntax doesn't seem worth it, but the underlying idea is the same.

So I'm happy with the two-name thing now, provided you add a Note with the definition of RecordPatSynField explaining why. Use the examples from this comment!

compiler/rename/RnBinds.hs
664

in what sense is recordPatSynArg free in the definition?

mpickering added inline comments.Oct 12 2015, 3:22 PM
compiler/typecheck/TcExpr.hs
716

I tried to refactor RecSelId to RecSelId [DataCon] Bool but this causes some pain when it comes to the interface files. Next try it is something like RecSelId (Either PatSyn TyCon) Bool with the aim for something better in the future.

mpickering updated this revision to Diff 4484.Oct 13 2015, 10:36 AM
  • cleanup wall
  • Use exposed tcRecSelBinds to typecheck selectors
  • refactor
  • Checkpoint: Using record syntax with constructor
  • checkpoint: polymorphic updates
  • Improve error messages
  • Add some tests which should fail
  • Add runnable test
  • Remove debug.trace import
  • Strip out field from PatSyn which was clearly bogus
  • Remove more cruft from RecordUpd
  • remove redundant import
  • Cleanup whitespace and traces
  • Remove deriving from hs-boot file
  • Add annotation information to parser
  • Update interface file
  • Refactor records updates to share pattern matching code
  • Cleanup
  • Fix core lint error.. this is obvious now looking at it
  • Fix 7.8 build
  • fix validate build
  • Try to fix build again..
  • Remove redundant import
  • Make CPP the right way around
  • Add CPP to ConLike
  • Fix two validate failures
  • Fix other validate failure, should be green now
  • Update tests
  • refactor
  • Remove some definitions from hs-boot file
  • Clarify ConLike module
  • Updates based on Richard's comments
  • Fix 7.8 validate
  • Get prov/req order the correct way around
  • Add two more tests
  • Make sure to add the correct constraints for the selector functions.
  • Add and update tests
  • Properly thread through prov/req constraints for record updates
  • Remove unused variable for validate
  • Cleanup some warnings
  • more cleanup
  • Don't use prov_wrap
  • Fix 7.8 validate
  • Get rid of prov_wrapper (rip)
  • update
  • fix validate
  • Fix validate
  • Add failing test - polymorphic req only
  • Update tests
  • Change behaviour to match record updates.
  • fixup
  • Revert coreSyn changes
  • Cleanup
  • Fix lint warnings
  • Fix bad rebase
  • PostTc the wrapper
  • lint
  • Remove PatSynWorkId and rename PatSynWrapId
  • Comments
  • Unify IdDetails
  • Refactor record selector generation
  • Remove non-existant test
mpickering updated this revision to Diff 4485.Oct 13 2015, 10:44 AM
  • Renaming
mpickering updated this revision to Diff 4488.Oct 13 2015, 4:38 PM
mpickering marked 33 inline comments as done.
  • renaming and comments
  • Comments & Cleanup
  • fix bad submodule merge
mpickering updated this revision to Diff 4489.Oct 13 2015, 4:47 PM
  • submodule merge
  • another bad submodule merge

I think I should have dealt with all of the outstanding comments now. The one lint error is 1 character too many and would be a paint to realign so I left it.

compiler/basicTypes/IdInfo.hs
117

I merged the two functions together. There is a test which checks for this already and the error message got much better.

compiler/rename/RnSource.hs
120

Note to self: the first pass puts all the names in tcg_rdr_env and the second into tcg_field_env.

compiler/typecheck/TcPatSyn.hs
385

I merged this code in with the old record selector generation code.

mpickering updated this revision to Diff 4491.Oct 13 2015, 7:10 PM
  • Refactor patsyn renaming
mpickering added a comment.EditedOct 13 2015, 7:12 PM

I actually missed two of Richard's comments but this should be nearly ready to merge once I add some documentation.

mpickering marked an inline comment as done.Oct 14 2015, 7:13 AM

I think that should be all the comment dealt with.

compiler/typecheck/TcExpr.hs
706

No but it fails with

+mixed-pat-syn-record-sels.hs:9:9:
+    No constructor has all these fields: ‘a’, ‘b’
+    In the expression: x {a = True, b = False}
+    In an equation for ‘foo’: foo x = x {a = True, b = False}

which I argue is desirable because it is desirable to treat pattern synonyms and data constructors uniformly.

743

Seems to be still useful if the previous line ever gets changed? The error would be much better.

820

Good observation. I made a note in PatSyn.

mpickering updated this revision to Diff 4493.Oct 14 2015, 7:18 AM
mpickering marked an inline comment as done.
  • slight refactor
  • comment
  • Correct origin
  • Add test for mixing pat syn record selectors

I think that should be all the comment dealt with.

compiler/rename/RnBinds.hs
665

Hmm, I'm not sure. I just followed the implementation of Prefix and Infix.

mpickering updated this revision to Diff 4494.Oct 14 2015, 8:29 AM
  • Remove used FVs stuff
simonpj requested changes to this revision.Oct 15 2015, 5:08 PM

I think that should be all the comment dealt with.

Excellent!

But, I hate to nag, but when we spoke at the Haskell exchange you did agree to write a specification didn't you?

I'd be quite content if the specification was part of the user manual; but at the moment there is no patch to the user manual at all, let alone a spec.

Simon

This revision now requires changes to proceed.Oct 15 2015, 5:08 PM

I think that should be all the comment dealt with.

Excellent!

But, I hate to nag, but when we spoke at the Haskell exchange you did agree to write a specification didn't you?

I'd be quite content if the specification was part of the user manual; but at the moment there is no patch to the user manual at all, let alone a spec.

Simon

Yes! The implementation is certainly the more interesting part, I will get to it soon. I have another patch D1325 which has generally improved documentation for this patch, D1258 and eventually the new signature stuff. I thought it made sense to do it all at once so that the whole section read better.

compiler/rename/RnBinds.hs
665

All the tests still pass removing all this free variables stuff so I'm not sure where it's used if anywhere.

Good progress; a few more things to do.

I have still not properly reviewed the update code, but I dont' think I'm going to have time to do that.

compiler/basicTypes/IdInfo.hs
112

No, no. We can't use an Either here -- too opaque at use sites. C.f. the ConLike module.

Either we could generalise ConLike, by parameterising it, so that's isomorphic to Either:

data ConLike a b = DataLike a | PatSynLike b

or we can declare a new sum type here

data RecSelParent = RecSelData TyCon | RecSelPatSyn PatSyn
125

This coment seems oprhaned (and was before your work). What does it refer to? Maybe the RecSelId?

compiler/rename/RnBinds.hs
691

emptyFVs??? With no comment??? Matthew you can't just arbitrarily delete code without understanding what it does! Saying that no test fails is not good enough!! Maybe we need another test.

In this case I think we need the free variables so that if the pattern uses view patterns, we will see the dependency between the pattern synonym and functions mentioned in the view pattern.

I'm afraid you'll have to put it back.

compiler/rename/RnSource.hs
1595

Please give a comment to explain what this function does

compiler/typecheck/TcPatSyn.hs
84

If you are going to use this style for rec_fields, use it for is_infix too, so that there are three let-bindings, for arg_names, is_infox, and rec_fields.. It's inconsistent otherwise

141

Same again. Actually now I see the code is fully dupliated, why not put it in an aux function returning a triple?

241

Close the fixM here. No need to have the gbl_env involved; it can happen after you have "tied the knot".

mpickering marked 5 inline comments as done.Oct 16 2015, 4:35 AM

Made the changes apart from refactoring the Either.

I am very scared of the rebase which is going to have to happen after D761 gets merged.

compiler/basicTypes/IdInfo.hs
112

I wanted this to be [ConLike] (or better NonEmpty ConLike) but I couldn't work out how to make the interface files work out so was left with this less nice solution.

125

There is a function isImplicitId in this module.

isImplicitId :: Id -> Bool
-- ^ 'isImplicitId' tells whether an 'Id's info is implied by other
-- declarations, so we don't need to put its signature in an interface
-- file, even if it's mentioned in some other interface unfolding.
isImplicitId id
  = case Var.idDetails id of
        FCallId {}       -> True
        ClassOpId {}     -> True
        PrimOpId {}      -> True
        DataConWorkId {} -> True
        DataConWrapId {} -> True
                -- These are implied by their type or class decl;
                -- remember that all type and class decls appear in the interface file.
                -- The dfun id is not an implicit Id; it must *not* be omitted, because
                -- it carries version info for the instance decl
        _               -> False
mpickering updated this revision to Diff 4511.Oct 16 2015, 4:37 AM
  • Revert "Remove used FVs stuff"
  • Refactoring
  • Move
  • Close fixM earlier
  • Comment

I am very scared of the rebase which is going to have to happen after D761 gets merged.

Yes, it's unfortunate that we have this impending pile of merge conflicts! Hopefully they should mostly be fairly mechanical to fix, though, from a quick skim of your diff. Let me know if I can do anything to help.

I don't know exactly how you are managing your branch, but I'd suggest merging master (once it includes D761) into your branch, rather than rebasing all your existing commits. With ORF I made the mistake of relying rather too heavily on rebasing, and ended up with a completely meaningless history.

I am very scared of the rebase which is going to have to happen after D761 gets merged.

Yes, it's unfortunate that we have this impending pile of merge conflicts! Hopefully they should mostly be fairly mechanical to fix, though, from a quick skim of your diff. Let me know if I can do anything to help.

I don't know exactly how you are managing your branch, but I'd suggest merging master (once it includes D761) into your branch, rather than rebasing all your existing commits. With ORF I made the mistake of relying rather too heavily on rebasing, and ended up with a completely meaningless history.

The commit history is already very messy. Phabricator will squash them all to a single commit anyway once it gets landed.

The bit I am worried about is the record selector generation. I don't know how this is implemented but if there class instances generated then there seems to be the possibility of generating orphan instances.

The only way to see will be to try and do the merge.

Actually before you finally land this, please squash it into one big commit. The "local commit history" on this branch is not at all informative,, because it records the journey rather the outcome.

The same may be true for you, Adam?

The bit I am worried about is the record selector generation. I don't know how this is implemented but if there class instances generated then there seems to be the possibility of generating orphan instances.

I wouldn't worry, D761 doesn't actually do anything to do with generating class instances, it just (!) modifies the renamer to allow duplicate field names. It'll be a while (i.e. probably not before 8.0) before I get a chance to revisit the instance generation for ORF, but then we'll have to figure out how it interacts with record pattern synonyms. My guess is that they will work rather like "virtual" record fields, which we want anyway.

Actually before you finally land this, please squash it into one big commit. The "local commit history" on this branch is not at all informative,, because it records the journey rather the outcome.

The same may be true for you, Adam?

Yes, I'm planning to do a squash merge.

I tried doing the merge in the last two days but didn't succeed. I won't have time to work on this anymore for the foreseeable future.

mpickering updated this revision to Diff 4547.Oct 17 2015, 6:46 PM
  • Refactoring Either to RecSelParent
  • Bad merge

I tried doing the merge in the last two days but didn't succeed. I won't have time to work on this anymore for the foreseeable future.

Is that a request for someone else to push this through? After all this work, it would be terrible not to cover the last mile. Thanks for that work!

Adam has provided a patch which fixes up the merge in one place I missed and couldn't fix. Thanks to Adam for this.

The second more serious problem is that it is not clear what the interaction between this feature and DuplicateRecordFields should be. Adam has pointed out that there are some assumptions that selectors have to be associated with a specific type constructor. (As we discovered in D1258). I don't know how to fix this so that the two features behave well together.

The second more serious problem is that it is not clear what the interaction between this feature and DuplicateRecordFields should be. Adam has pointed out that there are some assumptions that selectors have to be associated with a specific type constructor. (As we discovered in D1258). I don't know how to fix this so that the two features behave well together.

Would you like to give an example that illustrates the problem?

Would you like to give an example that illustrates the problem?

Consider the following module:

{-# LANGUAGE DuplicateRecordFields #-}
{-# LANGUAGE PatternSynonyms #-}

module Main where

pattern S{a, b} = (a, b)
pattern T{a}    = Just a

main = do
  print S{ a = "fst", b = "snd" }
  print T{ a = "a" }

In principle, this ought to work, because there is no ambiguity. But at the moment it leads to a "multiple declarations of a" error. The problem is that pattern synonym record selectors don't do the same name mangling as normal datatypes when DuplicateRecordFields is enabled. They could, but this would require some work to track the field label and selector name separately.

In particular, we currently represent datatype selectors in the third component of AvailTC, but pattern synonym selectors are just represented as Avails (because they don't have a corresponding type constructor). Moreover, the GlobalRdrElt for a selector currently requires it to have a parent tycon.

mpickering updated this revision to Diff 4564.Oct 19 2015, 3:22 PM
  • Record pattern synonyms
  • Apply Adam's patch

The diff now should build against master and validate.

mpickering updated this revision to Diff 4566.Oct 19 2015, 4:32 PM
  • Remove tracing statements
mpickering updated this revision to Diff 4567.Oct 19 2015, 5:56 PM
  • Record pattern synonyms
  • Apply Adam's patch
  • Remove tracing statements
mpickering updated this revision to Diff 4568.Oct 19 2015, 6:55 PM
  • Remove tracing
mpickering updated this revision to Diff 4573.Oct 20 2015, 4:17 AM
  • Add missing tests
mpickering updated this revision to Diff 4574.Oct 20 2015, 6:39 AM
  • Remove tracing again..
mpickering updated this revision to Diff 4740.Oct 28 2015, 10:18 AM
mpickering removed rGHC Glasgow Haskell Compiler as the repository for this revision.

Add comment for HsWrapper.

bgamari added a comment.EditedOct 28 2015, 10:22 AM

Alright, I'm going to merge this. Thanks @mpickering!

There are, however, a few issues which we may want to address in the future (most of which reference inlines comments in this Diff),

  • @goldfire's comment in HsExpr.hs hasn't been addressed with a Note (if I'm not mistaken)
  • It's still not clear to me that the free variables produced in rnPatSynBind are correct
  • Perhaps in the future we want to refactor tc_single to use a function which extends the typechecking environment instead of using setGblEnv
  • @mpickering's suggested refactoring of RecSelId to RecSelId (Either PatSyn TyCon) Bool described in the comment in TcExpr.hs
  • There is a TODO in tc_patsyn_finish which should either be clarified so it can be considered actionable, removed, or just fixed
  • @goldfire's comment regarding the type variables in mkPatSynBuilderId doesn't appear to have a response
  • @goldfire's question on line 406 of TcPatSyn.hs is unanswered,

Is this output sensible? That is, is this case covered by a should_run test?

bgamari commandeered this revision.Oct 29 2015, 10:31 AM
bgamari edited reviewers, added: mpickering; removed: bgamari.

This was merged in rGHC2a74a64e8329.

bgamari abandoned this revision.Oct 29 2015, 10:31 AM