Add Bifoldable and Bitraversable to base
ClosedPublic

Authored by RyanGlScott on May 30 2016, 6:49 PM.

Details

Summary

This adds Data.Bifoldable and Data.Bitraversable from the
bifunctors package to base, completing the migration started in D336.
This is fairly straightforward, although there were a suprising amount of
reinternal organization in base that was needed for this to happen:

  • Data.Foldable, Data.Traversable, Data.Bifoldable, and Data.Bitraversable share some nonexported datatypes (e.g., StateL, StateR, Min, Max, etc.) to implement some instances. To avoid code duplication, I migrated this internal code to a new hidden module, Data.Functor.Utils (better naming suggestions welcome).
  • Data.Traversable and Data.Bitraversable also make use of an identity newtype, so I modified them to use Data.Functor.Identity.Identity. This has a ripple effect on several other modules, since I had to move instances around in order to avoid dependency cycles.

Fixes Trac #10448.

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
RyanGlScott retitled this revision from to Add Bifoldable and Bitraversable to base.May 30 2016, 6:49 PM
RyanGlScott updated this object.
RyanGlScott edited the test plan for this revision. (Show Details)
RyanGlScott added reviewers: ekmett, hvr, austin, bgamari.
RyanGlScott updated the Trac tickets for this revision.
hvr awarded a token.May 31 2016, 2:57 AM
hvr added a comment.EditedMay 31 2016, 3:02 AM

First of all, thanks for picking up this tedious task!

I haven't yet reviewed this detail, but there's a disturbing amount of {-# INLINE #-} pragmas; when we moved other modules into base we dropped most INLINE-pragmas which were in the original code. Are all these INLINE pragmas really needed? Wouldn't INLINEABLE suffice?

In D2284#66056, @hvr wrote:

I haven't yet reviewed this detail, but there's a disturbing amount of {-# INLINE #-} pragmas; when we moved other modules into base we dropped most INLINE-pragmas which were in the original code. Are all these INLINE pragmas really needed? Wouldn't INLINEABLE suffice?

To be honest, I copied the INLINE pragmas wholesale from bifunctors without giving them much thought. I wouldn't object to dropping them.

RyanGlScott updated this revision to Diff 7810.Jun 1 2016, 6:08 PM
  • Remove INLINE pragmas
RyanGlScott updated this revision to Diff 7835.Jun 3 2016, 3:06 PM

Rebase on top of master

I think the documentation could be made a bit more clear. Namely, the haddocks for all of the Foldable-like combinators seem to simply be copy-pasted from Foldable and therefore do not acknowledge that there are are in fact two elements being folded over for each bifunctorial thing.

I also rather worry that those INLINE pragmas might be necessary. @RyanGlScott, could you verify that, for instance, GHC decides to inline bisum in a small example program? Failure for these to inline would be a significant performance cliff.

I think the documentation could be made a bit more clear. Namely, the haddocks for all of the Foldable-like combinators seem to simply be copy-pasted from Foldable and therefore do not acknowledge that there are are in fact two elements being folded over for each bifunctorial thing.

Well sure, they're quite similar to their Foldable counterparts because literally the only thing that changed is the typeclass. Can you give an example documentation change that would make it clearer? Personally, I'm a bit adverse to saying that "two elements are being folded over", since that's misleading for Bifoldables like Const.

I also rather worry that those INLINE pragmas might be necessary. @RyanGlScott, could you verify that, for instance, GHC decides to inline bisum in a small example program? Failure for these to inline would be a significant performance cliff.

Admittedly, I'm not very experienced at determining when a function successfully inlines. Is there a particular combination of GHC flags that is useful for diagnosing this sort of thing?

RyanGlScott updated this revision to Diff 7874.Jun 6 2016, 11:33 AM

Rebase on top of master

Well sure, they're quite similar to their Foldable counterparts because literally the only thing that changed is the typeclass. Can you give an example documentation change that would make it clearer? Personally, I'm a bit adverse to saying that "two elements are being folded over", since that's misleading for Bifoldables like Const.

Alright, fair point. I'm afraid I don't know what in particular can be said clear this up and perhaps my concern is really just my unfamiliarity with these particular classes speaking.

In general it would be nice if our documentation would strive to be more accessible to those new to the language. You certainly don't need to be the one to make this happen, however.

I also rather worry that those INLINE pragmas might be necessary. @RyanGlScott, could you verify that, for instance, GHC decides to inline bisum in a small example program? Failure for these to inline would be a significant performance cliff.

Admittedly, I'm not very experienced at determining when a function successfully inlines. Is there a particular combination of GHC flags that is useful for diagnosing this sort of thing?

I would typically just compile a simple example program like,

import Data.Bifoldable

main = print $ bisum $ map (\i -> (i, i+1)) [0..10000]

and compile with -ddump-simpl, ensuring that the bisum call has been inlined away.

RyanGlScott planned changes to this revision.EditedJun 8 2016, 9:01 AM

Alright, fair point. I'm afraid I don't know what in particular can be said clear this up and perhaps my concern is really just my unfamiliarity with these particular classes speaking.

In general it would be nice if our documentation would strive to be more accessible to those new to the language. You certainly don't need to be the one to make this happen, however.

Well, literally the only difference between Foldable and Bifoldable is that the latter works on datatypes with at least two type parameters instead of one. Is that really worth mentioning in the documentation of every single function in Data.Bifoldable? I'd argue it isn't, since the documentation for the Bifoldable class itself mentions that it "identifies foldable structures with two different varieties of elements", which seems like a pretty good description to me.

However, I did check the Haddocks for base-4.9.0.0's Data.Foldable a minute ago, and there were some improvements made since base-4.8.0.0 in the form of examples (e.g., for_) and more detailed explanations for some functions (e.g., and). It's probably worthwhile for me to do another documentation pass over this Diff to make sure it's at least as informative as Data.Foldable and Data.Traversable.

I would typically just compile a simple example program like,

import Data.Bifoldable

main = print $ bisum $ map (\i -> (i, i+1)) [0..10000]

and compile with -ddump-simpl, ensuring that the bisum call has been inlined away.

Well, that example doesn't typecheck. I did compile this example:

module Main (main) where

import Data.Bifoldable
import Data.Monoid ((<>))

data BiList a b = Nil | Cons a b (BiList a b)
  deriving Show

instance Bifoldable BiList where
  bifoldMap f g Nil          = mempty
  bifoldMap f g (Cons a b l) = f a <> g b <> bifoldMap f g l

fromList :: [Int] -> BiList Int Int
fromList []     = Nil
fromList (x:xs) = Cons x (x+1) (fromList xs)

main :: IO ()
main = print $ bisum $ fromList [0..10000]

with -O2 -ddump-simpl, but the results are rather mysterious to me.

Well, that example doesn't typecheck.

Oops, yes, of course. Moreover, the example was significantly more complex than necessary to demonstrate inlining. Something as simple as bisum $ Right 5 would be perfectly sufficient. When compiled with -O this should reduce easily to simply 5.

with -O2 -ddump-simpl, but the results are rather mysterious to me.

The problem here is that the BiList is built using primitive recursion and is therefore not subject to fusion.

OK, I compiled this program:

module Main (main) where

import Data.Bifoldable

rightFive :: Either Int Int
rightFive = Right 5

main :: IO ()
main = print $ bisum rightFive

with ghc -O -ddump-simpl and got this:

[1 of 1] Compiling Main             ( Example.hs, Example.o )

==================== Tidy Core ====================
Result size of Tidy Core = {terms: 34, types: 38, coercions: 9}

-- RHS size: {terms: 2, types: 0, coercions: 0}
Main.$trModule2 :: GHC.Types.TrName
[GblId,
 Caf=NoCafRefs,
 Str=m1,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 30 20}]
Main.$trModule2 = GHC.Types.TrNameS "main"#

-- RHS size: {terms: 2, types: 0, coercions: 0}
Main.$trModule1 :: GHC.Types.TrName
[GblId,
 Caf=NoCafRefs,
 Str=m1,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 30 20}]
Main.$trModule1 = GHC.Types.TrNameS "Main"#

-- RHS size: {terms: 3, types: 0, coercions: 0}
Main.$trModule :: GHC.Types.Module
[GblId,
 Caf=NoCafRefs,
 Str=m,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 10 30}]
Main.$trModule = GHC.Types.Module Main.$trModule2 Main.$trModule1

-- RHS size: {terms: 9, types: 11, coercions: 0}
Main.main2 :: String
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=False, ConLike=False,
         WorkFree=False, Expandable=False, Guidance=IF_ARGS [] 60 30}]
Main.main2 =
  case GHC.Show.$wshowSignedInt 0# 5# (GHC.Types.[] @ Char) of
  { (# ww5_a3bg, ww6_a3bh #) ->
  GHC.Types.: @ Char ww5_a3bg ww6_a3bh
  }

-- RHS size: {terms: 6, types: 2, coercions: 0}
Main.main1
  :: GHC.Prim.State# GHC.Prim.RealWorld
     -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)
[GblId,
 Arity=1,
 Str=<S,U>,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [0] 40 0}]
Main.main1 =
  \ (eta_a2OG :: GHC.Prim.State# GHC.Prim.RealWorld) ->
    GHC.IO.Handle.Text.hPutStr2
      GHC.IO.Handle.FD.stdout Main.main2 GHC.Types.True eta_a2OG

-- RHS size: {terms: 1, types: 0, coercions: 3}
main :: IO ()
[GblId,
 Arity=1,
 Str=<S,U>,
 Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True,
         Guidance=ALWAYS_IF(arity=0,unsat_ok=True,boring_ok=True)
         Tmpl= Main.main1
               `cast` (Sym (GHC.Types.N:IO[0] <()>_R)
                       :: ((GHC.Prim.State# GHC.Prim.RealWorld
                            -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)) :: *)
                          ~R#
                          (IO () :: *))}]
main =
  Main.main1
  `cast` (Sym (GHC.Types.N:IO[0] <()>_R)
          :: ((GHC.Prim.State# GHC.Prim.RealWorld
               -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)) :: *)
             ~R#
             (IO () :: *))

-- RHS size: {terms: 2, types: 1, coercions: 3}
Main.main3
  :: GHC.Prim.State# GHC.Prim.RealWorld
     -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)
[GblId,
 Arity=1,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True, Guidance=IF_ARGS [] 20 60}]
Main.main3 =
  GHC.TopHandler.runMainIO1
    @ ()
    (Main.main1
     `cast` (Sym (GHC.Types.N:IO[0] <()>_R)
             :: ((GHC.Prim.State# GHC.Prim.RealWorld
                  -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)) :: *)
                ~R#
                (IO () :: *)))

-- RHS size: {terms: 1, types: 0, coercions: 3}
:Main.main :: IO ()
[GblId,
 Arity=1,
 Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
         WorkFree=True, Expandable=True,
         Guidance=ALWAYS_IF(arity=0,unsat_ok=True,boring_ok=True)
         Tmpl= Main.main3
               `cast` (Sym (GHC.Types.N:IO[0] <()>_R)
                       :: ((GHC.Prim.State# GHC.Prim.RealWorld
                            -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)) :: *)
                          ~R#
                          (IO () :: *))}]
:Main.main =
  Main.main3
  `cast` (Sym (GHC.Types.N:IO[0] <()>_R)
          :: ((GHC.Prim.State# GHC.Prim.RealWorld
               -> (# GHC.Prim.State# GHC.Prim.RealWorld, () #)) :: *)
             ~R#
             (IO () :: *))



Linking Example ...

which doesn't appear to mention bisum anywhere, so I think it inlined?

Indeed it did. Thanks, @RyanGlScott!

I agree that it might be nice to have a look through the current Foldable haddocks and replicate any improvements made there in this module.

RyanGlScott updated this revision to Diff 7916.Jun 9 2016, 10:09 AM
  • Documentation improvements

I added some more details to the Haddocks in Data.Bifoldable and Data.Bitraversable. In particular, I made an effort to emphasize that Bifoldable and Bitraversable operate over data structures with two varieties of elements, whereas Foldable and Traversable operate over one variety. How does it look?

bgamari accepted this revision.Jun 9 2016, 10:35 AM

Looks great! Thanks.

This revision is now accepted and ready to land.Jun 9 2016, 10:35 AM
This revision was automatically updated to reflect the committed changes.