- User Since
- Apr 27 2018, 11:24 AM (11 w, 2 d)
Mon, Jul 9
update annotations for coercions; rebase on master
Wed, Jul 4
- Overall changes so far:
- Removed Refl r ty and CoherenceCo
- Introduced Refl ty and GRefl r ty MCoercion
- Refl ty :: ty ~n ty, note that Refl ty is always nominal.
- GRefl r ty MRefl :: ty ~r ty. If r == Nominal, use Refl.
- GRefl r ty (MCo co) :: ty ~r ty |>co.
- Replaced original Refl Nominal ty with Refl ty.
- Given g1 :: s ~r t, to construct s |> g2 ~r t we used CoherenceCo g1 g2. It is now replaced with Sym (GRefl r s g2) ; g1. Similar for s ~ t |> g3.
- It turns out that the explicit patten match in homogenise_result in TcFlatten triggers some optimization of GHC and improves the performance. However it is not useful in master branch.
- Added a small regression for T9872d (the added number is the allocation of current master on T9872d).
- Added note about flatten_exact_fam_app_fully performance in TcFlatten.
- Performance summary
- This patch intends to improve the overall performance about coercions.
- It does perform better in all cases under perf/compiler, except T9872b (0.6%), T9872d(3.7%), and T14683(0.02%).
- It failed T9872d, thus we added a small regression.
- It seems to perform better to compile large packages, e.g. Cabal.
- Further analysis of the performance: a step-by-step replay of the refactor following Simon's suggestion.
add notes for GRefl in TyCoRep, performance issue in TcFlatten, and regression for T9872d
Mon, Jul 2
In the last week, by analyzing the comparing the ddump-simpl code, Richard and I captured a simple optimization.
In this last commit I refactored homogenise_result to patten match explicitly on kind_co to distinguish the case when kind_co is a GRefl co and the case when it is not. It improves the performance notably. Specifically, because flatten_args_fast always returns a Refl co, GHC can make use of this information to go directly into the case when kind_co is a GRefl co, which saves lots of allocation.
Fri, Jun 29
try a small trick in homogenise_result
Tue, Jun 26
Sun, Jun 24
After some discussions with Richard, we made some optimizations and got rid of one performance failure.
Unfortunately we still have 3 performance failures, for T9872a, T9872b, and T9872c, with deviation 6.7%, 5.7%, 6.3% respectively.
(latest master itself has deviation 3.5%, 3%, 3.35% on those cases.)
Jun 15 2018
Current version adds the Nominal Reflexive Coercion Refl ty :: ty ~n ty.
Add Nominal Refl Coercion
Jun 12 2018
After several commits (some simplification for the smart constructors, saving bits whenever possible, etc.), we have a better deviation as 8.8%, 8.9%, 8.1%, 10.2%. At least a little bit progress?
address most comments from Richard
Jun 9 2018
clean up; add more annotations
Jun 8 2018
fix problematic cases according to Simon's comment
Refactor Refl to GRefl
May 29 2018
- fix some comments
May 28 2018
May 16 2018
- update annotations for MCoercion