We have already marked enumFromTo and enumFromThenTo methods for Int
and Word type. But for Int32, Word32, Int64 type, (as well as Int16,
Word16 and such things), these two methods will fall back to the default
implmentations in Enum typeclass, which are not marked as INLINE.
The lack of INLINE breaks stream fusion, leading to degeneration in
performance for Int32, Int64 and similar types. See Trac Trac #15185.
This patch fixs that.
For benchmark program in Trac #15185:
fact1 :: Integral t => t -> t fact1 n = product [1..n] fact2 :: Integral t => t -> t fact2 n = go 1 n where go acc 1 = acc go acc n = go (acc * n) (n - 1)
Without INLINE, we have:
fact1 20/Int mean 19.27 ns ( +- 437.5 ps ) fact1 20/Word mean 19.85 ns ( +- 350.5 ps ) fact1 20/Int64 mean 156.8 ns ( +- 3.221 ns ) fact1 20/Word64 mean 836.9 ns ( +- 15.94 ns )
We can see a huge gap between the performance of Int and Int64, as well as
Word and Word64.
After marking these methods (especially the default implementations in
Enum class), we have:
fact1 20/Int mean 21.20 ns ( +- 460.2 ps ) fact1 20/Word mean 19.72 ns ( +- 446.8 ps ) fact1 20/Int64 mean 22.35 ns ( +- 531.1 ps ) fact1 20/Word64 mean 632.1 ns ( +- 16.81 ns )
Now the Int64 has the same performance with the Int case. The variance
introduced by outliers is a bit of inflated, but the result is enough to
demonstrate the problem and the performance improvement.
We still cannot optimize the Word64 case to have the similar measures
with Word. The enumFromTo for Word64 (and for Word32 on 32-bit platform)
is integralEnumFromTo, whose overhead is significant. However the INLINE
does make some improvement.
The enumFromTo method for Int16, Word16 and such things will also
benefited from the INLINE pragma.
Signed-off-by: HE, Tao <firstname.lastname@example.org>