Return nBytes instead of nextAddr from utf8DecodeChar
ClosedPublic

Authored by thomie on Aug 27 2014, 6:22 AM.

Details

Summary

While researching D176, I came across the following simplification
opportunity:

Not all functions that call utf8DecodeChar actually need the address
of the next char. And some need the 'number of bytes' read. So returning
nBytes instead of nextAddr should save a few addition and subtraction
operations, and makes the code a bit simpler.

Test Plan

it validates

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Lint Skipped
Unit
Unit Tests Skipped
thomie updated this revision to Diff 444.Aug 27 2014, 6:22 AM
thomie retitled this revision from to Return nBytes instead of nextAddr from utf8DecodeChar.
thomie updated this object.
thomie edited the test plan for this revision. (Show Details)
thomie added reviewers: austin, simonmar.
austin updated this object.Aug 27 2014, 12:43 PM
austin edited edge metadata.
ezyang requested changes to this revision.Aug 27 2014, 12:50 PM
ezyang added a reviewer: ezyang.

Seems good except for the moved STRICT2.

compiler/utils/Encoding.hs
129

I don't think you want STRICT2 floated up here...

This revision now requires changes to proceed.Aug 27 2014, 12:50 PM
austin requested changes to this revision.Aug 27 2014, 1:00 PM
austin edited edge metadata.

LGTM, but...

compiler/utils/Encoding.hs
129

Yes, floating out STRICT2 is likely an error - in particular it means we may miss unboxing n in the loop body since it's no longer marked. (Or maybe not, but anyway, still probably wrong).

In fact with this proposed change I see no reason for STRICT2 at all, considering p is already unboxed with this change. For Maximum Speed we could also just pass n as unboxed in the body and rebox only on the return branch.

But personally, I'd just suggest marking the loop body as go p# !n and let GHC do the unboxing work. And can someone explain to me why we use STRICTn macros instead of just BangPatterns?

thomie updated this revision to Diff 490.Sep 2 2014, 12:49 PM
thomie edited edge metadata.

Rely on GHC for unboxing ptr in utf8DecodeString and countUTF8Chars, and
don't use STRICTn macros. The simplified core output is not changed, or
slightly improved. Originally the calculation of end did not get floated out of the inner loop.

thomie updated this object.Sep 2 2014, 12:55 PM
thomie edited edge metadata.
austin accepted this revision.Sep 2 2014, 6:07 PM
austin edited edge metadata.

LGTM, thanks!

austin closed this revision.Sep 16 2014, 7:58 AM
austin updated this revision to Diff 552.

Closed by commit rGHCcaf449e39f5e (authored by @thomie, committed by @austin).