Provide a faster implementation for the Read Integer instance
ClosedPublic

Authored by redneb on Feb 5 2015, 11:06 PM.

Details

Summary

The current implementation of the Read Integer instance has quadratic
complexity and thus performs badly on large inputs. This patch provides a
rather simple sub-quadratic algorithm. For small inputs, we use the old
algorithm (there is a small penalty for that). The gains for large
inputs can be dramatic: on my system, the time to perform

read (take 1000000 $ cycle "1234567890") :: Integer

drops from 65 seconds to less than a second.

Note that we already provide an ad-hoc instance for Show Integer, so this
patch essentially does the same thing for Read Integer.

Test Plan

Check that read :: String -> Integer returns correct results for inputs of various sizes.

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
redneb updated this revision to Diff 2197.Feb 5 2015, 11:06 PM
redneb retitled this revision from to Provide a faster implementation for the Read Integer instance.
redneb updated this object.
redneb edited the test plan for this revision. (Show Details)
redneb added a reviewer: austin.
redneb updated the Trac tickets for this revision.
ekmett added a subscriber: ekmett.Feb 5 2015, 11:15 PM

Doesn't it make sense to do this for Natural as well as Integer?

Well, the Read Natural instance currently relies on Read Integer, so it will benefit from this patch as well. But I can still add a RULE for Natural if you want.

austin accepted this revision.Feb 17 2015, 9:13 AM
austin edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Feb 17 2015, 9:13 AM
redneb updated this revision to Diff 2252.Feb 17 2015, 6:41 PM
redneb edited edge metadata.

Extend the fast integer parser to support Natural

  • Comments only
  • Add support for Natural in the integer parser in Text.Read.Lex
  • Add specializations in Text.Read.Lex related to the new integer parser
hvr edited edge metadata.EditedFeb 18 2015, 1:32 AM

@redneb can we keep the Natural-support out of this for the meantime?

The boot file will make it only harder to de-cycle base, and I don't see any significant benefit for the cost

hvr requested changes to this revision.Feb 18 2015, 9:43 AM
hvr edited edge metadata.
This revision now requires changes to proceed.Feb 18 2015, 9:43 AM
redneb updated this revision to Diff 2262.Feb 18 2015, 10:47 AM
redneb edited edge metadata.

Drop support for Natural in Text.Read.Lex (for now)

The only disadvantage now is that the readIntP, readOct, readDecP, and readHexP parsers do not have a specialization for Natural, but I guess we can fix that later on.

hvr accepted this revision.Feb 22 2015, 11:23 AM
hvr edited edge metadata.

LGTM (as in "I couldn't find anything obviously wrong")

@redneb would you mind running a nofib comparison with and w/o D645, just to make sure there are no surprises?

This revision is now accepted and ready to land.Feb 22 2015, 11:23 AM
In D645#18867, @hvr wrote:

LGTM (as in "I couldn't find anything obviously wrong")

@redneb would you mind running a nofib comparison with and w/o D645, just to make sure there are no surprises?

Here you are: P51. The basis of comparison is the current master branch (i.e. 7.11): master vs master+D645.

This revision was automatically updated to reflect the committed changes.

This is great, but could you add the tests you ran to the GHC test suite also?