Replace hashing function for string keys implementation with xxhash
ClosedPublic

Authored by Phyx on Sep 1 2017, 3:33 PM.

Details

Summary

When doing profiling on startup time of ghci on Windows, both cold
and startup loading static LLVM libs, the profiler is showing a glaring
red spot on the division operation of the the hashStr function.

In fact profiling shows 14% of the time is spent hashing the keys.

So I am replacing the hash function with xxHash which is a very fast
non-crypto hash. It's faster than MurMurHash which node etc use.

It also passes SMHasher. I can provide if required the collected raw data.
But from analysis done on the keys, xxHash does not introduce more collisions
than before, the amount splits seem about the same and the distributions
among the buckets are slightly more uniform than before.

However the runtime dropped enough to remove the function completely from
the profiler's report.

There's also a noticeable improvement in responsiveness.

xxHash is BSD licensed and can be found https://github.com/Cyan4973/xxHash

Test Plan

./validate

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
Phyx created this revision.Sep 1 2017, 3:33 PM
bgamari accepted this revision.Sep 1 2017, 3:49 PM

This sounds quite reasonable to me. @simonmar, any objection?

This revision is now accepted and ready to land.Sep 1 2017, 3:49 PM

Out of curiosity, which hash table was it that was so hot?

Phyx added a comment.Sep 1 2017, 3:54 PM

The rts hashtable which stores symbols.

This revision was automatically updated to reflect the committed changes.
Phyx added inline comments.Sep 6 2017, 5:34 AM
rts/Hash.c
82

Hmm how did harbormaster build here but the later build failed.. Sorry this should be a define check. Could someone fix this for me? I won't be near a computer for a few days

paolo added a subscriber: paolo.Sep 6 2017, 6:42 PM