Improve performance of CallArity
the hot path contained a call to
v `elemUnVarSet` (neighbors g v)
and creating the set of neighbors just to check if v is inside
accounted for half the allocations of the test case of Trac #15164.
By introducing a non-allocating function hasLoopAt for this we shave
off half the allocations. This brings the total cost of Call Arity down
to 20% of time and 23% of allocations, according to a profiled run. Not
amazing, but still much better.
Differential Revision: https://phabricator.haskell.org/D4718