The lexer hacks around unicode by squishing any character into a 'Word8'
and then storing the actual character in its state. This happens at
That is all and well, but we ought to be careful that the characters we
retrieve via 'alexInputPrevChar' also fit this convention.
In fact, Trac #13986 exposes nicely what can go wrong: the regex in the left
context of the type application rule uses the '$idchar' character set
which relies on the unicode hack. However, a left context corresponds
to a call to 'alexInputPrevChar', and we end up passing full blown
unicode characters to '$idchar', despite it not being equipped to deal