Fix for T14251 on ARM
ClosedPublic

Authored by kavon on Oct 23 2018, 5:13 PM.

Details

Summary

We now calculate the SSE register padding needed to fix the calling
convention in LLVM in a robust way: grouping them by whether
registers in that class overlap (with the same class overlapping itself).

My prior patch assumed that no matter the platform, physical
register Fx aliases with Dx, etc, for our calling convention.

This is unfortunately not the case for any platform except x86-64.

Test Plan

Only know how to test on x86-64, but it should be tested on ARM with:

make test WAYS=llvm && make test WAYS=optllvm

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
kavon created this revision.Oct 23 2018, 5:13 PM
kavon edited the test plan for this revision. (Show Details)Oct 23 2018, 5:15 PM
kavon updated the Trac tickets for this revision.
kavon updated the Trac tickets for this revision.Oct 23 2018, 5:17 PM
kavon updated the Trac tickets for this revision.Oct 23 2018, 5:54 PM
angerman added inline comments.Oct 23 2018, 8:13 PM
compiler/llvmGen/LlvmCodeGen/Base.hs
205

shouldn't we test against x86_64 here? I believe arm/aarch64 are registered platforms, am I mistaken?

bgamari requested changes to this revision.Oct 24 2018, 2:04 PM
bgamari added inline comments.
compiler/llvmGen/LlvmCodeGen/Base.hs
193

For what it's worth I would have just made these haddock comments on the type signature.

205

Agreed. In fact, in general I wish we could clearly demarcate this platform-specific logic (perhaps even moving it into a separate module entirely?); the mentions of SSE seem quite out of place in llvmFunArgs, which is really should be platform independent. In my mind XmmReg, YmmReg, and ZmmReg are wide registers of the abstract STG machine that we happen to map to SSE on amd64.

This revision now requires changes to proceed.Oct 24 2018, 2:04 PM
kavon updated this revision to Diff 18436.Oct 24 2018, 3:46 PM
  • naming change, SSE -> FPR
kavon added a comment.Oct 24 2018, 3:50 PM

Added inline comments & an updated revision.

compiler/llvmGen/LlvmCodeGen/Base.hs
193

Not sure what you mean here, since I think this already is a haddock comment. Do you want me to make it more succinct?

205

@angerman the code inside padLiveArgs is designed to handle any platform that uses physical registers specified by GHC to pass floating-point values, so no x86-64 test is needed (we use CmmUtils.regsOverlap for the platform-specific handling).

Thus, the function is specific to how LLVM assigns values to registers in a calling convention (left-to-right). So, if F2 is the only live floating-point register, we need to pretend F1 is live too, otherwise LLVM will put F2 in F1's register.

I added this test for unregisterized builds because I thought none of this should be needed in that case (not tested yet, but it should be safe to simply skip this patch for that type of build), since I imagine we do not use any specific physical registers on whichever platform we're on in that case.

@bgamari Yes I think the prior "SSE" naming convention was pretty confusing, though in its defense the Xmm/Ymm/Zmm naming in STG are x86-64 names, though not meant to be platform specific. I've gone ahead and changed all of the naming to FPR (floating-point register) instead.

bgamari accepted this revision.Oct 28 2018, 11:11 AM

Alright, looks good enough to merge. However, could you follow-up adding a Note explaining this whole thing more concretely? It would currently be very difficult for someone to reconstruct why this logic is needed.

compiler/llvmGen/LlvmCodeGen/Base.hs
193

I meant attach the haddock comments to the actual arguments. e.g.

-- | Add padding registers to the set of live registers to ensure LLVM's
-- calling convention doesn't use shadowed registers.
padLiveArgs :: DynFlags
            -> LiveGlobalRegs
               -- ^ live registers
            -> [(Bool, GlobalReg)]
               -- ^ An augmented list of live registers, where padding was
               -- added to the list of registers to ensure the calling convention is
               -- correctly used by LLVM.
               --
               -- Each global reg in the returned list is tagged with a bool, which
               -- indicates whether the global reg was added as padding, or was an original
               -- live register.
               --
               -- That is, True => padding, False => a real, live global register.
               --
               -- Also, the returned list is not sorted in any particular order.

I think this whole thing really deserves a long Note with an example.

This revision is now accepted and ready to land.Oct 28 2018, 11:11 AM
This revision was automatically updated to reflect the committed changes.

Alright, looks good enough to merge. However, could you follow-up adding a Note explaining this whole thing more concretely? It would currently be very difficult for someone to reconstruct why this logic is needed.

Sadly this still isn't quite right. We now see the padLiveArgs -- i > regNum ?? panic when building GHC using the LLVM backend.

https://circleci.com/gh/ghc/ghc/10749