Clean up opt and llc
Changes PlannedPublic

Authored by angerman on Mar 15 2017, 9:01 PM.

Details

Summary

The LLVM backend shells out to LLVMs opt and llc tools. This clean up introduces
a shared data structure to carry the arguments we pass to each tool so that corresponding
flags are next to each other. It drops the hard coded data layouts in favor of using -mtriple
and have LLVM infer them. Furthermore we add clang as a proper tool, so we don't rely on
assuming that clang is called clang on the PATH when using clang as the assembler.
Finally this diff also changes the type of optLevel from Int to Word, as we do not have
negative optimization levels.

Diff Detail

There are a very large number of changes, so older changes are hidden. Show Older Changes
angerman marked 2 inline comments as done.May 18 2017, 7:07 PM
angerman added inline comments.
compiler/llvmGen/LlvmCodeGen/Ppr.hs
89

@kavon we pass the target triple.

angerman updated this revision to Diff 12627.May 18 2017, 7:49 PM
angerman marked an inline comment as done.
  • non-negative optLevel and case
angerman updated this revision to Diff 12628.May 18 2017, 8:59 PM
  • Add ticket number.
angerman marked 2 inline comments as done and an inline comment as not done.May 18 2017, 9:13 PM
angerman added inline comments.
compiler/main/DriverPipeline.hs
1510

I've opened Trac #13724 to track this.

I prefer not to do triple checking here, to while/blacklist certain triples, as I see
this as rather fragile.

I've added the check that llc respects the opt_lc flags. Such that if required one
can pass -opt_lc=-O<N>.

1527
angerman marked 3 inline comments as done.May 18 2017, 9:14 PM
angerman edited the summary of this revision. (Show Details)May 18 2017, 9:21 PM
angerman edited the summary of this revision. (Show Details)May 18 2017, 9:23 PM

One question I have about this is why we need to know about clang at all? Can't we generally just call the "normal" C compiler, regardless of whether than happens to be gcc or clang.

compiler/main/DriverPipeline.hs
880

Sure, I don't doubt that it works now; I just worry what might change in the future. That being said, I don't consider the lack of such a test to be a show-stopper.

angerman marked 3 inline comments as done.May 18 2017, 10:32 PM
angerman added a subscriber: carter.

One question I have about this is why we need to know about clang at all? Can't we generally just call the "normal" C compiler, regardless of whether than happens to be gcc or clang.

We use clang as the assembler on macOS.

-- LLVM from version 3.0 onwards doesn't support the OS X system
-- assembler, so we use clang as the assembler instead. (#5636)
let whichAsProg | hscTarget dflags == HscLlvm &&
                  platformOS (targetPlatform dflags) == OSDarwin
                = return SysTools.runClang
                | otherwise = return SysTools.runAs

I am not sure we still need this though as the following is what my system reports.

λ as --version
Apple LLVM version 8.1.0 (clang-802.0.42)
Target: x86_64-apple-darwin16.5.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

The change about knowing clang is to ensure that runClang doesn't just assume that clang is called clang.
E.g. with the prefixed toolchain setup, my clang is called <target>-clang.

However, the question of getting rid of clang from GHC altogether would be better covered in a different diff
I believe.

compiler/main/DriverPipeline.hs
880

It might very well change in the future. I have high hopes that @carter or someone else will finally start getting back at simd stuff in GHC.
And hopefully provide sufficient test cases so that we might even be able to get rid of the avx mangling.

I do not see, how this diff might impact any future changes?

bgamari accepted this revision.May 18 2017, 10:39 PM

One question I have about this is why we need to know about clang at all? Can't we generally just call the "normal" C compiler, regardless of whether than happens to be gcc or clang.

We use clang as the assembler on macOS.

-- LLVM from version 3.0 onwards doesn't support the OS X system
-- assembler, so we use clang as the assembler instead. (#5636)
let whichAsProg | hscTarget dflags == HscLlvm &&
                  platformOS (targetPlatform dflags) == OSDarwin
                = return SysTools.runClang
                | otherwise = return SysTools.runAs

I am not sure we still need this though as the following is what my system reports.

Interesting.

λ as --version
Apple LLVM version 8.1.0 (clang-802.0.42)
Target: x86_64-apple-darwin16.5.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

The change about knowing clang is to ensure that runClang doesn't just assume that clang is called clang.
E.g. with the prefixed toolchain setup, my clang is called <target>-clang.

However, the question of getting rid of clang from GHC altogether would be better covered in a different diff
I believe.

Agreed. I was just wondering.

Other than the CI issues this looks fine to me.

compiler/main/DriverPipeline.hs
880

Right, I agree that the patch doesn't make things worse. I was just hoping to make things better. I'm okay with the patch as-is though.

This revision is now accepted and ready to land.May 18 2017, 10:39 PM
angerman updated this revision to Diff 12630.May 19 2017, 12:50 AM
angerman marked an inline comment as done.
angerman edited the summary of this revision. (Show Details)
  • Revert "Int -> Word"
kavon requested changes to this revision.May 19 2017, 4:15 AM

@angerman As far as I know, opt will not choose an appropriate datalayout for the given target triple... instead it will fall back on defaults that do not take the target into consideration [1]:

When constructing the data layout for a given target, LLVM starts with a default set of specifications which are then (possibly) overridden by the specifications in the datalayout keyword. The default specifications are given in this list:

This mismatch could lead to mis-optimizations of pointer arithmetic, for example. Until we're certain that this is not an issue, I would rather see the datalayout explicitly provided, at least when using opt.

[1] http://llvm.org/docs/LangRef.html#langref-datalayout

This revision now requires changes to proceed.May 19 2017, 4:15 AM
In D3352#102234, @kavon wrote:

@angerman As far as I know, opt will not choose an appropriate datalayout for the given target triple... instead it will fall back on defaults that do not take the target into consideration [1]:

When constructing the data layout for a given target, LLVM starts with a default set of specifications which are then (possibly) overridden by the specifications in the datalayout keyword. The default specifications are given in this list:

This mismatch could lead to mis-optimizations of pointer arithmetic, for example. Until we're certain that this is not an issue, I would rather see the datalayout explicitly provided, at least when using opt.

[1] http://llvm.org/docs/LangRef.html#langref-datalayout

I am strictly against keeping the datalayout stuff in the module. However, as the opt code, contrary to my understanding of the argument flags, in fact seems not to update the DL, if -mtriple is given, and datalayout is missing,

Can we agree on a separate file:

[("i386-unknown-windows", "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32")
,("i686-unknown-windows", "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32")
,("x86_64-unknown-windows", "e-m:w-i64:64-f80:128-n8:16:32:64-S128")
,("arm-unknown-linux-gnueabihf", "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64")
,("armv6-unknown-linux-gnueabihf", "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64")
,("armv7-unknown-linux-gnueabihf", "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64")
,("aarch64-unknown-linux-gnu", "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128")
,("i386-unknown-linux-gnu", "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128")
,("x86_64-unknown-linux-gnu", "e-m:e-i64:64-f80:128-n8:16:32:64-S128")
,("arm-unknown-linux-androideabi", "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64")
,("aarch64-unknown-linux-android", "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128")
,("arm-unknown-nto-qnx-eabi", "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64")
,("i386-apple-darwin", "e-m:o-p:32:32-f64:32:64-f80:128-n8:16:32-S128")
,("x86_64-apple-darwin", "e-m:o-i64:64-f80:128-n8:16:32:64-S128")
,("armv7-apple-ios", "e-m:o-p:32:32-f64:32:64-v64:32:64-v128:32:128-a:0:32-n32-S32")
,("arm64-apple-ios", "e-m:o-i64:64-i128:128-n32:64-S128")
,("i386-apple-ios", "e-m:o-p:32:32-f64:32:64-f80:128-n8:16:32-S128")
,("x86_64-apple-ios", "e-m:o-i64:64-f80:128-n8:16:32:64-S128")
]

that lives, say alongside the settings file. And passing this via -default-data-layout= to opt?

kavon added a comment.May 19 2017, 7:53 AM

Can we agree on a separate file:

Yes, I think it's cleaner to keep them in a file as suggested instead of hardcoding it in the module.

And passing this via -default-data-layout= to opt?

Sure, though I think the flag is is actually -data-layout=<layout-string>

kavon added a comment.EditedMay 19 2017, 8:06 AM
In D3352#102253, @kavon wrote:

And passing this via -default-data-layout= to opt?

Sure, though I think the flag is is actually -data-layout=<layout-string>

Oh, actually, it would help with debugging if we emit the datalayout & triple in the LLVM IR... otherwise, when you provide -keep-llvm-files to GHC, you would get a .ll file without a datalayout or target triple. If someone debugging that output forgets to paste in the datalayout before running an LLVM optimization pass, the output may differ.

I figure that, when using clang, you want to infer the data layout, so another option is to emit the data layout and target lines at the top as a comment so it can be easily uncommented when debugging.

In D3352#102254, @kavon wrote:
In D3352#102253, @kavon wrote:

And passing this via -default-data-layout= to opt?

Sure, though I think the flag is is actually -data-layout=<layout-string>

Actually, with llvm 4. seemingly it's

opt --version
LLVM (http://llvm.org/):
  LLVM version 4.0.0
  Optimized build.
  Default target: x86_64-apple-darwin16.5.0
  Host CPU: skylake
opt --help |grep data-layout
  -default-data-layout=<layout-string>            - data layout string to use if not specified by module

Yes, I want to get rid of all hard coded logic, because it keeps breaking. The data-layout and triple are wrong for most of apples platforms (again), and encoding the target in the IR file
results in annoying deprecation warnings during linking, as the targets encode the darwin version.

I used the following script, which I intend to commit into the tree to genearte the target/datalayout file:

WINDOWS_x86="i386-unknown-windows i686-unknown-windows x86_64-unknown-windows"
LINUX_ARM="arm-unknown-linux-gnueabihf armv6-unknown-linux-gnueabihf armv7-unknown-linux-gnueabihf aarch64-unknown-linux-gnu"
LINUX_x86="i386-unknown-linux-gnu x86_64-unknown-linux-gnu"
ANDROID="arm-unknown-linux-androideabi aarch64-unknown-linux-android"
QNX="arm-unknown-nto-qnx-eabi"
MACOS="i386-apple-darwin x86_64-apple-darwin"
IOS="armv7-apple-ios arm64-apple-ios i386-apple-ios x86_64-apple-ios"
TARGETS="${WINDOWS_x86} ${LINUX_ARM} ${LINUX_x86} ${ANDROID} ${QNX} ${MACOS} ${IOS}"

FST=1
for target in $TARGETS; do
    dl=$(clang --target=$target -S foo.c -emit-llvm -o -|grep datalayout |awk -F\  '{ print $4 }')
    if [ $FST -eq 1 ]; then
        echo "[(\"${target}\", $dl)"
        FST=0
    else
        echo ",(\"${target}\", $dl)"
    fi
done
echo "]"

Ideally we should not need this, and llvm would know about this all by itself. I'll try to see if the llvm developers can be convinced to auto inferring the datalayout if -mtriple is given, and no pre-existing data-layout is found.

@angerman sounds good to me, thanks for looking into this!

This also sounds like a reasonable compromise to me.

angerman updated this revision to Diff 12637.May 20 2017, 12:25 AM
  • cleanup
This revision is now accepted and ready to land.May 20 2017, 12:25 AM
angerman marked 2 inline comments as done.May 20 2017, 12:33 AM

@kavon is this agreeable to you now?

@bgamari you ok with this as well?

angerman updated this revision to Diff 12639.May 20 2017, 1:17 AM
  • Comments + Cleanup
kavon accepted this revision.May 20 2017, 9:31 AM

LGTM

Other than the one inline comment and the build failure, I think this is fine.

gen-data-layout.sh
68 ↗(On Diff #12639)

I suppose putting llvm-targets in the source tree root is fine, but I don't think this script belongs there. Let's put it in utils.

bgamari requested changes to this revision.May 20 2017, 11:55 AM
This revision now requires changes to proceed.May 20 2017, 11:55 AM
angerman updated this revision to Diff 12648.May 21 2017, 12:42 AM
  • Fix dfeaultDynFlags
  • move gen-data-layout.sh
This revision is now accepted and ready to land.May 21 2017, 12:42 AM
angerman updated this revision to Diff 12650.May 21 2017, 3:22 AM
  • fix comment.
angerman updated this revision to Diff 12652.May 21 2017, 5:41 AM
  • When a GNU eats an Apple
angerman updated this revision to Diff 12653.May 21 2017, 8:53 AM
  • Add llvm-targets
dfeuer added a subscriber: dfeuer.May 22 2017, 11:19 PM
dfeuer added inline comments.
compiler/main/DriverPipeline.hs
838

Is there a reason to calculate lengths here? Why not | not (null mcpu) and | not (null attrs)

862

So much nesting! Why does each string need its own list? Wouldn't this be the same?

intercalate "," $ mattr ++
   [ ["+sse42" | ...]
   , ["+sse2" | ...]
   , ...]
angerman updated this revision to Diff 12682.May 22 2017, 11:29 PM
  • bump allocation stat
angerman marked 2 inline comments as done.May 23 2017, 1:01 AM
angerman added inline comments.
compiler/main/DriverPipeline.hs
838

good idea!

862

I guess we can get rid of this. If someone ends up needing more that a single attribute for a flag, we'll either need to revert to this, or use words.

angerman updated this revision to Diff 12684.May 23 2017, 1:01 AM
angerman marked 2 inline comments as done.
  • Cleanup
angerman planned changes to this revision.Sun, May 28, 7:09 PM

Something in our opt -> llc -> mangler -> as pipeline results in bad relocations being generated on aarch64.

kavon added a comment.Mon, May 29, 5:25 AM

Something in our opt -> llc -> mangler -> as pipeline results in bad relocations being generated on aarch64.

If the bug isn't caused by this patch, we should create a new phab review for it.

In D3352#102743, @kavon wrote:

Something in our opt -> llc -> mangler -> as pipeline results in bad relocations being generated on aarch64.

If the bug isn't caused by this patch, we should create a new phab review for it.

This is precisely the issue, I did get around by using clang :-/. Dropping the mangler and assembler and letting llc
produce the obj directly seems to produce proper code. I however do not exactly know what the reason here is.

Also the aarch64 stage2 compiler segfaults for some yet unknown reason.

Just to add to that. I'm really keen in figuring out what is going wrong here, I believe it's something with -fPIC and some small memory model llvm infers.

angerman updated this revision to Diff 12716.Mon, May 29, 8:23 AM
  • minor adjustments, validation pending.
This revision is now accepted and ready to land.Mon, May 29, 8:23 AM
angerman updated this revision to Diff 12717.Mon, May 29, 9:18 PM
  • drop mtriple and data-layout, as they are now hardcoded in the ir file.
  • bring back rmodel.
angerman updated this revision to Diff 12724.Tue, May 30, 8:26 PM
  • always PIC
rwbarton requested changes to this revision.Wed, May 31, 8:04 AM

We should not just force PIC on all LLVM users.

This revision now requires changes to proceed.Wed, May 31, 8:04 AM

We should not just force PIC on all LLVM users.

No we should definitely not. We might want to control this via the llvm-targets file. The slight bit annoying part is, that if this the stage1 compiler doesn't set this properly
the libraries that ship with the cross compiler are broken and no matter what flags one passes to the cross compiler afterwards will change the libraries. This has been a
rather annoying bug to track down, as the build systems change tracking doesn't catch it.

angerman planned changes to this revision.Wed, May 31, 8:54 AM

I agree, PIC is not free and we shouldn't assume that the user needs it. I'm curious to know why this was necessary.

I agree, PIC is not free and we shouldn't assume that the user needs it. I'm curious to know why this was necessary.

-- Relocation models
-- we default to always "pic".
--
-- without it we run into the following two issues:
-- - unresolvable R_AARCH64_ADR_PREL_PG_HI21
--   relocation against symbol `free@@LIBC'
-- - creating a DT_TEXTREL in a shared object.
--

;-)

This certainly warrants some deeper investigation. The first one supposedly has something to do with a small memory model
that llvm is assuming. I'm not much a fan of PIC, unless needed, I do have a few ideas as to what to experiment with. As of
right now, and using this diff myself, producing PIC is the variant that works in all linking scenarios I have encountered so far.
As this has really only been a major issue with aarch64/android so far, I believe getting the default flag from the llvm-targets
file, might be a good option, as it would ensure that the libraries are built accordingly as well.