Support C structure as return value of ccall in cmm

Authored by Yuras on Sep 27 2014, 4:04 AM.


Trac Issues

This patch adds support for multiple return values from ccall in cmm to be used with C functions, that return structure by value. In that case ccall should be annotated with structure layout information, see test cases for example.

It should work for linux/x86, linux/x86_64, windows x86, windows/x64, mac os X x86_64
Not implemented for llvm backend yet.
Not sure what to do with PPC and SPARC backends

Test Plan


Diff Detail

rGHC Glasgow Haskell Compiler
Lint WarningsExcuse: links are long
Warningcompiler/nativeGen/X86/CodeGen.hs:2259TXT3Line Too Long
Warningcompiler/nativeGen/X86/CodeGen.hs:2260TXT3Line Too Long
Warningcompiler/nativeGen/X86/CodeGen.hs:2664TXT3Line Too Long
No Unit Test Coverage
Build Status
Buildable 1362
Build 1368: GHC Patch Validation (amd64/Linux)
Yuras updated this revision to Diff 657.Sep 27 2014, 4:04 AM
Yuras retitled this revision from to Support C structure as return value of ccall in cmm.
Yuras updated this object.
Yuras edited the test plan for this revision. (Show Details)
Yuras added reviewers: simonmar, austin.
Yuras set the repository for this revision to rGHC Glasgow Haskell Compiler.
Yuras added a project: GHC.
Yuras updated this revision to Diff 668.Sep 27 2014, 6:59 PM
Yuras edited edge metadata.

Add Notes re calling convention, few haddock comments, reorder functions, typo

Yuras updated this revision to Diff 669.Sep 27 2014, 7:03 PM

oops, I messed up with arc...

Yuras updated this revision to Diff 775.Oct 3 2014, 6:36 PM
Yuras updated this object.

Implement windows x86 support

Yuras updated this object.Oct 6 2014, 2:35 PM
Yuras planned changes to this revision.EditedOct 6 2014, 2:40 PM

I tested it on mac os X x86_64 -- just works. Mac os on x86 is probably dead, I don't have access to it.

But I discovered that "zed zone" is linux specific. Need to remove it's usage.

Also carter's comment in ghc IRC channel reminded me that allocated structure should be aligned too.

Does this support structures with non-standard packing? This is somewhat common in practice... at least the IPv6 imports on OSX use them, as do many Win32 headers.

Yuras added a comment.Oct 7 2014, 1:20 AM

Does this support structures with non-standard packing?

No, I assumed regular structures.

It is almost trivial to add (fully) packed structures support, but I don't want to deal with them right now.

Full support of non-standard packing is a bit harder, but still possible. It requires more complicated language to describe it's layout in memory.

BTW, on mac os x (at least X86_64) packed structures are returned on stack, so you should be able to emulate it with regular haskell FFI. Just pass the pointer to allocated area as the very first argument to the function. (I didn't try though.)

Yuras updated this revision to Diff 810.Oct 7 2014, 10:16 AM

Don't use red zone when

Yuras added a comment.Oct 7 2014, 10:22 AM

Sorry I incidentally pressed submit before the comment finished.
Not the code doesn't use red zone (which) is linux specific.

This version seems to be final and ready for review.

I dont think we should assume 32bit mac is dead, or at least there needs to be a real formal dialogue about that.

I'll see if i can build a 32bit copy of 7.8 locally. brb

Yuras updated this revision to Diff 831.Oct 8 2014, 2:00 PM

32bit mac os support

It is "theoretical" patch, based on spec and gcc output. I never tried it.

Yuras updated the Trac tickets for this revision.Oct 16 2014, 3:18 PM

Let's get the gang in here to review more closely.

tibbe edited edge metadata.Oct 18 2014, 3:14 AM

Thanks for working on this. I'm excited to have this support in GHC.

You'll need to update the manual as well. Include any limitations (e.g. no support for packing, etc).


Since all these cmmLintErr messages will be printed to users, lets try to be as clear and grammatically correct as possible. I'd write:

"only the ccall convention supports C structures"


multiple return values without C type annotation


It looks like you can just derive this equality.


Perhaps you could add some links to the OS manuals that describe how the return convention works on each platform? I see that further down you check for the OS and do different things depending on which OS it is.

Also, note any side effects on e.g. the stack that this function has.


When can this happen? If the type info is always required, can we make sure this function is always called with type info and remove the Maybe?


Perhaps add a comment summarizing what this series of instructions tries to achieve e.g.

In genCCall32' we allocate a memory area to return the result in. Here we pop the return values of the stack, one-by-one.


@simonmar is it OK to modify the stack pointer in the middle of a function?


I think this logic is worth a comment.


For each of these rather long local functions please add a comment (just like you would for a top-level function) explaining what they do.


Add comment.


Add comment explaining e.g. what a quartet is.


Add comment.


Use ByteOff instead of Int for clarity.


Use ByteOff instead of Int for clarity.


Use ByteOff instead of Int for clarity.


Instead of having an unknown value in this data type, can you use Maybe CAbiClass64?


Why not call this CAbiVanillaReg instead?


If you rename the above one, this could be CAbiSSEReg.


Use pprPanic instead and output the CmmType.


Same comment about documenting long local functions I gave above applies here too.

tibbe requested changes to this revision.Oct 18 2014, 3:14 AM
tibbe edited edge metadata.
This revision now requires changes to proceed.Oct 18 2014, 3:14 AM
Yuras added a comment.Oct 18 2014, 8:27 AM

@tibbe Thanks you for you comments!

You'll need to update the manual as well
Do you mean user manual? I don't think we have cmm described here, AFAIK cmm is totally internal.


No, because CmmType doesn't have Eq instance


That is a "usual" case -- primitive type as return value. We don't require annotation for that case. Probably we can always generate the annotation for autogenerated cmm, but there is a lot of manually written cmm, it would be silly to break it by requiring attonations.


Well, that is a piece of the original code, and I don't know *exactly* why it is doing what it is doing. I don't want to mislead reader with probably wrong comments. It seams to allocated temp storage, it doesn't pop anything. (Note: this code dials with primitive return value, not a structure.)


The original code did that


There is already a Note [Return C structure by value, 32 bit version] (I'll add a reference to it at the beginning of assign_code32 ) I agree that it worth a comment, but I don't think it worth *inline* comment. It is pretty clear what this code does -- it decides whether to assign from stack or from registers, and the Note described *how* this decision is made.
Please let me know if you disagree.


@tibbe I reused this piece of code verbatim


The names are defined in amd64 ABI spec (3.2.3 Parameter Passing). Wouldn't it confuse reader if we change names?

Yuras updated this revision to Diff 899.Oct 18 2014, 8:28 AM
Yuras edited edge metadata.

Address (most of) issues @tibbe raised

tibbe added a comment.Oct 19 2014, 4:25 AM

Regarding the manual, didn't you extend the foreign import syntax to allow users to specify a struct return type? If so that extension should be documented.

Yuras added a comment.Oct 19 2014, 4:39 AM
In D252#8304, @tibbe wrote:

Regarding the manual, didn't you extend the foreign import syntax to allow users to specify a struct return type?

Not yet. This patch only adds C structure support to cmm. I need to make myself familiar with desugaring and typecheking before continuing.

tibbe accepted this revision.Oct 19 2014, 4:53 AM
tibbe edited edge metadata.

LGTM. @simonmar most likely needs to take a look as well.

I"m noticing that

  1. the Notes piece doesn't remark on what the in register ABI threshold is for OS X
  2. no mention of BSD ABI.
  3. as you've mentioned, currently doesnt support LLVM, whic is problematic for ARM platforms (otoh, might be pretty easy to add the right LLVM support)
  4. doesn't seem to check that the (C?) stack has space before allocating

again, should this code check if theres space on the stack or not?


Whats the limit for OS X? (and the various BSDs?)


in the part after this line, if I'm reading it correctly, this code seems to unconditionally assume that theres space on the stack for the result. Am I reading it wrong?

@tibbe @simonmar, should there be a stack check before allocating there?

Yuras updated this revision to Diff 971.Oct 24 2014, 5:34 AM
Yuras edited edge metadata.

comments only

ezyang removed a subscriber: ezyang.Oct 27 2014, 2:05 PM
Yuras abandoned this revision.Feb 12 2015, 7:11 AM