Add 'memset' to Foreign.Marshal.Alloc.

Authored by ifesdjeen on Nov 8 2014, 12:24 PM.



Currently the memory allocated with malloc may potentially be dirty. In C this problem is usually solved by using memset.

This would be extremely useful for FFI / C interop, when a data structure is allocated within Haskell code. With memset, you can do something like

customMem <- malloc
_         <- memset (castPtr customMem) 0 #{size custom_t}

This will fill a block of allocated memory with zeroes.

For example, I've been working on FFI bindings for collectd, and when I was allocating memory, previously used within the process, it was dirty, so my CString
contained a value of

"custom name7e0700490, test_plugin_LTX_module_register): symbol nd"

Instead of just

"custom name"

After using memset and filling everything with zeroes, problem never appeared again.

This can be implemented in user applications, too, although it would be really nice to have it by default.

Diff Detail

Lint Skipped
Unit Tests Skipped
ifesdjeen updated this revision to Diff 1344.Nov 8 2014, 12:24 PM
ifesdjeen retitled this revision from to Add 'memset' to Foreign.Marshal.Alloc..
ifesdjeen updated this object.
ifesdjeen edited the test plan for this revision. (Show Details)
ifesdjeen set the repository for this revision to rGHC Glasgow Haskell Compiler.
ifesdjeen added a project: GHC.
ifesdjeen updated this object.
hvr added reviewers: austin, duncan.
hvr added a reviewer: simonmar.
hvr added a reviewer: ekmett.
hvr added a subscriber: hvr.Nov 8 2014, 12:42 PM

I wonder if a malloc0 :: CSize -> IO (Ptr a) wouldn't be convenient, as I assume in 99% of all cases you'd want the allocated memory to be zero-initialised rather than initaliased with a different octet (for which there may be in theory exist a more efficient lib/syscall than to call malloc+memset individually)

There's also a calloc: function contiguously allocates memory ... The allocated memory is filled with bytes of value zero.

Things like calloc or malloc + memset are used when structs with fixed-size strings are used (for example, char [64]). In most other cases, for example, when struct with pointer / primitive type fields created it is also initialised with values.

I like the idea of malloc0 (or calloc).

Sometimes I need to re-initialize the structure at pointer, though. For that, I still have to use memset. So I'd have it may be worth to have both.
Shall I prepare another patch for calloc or malloc0?

duncan edited edge metadata.Nov 9 2014, 12:51 PM
This comment was removed by duncan.
duncan requested changes to this revision.Nov 9 2014, 1:04 PM
duncan edited edge metadata.

So yeah, in principle fine, but needs various changes.

As for hvr's suggestion about a malloc + zero, I think that's a reasonable idea in principle.


This should live in Foreign.Marshal.Utils, next to copy/moveBytes not Foreign.Marshal.Alloc.


There's no reason to mention how it was allocated. We just have to say what it does. See the documentation for copyBytes and newBytes.


I think we should call it something like setBytes, fillBytes or copyByte to match the existing copyBytes moveBytes functions in Foreign.Marshal.Utils.


Why is this CInt? Shouldn't it be Word8 ? Also, this should be Int not CInt, again to match copyBytes and moveBytes.


Should be string.h not stdlib.h. See the memcpy and memmove decls in Foreign.Marshal.Utils.

This revision now requires changes to proceed.Nov 9 2014, 1:04 PM
hvr requested changes to this revision.Nov 9 2014, 1:19 PM
hvr added a reviewer: hvr.

I'm "requesting a change" to make sure we don't forget to submit a formal library addition proposal before landing

ekmett edited edge metadata.Nov 9 2014, 1:32 PM

From talking on #ghc, it sounds like the initialized allocs should be

calloc, callocBytes, callocArray and callocArray0 in Foreign.Marshal.Alloc and Foreign.Marshal.Array as appropriate.

duncan added a comment.Nov 9 2014, 1:32 PM

hvr, edwardk and I seemed to agree on calloc and callocBytes as names for a malloc + zero, following the C naming conversion in the Foreign.Marshal.Alloc module.

ekmett added a comment.Nov 9 2014, 1:34 PM

I agree with Duncan that memset belongs in Foreign.Marshal.Utils instead of Foreign.Marshal.Alloc

Perfect. I'm almost done with a patch that addresses all aforementioned problems, should be ready shortly.

Thank you so much for feedback!

ekmett added a comment.Nov 9 2014, 1:47 PM

This _should_ also get discussed on the libraries list, so folks know its in the wings and can pipe up with objections, even if it seems like an obvious win to me.

ifesdjeen added a comment.EditedNov 9 2014, 2:29 PM

I'm really sorry, I didn't prepare this patch with arc, therefore I couldn't find a way to update it, so had to create another revision (D465)

(Further comments and description is there).

Hope it's ok..

hvr added a comment.Nov 10 2014, 1:55 AM

@ifesdjeen you should simply resign this code-revision, and we should continue all discussions in D465 (you don't need to paste phabricator urls, just write D465 and it'll be recognized by Phabricator)

ifesdjeen abandoned this revision.Nov 10 2014, 1:59 AM

Thank you @hvr for explanation.

New revision is at D465

hvr added a comment.Nov 22 2014, 4:18 AM

@ifesdjeen btw, what became of calloc (see D461#11540). Was there a patch for that as well?

@hvr I thought I'd first figure out generally how contribution process works. Sorry didn't do both at same time.

Will work on the patch, hopefully can submit it today.

hvr added a comment.Nov 22 2014, 4:29 AM

@ifesdjeen Thanks!

@ekmett we'll need a libraries@ discussion for that as well, right? (which would mean it'll possibly be to late for GHC 7.10, but not for GHC 7.12 )

hvr added a comment.Nov 26 2014, 12:29 AM

Will work on the patch, hopefully can submit it today. is it progressing? :-)

(somewhat related:

@hvr Sorry it took me so long, been jammed with work :(

The revision is now available here: D527