Add new primops {resize,shrink}MutableByteArray#
ClosedPublic

Authored by hvr on Aug 9 2014, 1:56 PM.

Details

Summary

The two new primops with the type-signatures

resizeMutableByteArray# :: MutableByteArray# s -> Int#
                        -> State# s -> (# State# s, MutableByteArray# s #)

shrinkMutableByteArray# :: MutableByteArray# s -> Int#
                        -> State# s -> State# s

allow to resize MutableByteArray#s in-place (when possible), and are useful
for algorithms where memory is temporarily over-allocated. The motivating
use-case is for implementing integer backends, where the final target size of
the result is either N or N+1, and only known after the operation has been
performed.

A future commit will make sizeofMutableByteArray# a stateful operation, since
now the size of a MutableByteArray# may change over its lifetime (i.e before
it gets frozen or GCed)

Test Plan

./validate --slow

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Branch
master
Lint
Lint ErrorsExcuse: It wasn't me :-)
SeverityLocationCodeMessage
Errorcompiler/prelude/primops.txt.pp:26TXT2Tab Literal
Errorincludes/Cmm.h:48TXT2Tab Literal
Errorincludes/rts/storage/ClosureMacros.h:28TXT2Tab Literal
Unit
No Unit Test Coverage
Build Status
Buildable 352
Build 353: GHC Patch Validation (amd64/Linux)
hvr updated this revision to Diff 295.Aug 9 2014, 1:56 PM
hvr retitled this revision from to Add new primops {resize,shrink}MutableByteArray#.
hvr edited the test plan for this revision. (Show Details)
hvr added reviewers: simonmar, ezyang.
hvr updated this object.
hvr added inline comments.Aug 9 2014, 2:01 PM
rts/PrimOps.cmm
192–193

This white-space change slipped in by accident while fixing lint errors, please ignore

Yay! Build B353: Diff 295 (D133) has succeeded! Full logs available at F12235.

simonmar requested changes to this revision.Aug 11 2014, 3:23 AM
simonmar edited edge metadata.
simonmar added inline comments.
compiler/prelude/primops.txt.pp
1052

Say what happens if the new size is larger than the old - size: undefined or immediate crash?

1058–1062

Say what happens to the original array, and operations on it. I think you want to say that any operations on the original array are undefined after resizeMutableByteArray#.

includes/rts/storage/ClosureMacros.h
552–561

This probably works by accident because StgThunkHeader is the same size as StgArrWords. The reason we use StgThunkHeader in overwritingClosure is because only thunks are overwritten there, and the blackhole has the same size as the StgThunkHeader (it could be clearer, admittedly).

Suggestions:

  • make the offset argument relative to the start of the object, and to adjust it (add sizeofW(StgArrWords)) at your call site.
  • call LDV_recordDead() on the old closure here
  • call LDV_RECORD_CREATE() at the call site, after adjusting the size.
This revision now requires changes to proceed.Aug 11 2014, 3:23 AM
hvr updated this revision to Diff 328.Aug 12 2014, 4:16 AM
hvr edited edge metadata.

Tried to incorporate @simonmar's comments

Whoops, Build B398: Diff 328 (D133) has failed! Full logs available at F12479.

Whoops, Build B398: Diff 328 (D133) has failed! Full logs available at F12482.

austin accepted this revision.Aug 12 2014, 8:49 AM
austin edited edge metadata.

Looks OK, although @simonmar will give the final call. Perhaps you should de-tab ClosureMacros.h after you're done too.

hvr added a comment.Aug 12 2014, 9:11 AM
In D133#18, @austin wrote:

Looks OK, although @simonmar will give the final call. Perhaps you should de-tab ClosureMacros.h after you're done too.

Will do

simonmar accepted this revision.Aug 13 2014, 6:53 AM
simonmar edited edge metadata.

Ok, it looks good modulo the one comment below. Accepting so you can fix that and commit.

includes/rts/storage/ClosureMacros.h
561–562

Is this necessary? it makes the following LDV_recordDead conditional, which is probably wrong.

This revision is now accepted and ready to land.Aug 13 2014, 6:53 AM
tibbe added a subscriber: tibbe.Aug 13 2014, 6:54 AM

Since this adds some complexity to the MutableByteArray# story (i.e. we'll have to break backwards compatibility for the size operation) I'd like to see at least one benchmark that shows that this actually helps in a real use case (vs just copying to shrink).

compiler/prelude/primops.txt.pp
1065

Is it not technically possible to overwrite the existing MutableByteArray#? That would enforce the safety condition you have here.

rts/PrimOps.cmm
159

I don't think this will work. What if my program requires pinned byte arrays? I cannot use resize as it might give me back an unpinned on and on top of that, I can't tell whether that happened or not.

tibbe added a comment.Aug 13 2014, 7:26 AM

Here's what I think we should do for the sizeof primops:

  • Add getSizeofMutableByteArray# :: MutableByteArray# s -> State# s -> (# State# s Int# #). This is similar in spirit to numCapabilities and getNumCapabilities.
  • Add a deprecate pragma (or equivalent) for sizeofMutableByteArray#.
  • Add a note to sizeofMutableByteArray# saying that it's unsafe in the presence of calls to resize on the same MBA.
hvr added a comment.Aug 13 2014, 7:41 AM
In D133#26, @tibbe wrote:
  • Add getSizeofMutableByteArray# :: MutableByteArray# s -> State# s -> (# State# s Int# #). This is similar in spirit to numCapabilities and getNumCapabilities.

I like that as it gives a smooth upgrade path. Existing code (which probably won't be able use the new primops until they've been available for a few GHC releases) will happily continue to work with the old sizeofMutableBytaArray# for some time.

hvr added a comment.Aug 13 2014, 8:00 AM
In D133#24, @tibbe wrote:

I'd like to see at least one benchmark that shows that this actually helps in a real use case (vs just copying to shrink).

I can re-run P6 (= copying-only) vs P10 (= in-place shrink) if you like, but IMO those two nofib runs already show there's a gain for integer-gmp operations which often need to pessimistically over-allocate the initial result buffer by one Word (or even more sometimes), just to shrink it after the real size is known.

compiler/prelude/primops.txt.pp
1065

Sounds interesting, but I don't understand the RTS/GC well enough to assess the consequences here.

I'd be curious what @simonmar has to say about overwriting the original MBA eagerly.

includes/rts/storage/ClosureMacros.h
561–562

Now that you mention it, I think you're right. Since the caller is already obligated to call LDV_RECORD() manually, the caller can just refrain from calling overwritingClosureOfs() in the first place in case of size==offset.

rts/PrimOps.cmm
159

Actually, you could find out by using sameMutableByteArray#. We can, however, also add a variant for pinned MBAs, that re-allocates a pinned MBA if it's not possible to resize in-place.

tibbe added a comment.Aug 13 2014, 9:23 AM

I'm satisfied with the benchmark numbers you gave me.

rts/PrimOps.cmm
159

You will also need a pinned, aligned version.

hvr edited edge metadata.Aug 16 2014, 8:49 AM
hvr updated the Trac tickets for this revision.
hvr closed this revision.Aug 16 2014, 9:05 AM
hvr updated this revision to Diff 372.

Closed by commit rGHC246436f13739 (authored by @hvr).

ekmett added a subscriber: ekmett.Dec 17 2014, 1:08 AM