rts: fix trac issue #9314
AbandonedPublic

Authored by bgamari on Jun 12 2015, 2:23 PM.

Details

Reviewers
simonmar
hsyl20
austin
Trac Issues
#9314
Summary

We fill mmap'ed pages as much as possible instead of allocating one page
per object. Huge memory gains (174MB -> 84KB in the test case given in
trac). Startup performance should be better too because of the reduced
number of mmap calls.

Diff Detail

hsyl20 updated this revision to Diff 3218.Jun 12 2015, 2:23 PM
hsyl20 retitled this revision from to rts: fix trac issue #9314.
hsyl20 updated this object.
hsyl20 edited the test plan for this revision. (Show Details)
hsyl20 updated the Trac tickets for this revision.
hsyl20 updated this revision to Diff 3219.Jun 12 2015, 2:30 PM
hsyl20 edited edge metadata.
  • Fix: ROUND_UP is used outside of #ifdef USE_MMAP (e.g. line 3582)
hsyl20 updated this revision to Diff 3220.Jun 12 2015, 3:00 PM
  • Use consistent naming for objects (i.e. s/big/large)
hsyl20 updated this revision to Diff 3221.Jun 12 2015, 3:22 PM
  • fix typos
hsyl20 updated this revision to Diff 3223.Jun 12 2015, 4:18 PM
  • Share the same allocator for several archives

Loaded objects are never freed. Hence we don't really have to flush the
allocator at the end of loadArchive. By doing this, objects from different
archives may share the same pages and the memory usage should be lower

hsyl20 updated this revision to Diff 3226.Jun 12 2015, 5:28 PM
  • Fix wrong assumption of previous commit + pass LinkerUnload test
hsyl20 updated this revision to Diff 3231.Jun 12 2015, 7:24 PM
  • Fix potential leak and comments
simonmar edited edge metadata.Jun 13 2015, 2:26 PM

This will conflict with D975, so we have some merging to do.

hsyl20 updated this revision to Diff 3240.Jun 14 2015, 2:25 PM
hsyl20 edited edge metadata.
  • Fix potential memory leak

On the whole this looks quite nice.

What do others think about placing m32 into a new source file? It exposes a rather clean interface and Linker.c is already rather unwieldy due to its size.

On the more pedantic side of things, I noticed a few places where the use of whitespace is a bit inconsistent (in particular after commas in function call argument lists).

Other thoughts inline.

rts/Linker.c
2382

We tend to denote longer comments like this as "notes" and format them such that they are easily referenced and grepped for. See, for instance, Note [LoopBreader OccInfo] in compiler/basicTypes/BasicTypes.hs.

I think it would be quite helpful to turn this comment into a Note.

2384

It would be nice if this sentence were a bit more clear on precisely what is being counted here (e.g. the number of objects allocated into the page).

2399

See my comment on m32_flush.

2433

Perhaps we should abort here (see below)?

2439

The purpose of this function could be described a bit better. My understanding is this,

Release the allocator's reference to pages on the "filling" list. This should be called when it is believed that no more allocations will be needed from the allocator to ensure that empty pages waiting to be filled aren't unnecessarily held.

2458

Two spaces should be condensed to one.

2464

Do we really want to continue execution in this case? If I'm not mistaken the only cases where munmap might fail are those where it is passed a region which was never mmap'd. If this happens it seems to me that something has gone seriously wrong and we should likely bail.

2474

It would be easy to break up the argument list across multiple lines to stay under 80 characters.

2502

Usually I don't like raising a fuss over line length but this is rather long and could be easily improved if broken at ||.

hsyl20 updated this revision to Diff 3247.Jun 15 2015, 8:41 AM
  • Fix warnings and improve comments

I have fixed the warnings and taken into account comments from @bgamari.

Some remaining questions:

  1. Should we abort when munmap fails?
  1. @simonmar: does this patch fix the issue you wanted to fix with D975? The

memory improvement is so important that you may not have the issue you had in
the first place anymore. If it is not enough, we could combine both approaches:
mmap files without the 32-bit constraint and then m32_alloc and memcpy
only the sections we are interested in (not rounded-up to page size). I tend to
prefer this solution with the allocator to the one with the linear search in
maybeShareSection (D975). It is more generic as we don't really distinguish
sections allocated in different ways (except for the extra symbol hack) as it
is encapsulated in the allocator code and it is very easy to make thread-safe.

  1. Should we place the M32 allocator in a new source file? I have no strong

opinion about this as I don't know the conventions for GHC (I'm quite new here
;-)).

  1. About coding conventions: is there a guideline for C code in the GHC tree?

As noticed by @bgamari, the use of whitespace is inconsistent. Curly brace
positioning is too.

  1. I have used 8-byte alignment everywhere mmapForLinker has been replaced

with m32_alloc. I don't know if we can relax the alignment constraint for some
of the calls? It may spare a few bytes...

hsyl20 marked 8 inline comments as done.Jun 15 2015, 8:49 AM

@hsyl20 no, this doesn't fix the issue I'm addressing in D975. The problem is that the object file is large (say 300MB), but only about 80MB of that is actual code or data, the rest is symbols and debug info. If we map the whole object into the 1-2GB region, we quickly run out of space. In D975 I'm re-mapping the code/data from the object into the 1-2GB region.

I agree that maybeShareSection is a bit of a hack - I did this to avoid creating too many separate mappings when loading .a files. It would be fine to replace that with m32_alloc, and it shouldn't be too hard.

I like not having to memcpy all the memory for an object file into the 1-2GB region, we just mmap it - this is also useful because various tools (like perf) understand the mapping and can give symbol names. For .a files I think it's fine to memcpy the bits though.

I will manually merge this with D975

bgamari requested changes to this revision.Jul 5 2015, 12:07 PM
bgamari added a reviewer: bgamari.

I'm marking this as needs changes to bump it out of the review queue for now.

This revision now requires changes to proceed.Jul 5 2015, 12:07 PM

I have merged this diff into D975. The only thing remaining to do is to fix it up to work on OS X.

bgamari commandeered this revision.Aug 5 2015, 8:47 AM
bgamari edited reviewers, added: hsyl20; removed: bgamari.

Thanks @hsyl20! I'm going to mark this as abandoned to clean up the active Diff list.

bgamari abandoned this revision.Aug 5 2015, 8:47 AM