Re-initialise ObjectCode's info when re-mapping/re-allocating
Needs RevisionPublic

Authored by jrtc27 on Sep 18 2017, 2:38 AM.

Details

Summary

1c83fd814b12754be8af211a387cec906ca198b3 broke the SHORT_REL_BRANCH && NEED_SYMBOL_EXTRAS case, since USE_CONTIGUOUS_MMAP is defined; ocAllocateSymbolExtras re-mmaps with more space but then unmaps the old image, so all the cached pointers etc stored in oc->info by ocInit_ELF are now garbage.

There may be a smarter thing to do rather than a full Deinit + Init, but this at least seems to work for ELF with my experimental sparc64 NCG backend (which uses this case).

jrtc27 created this revision.Sep 18 2017, 2:38 AM

Note: I believe PowerPC is affected by this, but I have not actually tested that hypothesis.

bgamari requested changes to this revision.Sep 18 2017, 8:22 AM
bgamari added inline comments.
rts/linker/SymbolExtras.c
103

I'm a bit skeptical of this approach, but at very least this deserves a comment; at the moment I'm not entirely sure what this accomplishes.

This revision now requires changes to proceed.Sep 18 2017, 8:22 AM
simonmar added inline comments.Sep 18 2017, 12:45 PM
rts/linker/SymbolExtras.c
103

Yeah, this doesn't feel right to me either. I don't fully understand what broke, why it broke, and why this is the right fix. A bit more elaboration please? @angerman do you have any thoughts?

Sorry for the lack of detail, @angerman requested I post the diff so I did so quickly before leaving this morning.

https://phabricator.haskell.org/D3447 added an ocInit_ELF, which caches inside oc->info various pointers into the mapped/allocated image, but since these were yet to actually be used, nothing bad happened. Then https://phabricator.haskell.org/D3448 came along and started using them; the first usage encountered which crashes for me is the use of oc->info->symbolTables in ocGetNames_ELF, specifically in accessing symTab->symbols[j].elf_sym->st_shndx, as symTab points to the original location, but if oc->image has been changed by remapping/reallocating (and the old value unmapped or implicitly deallocated by realloc), this is no longer accessible. Any pointers inside oc->info and its nested structures which point inside the old image thus need to be adjusted to point inside the new image region. As I said in the original summary, this is clearly not the most efficient way to go about that, but it's the simplest.

angerman edited edge metadata.Sep 18 2017, 9:36 PM

I agree with @jrtc27 that this is a "fix" for the specific RTS_LINKER_USE_MMAP && USE_CONTIGUOUS_MMAP
case where we remap the image, to get the PLT/symbolExtras adjacent to the .text section.

As mentioned on IRC, I believe that the better solution here would be to start implementing towards the section
compacting (and as such include the PLT/symbolExtras in that as well).

The initial idea was to break code and data sections apart and mmap them separately, such that we can properly
mprotect them and achieve W^X which is required on form more restrictive platforms. The general consensus
seems to be that this is a feature others would like to see, if I recall the short "linker" debate from ICFP (@bgamari,
am I recalling this right?)

Right now the "stupid" approach that is implemented mmaps each section. And as such we spend at least one
page per section. The "compacting" approach would iterate over the sections to estimate the total size (plus the PLT
for code and GOT for data), and mmap allocate space only once for each type.

I would suggest start doing this on Object (.o) granularity first. I don't know how much additional logic would
be needed to perform this on Archive (.a) level.

What wasn't mentioned so far is, I believe, that the underlying issue is the relocation displacement range.

When relocations only allow a certain range, the splitting of the .text section and the PLT can result in
gaps larger than the possible relative displacement value. Which is the underlying reason for the
USE_CONTIGUOUS_MMAP, I believe.

bgamari added a comment.EditedSep 19 2017, 8:03 AM

The initial idea was to break code and data sections apart and mmap them separately, such that we can properly mprotect them and achieve W^X which is required on form more restrictive platforms. The general consensus seems to be that this is a feature others would like to see, if I recall the short "linker" debate from ICFP (@bgamari, am I recalling this right?)

Yes, I think this is quite important. The current lack of W^X protection in particular is rather frightening. I've been tempted to take this on a few times now but never quite got to it. For the record, this is tracked as Trac #14069.

austin resigned from this revision.Nov 9 2017, 5:39 PM