Fix #15758 by making the RTS parse response file arguments
Changes PlannedPublic

Authored by ckoparkar on Oct 29 2018, 9:49 AM.

Details

Summary

In base-4.12 we added some response file related utilities -- a
function 'getArgsWithResponseFiles' which enabled users to parse
command line arguments supplied via response files. However, RTS
options contained in such a file were not handled correctly.

When we pass options over the command line, they are first inspected
by the RTS. It separates out the options contained in +RTS...-RTS blocks
and updates the runtime configuration with the supplied values.
There are some other subtleties which must also be dealt with:

(1) If the program is compiled to ignore RTS options, and the user
tries to supply them at runtime, an error should be reported. A lot
variations on this are possible by setting -rtsopts to different
values[1].

(2) If a program accepts RTS options, they are filtered out from
the value returned by 'getArgs'.

(3) --RTS should disable further processing of arguments by the RTS.

And none of these things are done for the options supplied via
response files. Right now, 'getArgsWithResponseFiles'
processes the response files, instead of the RTS. It iterates over the
list returned by 'getArgs', and issues readFile's for appropriate
elements (the ones that start with '@'). This means that the RTS never
sees any of the options contained in these files, and hence is not
able to process them correctly.

This patch moves the response file processing logic into the
RTS, puts it behind a flag 'expand-response-files'.
If the flag is ON:

  • The RTS expands response files
  • getArgsWithResponseFiles and getArgs behave the same, and return the command line arguments after expanding response files and removing RTS flags.

Otherwise:

  • The RTS does not expand response files
  • getArgs works in the usual way (doesn't expand response files, and filters RTS flags passed over the command line).
  • getArgsWithResponseFiles expands response files, but doesn't process the RTS flags contained in a file.

[1] https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/runtime_control.html#setting-rts-options-on-the-command-line

Test Plan

./validate

ckoparkar created this revision.Oct 29 2018, 9:49 AM
ckoparkar planned changes to this revision.EditedOct 29 2018, 9:54 AM

This is still a WIP patch for option 2 mentioned in comment:2 in Trac #15732 (extending the RTS to parse response files). I submitted it here to get some early feedback. And there are a few things that must be dealt with:

(1) Should getArgs be different from getArgsWithResponseFiles?

a. This patch keeps these two functions distinct. I've created a separate argument vector rf_argv which contains arguments expanded from response files, in addition to the arguments stored in prog_argv. For example:

./foo a1 a2 @file1 a3

prog_argv = ["a1", "a2", "@file1", "a3"]
rf_argv   = ["a1", "a2", "b1", "b2", "b3", "a3"]

But, the program arguments are used in the runtime linker initialization phase. (See Note [runtime-linker-support] in 'rts/linker/Linker.c').
I am unclear on which arguments should it use (argv or rf_argv)). Perhaps it should depend on whether -no-expand-response-files is passed to GHC ?

b. Or, we can expand the arguments into the standard program argv. This would mean that in a program compiled with GHC > 8.8, getArgs can expand response file arguments; unless the program is compiled with -no-expand-response-files.
One obvious potential downside is that we would change the behavior of a widely used function like getArgs. Although we could argue this changes the behaviour for the better, it's still a big change.

Right now, I'm leaning towards an implementation that keeps them separate (a), but I'm way out of my element to judge which one is better.

And I'm not terribly happy with this patch. There is a lot of room for improvement, particularly the moveRtsOpts function. Also, having separate argument vectors means that every time we use one of these things, we have to decide which one to use.
I feel like it's adding way too much complexity.

(2) Implementing the C bits that parse response files is tricky. We have to be careful about escaping the appropriate things, stripping quotes, backslashes etc.
In this patch, I have copied the relevant bits from the GCC source code, and modified it to fit the GHC conventions.
But, we must make sure that we don't violate the GPL.

ckoparkar added inline comments.Oct 29 2018, 9:59 AM
libraries/base/GHC/ResponseFile.hs
63

If the program is compiled with -no-expand-response-files, and the user tries to use this function, should we generate a warning to indicate that response file arguments are not being parsed ?

rts/RtsFlags.c
96

This function requires way too many parameters. I'm trying to refactor this, but haven't found a good way so far.

2279

The build fails because this function isn't used anywhere.

testsuite/tests/rts/T15732/all.T
4

Should the comments go in .T files ? I've added them to the Makefile for now, where most of the action is.

ckoparkar requested review of this revision.Oct 29 2018, 10:00 AM
RyanGlScott resigned from this revision.Oct 30 2018, 9:08 AM

I'm afraid that I feel woefully underqualified to review this.

In this patch, I have copied the relevant bits from the GCC source code, and modified it to fit the GHC conventions.
But, we must make sure that we don't violate the GPL.

Yes, any implementation in the RTS should not be based on the gcc implementation.

For the semantics of getArgs and getArgsWithResponseFiles:

  • getArgs already strips out RTS flags from the arguments, so it should do the same thing in the presence of RTS flags in response files.
  • Obviously it can't modify the response files themselves, so it should return the arguments expanded
  • But that means getArgs would be implicitly expanding response files, and some uses might not want this
  • So, we have to make a compile-time choice to use a flag parser that understands response files

which raises the question of whether the default should be to understand response file arguments or not. Since this changes the behaviour of getArgs, I think we should be conservative and have the default to be to *not* have response files expanded. Possibly this could change in the future, it would need to be a GHC proposal.

So to summarise:

  • Let's have a compile-time flag to request response file expansion. Perhaps -rtsopts-response-files, or -expand-response-files
  • When the flag is on:
    • The RTS expands response files
    • getArgsWithResponseFiles and getArgs behave the same, and return the command-line arguments after expanding response files and removing RTS flags
  • When the flag is off, everything works as it does now:
    • The RTS does not expand response files
    • getArgs works as it does now
    • getArgsWithResponseFiles can expand response files.
ckoparkar planned changes to this revision.Nov 12 2018, 7:45 AM

Thanks @simonmar. I'll update this patch in a couple days.

  • When the flag is off, everything works as it does now:
    • getArgsWithResponseFiles can expand response files.

Should getArgsWithResponseFiles still remove the RTS options that it encounters while expanding response files?
Sure, the options won't have any effect on the RTS configuration itself, but might save some of the users from having to do that work.

In addition to the spec in your comment:
getArgs and getArgsWithResponseFiles never include any RTS options in the return value.
But the RTS options in response files only affect the configuration if -expand-response-files is on.

Should getArgsWithResponseFiles still remove the RTS options that it encounters while expanding response files?
Sure, the options won't have any effect on the RTS configuration itself, but might save some of the users from having to do that work.

No, because those options will not have been processed by the RTS. Why would users need to do that?

ckoparkar updated this revision to Diff 18858.Fri, Nov 23, 10:54 AM

Start updating semantics per @simonmar's comment (TODO: remove GCC bits)

No, because those options will not have been processed by the RTS. Why would users need to do that?

@simonmar Oh, I see what you mean. Filtering these out would actually make it seem that they were in fact processed by the RTS,
which would be incorrect.

I've updated this patch to follow the semantics in your comment. I'll rewrite the GCC bits and ping you for another round of review :)

ckoparkar edited the summary of this revision. (Show Details)Fri, Nov 23, 11:04 AM
ckoparkar edited the summary of this revision. (Show Details)

How is this going, @ckoparkar ?

ckoparkar planned changes to this revision.Mon, Nov 26, 5:26 PM

@bgamari Should be ready for review soon :) I'm still working on rewriting the GCC parts.

(I'll bump it out of the queue till then.)