#15363 Do some cleaning up of the testsuite driver
AbandonedPublic

Authored by lantti on Aug 25 2018, 2:30 PM.

Details

Reviewers
bgamari
tdammers
Trac Issues
#15363
Summary

An attempt to replace the external timeout scripts in the testsuite with code that can run directly in the testsuite driver process. Expected results were improved readability and possible speedup. As far as I see they were reached at least on Linux, but not really on Windows. Eventually Windows turned out to be more complex platform than I expected and there I merely rewrote the timeout.hs functionality in python. On Posix platforms, as far as I see, the functionality currently provided by the python subprocess module could fully replace the old timeout.py script.

lantti created this revision.Aug 25 2018, 2:30 PM
lantti retitled this revision from #15363 Bypass timeout scripts to #15363 Do some cleaning up of the testsuite driver.Aug 25 2018, 2:36 PM
lantti edited the summary of this revision. (Show Details)Aug 25 2018, 4:00 PM
lantti updated this revision to Diff 17806.Aug 25 2018, 4:22 PM

Include the first commit of the intended range, as it was accidently left out

lantti updated this revision to Diff 17807.Aug 25 2018, 4:33 PM

Fix a last moment edit that went on the wrong line

lantti updated this revision to Diff 17814.Aug 26 2018, 5:11 PM
  • 15363 Remember to send EOF after the input is written
lantti updated the Trac tickets for this revision.Aug 26 2018, 5:41 PM
tdammers requested changes to this revision.Aug 27 2018, 10:43 AM
tdammers added a subscriber: tdammers.

I have trouble understanding what this would buy us. 452 lines of Windows-specific code are a liability (most GHC devs are not on Windows, and aren't keen on maintaining nontrivial Windows code), so the benefit would have to be substantial to make it worthwhile.

I take it this improves testsuite performance on Windows? If so, can you provide some benchmarks / numbers?

And what exactly is it that makes the Windows-specific code more efficient than the Posix thing?

And finally, if this requires more than about a paragraph of explanation, then maybe a Trac ticket might be a good idea?

This revision now requires changes to proceed.Aug 27 2018, 10:43 AM
bgamari added a subscriber: Phyx.Aug 28 2018, 10:45 AM

@Phyx, perhaps you want to comment on this?

Phyx added a comment.EditedAug 28 2018, 2:51 PM

@Phyx, perhaps you want to comment on this?

Sure, but as before I remain unconvinced of the benefits of this change.
It doesn't currently run on Windows https://phabricator.haskell.org/harbormaster/build/52335/1/?l=0 so I can't comment on the performance differences.
Though I don't expect these to be significant, sure, 1 fork/exec is gone, but there are at least 6 or so left. The runtime of GHC/binutils/GCC will far outweigh the difference.

And on POSIX systems where process creation is lightweight I don't expect the change to make much of a difference either for that exact reason.
Comparing two random builds

this build:

Linux:
https://phabricator.haskell.org/harbormaster/build/52337/ ( 0:33:39 spent to go through )
macos:
https://phabricator.haskell.org/harbormaster/build/52336/ ( 0:34:34 spent to go through )

random build:

Linux:
https://phabricator.haskell.org/harbormaster/build/52383/ ( 0:33:13 spent to go through )
macosx:
https://phabricator.haskell.org/harbormaster/build/52382/ ( 0:33:19 spent to go through )

So in other words, as expected no difference.

I have trouble understanding what this would buy us. 452 lines of Windows-specific code are a liability (most GHC devs are not on Windows, and aren't keen on maintaining nontrivial Windows code), so the benefit would have to be substantial to make it worthwhile.

We had about ~470 lines of Windows specific code before, though SLOC is not a fair comparison here as the old code has more comments and we explicitly defined type signature for constants we imported using hsc2hs https://github.com/ghc/ghc/blob/master/testsuite/timeout/WinCBindings.hsc

So I would say this isn't less code, it's just more python code and less Haskell code. The bigger problem I have is that this inlines a bunch of constants from Windows headers and hardcodes them into the file as oppose to what we had before which used hsc2hs to read the constants.
This to me is a step back in maintainability. What I'm also iffy about is debugging. Debugging a Haskell application under tracing is significantly easier than having to debug an entire python interpreter that I have no idea how it works internally, so it's significantly harder to distinguish it's calls from my code's.

I take it this improves testsuite performance on Windows? If so, can you provide some benchmarks / numbers?

And what exactly is it that makes the Windows-specific code more efficient than the Posix thing?

The Windows specific code isn't about efficiency. It's about correctness. Windows has a significantly different process model than POSIX. And emulating this model on Windows at user-mode as msys does significantly complicates things, for instance how we wait for processes (and all child processes) to finish, how signals are handled, how pipes are handled (std handles are file handles etc). These differences are why we *need* this code to get the testsuite stable.

And finally, if this requires more than about a paragraph of explanation, then maybe a Trac ticket might be a good idea?

I'd also like to see rational of why this is a good thing. Right now the only reason I can see for this change is that "the rest of the testsuite" is written in python. Which is not a good enough reason for me, especially given that python distributions are a mess on Windows.

I'm not dead set against this change, I just don't see the benefits as of yet.

lantti added a comment.EditedAug 28 2018, 5:34 PM

Thank you for your comments everybody. I'll try to explain my motivations a bit and address the points raised. First of all my aim here is mainly readability, not performance. I found the current testsuite driver somewhat difficult to follow and the existence and mode of execution of the timeout scripts seemed like a hack that was no longer necessary as the python subprocess module had become capable of dealing with timeouts, process groups, etc. directly. Unfortunately it turned out that on Windows the process groups handling of subprocess, while functional on light loads, broke down quite quickly with heavier loads (as Phyx suspected on Trac). Nevertheless l thought it would still make the testsuite driver more readable if we would get rid of the timeout scripts, only this time rewriting the Windows part of timeout.hs functionality (the POSIX part is dead code at the moment as far as I see) as a python script that could be explicitly imported by the driver for Windows runs, instead of juggling separate executables through make and bash magic. I am also not completely convinced that this would be worth much, but to see how it would look like and what complications would arise I thought the easiest way was to just get up and write the patch.

In D5107#140592, @Phyx wrote:

Sure, but as before I remain unconvinced of the benefits of this change.
It doesn't currently run on Windows https://phabricator.haskell.org/harbormaster/build/52335/1/?l=0 so I can't comment on the performance differences.

This was a surprise to me and I still haven't quite figured out what is the difference between harbormasters and my own build environment. Classically, It Works On My Machine(tm). I can reproduce the error that harbormaster shows for example by running the msys version of python instead of the mingw64 version, but this is just hand waiving and at the moment I have no clue why the harbormaster build would fail.

Though I don't expect these to be significant, sure, 1 fork/exec is gone, but there are at least 6 or so left. The runtime of GHC/binutils/GCC will far outweigh the difference.
And on POSIX systems where process creation is lightweight I don't expect the change to make much of a difference either for that exact reason.

On POSIX there are two fork/execs (one for the bash script that is generated by make to start the new python interpreter and one for that interpreter itself) and the python interpreter startup time. On Windows I wouldn't expect any performance gains, perhaps rather a loss. Anyway I agree that the difference will hardly be noticeable. (Edit: Ah, sorry only one fork on POSIX too, the bash script uses exec)

So I would say this isn't less code, it's just more python code and less Haskell code. The bigger problem I have is that this inlines a bunch of constants from Windows headers and hardcodes them into the file as oppose to what we had before which used hsc2hs to read the constants.
This to me is a step back in maintainability. What I'm also iffy about is debugging. Debugging a Haskell application under tracing is significantly easier than having to debug an entire python interpreter that I have no idea how it works internally, so it's significantly harder to distinguish it's calls from my code's.

There is slightly more Windows specific code to maintain in this patch than in the current version. That extra comes from having to pass env and redirect IO ourselves as the call to subprocess.Popen is not there to do that anymore. I admit that the hardcoded Windows constants are bad, I'll see if I could be more clever about them. For debugging, I agree that debugging something from inside the python interpreter sounds nasty, but then again there are specific python debuggers to avoid doing that. Of course you'd need to trust them.

I'd also like to see rational of why this is a good thing. Right now the only reason I can see for this change is that "the rest of the testsuite" is written in python. Which is not a good enough reason for me, especially given that python distributions are a mess on Windows.

Seen that also this patch fails to run on harbormaster due to what I must assume is some mess with python versions, I would have to agree here as well. My motivation was simply to make the codebase a tiny bit more readable in a spot where I personally had to work a bit extra to understand it. I'm happy to abandon this idea if it is considered not interesting.

I suggest continuing the discussion on Trac 15363 rather than here, if only to preserve it for future generations.

The ticket was closed as 'wontfix', should this revision be marked as abandoned?

lantti abandoned this revision.Oct 16 2018, 7:59 PM

Yes. Sorry I'm still not familiar with the tools here.