Testsuite: run tests in /tmp after copying required files
ClosedPublic

Authored by thomie on Aug 29 2015, 9:28 AM.

Details

Summary

Major change to the testsuite driver.

For each TEST:

  • create a directory <testdir> inside /tmp.
  • link/copy all source files that the test needs into <testdir>.
  • run the test inside <testdir>.
  • delete testdir

By default only filenames that start with the name of the test are linked/copied into <testdir>. Test authors can use the extra_files setup function to specify other files that their test depends on.

Extra files for existing tests are tracked in testsuite/driver/extra_files.py.

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
There are a very large number of changes, so older changes are hidden. Show Older Changes
thomie updated this revision to Diff 3973.Aug 29 2015, 10:33 AM
thomie edited edge metadata.
This comment was removed by thomie.
thomie updated this revision to Diff 3974.Aug 29 2015, 10:38 AM
thomie edited edge metadata.
This comment was removed by thomie.
thomie updated this revision to Diff 3975.Aug 29 2015, 10:41 AM
thomie edited edge metadata.
This comment was removed by thomie.
thomie updated this revision to Diff 3977.Aug 29 2015, 10:47 AM
thomie edited edge metadata.

Update

This looks great. Another thought in the same vein: could we check that all needed files (including stdout/stderr comparison files) are tracked by git before running the test? (The idea being to prevent "Oops I forgot to add the .stdout file before pushing".)

thomie added a comment.EditedAug 29 2015, 2:29 PM

Here is an alternative implementation, which would require a bit less magic and no copying of source files:

  • create a testdir=out-<testname> as a subdirectory inside the source directory for each test
  • always call ghc -outputdir=<testdir>
  • change the Makefiles (run_command tests) to only write to the <testdir> as well (TODO. Check how many tests need to change)
  • add out-* to .gitignore, remove all the other entries
  • call git clean -x -n after a testrun to make sure run_command tests didn't write files outside of testdir (this last step is not 100% waterproof, since they could have deleted it themselves in the meantime).

I'm not sure which approach is better. The approach as outlined in the Summary (and partially implemented in this patch) should be more robust, but it requires copying/linking of a lot of files (slower?).

could we check that all needed files (including stdout/stderr comparison files) are tracked by git before running the test?

Sure. git clean -x -n could take care of that as well.

thomie updated this revision to Diff 4018.Sep 2 2015, 5:45 AM
thomie edited edge metadata.

Rebase

thomie updated this revision to Diff 4029.Sep 2 2015, 9:20 AM
thomie edited edge metadata.
thomie added a parent revision: D1186: Testsuite: refactoring only.

Rebase

thomie updated this revision to Diff 4030.Sep 2 2015, 9:30 AM
thomie edited edge metadata.

Does this remove the dependency again?

thomie updated this revision to Diff 4031.Sep 2 2015, 9:41 AM
thomie edited edge metadata.
thomie added a parent revision: D1186: Testsuite: refactoring only.

Now it should at least start building it.

austin awarded a token.Sep 6 2015, 4:58 PM
thomie added a comment.Sep 7 2015, 4:44 AM

@thoughtpolice: this patch isn't finished yet. But if could comment on my musings in https://phabricator.haskell.org/D1187#33028, that would be great.

austin added a comment.Sep 8 2015, 8:48 AM

ohmyglob

This looks great!

So I think the current approach is fine (and probably is more robust), provided it doesn't have a horrendous speed impact. Can you see if this change actually causes a noticeable lag in the result? It's questionable how much it might grow over time... If we pay only a few seconds up-front before the tests start running, that's probably OK tbqh (and it hopefully won't grow much with the amount of tests added over time).

How's this going @thomie?

thomie planned changes to this revision.Oct 6 2015, 9:53 AM

It's on my TODO list somewhere..

thomie updated this revision to Diff 7407.Apr 28 2016, 4:34 PM
thomie edited edge metadata.

This is ready for final review + testing

thomie updated this object.Apr 28 2016, 5:11 PM
thomie updated the Trac tickets for this revision.
thomie updated this object.
In D1187#34276, @austin wrote:

So I think the current approach is fine (and probably is more robust), provided it doesn't have a horrendous speed impact. Can you see if this change actually causes a noticeable lag in the result? It's questionable how much it might grow over time... If we pay only a few seconds up-front before the tests start running, that's probably OK tbqh (and it hopefully won't grow much with the amount of tests added over time).

The table below shows timing results of running make test and make slow, without actually running any non-Python code. Just reading .T files, setting up datastructrures, and copying/cleaning the test files (by commenting out most of the function do_test). This is on Linux with SSD, I didn't test on Windows (which doesn't have symbolic links).

make test
before CLEANUP=0: 4s
before CLEANUP=1: 16s
after  CLEANUP=1: 9s

make slow
before CLEANUP=0: 4s
before CLEANUP=1: 16s
after  CLEANUP=1: 13s

So for some reason cleaning was very slow before, and validate might actually become a few seconds faster with this change. I'm quite happy with these results.

thomie updated this revision to Diff 7425.Apr 30 2016, 6:26 AM

Delete old cleanup code

thomie updated this revision to Diff 7427.Apr 30 2016, 6:46 AM

Delete some more code

thomie added subscribers: Rufflewind, monoidal.

Adding @monoidal and @Rufflewind as reviewers, since they know Python pretty well, and worked on the testsuite driver before.

Rufflewind edited edge metadata.Apr 30 2016, 1:37 PM

How does this work on Windows where os.symlink is not defined?

How does this work on Windows where os.symlink is not defined?

If os.symlink isn't defined, then it doesn't work. But the solution is to copy the files on Windows instead of creating symlinks.

The trouble is that I don't have access to a Windows machine at the moment.

I can try testing this on Windows. @thomie.

Rufflewind added a comment.EditedMay 1 2016, 3:40 AM
In D1187#62774, @thomie wrote:

If os.symlink isn't defined, then it doesn't work. But the solution is to copy the files on Windows instead of creating symlinks.

Could try replacing os.symlink with link_or_copy_file = getattr(os, "symlink", shutil.copyfile)?

Edit: forgot the word "replacing"

Hrm, I tried running it on Windows and it seems to work fine, to my surprise. It seems that os.symlink should work as long as it's using the MSYS2 flavor of python at /usr/bin/python2 (as opposed to native Windows python commonly found in C:\Python27).

Apparently, MSYS2 by default "implements" symbolic links by copying the files, which leads to some fun behavior …

MSYS$ echo hello > src
MSYS$ ln -s src link
MSYS$ echo bye > src
MSYS$ cat link
hello

Of course, this is fine for the GHC test framework, but I can imagine this biting someone who expects POSIX-compliance.

(Still, using the link_or_copy_file that I suggested shouldn't hurt.)

bgamari requested changes to this revision.May 2 2016, 2:26 AM
bgamari edited edge metadata.

I agree that naming a function symlink that sometimes copies is a bit misleading. I think we should either make sure there is a very vocal note about this somewhere or create the alias that @Rufflewind suggests.

This revision now requires changes to proceed.May 2 2016, 2:26 AM
thomie updated this revision to Diff 7475.May 5 2016, 4:13 AM
thomie edited edge metadata.

Define link_or_copy_file and use it.

@Rufflewind: thanks. Did you run the complete testsuite, or just a few tests?
I would be pleasantly surprised if extra_files.py didn't miss any entries
for Windows specific tests.

In D1187#63147, @thomie wrote:

@Rufflewind: thanks. Did you run the complete testsuite, or just a few tests?
I would be pleasantly surprised if extra_files.py didn't miss any entries
for Windows specific tests.

Here's the full log (~2MB). There were a few errors.

Summary:

Unexpected results from:
TEST="PatBind break011 Timeout001 T11072msvc T9405 T7037 T11223_simple_duplicate_lib concprog001 T5205 T4029"

OVERALL SUMMARY for test run started at Thu May  5 20:26:29 2016 EDT
 2:17:42 spent to go through
    5106 total tests, which gave rise to
   14788 test cases, of which
    9671 were skipped

      69 had missing libraries
    4895 expected passes
     140 expected failures

       3 caused framework failures
       0 unexpected passes
       8 unexpected failures
       2 unexpected stat failures

Unexpected failures:
   /tmp/ghctest/BrUiOy/1/2/3/../../libraries/base/tests/System/Timeout001  Timeout001 [bad exit code] (normal)
   /tmp/ghctest/BrUiOy/1/2/3/./concurrent/prog001/concprog001              concprog001 [bad exit code] (threaded2)
   /tmp/ghctest/BrUiOy/1/2/3/./ghci.debugger/scripts/break011              break011 [bad stdout] (ghci)
   /tmp/ghctest/BrUiOy/1/2/3/./ghci/linking/dyn/T11072msvc                 T11072msvc [bad exit code] (normal)
   /tmp/ghctest/BrUiOy/1/2/3/./partial-sigs/should_compile/PatBind         PatBind [exit code non-0] (normal)
   /tmp/ghctest/BrUiOy/1/2/3/./rts/T11223/T11223_simple_duplicate_lib      T11223_simple_duplicate_lib [bad stderr] (normal)
   /tmp/ghctest/BrUiOy/1/2/3/./rts/T7037                                   T7037 [bad stdout] (normal)
   /tmp/ghctest/BrUiOy/1/2/3/./rts/T9405                                   T9405 [bad exit code] (normal)

Unexpected stat failures:
   /tmp/ghctest/BrUiOy/1/2/3/./perf/should_run/T5205   T5205 [stat too good] (normal)
   /tmp/ghctest/BrUiOy/1/2/3/./perf/space_leaks/T4029  T4029 [stat not good enough] (ghci)

Test framework failures:
   haddock.Cabal      (normal)
   haddock.base       (normal)
   haddock.compiler   (normal)
thomie updated this revision to Diff 7479.May 6 2016, 4:50 AM
thomie edited edge metadata.
  • Don't framework_fail on missing .t files when haddock not available.
  • Fix T11072msvc

Thanks for the test results @Rufflewind.

  • break011, PatBind, T7037, T9405: already reported in Trac:12004. Not related to this patch.
  • Timeout001, concprog001: timeouts. Doesn't seem related to this patch.
  • T11223_simple_duplicate_lib: fixed in 9dc34d31eb50d5aeb93d24cf9a3197f27ecb1687
bgamari accepted this revision.May 11 2016, 5:27 AM
bgamari edited edge metadata.

@thomie, so by my count all of the errors reported by @Rufflewind are accounted for, right?

If so I suppose this is ready to be merged.

Yes, it's ready.

I'll land it myself, but not before Sunday. I still need to update the wiki + send an email to ghc-devs, as it's a rather large change.

"thomie (Thomas Miedema)" <noreply@phabricator.haskell.org> writes:

Sure, thanks!

This revision was automatically updated to reflect the committed changes.

Apologies for not commenting on this before, I wasn't aware of it but I just ran into the changes while working on a new test. My initial thoughts are:

  • It's getting harder to just build a test and run it by hand, which I often want to do when making new tests. Now in addition to CLEANUP=0 I now have to find the weird path in /tmp and copy/paste it or type it.
  • Now CLEANUP=0 will leave stuff around in /tmp forever, whereas previously it would be cleaned by the next run of the test.
  • Why are the extra files specified in one file under testsuite/driver, rather than being with the rest of the metadata for each test? That just seems wrong.
  • It's getting harder to just build a test and run it by hand, which I often want to do when making new tests. Now in addition to CLEANUP=0 I now have to find the weird path in /tmp and copy/paste it or type it.

I'm aware of this, but I don't see a easy solution, and I don't find it a very big problem in practice. I just copy-paste the cd command (with the quotes).

A flag to run the test in the source directory could be added, but then you'd have to run git clean -x -d -f yourself to cleanup, which doesn't work outside of a git tree (maybe ok? I use git-new-workdir to create build trees, but some people might be using lndir instead).

Building in a subdirectory of the source directory doesn't help, because you'd still have to cd to it.

  • Now CLEANUP=0 will leave stuff around in /tmp forever, whereas previously it would be cleaned by the next run of the test.

That is by design. Running two make test commands at the same time should work (for example to investigate a validate test failure while the validate run hasn't completed yet), so the testsuite driver can not delete /tmp/ghc-test* before starting a test run.

A solution could be:

  • CLEANUP=0 always runs the tests in /tmp/ghc-test-no-cleanup
  • The testsuite driver deletes /tmp/ghc-test-no-cleanup before starting a new test run (regardless of CLEANUP setting).

One complication: /tmp/ghc-test-no-cleanup might be created by another user, so you could run into permission problems (someone already did with earlier versions of this patch). In case that happens, run the test in /tmp/ghc-test-no-cleanup2 or something.

What do you think?

  • Why are the extra files specified in one file under testsuite/driver, rather than being with the rest of the metadata for each test?

I did that to ease cherry-picking patches to ghc-8.0 branch, after consulting with Ben.

But I have a I have a .T file parser and pretty printer ready, to move all information from testsuite/driver/extra_files.py to the respective .T files (before the next branching probably).

For new tests you are encouraged to use the extra_files setup function. Example:

test('T5522', [only_ways(['optasm']), extra_files(['mc03.hs'])], multimod_compile, ...
simonmar added a comment.EditedJun 12 2016, 1:49 AM

(apologies if the previous comment sounded a bit wingey, the change is headed in the right direction, I was just slightly miffed that my workflows that I'd been using for years just stopped working and I had to figure out what had changed.)

Building in a subdirectory of the source directory doesn't help, because you'd still have to cd to it.

This seems like a good compromise to me. How about

  • make LOCAL=1 to use a local subdir, ./T1234-run
  • and make LOCAL=1 CLEANUP=1 would clean up a previous ./T1234-run

Thanks for clarifying about the extra_files setting.

I appreciate the feedback though. And I'll add you as a reviewer on future testsuite workflow changes.

make LOCAL=1 to use a local subdir, ./T1234-run

LOCAL=1 could be the default even. Except, some tests depend on files from ancestor directories:

../driver/extra_files.py:  'Defer02': ['../../typecheck/should_run/Defer01.hs'],
../driver/extra_files.py:  'T6106': ['../shell.hs'],
../driver/extra_files.py:  'break001': ['../Test2.hs'],
../driver/extra_files.py:  'break002': ['../Test2.hs'],
../driver/extra_files.py:  'break003': ['../Test3.hs'],
../driver/extra_files.py:  'break005': ['../QSort.hs'],
../driver/extra_files.py:  'break006': ['../Test3.hs'],
../driver/extra_files.py:  'break008': ['../Test3.hs'],
../driver/extra_files.py:  'break009': ['../Test6.hs'],
../driver/extra_files.py:  'break010': ['../Test6.hs'],
../driver/extra_files.py:  'break011': ['../Test7.hs'],
../driver/extra_files.py:  'break017': ['../QSort.hs'],
../driver/extra_files.py:  'break018': ['../mdo.hs'],
../driver/extra_files.py:  'break019': ['../Test2.hs'],
../driver/extra_files.py:  'break027': ['../QSort.hs'],
../driver/extra_files.py:  'dynbrk001': ['../QSort.hs'],
../driver/extra_files.py:  'dynbrk002': ['../QSort.hs'],
../driver/extra_files.py:  'dynbrk004': ['../mdo.hs'],
../driver/extra_files.py:  'getargs': ['../getargs.hs'],
../driver/extra_files.py:  'ghci026': ['../prog002'],
../driver/extra_files.py:  'ghci038': ['../shell.hs'],
../driver/extra_files.py:  'ghci058': ['../shell.hs'],
../driver/extra_files.py:  'haddock.Cabal': ['../../../../libraries/Cabal/Cabal/dist-install/haddock.t'],
../driver/extra_files.py:  'haddock.base': ['../../../../libraries/base/dist-install/haddock.t'],
../driver/extra_files.py:  'haddock.compiler': ['../../../../compiler/stage2/haddock.t'],
../driver/extra_files.py:  'hist001': ['../Test3.hs'],
../driver/extra_files.py:  'hpc001': ['../hpcrun.pl'],
../driver/extra_files.py:  'hpc_fork': ['../hpcrun.pl'],
../driver/extra_files.py:  'hpc_markup_multi_001': ['../Geometry.hs', '.hpc/', 'hpc_sample.tix'],
../driver/extra_files.py:  'hpc_markup_multi_002': ['../CSG.hs', '../Construct.hs', '../Data.hs', '../Eval.hs', '../Geometry.hs', '../Illumination.hs', '../Intersections.hs', '../Interval.hs', '../Main.hs', '../Misc.hs', '../Parse.hs', '../Pixmap.hs', '../Primitives.hs', '../RayTrace.hs', '../Surface.hs', '.hpc/', 'hpc_sample.tix'],
../driver/extra_files.py:  'hpc_markup_multi_003': ['../CSG.hs', '../Construct.hs', '../Data.hs', '../Eval.hs', '../Geometry.hs', '../Illumination.hs', '../Intersections.hs', '../Interval.hs', '../Main.hs', '../Misc.hs', '../Parse.hs', '../Pixmap.hs', '../Primitives.hs', '../RayTrace.hs', '../Surface.hs', '.hpc/', 'hpc_sample.tix'],
../driver/extra_files.py:  'hpc_raytrace': ['../hpcrun.pl', 'CSG.hs', 'Construct.hs', 'Data.hs', 'Eval.hs', 'Geometry.hs', 'Illumination.hs', 'Intersections.hs', 'Interval.hs', 'Main.hs', 'Misc.hs', 'Parse.hs', 'Primitives.hs', 'Surface.hs', 'galois.gml', 'galois.sample'],
../driver/extra_files.py:  'lazy-bs-alloc': ['../../numeric/should_run/arith011.stdout'],
../driver/extra_files.py:  'listCommand001': ['../Test3.hs'],
../driver/extra_files.py:  'print002': ['../Test.hs'],
../driver/extra_files.py:  'print003': ['../Test.hs'],
../driver/extra_files.py:  'print005': ['../QSort.hs'],
../driver/extra_files.py:  'print006': ['../Test.hs'],
../driver/extra_files.py:  'print007': ['../Test.hs'],
../driver/extra_files.py:  'print008': ['../Test.hs'],
../driver/extra_files.py:  'print010': ['../Test.hs'],
../driver/extra_files.py:  'print011': ['../Test.hs'],
../driver/extra_files.py:  'print012': ['../GADT.hs', '../Test.hs'],
../driver/extra_files.py:  'print013': ['../GADT.hs'],
../driver/extra_files.py:  'print014': ['../GADT.hs'],
../driver/extra_files.py:  'print016': ['../Test.hs'],
../driver/extra_files.py:  'print017': ['../Test.hs'],
../driver/extra_files.py:  'print018': ['../Test.hs'],
../driver/extra_files.py:  'print019': ['../Test.hs'],
../driver/extra_files.py:  'print020': ['../HappyTest.hs'],
../driver/extra_files.py:  'print023': ['../Test.hs'],
../driver/extra_files.py:  'print024': ['../Test.hs'],
../driver/extra_files.py:  'print034': ['../GADT.hs', '../Test.hs'],
../driver/extra_files.py:  'print035': ['../Unboxed.hs'],
../driver/extra_files.py:  'prog001': ['../shell.hs', 'A.hs', 'B.hs', 'C1.hs', 'D1.hs', 'D2.hs'],
../driver/extra_files.py:  'prog002': ['../shell.hs', 'A1.hs', 'A2.hs', 'B.hs', 'C.hs', 'D.hs'],
../driver/extra_files.py:  'prog003': ['../shell.hs', 'A.hs', 'B.hs', 'C.hs', 'D1.hs', 'D2.hs'],
../driver/extra_files.py:  'prog012': ['../shell.hs', 'Bar1.hs', 'Bar2.hs', 'Foo.hs', 'FooBar.hs'],
../driver/extra_files.py:  'testwsdeque': ['../../../rts/WSDeque.h'],
../driver/extra_files.py:  'tough': ['../hpcrun.pl'],
../driver/extra_files.py:  'tough2': ['../hpcrun.pl', 'subdir/'],

Those tests contain code like:

T6106.script:    :l ../shell.hs
testwsdeque:     compile_and_run, ['-I../../../rts'])

My solution for this was to create a sufficiently deep directory structure ( 1/2/3 ), and just copy those files over like all other files:

/tmp/ghctest-kssuJi/test   spaces/1/rts/WSDeque.h
/tmp/ghctest-kssuJi/test   spaces/1/2/3/testwsdeque/testwsdeque.c

So with LOCAL=1, you would still have to cd to some weird (local) path, and not much is gained.

Or I'm going to have to make changes to those tests. But it's a little fiddly. What would you change :l ../shell.hs to for example?

Ah, so now I understand what the 1/2/3 business is about.

The easiest way would be to support copying files from arbitrary locations in the source tree into the test dir. So we copy ../shell.hs to ./shell.hs, and then use it from there. This would also handle the ../../../rts/WSDeque.h case, and it looks like all the others could be handled this way too with some changes to the tests themselves. Sound OK?

I'd really like LOCAL=1 to be the default, I presume you would run validate with LOCAL=0?