Acquire all_tasks_mutex in forkProcess
ClosedPublic

Authored by edsko on Jul 11 2014, 6:01 AM.

Details

Summary

(for the same reason that we acquire all the other mutexes)

Test Plan

validate

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Lint Skipped
Unit
Unit Tests Skipped
edsko updated this revision to Diff 131.Jul 11 2014, 6:01 AM
edsko retitled this revision from to Acquire all_tasks_mutex in forkProcess.
edsko updated this object.
edsko edited the test plan for this revision. (Show Details)
edsko added reviewers: austin, simonmar, duncan.
simonmar accepted this revision.Jul 11 2014, 10:37 AM
simonmar edited edge metadata.

LGTM, nice catch

austin accepted this revision.Jul 11 2014, 10:40 AM
austin edited edge metadata.

LGTM.

Random brainstorming: I really wonder if we should be using some kind of analysis tool to help catch things like this.

For example, Clang has a dataflow-based thread safety analysis where you can specify things like lock X must be taken in the same scope Y must be taken, and even with an enforced ordering. This is done by adding __attribute__ pragmas to locks and their APIs, and you can be granular depending on the scope. I wonder if it would be worth investigating this separately.

This revision is now accepted and ready to land.Jul 11 2014, 10:40 AM

I have no experience with these tools but in theory that would be awesome :) Note that there has been a previous attempt at this:

http://www.haskell.org/pipermail/haskell-cafe/2013-January/105871.html

duncan accepted this revision.Jul 11 2014, 11:06 AM
duncan edited edge metadata.

Right. There are other global mutexes that we're not taking. How do we know which ones we need?

There are other tools that can track this stuff, like the user-space port of the Linux kernel's "lockdep" stuff. IIRC, there's also some lock-race tool that's part of the valgrind suite.

Yes, valgrind's Helgrind and lockdep should also be able to catch these issues too. Thread sanitizer (now in GCC) can do the same.

I believe in some cases these tools *might* report false positives (there are some variables we don't lock in read-cases I think, but I believe this is because their ordering will be consistent anyway on certain platforms like x86), but either way both of them should be able to spot cases like this.

lockdep in particular would be a good choice since it's very stable and requires no actual changes to GHC at all to work and get meaningful results (valgrind is a lot slower and more general, so it comes with more weight).

austin closed this revision.Jul 13 2014, 9:43 PM
austin updated this revision to Diff 150.

Closed by commit rGHC16403f0d182d (authored by @edsko, committed by @austin).