The existing thread affinity mechanism is limited to a specific mapping of RTS capabilities to hardware cores. This patch allows a user specified mapping via a file where each line is the thread affinity mask and capabilities are mapped to masks by indexing into lines (wrapping around if there are more capabilities then lines in the file). Each mask is given by specifying the allowed cores by number (zero-indexed) separated by spaces.
I've been using this for the last three months and it is critical to getting consistent results on a large machine, in my case, 72-threads across two sockets.
This patch is not comprehensive and only applies to Linux. There may be better directions to go, but hopefully this will start the conversation and through some discussion we can land at a good solution.
Things that still need to be addressed:
- Supporting other OSs.
- Tests for parsing.
- RTS flags for convenient common configurations.