- User Since
- Jun 8 2014, 1:59 AM (215 w, 11 h)
Fri, Jul 20
Remove irrelevant changes
Wed, Jul 18
@Phyx should have a look at this. Also, can you run a benchmark? It's easy to make things theoretically faster but practically slower.
Mon, Jul 16
I don't think noDuplicate# has any well-defined semantics that we can rely on, so I'm not entirely surprised that you found it difficult to use it to fix Lazy ST.
Just the benchmarks. Could you run the benchmarks in nofib please? See https://ghc.haskell.org/trac/ghc/wiki/Building/RunningNoFib
@bgamari The code is being executed in lots of tests (e.g. threaded1 runs with -debug), it just isn't printing anything because we don't enable +RTS -DG in any tests. We could do that, but it might be sensitive to optimisation and refactoring in library code.
Note that the dependent diff still needs some revisions, so we can't push this yet.
Note looks good. Thankyou!
It doesn't look like this compiles.
Fri, Jul 13
@bgamari thanks for the review - all done I think.
address comments; rebase
Ok, but we need to document the assumptions carefully here. I believe this is it:
Core looks good. Would you mind adding tests? (ok to do it in a separate diff)
Yes, looks good to me. I think I was wrong with my comment before - we removed the distinction between static and dynamic constructors (which is why you only see dynamic CONSTR_NOCAFs) - so we'll create a CONSTR_NOCAF for any constructor that has no pointer fields.
@bgamari what happened to the OSX build here?
Ah, I was actually going to revise this to try to make it more efficient. Not to worry, I should remember to set it to Plan Changes next time.
Thu, Jul 12
For atomicSwapMutVar# it looks like we don't have the right primitive in Cmm yet, namely %xchg. I suppose we could defer that for a separate diff?
Huh, I wonder how that happened :)
Wed, Jul 11
fix various minor wibbles
Tue, Jul 10
Rebase + update T1969 stats
Fri, Jul 6
Hey @niteria, do you plan to finish this or shall we take it over?
Sat, Jun 30
I have vague memories that I thought about this and didn't think it would be safe to remove the lock, but silly me I didn't leave a comment.
Wed, Jun 27
So the padding is not there for alignment, it's there for cache line separation. I requested changes because I think this diff might be a regression (but this is complicated, different CPUs have different cache line sizes, and it's likely quite hard to measure the difference)
Requesting changes to this bit :)
The purpose of the padding is to ensure that sync is on a different cache line from some of the other fields, otherwise one thread spinning on the lock will interfere with other threads modifying other fields of the struct.
Mon, Jun 25
My question on D4886 applies here too - why do we need a loop?
Seems like a sensible idea. What is the effect on code size?
A test would be nice.
Sat, Jun 23
Ah, that was pretty silly. I opened a GHC proposal for this, but I'm beginning to wonder if that was overkill. https://github.com/ghc-proposals/ghc-proposals/pull/149
Jun 22 2018
The commit title and summary don't really tell me what this does - can you say what the goal is here? Is there a ticket?
Jun 21 2018
That's a lot of bits flipped for no functional changes :) I also wonder whether this is worth it. What is it about the Mach prefix that bothers you?
I'll take a look. On Linux there is only the one test showing an improvement, no failures.
I think my symbol-ordering question was a red herring. There are three kinds of non-exported top-level definitions:
Jun 19 2018
Jun 18 2018
@snowleopard You would also need to give it an empty definition in non-DEBUG configurations.
Jun 15 2018
We could make findPtr exist for all ways, I think that would be the simplest fix.
Jun 14 2018
Wait, how does this even work? Is this file processed by CPP? If so, is it processed multiple times for different ways?
Yeah, I think that's right. As a word of advice: be very careful with this code, it tends to be a bug farm :)
Jun 13 2018
I'm good with this. If there is still a problem it will be good to find out, and if not, we prevent whatever it was from recurring.
Jun 12 2018
Does this actually make the build succeed? I thought there was a bigger problem, namely Trac #15237
Jun 11 2018
Hmm, I wonder what fixed it!
Jun 8 2018
@Abhiroop thanks for the clarification!
Jun 7 2018
Jun 6 2018
Nice, I thought we would have to do more than this, but it looks sufficient.
Jun 5 2018
Could we have a little context for this diff please? A descriptive summary would be really great. What problem are you solving? Do you have a test plan and results?
Ok by me, but maybe @simonpj knows a cleaner way to do it.
The cure seems a bit worse than the disease to me. In exchange for removing a thunk in some rare cases, we get an extra case every time we use evaluate.
Instead of this I think we should respect the CLEAN/DIRTY distinction in scavenge_mutable_list. It's really a bug that we don't do that.
Jun 4 2018
Yes, this is a good idea, I never liked the 0-suffix naming terminology.
Jun 2 2018
Ok by me if you can get it to validate.
Might be an idea to do a spot-check on the performance impact of doing this. We don't mind paying for assertions as long as they don't kill performance too much, then we put them behind +RTS -DS.
Jun 1 2018
May 28 2018
Could you try a benchmark or two to find out the benefit of this?