Fix processHeapClosureForDead CONSTR_NOCAF case:
ClosedPublic

Authored by osa1 on Tue, Jul 3, 4:23 AM.

Details

Summary

CONSTR_NOCAF was introduced with 55d535da10d as a replacement for
CONSTR_STATIC and CONSTR_NOCAF_STATIC, however, as explained in Note
[static constructors], we copy CONSTR_NOCAFs (which can also be seen in
evacuate) during GC, and they can become dead, like other CONSTR_X_Ys.
processHeapClosureForDead is updated to reflect this.

Test Plan

Validates on x86_64. Existing failures on i386.

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
osa1 created this revision.Tue, Jul 3, 4:23 AM
osa1 added a comment.EditedTue, Jul 3, 4:33 AM

This patch was previously submitted as D4567, but it got merged accidentally and then revereted and now I can't reopen it, so I'm submitting a new one.

Since D4567 Trac #7836 is submitted three times: Trac #15063, Trac #15087, Trac #15165.

@simonmar said in D4567:

Did you actually observe this going wrong? I imagine one of these could only
appear in a compact region, and I don't think we currently
processHeapClosureForDead() when we release a compact region. (we probably
should, that was an oversight)

To check this I added some print statements in the compiler and I can see that we're generating dynamic CONSTR_NOCAF closures. Print statements I've added:

diff --git a/compiler/cmm/SMRep.hs b/compiler/cmm/SMRep.hs
index 743631527e..a6b6338c70 100644
--- a/compiler/cmm/SMRep.hs
+++ b/compiler/cmm/SMRep.hs
@@ -437,7 +437,8 @@ rtsClosureType rep
       HeapRep _     2 0 Constr{} -> CONSTR_2_0
       HeapRep _     1 1 Constr{} -> CONSTR_1_1
       HeapRep _     0 2 Constr{} -> CONSTR_0_2
-      HeapRep _     0 _ Constr{} -> CONSTR_NOCAF
+      HeapRep True  0 _ c@Constr{} -> pprTrace "rtsClosureType" (text "Allocating a static CONSTR_NOCAF for" <+> ppr c) CONSTR_NOCAF
+      HeapRep False 0 _ c@Constr{} -> pprTrace "rtsClosureType" (text "Allocating a dynamic CONSTR_NOCAF for" <+> ppr c) CONSTR_NOCAF
            -- See Note [Static NoCaf constructors]
       HeapRep _     _ _ Constr{} -> CONSTR

In fact when booting GHC I only see dynamic CONSTR_NOCAFs, never a static CONSTR_NOCAF.

Secondly, I also checked the backtrace:

(gdb) bt
#0  0xf7d8be2c in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0xf7d8d4b3 in __GI_abort () at abort.c:89
#2  0x080ec0b6 in rtsFatalInternalErrorFn (s=0x813a674 "Invalid object in processHeapClosureForDead(): %d", ap=0xffffb374 "\a") at rts/RtsMessages.c:186
#3  0x080ec195 in barf (s=0x813a674 "Invalid object in processHeapClosureForDead(): %d") at rts/RtsMessages.c:48
#4  0x080f88ee in processHeapClosureForDead (c=0xf7c04938) at rts/LdvProfile.c:146
#5  processNurseryForDead () at rts/LdvProfile.c:189
#6  LdvCensusForDead (N=N@entry=1) at rts/LdvProfile.c:233
#7  0x080f8c13 in LdvCensusForDead (N=1) at rts/LdvProfile.c:224
#8  0x08101e75 in GarbageCollect (collect_gen=1, do_heap_census=false, gc_type=1, cap=0x815b600 <MainCapability>, idle_cap=0x816fa10) at rts/sm/GC.c:454
#9  0x080eeb5c in scheduleDoGC (pcap=0x0, task=0x6, force_major=132) at rts/Schedule.c:1797
#10 0x0804aaa5 in exitScheduler (wait_foreign=false) at rts/Schedule.c:2655
#11 0x080edb4d in hs_exit_ (wait_foreign=<optimized out>) at rts/RtsStartup.c:387
#12 0x080ee01e in shutdownHaskellAndExit (n=0, fastExit=0) at rts/RtsStartup.c:545
#13 0x0809791d in r70X_info ()
#14 0x00000000 in ?? ()

So we find the closure in a nursery.

I think it's clear that we're allocating CONSTR_NOCAFs on the heap and we need to deal with that in processHeapClosureForDead.

osa1 edited the test plan for this revision. (Show Details)Tue, Jul 3, 5:16 AM
bgamari accepted this revision.Thu, Jul 12, 5:03 PM

Looks like pretty compelling evidence to me. Thanks!

This revision is now accepted and ready to land.Thu, Jul 12, 5:03 PM
osa1 added a comment.Fri, Jul 13, 4:34 AM

@simonmar could you take a look at the patch and my explanation? Am I missing anything?

simonmar accepted this revision.Fri, Jul 13, 6:12 AM

Yes, looks good to me. I think I was wrong with my comment before - we removed the distinction between static and dynamic constructors (which is why you only see dynamic CONSTR_NOCAFs) - so we'll create a CONSTR_NOCAF for any constructor that has no pointer fields.

This revision was automatically updated to reflect the committed changes.