runtime: Remove write barriers during STW.

The GC assumes that there will be no asynchronous write barriers when
the world is stopped. This keeps the synchronization between write
barriers and the GC simple. However, currently, there are a few places
in runtime code where this assumption does not hold.
The GC stops the world by collecting all Ps, which stops all user Go
code, but small parts of the runtime can run without a P. For example,
the code that releases a P must still deschedule its G onto a runnable
queue before stopping. Similarly, when a G returns from a long-running
syscall, it must run code to reacquire a P.
Currently, this code can contain write barriers. This can lead to the
GC collecting reachable objects if something like the following
sequence of events happens:
1. GC stops the world by collecting all Ps.
2. G #1 returns from a syscall (for example), tries to install a
pointer to object X, and calls greyobject on X.
3. greyobject on G #1 marks X, but does not yet add it to a write
buffer. At this point, X is effectively black, not grey, even though
it may point to white objects.
4. GC reaches X through some other path and calls greyobject on X, but
greyobject does nothing because X is already marked.
5. GC completes.
6. greyobject on G #1 adds X to a work buffer, but it's too late.
7. Objects that were reachable only through X are incorrectly collected.
To fix this, we check the invariant that no asynchronous write
barriers happen when the world is stopped by checking that write
barriers always have a P, and modify all currently known sources of
these writes to disable the write barrier. In all modified cases this
is safe because the object in question will always be reachable via
some other path.

Some of the trace code was turned off, in particular the
code that traces returning from a syscall. The GC assumes
that as far as the heap is concerned the thread is stopped
when it is in a syscall. Upon returning the trace code
must not do any heap writes for the same reasons discussed
above.

Fixes #10098
Fixes #9953
Fixes #9951
Fixes #9884

May relate to #9610 #9771

Change-Id: Ic2e70b7caffa053e56156838eb8d89503e3c0c8a
Reviewed-on: https://go-review.googlesource.com/7504
Reviewed-by: Austin Clements <austin@google.com>
This commit is contained in:
Rick Hudson 2015-03-12 14:19:21 -04:00
parent ce9b512ccc
commit 41dbcc19ef
4 changed files with 122 additions and 34 deletions

View file

@ -83,6 +83,19 @@ func needwb() bool {
return gcphase == _GCmark || gcphase == _GCmarktermination || mheap_.shadow_enabled
}
// Write barrier calls must not happen during critical GC and scheduler
// related operations. In particular there are times when the GC assumes
// that the world is stopped but scheduler related code is still being
// executed, dealing with syscalls, dealing with putting gs on runnable
// queues and so forth. This code can not execute write barriers because
// the GC might drop them on the floor. Stopping the world involves removing
// the p associated with an m. We use the fact that m.p == nil to indicate
// that we are in one these critical section and throw if the write is of
// a pointer to a heap object.
// The p, m, and g pointers are the pointers that are used by the scheduler
// and need to be operated on without write barriers. We use
// the setPNoWriteBarrier, setMNoWriteBarrier and setGNowriteBarrier to
// avoid having to do the write barrier.
//go:nosplit
func writebarrierptr_nostore1(dst *uintptr, src uintptr) {
mp := acquirem()
@ -90,8 +103,11 @@ func writebarrierptr_nostore1(dst *uintptr, src uintptr) {
releasem(mp)
return
}
mp.inwb = true
systemstack(func() {
if mp.p == nil && memstats.enablegc && !mp.inwb && inheap(src) {
throw("writebarrierptr_nostore1 called with mp.p == nil")
}
mp.inwb = true
gcmarkwb_m(dst, src)
})
mp.inwb = false