runtime, syscall: reimplement AllThreadsSyscall using only signals.

In issue 50113, we see that a thread blocked in a system call can result
in a hang of AllThreadsSyscall. To resolve this, we must send a signal
to these threads to knock them out of the system call long enough to run
the per-thread syscall.

Stepping back, if we need to send signals anyway, it should be possible
to implement this entire mechanism on top of signals. This CL does so,
vastly simplifying the mechanism, both as a direct result of
newly-unnecessary code as well as some ancillary simplifications to make
things simpler to follow.

Major changes:

* The rest of the mechanism is moved to os_linux.go, with fields in mOS
  instead of m itself.
* 'Fixup' fields and functions are renamed to 'perThreadSyscall' so they
  are more precise about their purpose.
* Rather than getting passed a closure, doAllThreadsSyscall takes the
  syscall number and arguments. This avoids a lot of hairy behavior:
    * The closure may potentially only be live in fields in the M,
      hidden from the GC. Not necessary with no closure.
    * The need to loan out the race context. A direct RawSyscall6 call
      does not require any race context.
    * The closure previously conditionally panicked in strange
      locations, like a signal handler. Now we simply throw.
* All manual fixup synchronization with mPark, sysmon, templateThread,
  sigqueue, etc is gone. The core approach is much simpler:
  doAllThreadsSyscall sends a signal to every thread in allm, which
  executes the system call from the signal handler. We use (SIGRTMIN +
  1), aka SIGSETXID, the same signal used by glibc for this purpose. As
  such, we are careful to only handle this signal on non-cgo binaries.

Synchronization with thread creation is a key part of this CL. The
comment near the top of doAllThreadsSyscall describes the required
synchronization semantics and how they are achieved.

Note that current use of allocmLock protects the state mutations of allm
that are also protected by sched.lock. allocmLock is used instead of
sched.lock simply to avoid holding sched.lock for so long.

Fixes #50113

Change-Id: Ic7ea856dc66cf711731540a54996e08fc986ce84
Reviewed-on: https://go-review.googlesource.com/c/go/+/383434
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This commit is contained in:
Michael Pratt 2022-02-04 17:15:28 -05:00
parent 0b321c9a7c
commit 0a5fae2a0e
35 changed files with 432 additions and 431 deletions

View file

@ -547,7 +547,6 @@ type m struct {
ncgo int32 // number of cgo calls currently in progress
cgoCallersUse uint32 // if non-zero, cgoCallers in use temporarily
cgoCallers *cgoCallers // cgo traceback if crashing in cgo call
doesPark bool // non-P running threads: sysmon and newmHandoff never use .park
park note
alllink *m // on allm
schedlink muintptr
@ -564,16 +563,6 @@ type m struct {
syscalltick uint32
freelink *m // on sched.freem
// mFixup is used to synchronize OS related m state
// (credentials etc) use mutex to access. To avoid deadlocks
// an atomic.Load() of used being zero in mDoFixupFn()
// guarantees fn is nil.
mFixup struct {
lock mutex
used uint32
fn func(bool) bool
}
// these are here because they are too large to be on the stack
// of low-level NOSPLIT functions.
libcall libcall
@ -817,10 +806,6 @@ type schedt struct {
sysmonwait uint32
sysmonnote note
// While true, sysmon not ready for mFixup calls.
// Accessed atomically.
sysmonStarting uint32
// safepointFn should be called on each P at the next GC
// safepoint if p.runSafePointFn is set.
safePointFn func(*p)
@ -838,8 +823,6 @@ type schedt struct {
// with the rest of the runtime.
sysmonlock mutex
_ uint32 // ensure timeToRun has 8-byte alignment
// timeToRun is a distribution of scheduling latencies, defined
// as the sum of time a G spends in the _Grunnable state before
// it transitions to _Grunning.
@ -856,7 +839,7 @@ const (
_SigPanic // if the signal is from the kernel, panic
_SigDefault // if the signal isn't explicitly requested, don't monitor it
_SigGoExit // cause all runtime procs to exit (only used on Plan 9).
_SigSetStack // add SA_ONSTACK to libc handler
_SigSetStack // Don't explicitly install handler, but add SA_ONSTACK to existing libc handler
_SigUnblock // always unblock; see blockableSig
_SigIgn // _SIG_DFL action is to ignore the signal
)