mirror of
https://github.com/golang/go.git
synced 2025-12-08 06:10:04 +00:00
runtime, syscall: reimplement AllThreadsSyscall using only signals.
In issue 50113, we see that a thread blocked in a system call can result
in a hang of AllThreadsSyscall. To resolve this, we must send a signal
to these threads to knock them out of the system call long enough to run
the per-thread syscall.
Stepping back, if we need to send signals anyway, it should be possible
to implement this entire mechanism on top of signals. This CL does so,
vastly simplifying the mechanism, both as a direct result of
newly-unnecessary code as well as some ancillary simplifications to make
things simpler to follow.
Major changes:
* The rest of the mechanism is moved to os_linux.go, with fields in mOS
instead of m itself.
* 'Fixup' fields and functions are renamed to 'perThreadSyscall' so they
are more precise about their purpose.
* Rather than getting passed a closure, doAllThreadsSyscall takes the
syscall number and arguments. This avoids a lot of hairy behavior:
* The closure may potentially only be live in fields in the M,
hidden from the GC. Not necessary with no closure.
* The need to loan out the race context. A direct RawSyscall6 call
does not require any race context.
* The closure previously conditionally panicked in strange
locations, like a signal handler. Now we simply throw.
* All manual fixup synchronization with mPark, sysmon, templateThread,
sigqueue, etc is gone. The core approach is much simpler:
doAllThreadsSyscall sends a signal to every thread in allm, which
executes the system call from the signal handler. We use (SIGRTMIN +
1), aka SIGSETXID, the same signal used by glibc for this purpose. As
such, we are careful to only handle this signal on non-cgo binaries.
Synchronization with thread creation is a key part of this CL. The
comment near the top of doAllThreadsSyscall describes the required
synchronization semantics and how they are achieved.
Note that current use of allocmLock protects the state mutations of allm
that are also protected by sched.lock. allocmLock is used instead of
sched.lock simply to avoid holding sched.lock for so long.
Fixes #50113
Change-Id: Ic7ea856dc66cf711731540a54996e08fc986ce84
Reviewed-on: https://go-review.googlesource.com/c/go/+/383434
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This commit is contained in:
parent
0b321c9a7c
commit
0a5fae2a0e
35 changed files with 432 additions and 431 deletions
|
|
@ -547,7 +547,6 @@ type m struct {
|
|||
ncgo int32 // number of cgo calls currently in progress
|
||||
cgoCallersUse uint32 // if non-zero, cgoCallers in use temporarily
|
||||
cgoCallers *cgoCallers // cgo traceback if crashing in cgo call
|
||||
doesPark bool // non-P running threads: sysmon and newmHandoff never use .park
|
||||
park note
|
||||
alllink *m // on allm
|
||||
schedlink muintptr
|
||||
|
|
@ -564,16 +563,6 @@ type m struct {
|
|||
syscalltick uint32
|
||||
freelink *m // on sched.freem
|
||||
|
||||
// mFixup is used to synchronize OS related m state
|
||||
// (credentials etc) use mutex to access. To avoid deadlocks
|
||||
// an atomic.Load() of used being zero in mDoFixupFn()
|
||||
// guarantees fn is nil.
|
||||
mFixup struct {
|
||||
lock mutex
|
||||
used uint32
|
||||
fn func(bool) bool
|
||||
}
|
||||
|
||||
// these are here because they are too large to be on the stack
|
||||
// of low-level NOSPLIT functions.
|
||||
libcall libcall
|
||||
|
|
@ -817,10 +806,6 @@ type schedt struct {
|
|||
sysmonwait uint32
|
||||
sysmonnote note
|
||||
|
||||
// While true, sysmon not ready for mFixup calls.
|
||||
// Accessed atomically.
|
||||
sysmonStarting uint32
|
||||
|
||||
// safepointFn should be called on each P at the next GC
|
||||
// safepoint if p.runSafePointFn is set.
|
||||
safePointFn func(*p)
|
||||
|
|
@ -838,8 +823,6 @@ type schedt struct {
|
|||
// with the rest of the runtime.
|
||||
sysmonlock mutex
|
||||
|
||||
_ uint32 // ensure timeToRun has 8-byte alignment
|
||||
|
||||
// timeToRun is a distribution of scheduling latencies, defined
|
||||
// as the sum of time a G spends in the _Grunnable state before
|
||||
// it transitions to _Grunning.
|
||||
|
|
@ -856,7 +839,7 @@ const (
|
|||
_SigPanic // if the signal is from the kernel, panic
|
||||
_SigDefault // if the signal isn't explicitly requested, don't monitor it
|
||||
_SigGoExit // cause all runtime procs to exit (only used on Plan 9).
|
||||
_SigSetStack // add SA_ONSTACK to libc handler
|
||||
_SigSetStack // Don't explicitly install handler, but add SA_ONSTACK to existing libc handler
|
||||
_SigUnblock // always unblock; see blockableSig
|
||||
_SigIgn // _SIG_DFL action is to ignore the signal
|
||||
)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue