This reverts CL 481061.
Reason for revert: When built with C TSAN, x_cgo_getstackbound triggers
race detection on `g->stacklo` because the synchronization is in Go,
which isn't instrumented.
For #51676.
For #59294.
For #59678.
Change-Id: I38afcda9fcffd6537582a39a5214bc23dc147d47
Reviewed-on: https://go-review.googlesource.com/c/go/+/485275
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
Previously we did not permit them as Go programs generated enormous
core dumps on macOS. However, according to an investigation in #59446,
they are OK now.
For #59446
Change-Id: I1d7a3f500a6bc525aa6de8dfa8a1d8dbb15feadc
Reviewed-on: https://go-review.googlesource.com/c/go/+/483015
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.
CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.
[Original CL 392854 description]
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
[CL 479915 description]
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
Fixes#51676.
Fixes#59294.
Change-Id: I9bf1400106d5c08ce621d2ed1df3a2d9e3f55494
Reviewed-on: https://go-review.googlesource.com/c/go/+/481061
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: DeJiang Zhu (doujiang) <doujiang24@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This reverts CL 392854.
Reason for revert: caused #59294, which was derived from google
internal tests. The attempted fix of #59294 caused more breakage.
Change-Id: I5a061561ac2740856b7ecc09725ac28bd30f8bba
Reviewed-on: https://go-review.googlesource.com/c/go/+/481060
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
This reverts CL 479915.
Reason for revert: breaks a lot google internal tests.
Change-Id: I13a9422e810af7ba58cbf4a7e6e55f4d8cc0ca51
Reviewed-on: https://go-review.googlesource.com/c/go/+/481055
Reviewed-by: Chressie Himpel <chressie@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.
This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.
For #59294.
Change-Id: Ie52a8f931e0648d8753e4c1dbe45468b8748b527
Reviewed-on: https://go-review.googlesource.com/c/go/+/479915
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
If a panicking signal (e.g. SIGSEGV) happens on a g0 stack, we're
either in the runtime or running C code. Either way we cannot
recover and sigpanic will immediately throw. Further, injecting a
sigpanic could make the C stack unwinder and the debugger fail to
unwind the stack. So don't inject a sigpanic.
If we have cgo traceback and symbolizer attached, if it panics in
a C function ("CF" for the example below), previously it shows
something like
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x45f1ef]
runtime stack:
runtime.throw({0x485460?, 0x0?})
.../runtime/panic.go:1076 +0x5c fp=0x7ffd77f60f58 sp=0x7ffd77f60f28 pc=0x42e39c
runtime.sigpanic()
.../runtime/signal_unix.go:821 +0x3e9 fp=0x7ffd77f60fb8 sp=0x7ffd77f60f58 pc=0x442229
goroutine 1 [syscall]:
CF
/tmp/pp/c.c:6 pc=0x45f1ef
runtime.asmcgocall
.../runtime/asm_amd64.s:869 pc=0x458007
runtime.cgocall(0x45f1d0, 0xc000053f70)
.../runtime/cgocall.go:158 +0x51 fp=0xc000053f48 sp=0xc000053f10 pc=0x404551
main._Cfunc_CF()
_cgo_gotypes.go:39 +0x3f fp=0xc000053f70 sp=0xc000053f48 pc=0x45f0bf
Now it shows
SIGSEGV: segmentation violation
PC=0x45f1ef m=0 sigcode=1
signal arrived during cgo execution
goroutine 1 [syscall]:
CF
/tmp/pp/c.c:6 pc=0x45f1ef
runtime.asmcgocall
.../runtime/asm_amd64.s:869 pc=0x458007
runtime.cgocall(0x45f1d0, 0xc00004ef70)
.../runtime/cgocall.go:158 +0x51 fp=0xc00004ef48 sp=0xc00004ef10 pc=0x404551
main._Cfunc_CF()
_cgo_gotypes.go:39 +0x3f fp=0xc00004ef70 sp=0xc00004ef48 pc=0x45f0bf
I think the new one is reasonable.
For #57698.
Change-Id: I4f7af91761374e9b569dce4c7587499d4799137e
Reviewed-on: https://go-review.googlesource.com/c/go/+/462437
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.
When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key, only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.
When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.
This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.
This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.
For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz
Fixes#51676
Change-Id: I380702fe2f9b6b401b2d6f04b0aba990f4b9ee6c
GitHub-Last-Rev: 93dc64ad98
GitHub-Pull-Request: golang/go#51679
Reviewed-on: https://go-review.googlesource.com/c/go/+/392854
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: thepudds <thepudds1460@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Use a single writeErrStr function. Avoid using global variables.
Use a single version of some error messages rather than duplicating
the messages in OS-specific files.
Change-Id: If259fbe78faf797f0a21337d14472160ca03efa0
Reviewed-on: https://go-review.googlesource.com/c/go/+/447055
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
This change adds an API to the runtime for arenas. A later CL can
potentially export it as an experimental API, but for now, just the
runtime implementation will suffice.
The purpose of arenas is to improve efficiency, primarily by allowing
for an application to manually free memory, thereby delaying garbage
collection. It comes with other potential performance benefits, such as
better locality, a better allocation strategy, and better handling of
interior pointers by the GC.
This implementation is based on one by danscales@google.com with a few
significant differences:
* The implementation lives entirely in the runtime (all layers).
* Arena chunks are the minimum of 8 MiB or the heap arena size. This
choice is made because in practice 64 MiB appears to be way too large
of an area for most real-world use-cases.
* Arena chunks are not unmapped, instead they're placed on an evacuation
list and when there are no pointers left pointing into them, they're
allowed to be reused.
* Reusing partially-used arena chunks no longer tries to find one used
by the same P first; it just takes the first one available.
* In order to ensure worst-case fragmentation is never worse than 25%,
only types and slice backing stores whose sizes are 1/4th the size of
a chunk or less may be used. Previously larger sizes, up to the size
of the chunk, were allowed.
* ASAN, MSAN, and the race detector are fully supported.
* Sets arena chunks to fault that were deferred at the end of mark
termination (a non-public patch once did this; I don't see a reason
not to continue that).
For #51317.
Change-Id: I83b1693a17302554cb36b6daa4e9249a81b1644f
Reviewed-on: https://go-review.googlesource.com/c/go/+/423359
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
On Linux a signal sent using tgkill will have si_code == SI_TKILL,
not SI_USER. Treat the two cases the same. Add a Linux-specific test.
Change the test to use the C pause function rather than sleeping
for a second, as that achieves the same effect.
This is a roll forward of CL 431255 which was rolled back in CL 431715.
This new version skips flaky tests on more systems, and marks a new method
nosplit.
Change-Id: Ibf2d3e6fc43d63d0a71afa8fcca6a11fda03f291
Reviewed-on: https://go-review.googlesource.com/c/go/+/432136
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
On Linux a signal sent using tgkill will have si_code == SI_TKILL,
not SI_USER. Treat the two cases the same. Add a Linux-specific test.
Change the test to use the C pause function rather than sleeping
for a second, as that achieves the same effect.
Change-Id: I2a36646aecabcab9ec42ed9a048b07c2ff0a3987
Reviewed-on: https://go-review.googlesource.com/c/go/+/431255
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
This converts several unsynchronized reads (reads without holding
prof.signalLock) into atomic reads.
For #53821.
For #52912.
Change-Id: I421b96a22fbe26d699bcc21010c8a9e0f4efc276
Reviewed-on: https://go-review.googlesource.com/c/go/+/420196
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
sighandler has gp, the goroutine running when the signal arrived, and
gsignal, the goroutine executing the signal handler. The latter is
usually mp.gsignal, except in the case noted by the delayedSignal check.
Like previous CLs, cases where the getg() G is used only to access the M
are replaced with direct uses of mp.
Change-Id: I2dc7894da7004af17682712e07a0be5f9a235d81
Reviewed-on: https://go-review.googlesource.com/c/go/+/418580
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
* The gp argument to canpanic is always equivalent to getg(), so no need
to pass it at all.
* gp must not be nil or _g_.m would have crashed, so no need to check
for nil.
* Use acquirem to better reason about preemption.
Change-Id: Ic7dc8dc1e56ab4c1644965f6aeba16807cdb2df4
Reviewed-on: https://go-review.googlesource.com/c/go/+/418575
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
Cgo TSAN (not the Go race detector) intercepts signals and calls
the signal handler at a later time. When the signal handler is
called, the memory may have changed, but the signal context
remains old. As the signal context and the memory don't match, it
is unsafe to unwind the stack from the signal PC and SP. We have
to ignore the signal.
It is probably also not safe to do async preemption, which relies
on the signal PC, and inspects and even writes to the stack (for
call injection).
We also inspect the stack for fatal signals (e.g. SIGSEGV), but I
think they are not delayed. For other signals we don't inspect
the stack, so they are probably fine.
Fixes#27540.
Change-Id: I5c80a7512265b8ea4a91422954dbff32c6c3a0d0
Reviewed-on: https://go-review.googlesource.com/c/go/+/408218
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
This change adds support for vDSO for s390x architecture. This avoids the use of system calls in nanotime and walltime and accelerates them by factor 4-5.
Benchmarks:
100,000,000 x time.Now():
syscall fallback 13923ms 139.23 ns/op
vDSO enabled 2640ms 26.40 ns/op
Change-Id: Ic679fe31048379e59ccf83b400140f13c9d49696
GitHub-Last-Rev: 8f6e918a45
GitHub-Pull-Request: golang/go#49717
Reviewed-on: https://go-review.googlesource.com/c/go/+/365995
Run-TryBot: Paul Murphy <murp@ibm.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Jonathan Albrecht <jonathan.albrecht@ibm.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Bill O'Farrell <billotosyr@gmail.com>
This gives explicit names to the possible states of throwing (-1, 0, 1).
m.throwing is now one of:
throwTypeOff: not throwing, previously == 0
throwTypeUser: user throw, previously == -1
throwTypeRuntime: runtime throw, previously == 1
For runtime throws, we now always include frame metadata and system
goroutines regardless of GOTRACEBACK to aid in debugging the runtime.
For user throws, we no longer include frame metadata or runtime frames,
unless GOTRACEBACK=system or higher.
For #51485.
Change-Id: If252e2377a0b6385ce7756b937929be4273a56c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/390421
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
No test because we already have a test in the syscall package.
The issue reports 1 failure per 100,000 iterations, which is rare enough
that our builders won't catch the problem.
Fixes#52226
Change-Id: I17633ff6cf676b6d575356186dce42cdacad0746
Reviewed-on: https://go-review.googlesource.com/c/go/+/400315
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
A future change to gofmt will rewrite
// Doc comment.
//go:foo
to
// Doc comment.
//
//go:foo
Apply that change preemptively to all comments (not necessarily just doc comments).
For #51082.
Change-Id: Iffe0285418d1e79d34526af3520b415a12203ca9
Reviewed-on: https://go-review.googlesource.com/c/go/+/384260
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
CL 383434 forgot to enable these paths for android, which is still linux
just not via GOOS.
Fixes#51213.
Change-Id: I102e53e8671403ded6edb4ba04789154d7a0730b
Reviewed-on: https://go-review.googlesource.com/c/go/+/385954
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
In issue 50113, we see that a thread blocked in a system call can result
in a hang of AllThreadsSyscall. To resolve this, we must send a signal
to these threads to knock them out of the system call long enough to run
the per-thread syscall.
Stepping back, if we need to send signals anyway, it should be possible
to implement this entire mechanism on top of signals. This CL does so,
vastly simplifying the mechanism, both as a direct result of
newly-unnecessary code as well as some ancillary simplifications to make
things simpler to follow.
Major changes:
* The rest of the mechanism is moved to os_linux.go, with fields in mOS
instead of m itself.
* 'Fixup' fields and functions are renamed to 'perThreadSyscall' so they
are more precise about their purpose.
* Rather than getting passed a closure, doAllThreadsSyscall takes the
syscall number and arguments. This avoids a lot of hairy behavior:
* The closure may potentially only be live in fields in the M,
hidden from the GC. Not necessary with no closure.
* The need to loan out the race context. A direct RawSyscall6 call
does not require any race context.
* The closure previously conditionally panicked in strange
locations, like a signal handler. Now we simply throw.
* All manual fixup synchronization with mPark, sysmon, templateThread,
sigqueue, etc is gone. The core approach is much simpler:
doAllThreadsSyscall sends a signal to every thread in allm, which
executes the system call from the signal handler. We use (SIGRTMIN +
1), aka SIGSETXID, the same signal used by glibc for this purpose. As
such, we are careful to only handle this signal on non-cgo binaries.
Synchronization with thread creation is a key part of this CL. The
comment near the top of doAllThreadsSyscall describes the required
synchronization semantics and how they are achieved.
Note that current use of allocmLock protects the state mutations of allm
that are also protected by sched.lock. allocmLock is used instead of
sched.lock simply to avoid holding sched.lock for so long.
Fixes#50113
Change-Id: Ic7ea856dc66cf711731540a54996e08fc986ce84
Reviewed-on: https://go-review.googlesource.com/c/go/+/383434
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
In c-archive mode, when we turn off profiling, we restore the
previous handler for SIGPROF, and ignore SIGPROF signals if no
handler was installed. So if a pending signal lands after we
remove the Go signal handler, it will not kill the program.
In the current code there is a small window, where we can still
receive signals but we are set to not handling the signal. If a
signal lands in this window (possibly on another thread), it will
see that we are not handling this signal and no previous handler
installed, and kill the program. To avoid this race, we set the
previous handler to SIG_IGN (ignoring the signal) when turning on
profiling. So when turning off profiling we'll ignore the signal
even if a stray signal lands in the small window.
Fixes#43828.
Change-Id: I304bc85a93ca0e63b0c0d8e902b097bfdc8e3f1d
Reviewed-on: https://go-review.googlesource.com/c/go/+/374074
Trust: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Fixes#49288
Change-Id: I7bfcbecbefa68871a3e556935a73f241fff44c0e
Reviewed-on: https://go-review.googlesource.com/c/go/+/360861
Trust: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
When these packages are released as part of Go 1.18,
Go 1.16 will no longer be supported, so we can remove
the +build tags in these files.
Ran go fix -fix=buildtag std cmd and then reverted the bootstrapDirs
as defined in src/cmd/dist/buildtool.go, which need to continue
to build with Go 1.4 for now.
Also reverted src/vendor and src/cmd/vendor, which will need
to be updated in their own repos first.
Manual changes in runtime/pprof/mprof_test.go to adjust line numbers.
For #41184.
Change-Id: Ic0f93f7091295b6abc76ed5cd6e6746e1280861e
Reviewed-on: https://go-review.googlesource.com/c/go/+/344955
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
CL 64070 removed lockOSThread from the cgocall path, but didn't update
the signal-in-cgo detection in sighandler. As a result, signals that
arrive during a cgo call are treated like they arrived during Go
execution, breaking the traceback.
Update the cgo detection to fix the backtrace.
Fixes#47522
Change-Id: I61d77ba6465f55e3e6187246d79675ba8467ec23
Reviewed-on: https://go-review.googlesource.com/c/go/+/339989
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
The sigprofNonGo and sigprofNonGoPC functions are only used on unix-like
platforms. In preparation for unix-specific changes to sigprofNonGo,
move it (plus its close relative) to a unix-specific file.
Updates #35057
Change-Id: I9c814127c58612ea9a9fbd28a992b04ace5c604d
Reviewed-on: https://go-review.googlesource.com/c/go/+/351790
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: David Chase <drchase@google.com>
This patch reinstates a fix for PowerPC with regard to making VDSO calls
while receiving a signal, and subsequently crashing. The crash happens
because certain VDSO calls can modify the r30 register, which is where g
is stored. This change was reverted for PowerPC because r30 is supposed
to be a non-volatile register. This is true, but that only makes a
guarantee across function calls, but not "within" a function call. This
patch was seemingly fine before because the Linux kernel still had hand
rolled assembly VDSO function calls, however with a recent change to C
function calls it seems the compiler used can generate instructions
which temporarily clobber r30. This means that when we receive a signal
during one of these calls the value of r30 will not be the g as the
runtime expects, causing a segfault.
You can see from this assembly dump how the register is clobbered during
the call:
(the following is from a 5.13rc2 kernel)
```
Dump of assembler code for function __cvdso_clock_gettime_data:
0x00007ffff7ff0700 <+0>: cmplwi r4,15
0x00007ffff7ff0704 <+4>: bgt 0x7ffff7ff07f0 <__cvdso_clock_gettime_data+240>
0x00007ffff7ff0708 <+8>: li r9,1
0x00007ffff7ff070c <+12>: slw r9,r9,r4
0x00007ffff7ff0710 <+16>: andi. r10,r9,2179
0x00007ffff7ff0714 <+20>: beq 0x7ffff7ff0810 <__cvdso_clock_gettime_data+272>
0x00007ffff7ff0718 <+24>: rldicr r10,r4,4,59
0x00007ffff7ff071c <+28>: lis r9,32767
0x00007ffff7ff0720 <+32>: std r30,-16(r1)
0x00007ffff7ff0724 <+36>: std r31,-8(r1)
0x00007ffff7ff0728 <+40>: add r6,r3,r10
0x00007ffff7ff072c <+44>: ori r4,r9,65535
0x00007ffff7ff0730 <+48>: lwz r8,0(r3)
0x00007ffff7ff0734 <+52>: andi. r9,r8,1
0x00007ffff7ff0738 <+56>: bne 0x7ffff7ff07d0 <__cvdso_clock_gettime_data+208>
0x00007ffff7ff073c <+60>: lwsync
0x00007ffff7ff0740 <+64>: mftb r30 <---- RIGHT HERE
=> 0x00007ffff7ff0744 <+68>: ld r12,40(r6)
```
What I believe is happening is that the kernel changed the PowerPC VDSO
calls to use standard C calls instead of using hand rolled assembly. The
hand rolled assembly calls never touched r30, so this change was safe to
roll back. That does not seem to be the case anymore as on the 5.13rc2
kernel the compiler *is* generating assembly which modifies r30, making
this change again unsafe and causing a crash when the program receives a
signal during these calls (which will happen often due to async
preempt). This change happened here:
https://lwn.net/ml/linux-kernel/235e5571959cfa89ced081d7e838ed5ff38447d2.1601365870.git.christophe.leroy@csgroup.eu/.
I realize this was reverted due to unexplained hangs in PowerPC
builders, but I think we should reinstate this change and investigate
those issues separately:
f4ca3c1e0aFixes#46803
Change-Id: Ib18d7bbfc80a1a9cb558f0098878d41081324b52
GitHub-Last-Rev: c3002bcfca
GitHub-Pull-Request: golang/go#46767
Reviewed-on: https://go-review.googlesource.com/c/go/+/328110
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
Otherwise, in c-archive or c-shared mode, there is the chance of
getting a SIGPROF just after the signal handler is removed but before
profiling is disabled, in which case the program will die.
Fixes#46498
Change-Id: I5492beef45fec9fb9a7f58724356d6aedaf799ac
Reviewed-on: https://go-review.googlesource.com/c/go/+/329290
Trust: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
At this point all funcPC references are ABIInternal functions.
Replace with the intrinsics.
Change-Id: I3ba7e485c83017408749b53f92877d3727a75e27
Reviewed-on: https://go-review.googlesource.com/c/go/+/321954
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
There are a few assembly functions in the runtime that are marked
as ABIInternal, solely because funcPC can get the right address.
The functions themselves do not actually follow ABIInternal (or
irrelevant). Now we have internal/abi.FuncPCABI0, use that, and
un-mark the functions.
Also un-mark assembly functions that are only called in assembly.
For them, it only matters if the caller and callee are consistent.
Change-Id: I240e126ac13cb362f61ff8482057ee9f53c24097
Reviewed-on: https://go-review.googlesource.com/c/go/+/321950
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).
Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild
Change-Id: Ia0534360e4957e58cd9a18429c39d0e32a6addb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/294430
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
In the signal handler, we adjust gsingal's stack to the stack
where the signal is delivered. TSAN may deliver signals to the
g0 stack, so we have a special case for the g0 stack. However,
we don't have very good accuracy in determining the g0 stack's
bounds, as it is system allocated and we don't know where it is
exactly. If g0.stack.lo is too low, the condition may be
triggered incorrectly, where we thought the signal is delivered to
the g0 stack but it is actually not. In this case, as the stack
bounds is actually wrong, when the stack grows, it may go below
the (inaccurate) lower bound, causing "morestack on gsignal"
crash.
Check for g0 stack last to avoid this situation. There could still
be false positives, but for those cases we'll crash either way.
(If we could in some way determine the g0 stack bounds accurately,
this would not matter (but probably doesn't hurt).)
Fixes#43853.
Change-Id: I759717c5aa2b0deb83ffb23e57b7625a6b249ee8
Reviewed-on: https://go-review.googlesource.com/c/go/+/285772
Trust: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Under linux+cgo, OS threads are launched via pthread_create().
This abstraction, under linux, requires we avoid blocking
signals 32,33 and 34 indefinitely because they are needed to
reliably execute POSIX-semantics threading in glibc and/or musl.
When blocking signals the go runtime generally re-enables them
quickly. However, when a thread exits (under cgo, this is
via a return from mstart()), we avoid a deadlock in C-code by
not blocking these three signals.
Fixes#42494
Change-Id: I02dfb2480a1f97d11679e0c4b132b51bddbe4c14
Reviewed-on: https://go-review.googlesource.com/c/go/+/269799
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Trust: Tobias Klauser <tobias.klauser@gmail.com>
The iOS kernel has the same problem as the macOS kernel. Extend
the workaround of #41702 (CL 262438 and CL 262817) to iOS.
Updates #35851.
Change-Id: I7ccec00dc96643c08c5be8b385394856d0fa0f64
Reviewed-on: https://go-review.googlesource.com/c/go/+/275293
Trust: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Now that we have the G register saved, we can enable asynchronous
preemption for pure Go programs on darwin/arm64.
Updates #38485, #36365.
Change-Id: Ic654fa4dce369efe289b38d59cf1a184b358fe9e
Reviewed-on: https://go-review.googlesource.com/c/go/+/265120
Trust: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Otherwise, if a signal occurs just after we allocated the M,
we can deadlock if the signal handler needs to allocate an M
itself.
Fixes#42207
Change-Id: I76f44547f419e8b1c14cbf49bf602c6e645d8c14
Reviewed-on: https://go-review.googlesource.com/c/go/+/265759
Trust: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
On amd64 and 386, we have a very roundabout way of remembering that we
need to dropm on return that currently involves saving a zero to
needm's argument slot and later bringing it back. Just store the zero.
This also makes amd64 and 386 more consistent with cgocallback on all
other platforms: rather than saving the old M to the G stack, they now
save it to a named slot on the G0 stack.
The needm function no longer needs a dummy argument to get the SP, so
we drop that.
Change-Id: I7e84bb4a5ff9552de70dcf41d8accf02310535e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/263268
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Fixes#41702Fixes#42023
Change-Id: If07f40b1d73b8f276ee28ffb8b7214175e56c24d
Reviewed-on: https://go-review.googlesource.com/c/go/+/262817
Trust: Ian Lance Taylor <iant@golang.org>
Trust: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>