Commit graph

8 commits

Author SHA1 Message Date
doujiang24
ccad8a9f9c runtime/cgo: store M for C-created thread in pthread key
This reapplies CL 392854, with the followup fixes in CL 479255,
CL 479915, and CL 481057 incorporated.

CL 392854, by doujiang24 <doujiang24@gmail.com>, speed up C to Go
calls by binding the M to the C thread. See below for its
description.
CL 479255 is a followup fix for a small bug in ARM assembly code.
CL 479915 is another followup fix to address C to Go calls after
the C code uses some stack, but that CL is also buggy.
CL 481057, by Michael Knyszek, is a followup fix for a memory leak
bug of CL 479915.

[Original CL 392854 description]

In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.

When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key,  only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.

When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.

This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.

This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.

For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz

[CL 479915 description]

Currently, when C calls into Go the first time, we grab an M
using needm, which sets m.g0's stack bounds using the SP. We don't
know how big the stack is, so we simply assume 32K. Previously,
when the Go function returns to C, we drop the M, and the next
time C calls into Go, we put a new stack bound on the g0 based on
the current SP. After CL 392854, we don't drop the M, and the next
time C calls into Go, we reuse the same g0, without recomputing
the stack bounds. If the C code uses quite a bit of stack space
before calling into Go, the SP may be well below the 32K stack
bound we assumed, so the runtime thinks the g0 stack overflows.

This CL makes needm get a more accurate stack bound from
pthread. (In some platforms this may still be a guess as we don't
know exactly where we are in the C stack), but it is probably
better than simply assuming 32K.

Fixes #51676.
Fixes #59294.

Change-Id: I9bf1400106d5c08ce621d2ed1df3a2d9e3f55494
Reviewed-on: https://go-review.googlesource.com/c/go/+/481061
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: DeJiang Zhu (doujiang) <doujiang24@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-03 18:34:11 +00:00
Cherry Mui
bfe3c678ab Revert "runtime/cgo: store M for C-created thread in pthread key"
This reverts CL 392854.

Reason for revert: caused #59294, which was derived from google
internal tests. The attempted fix of #59294 caused more breakage.

Change-Id: I5a061561ac2740856b7ecc09725ac28bd30f8bba
Reviewed-on: https://go-review.googlesource.com/c/go/+/481060
Reviewed-by: Heschi Kreinick <heschi@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-03-31 19:26:35 +00:00
doujiang24
ef0dedce87 runtime/cgo: store M for C-created thread in pthread key
In a C thread, it's necessary to acquire an extra M by using needm while invoking a Go function from C. But, needm and dropm are heavy costs due to the signal-related syscalls.
So, we change to not dropm while returning back to C, which means binding the extra M to the C thread until it exits, to avoid needm and dropm on each C to Go call.
Instead, we only dropm while the C thread exits, so the extra M won't leak.

When invoking a Go function from C:
Allocate a pthread variable using pthread_key_create, only once per shared object, and register a thread-exit-time destructor.
And store the g0 of the current m into the thread-specified value of the pthread key,  only once per C thread, so that the destructor will put the extra M back onto the extra M list while the C thread exits.

When returning back to C:
Skip dropm in cgocallback, when the pthread variable has been created, so that the extra M will be reused the next time invoke a Go function from C.

This is purely a performance optimization. The old version, in which needm & dropm happen on each cgo call, is still correct too, and we have to keep the old version on systems with cgo but without pthreads, like Windows.

This optimization is significant, and the specific value depends on the OS system and CPU, but in general, it can be considered as 10x faster, for a simple Go function call from a C thread.

For the newly added BenchmarkCGoInCThread, some benchmark results:
1. it's 28x faster, from 3395 ns/op to 121 ns/op, in darwin OS & Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
2. it's 6.5x faster, from 1495 ns/op to 230 ns/op, in Linux OS & Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz

Fixes #51676

Change-Id: I380702fe2f9b6b401b2d6f04b0aba990f4b9ee6c
GitHub-Last-Rev: 93dc64ad98
GitHub-Pull-Request: golang/go#51679
Reviewed-on: https://go-review.googlesource.com/c/go/+/392854
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: thepudds <thepudds1460@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2023-03-24 16:00:24 +00:00
Clément Chigot
125d187f9a cmd/dist, misc/cgo: enable tests for aix/ppc64
Some cgo tests aren't yet available on aix/ppc64.
-shared and -static don't work as expected and will be fixed latter.

Updates #30565

Change-Id: Ic59cabe685cb1cbdf89a8d1d1a1d2c4b0e8ef442
Reviewed-on: https://go-review.googlesource.com/c/go/+/164018
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2019-03-20 14:38:44 +00:00
Emmanuel Odeke
53fd522c0d all: make copyright headers consistent with one space after period
Follows suit with https://go-review.googlesource.com/#/c/20111.

Generated by running
$ grep -R 'Go Authors.  All' * | cut -d":" -f1 | while read F;do perl -pi -e 's/Go
Authors.  All/Go Authors. All/g' $F;done

The code in cmd/internal/unvendor wasn't changed.

Fixes #15213

Change-Id: I4f235cee0a62ec435f9e8540a1ec08ae03b1a75f
Reviewed-on: https://go-review.googlesource.com/21819
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-02 13:43:18 +00:00
Aram Hăvărneanu
2230e9d24b misc/cgo: add various solaris build lines
Change-Id: Ifd9ac7f5300232fb83c6350a787b5803adb96b48
Reviewed-on: https://go-review.googlesource.com/8263
Reviewed-by: Minux Ma <minux@golang.org>
2015-05-06 11:38:08 +00:00
Aulus Egnatius Varialus
2b44b36487 cgo: enable cgo on dragonfly
Enable cgo for dragonfly/386 and dragonfly/amd64.

R=golang-dev, jsing, iant, bradfitz
CC=golang-dev
https://golang.org/cl/13247046
2013-09-04 15:19:21 -07:00
Russ Cox
6c976393ae runtime: allow cgo callbacks on non-Go threads
Fixes #4435.

R=golang-dev, iant, alex.brainman, minux.ma, dvyukov
CC=golang-dev
https://golang.org/cl/7304104
2013-02-20 17:48:23 -05:00