Commit graph

132 commits

Author SHA1 Message Date
Keith Randall
5a75d6a08e cmd/compile: optimize non-empty-interface type conversions
When doing i.(T) for non-empty-interface i and concrete type T,
there's no need to read the type out of the itab. Just compare the
itab to the itab we expect for that interface/type pair.

Also optimize type switches by putting the type hash of the
concrete type in the itab. That way we don't need to load the
type pointer out of the itab.

Update #18492

Change-Id: I49e280a21e5687e771db5b8a56b685291ac168ce
Reviewed-on: https://go-review.googlesource.com/34810
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: David Chase <drchase@google.com>
2017-02-13 18:16:31 +00:00
Russ Cox
45c6f59e1f runtime: use two-level list for semaphore address search in semaRoot
If there are many goroutines contending for two different locks
and both locks hash to the same semaRoot, the scans to find the
goroutines for a particular lock can end up being O(n), making
n lock acquisitions quadratic.

As long as only one actively-used lock hashes to each semaRoot
there's no problem, since the list operations in that case are O(1).
But when the second actively-used lock hits the same semaRoot,
then scans for entries with for a given lock have to scan over the
entries for the other lock.

Fix this problem by changing the semaRoot to hold only one sudog
per unique address. In the running example, this drops the length of
that list from O(n) to 2. Then attach other goroutines waiting on the
same address to a separate list headed by the sudog in the semaRoot list.
Those "same address list" operations are still O(1), so now the
example from above works much better.

There is still an assumption here that in real programs you don't have
many many goroutines queueing up on many many distinct addresses.
If we end up with that problem, we can replace the top-level list with
a treap.

Fixes #17953.

Change-Id: I78c5b1a5053845275ab31686038aa4f6db5720b2
Reviewed-on: https://go-review.googlesource.com/36792
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-02-12 15:54:16 +00:00
Ian Lance Taylor
87ad863f35 runtime: use atomic ops for fwdSig, make sigtable immutable
The fwdSig array is accessed by the signal handler, which may run in
parallel with other threads manipulating it via the os/signal package.
Use atomic accesses to ensure that there are no problems.

Move the _SigHandling flag out of the sigtable array. This makes sigtable
immutable and safe to read from the signal handler.

Change-Id: Icfa407518c4ebe1da38580920ced764898dfc9ad
Reviewed-on: https://go-review.googlesource.com/36321
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-02-08 04:14:41 +00:00
Michael Matloob
62956897c1 runtime: add definitions for SetGoroutineLabels and Do
This change defines runtime/pprof.SetGoroutineLabels and runtime/pprof.Do, which
are used to set profiler labels on goroutines. The change defines functions
in the runtime for setting and getting profile labels, and sets and unsets
profile labels when goroutines are created and deleted. The change also adds
the package runtime/internal/proflabel, which defines the structure the runtime
uses to store profile labels.

Change-Id: I747a4400141f89b6e8160dab6aa94ca9f0d4c94d
Reviewed-on: https://go-review.googlesource.com/34198
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
Reviewed-on: https://go-review.googlesource.com/35010
2017-02-06 20:29:37 +00:00
Elias Naur
78074f6850 runtime: handle SIGPIPE in c-archive and c-shared programs
Before this CL, Go programs in c-archive or c-shared buildmodes
would not handle SIGPIPE. That leads to surprising behaviour where
writes on a closed pipe or socket would raise SIGPIPE and terminate
the program. This CL changes the Go runtime to handle
SIGPIPE regardless of buildmode. In addition, SIGPIPE from non-Go
code is forwarded.

This is a refinement of CL 32796 that fixes the case where a non-default
handler for SIGPIPE is installed by the host C program.

Fixes #17393

Change-Id: Ia41186e52c1ac209d0a594bae9904166ae7df7de
Reviewed-on: https://go-review.googlesource.com/35960
Run-TryBot: Elias Naur <elias.naur@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2017-02-03 20:07:36 +00:00
Lion Yang
a2b615d527 crypto: detect BMI usability on AMD64 for sha1 and sha256
The existing implementations on AMD64 only detects AVX2 usability,
when they also contains BMI (bit-manipulation instructions).
These instructions crash the running program as 'unknown instructions'
on the architecture, e.g. i3-4000M, which supports AVX2 but not
support BMI.

This change added the detections for BMI1 and BMI2 to AMD64 runtime with
two flags as the result, `support_bmi1` and `support_bmi2`,
in runtime/runtime2.go. It also completed the condition to run AVX2 version
in packages crypto/sha1 and crypto/sha256.

Fixes #18512

Change-Id: I917bf0de365237740999de3e049d2e8f2a4385ad
Reviewed-on: https://go-review.googlesource.com/34850
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2017-01-05 15:37:37 +00:00
Michael Hudson-Doyle
1ec64e9b63 cmd/compile, runtime: a different approach to duplicate itabs
golang.org/issue/17594 was caused by additab being called more than once for
an itab. golang.org/cl/32131 fixed that by making the itabs local symbols,
but that in turn causes golang.org/issue/18252 because now there are now
multiple itab symbols in a process for a given (type,interface) pair and
different code paths can end up referring to different itabs which breaks
lots of reflection stuff. So this makes itabs global again and just takes
care to only call additab once for each itab.

Fixes #18252

Change-Id: I781a193e2f8dd80af145a3a971f6a25537f633ea
Reviewed-on: https://go-review.googlesource.com/34173
Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
2016-12-19 01:31:59 +00:00
Austin Clements
61db2e4efa runtime: cross-reference _func type better
It takes me several minutes every time I want to find where the linker
writes out the _func structures. Add some comments to make this
easier.

Change-Id: Ic75ce2786ca4b25726babe3c4fe9cd30c85c34e2
Reviewed-on: https://go-review.googlesource.com/34390
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2016-12-16 17:03:25 +00:00
Austin Clements
70c107c68d runtime: add deletion barriers on gobuf.ctxt
gobuf.ctxt is set to nil from many places in assembly code and these
assignments require write barriers with the hybrid barrier.

Conveniently, in most of these places ctxt should already be nil, in
which case we don't need the barrier. This commit changes these places
to assert that ctxt is already nil.

gogo is more complicated, since ctxt may not already be nil. For gogo,
we manually perform the write barrier if ctxt is not nil.

Updates #17503.

Change-Id: I9d75e27c75a1b7f8b715ad112fc5d45ffa856d30
Reviewed-on: https://go-review.googlesource.com/31764
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2016-10-28 20:48:02 +00:00
Austin Clements
8044b77a57 runtime: eliminate write barriers from dropg
Currently this contains no write barriers because it's writing nil
pointers, but with the hybrid barrier, even these will produce write
barriers. However, since these are *gs and *ms, they don't need write
barriers, so we can simply eliminate them.

Updates #17503.

Change-Id: Ib188a60492c5cfb352814bf9b2bcb2941fb7d6c0
Reviewed-on: https://go-review.googlesource.com/31570
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-10-28 20:05:39 +00:00
Peter Weinberger
ca922b6d36 runtime: Profile goroutines holding contended mutexes.
runtime.SetMutexProfileFraction(n int) will capture 1/n-th of stack
traces of goroutines holding contended mutexes if n > 0. From runtime/pprof,
pprot.Lookup("mutex").WriteTo writes the accumulated
stack traces to w (in essentially the same format that blocking
profiling uses).

Change-Id: Ie0b54fa4226853d99aa42c14cb529ae586a8335a
Reviewed-on: https://go-review.googlesource.com/29650
Reviewed-by: Austin Clements <austin@google.com>
2016-10-28 11:47:16 +00:00
Carlos Eduardo Seo
aaa6b53524 runtime: insufficient padding in the p structure
The current padding in the 'p' struct is hardcoded at 64 bytes. It should be the
cache line size. On ppc64x, the current value is only okay because sys.CacheLineSize
is wrong at 64 bytes. This change fixes that by making the padding equal to the
cache line size. It also fixes the cache line size for ppc64/ppc64le to 128 bytes.

Fixes #16477

Change-Id: Ib7ec5195685116eb11ba312a064f41920373d4a3
Reviewed-on: https://go-review.googlesource.com/25370
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Minux Ma <minux@golang.org>
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-29 23:22:51 +00:00
Ian Lance Taylor
f29ec7d74a runtime: remove unused type sigtabtt
The type sigtabtt was introduced by an automated tool in
https://golang.org/cl/167550043. It was the Go version of the C type
SigTab. However, when the C code using SigTab was converted to Go in
https://golang.org/cl/168500044 it was rewritten to use a different Go
type, sigTabT, rather than sigtabtt (the difference being that sigTabT
uses string where sigtabtt uses *int8 from the C type char*). So this is
just a dreg from the conversion that was never actually used.

Change-Id: I2ec6eb4b25613bf5e5ad1dbba1f4b5ff20f80f55
Reviewed-on: https://go-review.googlesource.com/27691
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-08-25 03:51:24 +00:00
Russ Cox
7fdec6216c build: enable framepointer mode by default
This has a minor performance cost, but far less than is being gained by SSA.
As an experiment, enable it during the Go 1.7 beta.
Having frame pointers on by default makes Linux's perf, Intel VTune,
and other profilers much more useful, because it lets them gather a
stack trace efficiently on profiling events.
(It doesn't help us that much, since when we walk the stack we usually
need to look up PC-specific information as well.)

Fixes #15840.

Change-Id: I4efd38412a0de4a9c87b1b6e5d11c301e63f1a2a
Reviewed-on: https://go-review.googlesource.com/23451
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-26 19:02:00 +00:00
Dmitry Vyukov
caa2147532 runtime: per-P contexts for race detector
Race runtime also needs local malloc caches and currently uses
a mix of per-OS-thread and per-goroutine caches. This leads to
increased memory consumption. But more importantly cache of
synchronization objects is per-goroutine and we don't always
have goroutine context when feeing memory in GC. As the result
synchronization object descriptors leak (more precisely, they
can be reused if another synchronization object is recreated
at the same address, but it does not always help). For example,
the added BenchmarkSyncLeak has effectively runaway memory
consumption (based on a real long running server).

This change updates race runtime with support for per-P contexts.
BenchmarkSyncLeak now stabilizes at ~1GB memory consumption.

Long term, this will allow us to remove race runtime dependency
on glibc (as malloc is the main cornerstone).

I've also implemented a different scheme to pass P context to
race runtime: scheduler notified race runtime about association
between G and P by calling procwire(g, p)/procunwire(g, p).
But it turned out to be very messy as we have lots of places
where the association changes (e.g. syscalls). So I dropped it
in favor of the current scheme: race runtime asks scheduler
about the current P.

Fixes #14533

Change-Id: Iad10d2f816a44affae1b9fed446b3580eafd8c69
Reviewed-on: https://go-review.googlesource.com/19970
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-05-03 11:00:43 +00:00
Ian Lance Taylor
5f9a870bf1 cmd/cgo, runtime, runtime/cgo: use cgo context function
Add support for the context function set by runtime.SetCgoTraceback.
The context function was added in CL 17761, without support.
This CL is the support.

This CL has not been tested for real C code, as a working context
function for C code requires unwind support that does not seem to exist.
I wanted to get the CL out before the freeze.

I apologize for the length of this CL.  It's mostly plumbing, but
unfortunately the plumbing is processor-specific.

Change-Id: I8ce11a0de9b3dafcc29efd2649d776e93bff0e90
Reviewed-on: https://go-review.googlesource.com/22508
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-29 22:07:36 +00:00
Austin Clements
2a889b9d93 runtime: make stack re-scan O(# dirty stacks)
Currently the stack re-scan during mark termination is O(# stacks)
because we enqueue a root marking job for every goroutine. It takes
~34ns to process this root marking job for a valid (clean) stack, so
at around 300k goroutines we exceed the 10ms pause goal. A non-trivial
portion of this time is spent simply taking the cache miss to check
the gcscanvalid flag, so simply optimizing the path that handles clean
stacks can only improve this so much.

Fix this by keeping an explicit list of goroutines with dirty stacks
that need to be rescanned. When a goroutine first transitions to
running after a stack scan and marks its stack dirty, it adds itself
to this list. We enqueue root marking jobs only for the goroutines in
this list, so this improves stack re-scanning asymptotically by
completely eliminating time spent on clean goroutines.

This reduces mark termination time for 500k idle goroutines from 15ms
to 238µs. Overall performance effect is negligible.

name \ 95%ile-time/markTerm     old           new         delta
IdleGs/gs:500000/gomaxprocs:12  15000µs ± 0%  238µs ± 5%  -98.41% (p=0.000 n=10+10)

name              old time/op  new time/op  delta
XBenchGarbage-12  2.30ms ± 3%  2.29ms ± 1%  -0.43%  (p=0.049 n=17+18)

name                      old time/op    new time/op    delta
BinaryTree17-12              2.57s ± 3%     2.59s ± 2%    ~     (p=0.141 n=19+20)
Fannkuch11-12                2.09s ± 0%     2.10s ± 1%  +0.53%  (p=0.000 n=19+19)
FmtFprintfEmpty-12          45.3ns ± 3%    45.2ns ± 2%    ~     (p=0.845 n=20+20)
FmtFprintfString-12          129ns ± 0%     127ns ± 0%  -1.55%  (p=0.000 n=16+16)
FmtFprintfInt-12             123ns ± 0%     119ns ± 1%  -3.24%  (p=0.000 n=19+19)
FmtFprintfIntInt-12          195ns ± 1%     189ns ± 1%  -3.11%  (p=0.000 n=17+17)
FmtFprintfPrefixedInt-12     193ns ± 1%     187ns ± 1%  -3.06%  (p=0.000 n=19+19)
FmtFprintfFloat-12           254ns ± 0%     255ns ± 1%  +0.35%  (p=0.001 n=14+17)
FmtManyArgs-12               781ns ± 0%     770ns ± 0%  -1.48%  (p=0.000 n=16+19)
GobDecode-12                7.00ms ± 1%    6.98ms ± 1%    ~     (p=0.563 n=19+19)
GobEncode-12                5.91ms ± 1%    5.92ms ± 0%    ~     (p=0.118 n=19+18)
Gzip-12                      219ms ± 1%     215ms ± 1%  -1.81%  (p=0.000 n=18+18)
Gunzip-12                   37.2ms ± 0%    37.4ms ± 0%  +0.45%  (p=0.000 n=17+19)
HTTPClientServer-12         76.9µs ± 3%    77.5µs ± 2%  +0.81%  (p=0.030 n=20+19)
JSONEncode-12               15.0ms ± 0%    14.8ms ± 1%  -0.88%  (p=0.001 n=15+19)
JSONDecode-12               50.6ms ± 0%    53.2ms ± 2%  +5.07%  (p=0.000 n=17+19)
Mandelbrot200-12            4.05ms ± 0%    4.05ms ± 1%    ~     (p=0.581 n=16+17)
GoParse-12                  3.34ms ± 1%    3.30ms ± 1%  -1.21%  (p=0.000 n=15+20)
RegexpMatchEasy0_32-12      69.6ns ± 1%    69.8ns ± 2%    ~     (p=0.566 n=19+19)
RegexpMatchEasy0_1K-12       238ns ± 1%     236ns ± 0%  -0.91%  (p=0.000 n=17+13)
RegexpMatchEasy1_32-12      69.8ns ± 1%    70.0ns ± 1%  +0.23%  (p=0.026 n=17+16)
RegexpMatchEasy1_1K-12       371ns ± 1%     363ns ± 1%  -2.07%  (p=0.000 n=19+19)
RegexpMatchMedium_32-12      107ns ± 2%     106ns ± 1%  -0.51%  (p=0.031 n=18+20)
RegexpMatchMedium_1K-12     33.0µs ± 0%    32.9µs ± 0%  -0.30%  (p=0.004 n=16+16)
RegexpMatchHard_32-12       1.70µs ± 0%    1.70µs ± 0%  +0.45%  (p=0.000 n=16+17)
RegexpMatchHard_1K-12       51.1µs ± 2%    51.4µs ± 1%  +0.53%  (p=0.000 n=17+19)
Revcomp-12                   378ms ± 1%     385ms ± 1%  +1.92%  (p=0.000 n=19+18)
Template-12                 64.3ms ± 2%    65.0ms ± 2%  +1.09%  (p=0.001 n=19+19)
TimeParse-12                 315ns ± 1%     317ns ± 2%    ~     (p=0.108 n=18+20)
TimeFormat-12                360ns ± 1%     337ns ± 0%  -6.30%  (p=0.000 n=18+13)
[Geo mean]                  51.8µs         51.6µs       -0.48%

Change-Id: Icf8994671476840e3998236e15407a505d4c760c
Reviewed-on: https://go-review.googlesource.com/20700
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-26 23:40:13 +00:00
Austin Clements
1a2cf91f5e runtime: split gfree list into with-stacks and without-stacks
Currently all free Gs are added to one list. Split this into two
lists: one for free Gs with cached stacks and one for Gs without
cached stacks.

This lets us preferentially allocate Gs that already have a stack, but
more importantly, it sets us up to free cached G stacks concurrently.

Change-Id: Idbe486f708997e1c9d166662995283f02d1eeb3c
Reviewed-on: https://go-review.googlesource.com/20664
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-26 23:39:51 +00:00
Dmitry Vyukov
a3703618ea runtime: use per-goroutine sequence numbers in tracer
Currently tracer uses global sequencer and it introduces
significant slowdown on parallel machines (up to 10x).
Replace the global sequencer with per-goroutine sequencer.

If we assign per-goroutine sequence numbers to only 3 types
of events (start, unblock and syscall exit), it is enough to
restore consistent partial ordering of all events. Even these
events don't need sequence numbers all the time (if goroutine
starts on the same P where it was unblocked, then start does
not need sequence number).
The burden of restoring the order is put on trace parser.
Details of the algorithm are described in the comments.

On http benchmark with GOMAXPROCS=48:
no tracing: 5026 ns/op
tracing: 27803 ns/op (+453%)
with this change: 6369 ns/op (+26%, mostly for traceback)

Also trace size is reduced by ~22%. Average event size before: 4.63
bytes/event, after: 3.62 bytes/event.

Besides running trace tests, I've also tested with manually broken
cputicks (random skew for each event, per-P skew and episodic random skew).
In all cases broken timestamps were detected and no test failures.

Change-Id: I078bde421ccc386a66f6c2051ab207bcd5613efa
Reviewed-on: https://go-review.googlesource.com/21512
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-23 15:57:05 +00:00
Jeremy Jackins
02b8e6978a runtime: find a home for orphaned comments
These comments were left behind after runtime.h was converted
from C to Go. I examined the original code and tried to move these
to the places that the most sense.

Change-Id: I8769d60234c0113d682f9de3bd8d6c34c450c188
Reviewed-on: https://go-review.googlesource.com/21969
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-04-14 18:34:09 +00:00
Ian Lance Taylor
ea306ae625 runtime: support symbolic backtrace of C code in a cgo crash
The new function runtime.SetCgoTraceback may be used to register stack
traceback and symbolizer functions, written in C, to do a stack
traceback from cgo code.

There is a sample implementation of runtime.SetCgoSymbolizer at
github.com/ianlancetaylor/cgosymbolizer.  Just importing that package is
sufficient to get symbolic C backtraces.

Currently only supported on linux/amd64.

Change-Id: If96ee2eb41c6c7379d407b9561b87557bfe47341
Reviewed-on: https://go-review.googlesource.com/17761
Reviewed-by: Austin Clements <austin@google.com>
2016-04-01 04:13:44 +00:00
Keith Randall
4b209dbf0b runtime: don't use REP;MOVSB if CPUID doesn't say it is fast
Only use REP;MOVSB if:
 1) The CPUID flag says it is fast, and
 2) The pointers are unaligned
Otherwise, use REP;MOVSQ.

Update #14630

Change-Id: I946b28b87880c08e5eed1ce2945016466c89db66
Reviewed-on: https://go-review.googlesource.com/21300
Reviewed-by: Nigel Tao <nigeltao@golang.org>
2016-03-31 02:54:10 +00:00
Michel Lespinasse
f00bbd5f81 cmd/compile: emit itabs and itablinks
See #14874

This change tells the compiler to emit itab and itablink symbols in
situations where they could be useful; however the compiled code does
not actually make use of the new symbols yet.

Change-Id: I0db3e6ec0cb1f3b7cebd4c60229e4a48372fe586
Reviewed-on: https://go-review.googlesource.com/20888
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Michel Lespinasse <walken@google.com>
2016-03-29 02:18:03 +00:00
Austin Clements
857d0b48db runtime: document sudog
Change-Id: I85c0bcf02842cc32dbc9bfdcea27efe871173574
Reviewed-on: https://go-review.googlesource.com/20774
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2016-03-17 18:57:10 +00:00
Matthew Dempsky
ac74e5debc cmd/compile: stop constructing sudog type
The compiler doesn't care about the runtime's sudog type. Stop
constructing it.

Change-Id: If1885fe30b2e215a08d17662eab5ea6d81fe58ab
Reviewed-on: https://go-review.googlesource.com/20797
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2016-03-17 18:02:40 +00:00
Austin Clements
276b177771 runtime: make shrinkstack concurrent-safe
Currently shinkstack is only safe during STW because it adjusts
channel-related stack pointers and moves send/receive stack slots
without synchronizing with the channel code. Make it safe to use when
the world isn't stopped by:

1) Locking all channels the G is blocked on while adjusting the sudogs
   and copying the area of the stack that may contain send/receive
   slots.

2) For any stack frames that may contain send/receive slot, using an
   atomic CAS to adjust pointers to prevent races between adjusting a
   pointer in a receive slot and a concurrent send writing to that
   receive slot.

In principle, the synchronization could be finer-grained. For example,
we considered synchronizing around the sudogs, which would allow
channel operations involving other Gs to continue if the G being
shrunk was far enough down the send/receive queue. However, using the
channel lock means no additional locks are necessary in the channel
code. Furthermore, the stack shrinking code holds the channel lock for
a very short time (much less than the time required to shrink the
stack).

This does not yet make stack shrinking concurrent; it merely makes
doing so safe.

This has negligible effect on the go1 and garbage benchmarks.

For #12967.

Change-Id: Ia49df3a8a7be4b36e365aac4155a2416b94b988c
Reviewed-on: https://go-review.googlesource.com/20042
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
2016-03-16 20:13:20 +00:00
Austin Clements
db72b41bcd runtime: protect sudog.elem with hchan.lock
Currently sudog.elem is never accessed concurrently, so in several
cases we drop the channel lock just before reading/writing the
sent/received value from/to sudog.elem. However, concurrent stack
shrinking is going to have to adjust sudog.elem to point to the new
stack, which means it needs a way to synchronize with accesses to
sudog.elem. Hence, add sudog.elem to the fields protected by
hchan.lock and scoot the unlocks down past the uses of sudog.elem.

While we're here, better document the channel synchronization rules.

For #12967.

Change-Id: I3ad0ca71f0a74b0716c261aef21b2f7f13f74917
Reviewed-on: https://go-review.googlesource.com/20040
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-16 20:13:15 +00:00
Austin Clements
005140a77e runtime: put g.waiting list in lock order
Currently the g.waiting list created by a select is in poll order.
However, nothing depends on this, and we're going to need access to
the channel lock order in other places shortly, so modify select to
put the waiting list in channel lock order.

For #12967.

Change-Id: If0d38816216ecbb37a36624d9b25dd96e0a775ec
Reviewed-on: https://go-review.googlesource.com/20037
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
2016-03-16 20:13:07 +00:00
Austin Clements
e4a95b6343 runtime: record channel in sudog
Given a G, there's currently no way to find the channel it's blocking
on. We'll need this information to fix a (probably theoretical) bug in
select and to implement concurrent stack shrinking, so record the
channel in the sudog.

For #12967.

Change-Id: If8fb63a140f1d07175818824d08c0ebeec2bdf66
Reviewed-on: https://go-review.googlesource.com/20035
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-16 20:13:02 +00:00
Wedson Almeida Filho
8e7072ca83 sync: new Cond implementation
Change Cond implementation to use a notification list such that waiters
can first register for a notification, release the lock, then actually
wait. Signalers never have to park anymore.

This is intended to address an issue in the previous implementation
where Broadcast could fail to signal all waiters.

Results of the existing benchmark are below.

                                          Original          New  Diff
BenchmarkCond1-48        2000000               745 ns/op    755 +1.3%
BenchmarkCond2-48        1000000              1545 ns/op   1532 -0.8%
BenchmarkCond4-48         300000              3833 ns/op   3896 +1.6%
BenchmarkCond8-48         200000             10049 ns/op  10257 +2.1%
BenchmarkCond16-48        100000             21123 ns/op  21236 +0.5%
BenchmarkCond32-48         30000             40393 ns/op  41097 +1.7%

Fixes #14064

Change-Id: I083466d61593a791a034df61f5305adfb8f1c7f9
Reviewed-on: https://go-review.googlesource.com/18892
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Caleb Spare <cespare@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-15 22:01:20 +00:00
Austin Clements
b1a6e07919 runtime: document the G states
In particular, write down the rules for stack ownership because the
details of this are about to get very important with concurrent stack
shrinking. (Interestingly, the details don't actually change, but
anything that's currently skating on thin ice is likely to fall
through.)

Fox #12967.

Change-Id: I561e2610e864295e9faba07717a934aabefcaab9
Reviewed-on: https://go-review.googlesource.com/20034
Reviewed-by: Rick Hudson <rlh@golang.org>
2016-03-14 16:30:16 +00:00
Brad Fitzpatrick
5fea2ccc77 all: single space after period.
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.

This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:

$ perl -i -npe 's,^(\s*// .+[a-z]\.)  +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.)  +([A-Z])')
$ go test go/doc -update

Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-03-02 00:13:47 +00:00
Austin Clements
cbe849fc38 runtime: eliminate unused _Genqueue state
_Genqueue and _Gscanenqueue were introduced as part of the GC quiesce
code. The quiesce code was removed by 197aa9e, but these states and
some associated code stuck around. Remove them.

Change-Id: I69df81881602d4a431556513dac2959668d27c20
Reviewed-on: https://go-review.googlesource.com/19638
Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-25 23:37:32 +00:00
Matthew Dempsky
9877900c8c Revert "cmd/compile: move hiter, hmap, and scase definitions into builtin.go"
This reverts commit f28bbb776a.

Change-Id: I82fb81dcff3ddcaefef72949f1ef3a41bcd22301
Reviewed-on: https://go-review.googlesource.com/19849
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2016-02-23 19:42:52 +00:00
Matthew Dempsky
a4b833940d runtime: move machport into darwin's mOS
It's not needed on other OSes.

Change-Id: Ia6b13510585392a7062374806527d33876beba2a
Reviewed-on: https://go-review.googlesource.com/19818
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2016-02-22 21:15:50 +00:00
Matthew Dempsky
f28bbb776a cmd/compile: move hiter, hmap, and scase definitions into builtin.go
Also eliminates per-maptype hiter and hmap types, since they're not
really needed anyway.  Update packages reflect and runtime
accordingly.

Reduces golang.org/x/tools/cmd/godoc's text segment by ~170kB:

   text	   data	    bss	    dec	    hex	filename
13085702	 140640	 151520	13377862	 cc2146	godoc.before
12915382	 140640	 151520	13207542	 c987f6	godoc.after

Updates #6853.

Change-Id: I948b2bc1f22d477c1756204996b4e3e1fb568d81
Reviewed-on: https://go-review.googlesource.com/16610
Reviewed-by: Keith Randall <khr@golang.org>
2016-02-22 07:42:37 +00:00
Austin Clements
09940b92a0 runtime: make p.gcBgMarkWorker a guintptr
Currently p.gcBgMarkWorker is a *g. Change it to a guintptr. This
eliminates a write barrier during the subtle mark worker parking dance
(which isn't known to be causing problems, but may).

Change-Id: Ibf12c05ac910820448059e69a68e5b882c993ed8
Reviewed-on: https://go-review.googlesource.com/18970
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2016-01-27 02:23:09 +00:00
Russ Cox
fac8202c3f runtime: make NumGoroutine and Stack agree not to include system goroutines
[Repeat of CL 18343 with build fixes.]

Before, NumGoroutine counted system goroutines and Stack (usually) didn't show them,
which was inconsistent and confusing.

To resolve which way they should be consistent, it seems like

	package main
	import "runtime"
	func main() { println(runtime.NumGoroutine()) }

should print 1 regardless of internal runtime details. Make it so.

Fixes #11706.

Change-Id: If26749fec06aa0ff84311f7941b88d140552e81d
Reviewed-on: https://go-review.googlesource.com/18432
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Russ Cox <rsc@golang.org>
2016-01-13 01:46:01 +00:00
Russ Cox
6da608206c Revert "runtime: make NumGoroutine and Stack agree not to include system goroutines"
This reverts commit c5bafc8281.

Change-Id: Ie7030c978c6263b9e996d5aa0e490086796df26d
Reviewed-on: https://go-review.googlesource.com/18431
Reviewed-by: Russ Cox <rsc@golang.org>
2016-01-08 15:31:15 +00:00
Russ Cox
c5bafc8281 runtime: make NumGoroutine and Stack agree not to include system goroutines
Before, NumGoroutine counted system goroutines and Stack (usually) didn't show them,
which was inconsistent and confusing.

To resolve which way they should be consistent, it seems like

	package main
	import "runtime"
	func main() { println(runtime.NumGoroutine()) }

should print 1 regardless of internal runtime details. Make it so.

Fixes #11706.

Change-Id: I6bfe26a901de517728192cfb26a5568c4ef4fe47
Reviewed-on: https://go-review.googlesource.com/18343
Reviewed-by: Austin Clements <austin@google.com>
2016-01-08 15:25:00 +00:00
Ian Lance Taylor
a7cad52e04 runtime: preserve signal stack when calling Go on C thread
When calling a Go function on a C thread, if the C thread already has an
alternate signal stack, use that signal stack instead of installing a
new one.

Update #9896.

Change-Id: I62aa3a6a4a1dc4040fca050757299c8e6736987c
Reviewed-on: https://go-review.googlesource.com/18108
Reviewed-by: Russ Cox <rsc@golang.org>
2016-01-06 23:11:42 +00:00
Ian Lance Taylor
a7d2b4d7ce runtime: disable a signal by restoring the original disposition
Fixes #13034.
Fixes #13042.
Update #9896.

Change-Id: I189f381090223dd07086848aac2d69d2c00d80c4
Reviewed-on: https://go-review.googlesource.com/18062
Reviewed-by: Russ Cox <rsc@golang.org>
2016-01-05 00:25:50 +00:00
Austin Clements
d446ba99a4 runtime: document stack barrier synchronization rules
Change-Id: I545e53561f37bceabd26d814d272cecc3ff19847
Reviewed-on: https://go-review.googlesource.com/18024
Reviewed-by: Russ Cox <rsc@golang.org>
2015-12-18 17:08:52 +00:00
Russ Cox
0abf443513 runtime: remove incorrect TODO added in CL 16035
I've already turned away one attempt to remove this field.
As the comment above the struct says, many tools know the layout.
The field cannot simply be removed.

It was one thing to remove the fields name, but the TODO should
not have been added.

Change-Id: If40eacf0eb35835082055e129e2b88333a0731b9
Reviewed-on: https://go-review.googlesource.com/17741
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-12-16 20:23:15 +00:00
Dmitry Vyukov
fb6f8a96f2 runtime: remove unnecessary wakeups of worker threads
Currently we wake up new worker threads whenever we pass
through the scheduler with nmspinning==0. This leads to
lots of unnecessary thread wake ups.
Instead let only spinning threads wake up new spinning threads.

For the following program:

package main
import "runtime"
func main() {
	for i := 0; i < 1e7; i++ {
		runtime.Gosched()
	}
}

Before:
$ time ./test
real	0m4.278s
user	0m7.634s
sys	0m1.423s

$ strace -c ./test
% time     seconds  usecs/call     calls    errors syscall
 99.93    9.314936           3   2685009     17536 futex

After:
$ time ./test
real	0m1.200s
user	0m1.181s
sys	0m0.024s

$ strace -c ./test
% time     seconds  usecs/call     calls    errors syscall
  3.11    0.000049          25         2           futex

Fixes #13527

Change-Id: Ia1f5bf8a896dcc25d8b04beb1f4317aa9ff16f74
Reviewed-on: https://go-review.googlesource.com/17540
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-12-11 11:31:12 +00:00
Elias Naur
a7383fc467 runtime: use a proper type, sigset, for m.sigmask
Replace the cross platform but unsafe [4]uintptr type with a OS
specific type, sigset. Most OSes already define sigset, and this
change defines a suitable sigset for the OSes that don't (darwin,
openbsd). The OSes that don't use m.sigmask (windows, plan9, nacl)
now defines sigset as the empty type, struct{}.

The gain is strongly typed access to m.sigmask, saving a dynamic
size sanity check and unsafe.Pointer casting. Also, some storage is
saved for each M, since [4]uinptr was conservative for most OSes.

The cost is that OSes that don't need m.sigmask has to define sigset.

completes ./all.bash with GOOS linux, on amd64
completes ./make.bash with GOOSes openbsd, android, plan9, windows,
darwin, solaris, netbsd, freebsd, dragonfly, all amd64.

With GOOS=nacl ./make.bash failed with a seemingly unrelated error.

[Replay of CL 16942 by Elias Naur.]

Change-Id: I98f144d626033ae5318576115ed635415ac71b2c
Reviewed-on: https://go-review.googlesource.com/17033
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
2015-11-24 17:16:47 +00:00
Michael Hudson-Doyle
239273d963 runtime: mark {g,m,p}uintptr methods as nosplit
These are methods that are "obviously" going to get inlined -- until you build
with -l, when they can trigger a stack split at a bad time.

Fixes #11482

Change-Id: Ia065c385978a2e7fe9f587811991d088c4d68325
Reviewed-on: https://go-review.googlesource.com/17165
Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-24 02:33:04 +00:00
Russ Cox
5af2be8604 Revert "runtime: use a proper type, sigset, for m.sigmask"
This reverts commit 7db77271e4.

Change-Id: I6d8855eb05ca331025dc49a5533c6da4d1fa4e84
Reviewed-on: https://go-review.googlesource.com/17030
Reviewed-by: Russ Cox <rsc@golang.org>
2015-11-18 17:18:20 +00:00
Elias Naur
7db77271e4 runtime: use a proper type, sigset, for m.sigmask
Replace the cross platform but unsafe [4]uintptr type with a OS
specific type, sigset. Most OSes already define sigset, and this
change defines a suitable sigset for the OSes that don't (darwin,
openbsd). The OSes that don't use m.sigmask (windows, plan9, nacl)
now defines sigset as the empty type, struct{}.

The gain is strongly typed access to m.sigmask, saving a dynamic
size sanity check and unsafe.Pointer casting. Also, some storage is
saved for each M, since [4]uinptr was conservative for most OSes.

The cost is that OSes that don't need m.sigmask has to define sigset.

completes ./all.bash with GOOS linux, on amd64
completes ./make.bash with GOOSes openbsd, android, plan9, windows,
darwin, solaris, netbsd, freebsd, dragonfly, all amd64.

With GOOS=nacl ./make.bash failed with a seemingly unrelated error.

R=go1.7

Change-Id: Ib460379f063eb83d393e1c5efe7333a643c1595e
Reviewed-on: https://go-review.googlesource.com/16942
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2015-11-17 21:23:06 +00:00
Michael Hudson-Doyle
7af0839e11 cmd/go, runtime: always use position-independent code to invoke vsyscall helper on linux/386
golang.org/cl/16346 changed the runtime on linux/386 to invoke the vsyscall
helper via a PIC sequence (CALL 0x10(GS)) when dynamically linking. But it's
actually quite easy to make that code sequence work all the time, so do that,
and remove the ugly machinery that passed the buildmode from the go tool to the
assembly.

This means enlarging m.tls so that we can safely access 0x10(GS) (GS is set to
&m.tls + 4, so 0x10(GS) accesses m_tls[5]).

Change-Id: I1345c34029b149cb5f25320bf19a3cdd73a056fa
Reviewed-on: https://go-review.googlesource.com/16796
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2015-11-15 06:42:19 +00:00