Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
Michael Anthony Knyszek	39a5ee52b9	runtime: decouple consistent stats from mcache and allow P-less update This change modifies the consistent stats implementation to keep the per-P sequence counter on each P instead of each mcache. A valid mcache is not available everywhere that we want to call e.g. allocSpan, as per issue #42339. By decoupling these two, we can add a mechanism to allow contexts without a P to update stats consistently. In this CL, we achieve that with a mutex. In practice, it will be very rare for an M to update these stats without a P. Furthermore, the stats reader also only needs to hold the mutex across the update to "gen" since once that changes, writers are free to continue updating the new stats generation. Contention could thus only arise between writers without a P, and as mentioned earlier, those should be rare. A nice side-effect of this change is that the consistent stats acquire and release API becomes simpler. Fixes #42339. Change-Id: Ied74ab256f69abd54b550394c8ad7c4c40a5fe34 Reviewed-on: https://go-review.googlesource.com/c/go/+/267158 Run-TryBot: Michael Knyszek <mknyszek@google.com> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-11-02 21:21:46 +00:00
Michael Pratt	e1faebe7b4	runtime: manage gcBgMarkWorkers with a global pool Background mark workers perform per-P marking work. Currently each worker is assigned a P at creation time. The worker "attaches" to the P via p.gcBgMarkWorker, making itself (usually) available to findRunnableGCWorker for scheduling GC work. While running gcMarkDone, the worker "detaches" from the P (by clearing p.gcBgMarkWorker), since it may park for other reasons and should not be scheduled by findRunnableGCWorker. Unfortunately, this design is complex and difficult to reason about. We simplify things by changing the design to eliminate the hard P attachment. Rather than workers always performing work from the same P, workers perform work for whichever P they find themselves on. On park, the workers are placed in a pool of free workers, which each P's findRunnableGCWorker can use to run a worker for its P. Now if a worker parks in gcMarkDone, a P may simply use another worker from the pool to complete its own work. The P's GC worker mode is used to communicate the mode to run to the selected worker. It is also used to emit the appropriate worker EvGoStart tracepoint. This is a slight change, as this G may be preempted (e.g., in gcMarkDone). When it is rescheduled, the trace viewer will show it as a normal goroutine again. It is currently a bit difficult to connect to the original worker tracepoint, as the viewer does not display the goid for the original worker (though the data is in the trace file). Change-Id: Id7bd3a364dc18a4d2b1c99c4dc4810fae1293c1b Reviewed-on: https://go-review.googlesource.com/c/go/+/262348 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Trust: Michael Pratt <mpratt@google.com>	2020-10-30 15:25:49 +00:00
Michael Pratt	fc116b69e2	runtime: try to elide timer stealing if P has no timers Following golang.org/cl/259578, findrunnable still must touch every other P in checkTimers in order to look for timers to steal. This scales poorly with GOMAXPROCS and potentially performs poorly by pulling remote Ps into cache. Add timerpMask, a bitmask that tracks whether each P may have any timers on its timer heap. Ideally we would update this field on any timer add / remove to always keep it up to date. Unfortunately, updating a shared global structure is antithetical to sharding timers by P, and doing so approximately doubles the cost of addtimer / deltimer in microbenchmarks. Instead we only (potentially) clear the mask when the P goes idle. This covers the best case of avoiding looking at a P _at all_ when it is idle and has no timers. See the comment on updateTimerPMask for more details on the trade-off. Future CLs may be able to expand cases we can avoid looking at the timers. Note that the addition of idlepMask to p.init is a no-op. The zero value of the mask is the correct init value so it is not necessary, but it is included for clarity. Benchmark results from WakeupParallel/syscall/pair/race/1ms (see golang.org/cl/228577). Note that these are on top of golang.org/cl/259578: name old msec new msec delta Perf-task-clock-8 244 ± 4% 246 ± 4% ~ (p=0.841 n=5+5) Perf-task-clock-16 247 ±11% 252 ± 4% ~ (p=1.000 n=5+5) Perf-task-clock-32 270 ± 1% 268 ± 2% ~ (p=0.548 n=5+5) Perf-task-clock-64 302 ± 3% 296 ± 1% ~ (p=0.222 n=5+5) Perf-task-clock-128 358 ± 3% 352 ± 2% ~ (p=0.310 n=5+5) Perf-task-clock-256 483 ± 3% 458 ± 1% -5.16% (p=0.008 n=5+5) Perf-task-clock-512 663 ± 1% 612 ± 4% -7.61% (p=0.008 n=5+5) Perf-task-clock-1024 1.06k ± 1% 0.95k ± 2% -10.24% (p=0.008 n=5+5) Updates #28808 Updates #18237 Change-Id: I4239cd89f21ad16dfbbef58d81981da48acd0605 Reviewed-on: https://go-review.googlesource.com/c/go/+/264477 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Trust: Michael Pratt <mpratt@google.com>	2020-10-28 17:54:13 +00:00
Ian Lance Taylor	b4b0144652	runtime: don't always adjust timers Some programs have a lot of timers that they adjust both forward and backward in time. This can cause a large number of timerModifiedEarlier timers. In practice these timers are used for I/O deadlines and are rarely reached. The effect is that the runtime spends a lot of time in adjusttimers making sure that there are no timerModifiedEarlier timers, but the effort is wasted because none of the adjusted timers are near the top of the timer heap anyhow. Avoid much of this extra work by keeping track of the earliest known timerModifiedEarlier timer. This lets us skip adjusttimers if we know that none of the timers will be ready to run anyhow. We will still eventually run it, when we reach the deadline of the earliest known timerModifiedEarlier, although in practice that timer has likely been removed. When we do run adjusttimers, we will reset all of the timerModifiedEarlier timers, and clear our notion of when we need to run adjusttimers again. This effect should be to significantly reduce the number of times we walk through the timer list in adjusttimers. Fixes #41699 Change-Id: I38eb2be611fb34e3017bb33d0a9ed40d75fb414f Reviewed-on: https://go-review.googlesource.com/c/go/+/258303 Trust: Ian Lance Taylor <iant@golang.org> Trust: Emmanuel Odeke <emmanuel@orijtech.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-10-27 22:57:36 +00:00
Austin Clements	bda37a0b8a	runtime: tidy compileCallback This makes a few minor cleanups and simplifications to compileCallback. Change-Id: Ibebf4b5ed66fb68bba7c84129c127cd4d8a691fe Reviewed-on: https://go-review.googlesource.com/c/go/+/263269 Trust: Austin Clements <austin@google.com> Trust: Alex Brainman <alex.brainman@gmail.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>	2020-10-26 14:50:35 +00:00
Andrew G. Morgan	d1b1145cac	syscall: support POSIX semantics for Linux syscalls This change adds two new methods for invoking system calls under Linux: syscall.AllThreadsSyscall() and syscall.AllThreadsSyscall6(). These system call wrappers ensure that all OSThreads mirror a common system call. The wrappers serialize execution of the runtime to ensure no race conditions where any Go code observes a non-atomic OS state change. As such, the syscalls have higher runtime overhead than regular system calls, and only need to be used where such thread (or 'm' in the parlance of the runtime sources) consistency is required. The new support is used to enable these functions under Linux: syscall.Setegid(), syscall.Seteuid(), syscall.Setgroups(), syscall.Setgid(), syscall.Setregid(), syscall.Setreuid(), syscall.Setresgid(), syscall.Setresuid() and syscall.Setuid(). They work identically to their glibc counterparts. Extensive discussion of the background issue addressed in this patch can be found here: https://github.com/golang/go/issues/1435 In the case where cgo is used, the C runtime can launch pthreads that are not managed by the Go runtime. As such, the added syscall.AllThreadsSyscall() return ENOTSUP when cgo is enabled. However, for the 9 syscall.Set() functions listed above, when cgo is active, these functions redirect to invoke their C.set() equivalents in glibc, which wraps the raw system calls with a nptl:setxid fixup mechanism. This achieves POSIX semantics for these functions in the combined Go and C runtime. As a side note, the glibc/nptl:setxid support (2019-11-30) does not extend to all security related system calls under Linux so using native Go (CGO_ENABLED=0) and these AllThreadsSyscall()s, where needed, will yield more well defined/consistent behavior over all threads of a Go program. That is, using the syscall.AllThreadsSyscall*() wrappers for things like setting state through SYS_PRCTL and SYS_CAPSET etc. Fixes #1435 Change-Id: Ib1a3e16b9180f64223196a32fc0f9dce14d9105c Reviewed-on: https://go-review.googlesource.com/c/go/+/210639 Trust: Emmanuel Odeke <emm.odeke@gmail.com> Trust: Ian Lance Taylor <iant@golang.org> Trust: Michael Pratt <mpratt@google.com> Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Austin Clements <austin@google.com>	2020-10-23 20:53:14 +00:00
Michael Pratt	4a2cc73f87	runtime: don't attempt to steal from idle Ps Work stealing is a scalability bottleneck in the scheduler. Since each P has a work queue, work stealing must look at every P to determine if there is any work. The number of Ps scales linearly with GOMAXPROCS (i.e., the number of Ps _is_ GOMAXPROCS), thus this work scales linearly with GOMAXPROCS. Work stealing is a later attempt by a P to find work before it goes idle. Since the P has no work of its own, extra costs here tend not to directly affect application-level benchmarks. Where they show up is extra CPU usage by the process as a whole. These costs get particularly expensive for applications that transition between blocked and running frequently. Long term, we need a more scalable approach in general, but for now we can make a simple observation: idle Ps ([1]) cannot possibly have anything in their runq, so we need not bother checking at all. We track idle Ps via a new global bitmap, updated in pidleput/pidleget. This is already a slow path (requires sched.lock), so we don't expect high contention there. Using a single bitmap avoids the need to touch every P to read p.status. Currently, the bitmap approach is not significantly better than reading p.status. However, in a future CL I'd like to apply a similiar optimization to timers. Once done, findrunnable would not touch most Ps at all (in mostly idle programs), which will avoid memory latency to pull those Ps into cache. When reading this bitmap, we are racing with Ps going in and out of idle, so there are a few cases to consider: 1. _Prunning -> _Pidle: Running P goes idle after we check the bitmap. In this case, we will try to steal (and find nothing) so there is no harm. 2. _Pidle -> _Prunning while spinning: A P that starts running may queue new work that we miss. This is OK: (a) that P cannot go back to sleep without completing its work, and (b) more fundamentally, we will recheck after we drop our P. 3. _Pidle -> _Prunning after spinning: After spinning, we really can miss work from a newly woken P. (a) above still applies here as well, but this is also the same delicate dance case described in findrunnable: if nothing is spinning anymore, the other P will unpark a thread to run the work it submits. Benchmark results from WakeupParallel/syscall/pair/race/1ms (see golang.org/cl/228577): name old msec new msec delta Perf-task-clock-8 250 ± 1% 247 ± 4% ~ (p=0.690 n=5+5) Perf-task-clock-16 258 ± 2% 259 ± 2% ~ (p=0.841 n=5+5) Perf-task-clock-32 284 ± 2% 270 ± 4% -4.94% (p=0.032 n=5+5) Perf-task-clock-64 326 ± 3% 303 ± 2% -6.92% (p=0.008 n=5+5) Perf-task-clock-128 407 ± 2% 363 ± 5% -10.69% (p=0.008 n=5+5) Perf-task-clock-256 561 ± 1% 481 ± 1% -14.20% (p=0.016 n=4+5) Perf-task-clock-512 840 ± 5% 683 ± 2% -18.70% (p=0.008 n=5+5) Perf-task-clock-1024 1.38k ±14% 1.07k ± 2% -21.85% (p=0.008 n=5+5) [1] "Idle Ps" here refers to _Pidle Ps in the sched.pidle list. In other contexts, Ps may temporarily transition through _Pidle (e.g., in handoffp); those Ps may have work. Updates #28808 Updates #18237 Change-Id: Ieeb958bd72e7d8fb375b0b1f414e8d7378b14e29 Reviewed-on: https://go-review.googlesource.com/c/go/+/259578 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com> Trust: Michael Pratt <mpratt@google.com>	2020-10-23 14:18:27 +00:00
Jeremy Faller	91e4d2d57b	[dev.link] Merge branch 'master' into dev.link 2 conflicts, that make sense. src/cmd/internal/obj/objfile.go src/cmd/link/internal/loader/loader.go Change-Id: Ib224e2d248cb568fa1e888af79dd908b2f5e05ff	2020-09-30 18:00:58 -04:00
Cherry Zhang	a413908dd0	all: add GOOS=ios Introduce GOOS=ios for iOS systems. GOOS=ios matches "darwin" build tag, like GOOS=android matches "linux" and GOOS=illumos matches "solaris". Only ios/arm64 is supported (ios/amd64 is not). GOOS=ios and GOOS=darwin remain essentially the same at this point. They will diverge at later time, to differentiate macOS and iOS. Uses of GOOS=="darwin" are changed to (GOOS=="darwin" \|\| GOOS=="ios"), except if it clearly means macOS (e.g. GOOS=="darwin" && GOARCH=="amd64"), it remains GOOS=="darwin". Updates #38485. Change-Id: I4faacdc1008f42434599efb3c3ad90763a83b67c Reviewed-on: https://go-review.googlesource.com/c/go/+/254740 Trust: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-09-23 18:12:59 +00:00
Michael Anthony Knyszek	eb3c6a93c3	runtime: disable stack shrinking in activeStackChans race window Currently activeStackChans is set before a goroutine blocks on a channel operation in an unlockf passed to gopark. The trouble is that the unlockf is called after the G's status is changed, and the G's status is what is used by a concurrent mark worker (calling suspendG) to determine that a G has successfully been suspended. In this window between the status change and unlockf, the mark worker could try to shrink the G's stack, and in particular observe that activeStackChans is false. This observation will cause the mark worker to not synchronize with concurrent channel operations when it should, and so updating pointers in the sudog for the blocked goroutine (which may point to the goroutine's stack) races with channel operations which may also manipulate the pointer (read it, dereference it, update it, etc.). Fix the problem by adding a new atomically-updated flag to the g struct called parkingOnChan, which is non-zero in the race window above. Then, in isShrinkStackSafe, check if parkingOnChan is zero. The race is resolved like so: * Blocking G sets parkingOnChan, then changes status in gopark. * Mark worker successfully suspends blocking G. * If the mark worker observes parkingOnChan is non-zero when checking isShrinkStackSafe, then it's not safe to shrink (we're in the race window). * If the mark worker observes parkingOnChan as zero, then because the mark worker observed the G status change, it can be sure that gopark's unlockf completed, and gp.activeStackChans will be correct. The risk of this change is low, since although it reduces the number of places that stack shrinking is allowed, the window here is incredibly small. Essentially, every place that it might crash now is replaced with no shrink. This change adds a test, but the race window is so small that it's hard to trigger without a well-placed sleep in park_m. Also, this change fixes stackGrowRecursive in proc_test.go to actually allocate a 128-byte stack frame. It turns out the compiler was destructuring the "pad" field and only allocating one uint64 on the stack. Fixes #40641. Change-Id: I7dfbe7d460f6972b8956116b137bc13bc24464e8 Reviewed-on: https://go-review.googlesource.com/c/go/+/247050 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Trust: Michael Knyszek <mknyszek@google.com>	2020-09-21 14:34:33 +00:00
Cherry Zhang	3ab22052fb	[dev.link] all: merge branch 'master' into dev.link Clean merge. Change-Id: Ib773b0bc00fd99d494f9331c3613bcc8285e48e3	2020-09-11 12:07:44 -04:00
Keith Randall	5c2c6d3fbf	runtime: framepointers are no longer an experiment - hard code them I think they are no longer experimental status. Might as well promote them to permanent. Change-Id: Id1259601b3dd2061dd60df86ee48080bfb575d2f Reviewed-on: https://go-review.googlesource.com/c/go/+/249857 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>	2020-08-27 21:15:47 +00:00
Keith Randall	d9a6bdf7ef	cmd/compile: don't allow go:notinheap on the heap or stack Right now we just prevent such types from being on the heap. This CL makes it so they cannot appear on the stack either. The distinction between heap and stack is pretty vague at the language level (e.g. it is affected by -N), and we don't need the flexibility anyway. Once go:notinheap types cannot be in either place, we don't need to consider pointers to such types to be pointers, at least according to the garbage collector and stack copying. (This is the big win of this CL, in my opinion.) The distinction between HasPointers and HasHeapPointer no longer exists. There is only HasPointers. This CL is cleanup before possible use of go:notinheap to fix #40954. Update #13386 Change-Id: Ibd895aadf001c0385078a6d4809c3f374991231a Reviewed-on: https://go-review.googlesource.com/c/go/+/249917 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>	2020-08-25 01:46:05 +00:00
Cherry Zhang	0ef562592f	[dev.link] all: merge branch 'master' into dev.link Change-Id: Ic66b5138f3ecd9e9a48d7ab05782297c06e4a5b5	2020-08-21 14:18:06 -04:00
Matthew Dempsky	30a68bfb80	runtime: add "success" field to sudog The current wakeup protocol for channel communications is that the second goroutine sets gp.param to the sudog when a value is successfully communicated over the channel, and to nil when the wakeup is due to closing the channel. Setting nil to indicate channel closure works okay for chansend and chanrecv, because they're only communicating with one channel, so they know it must be the channel that was closed. However, it means selectgo has to re-poll all of the channels to figure out which one was closed. This commit adds a "success" field to sudog, and changes the wakeup protocol to always set gp.param to sg, and to use sg.success to indicate successful communication vs channel closure. While here, this also reorganizes the chansend code slightly so that the sudog is still released to the pool if the send blocks and then is awoken because the channel closed. Updates #40410. Change-Id: I6cd9a20ebf9febe370a15af1b8afe24c5539efc6 Reviewed-on: https://go-review.googlesource.com/c/go/+/245019 Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Keith Randall <khr@golang.org>	2020-08-18 20:05:33 +00:00
Jeremy Faller	26407b2212	[dev.link] cmd/{compile,link}: remove pcdata tables from pclntab_old Move the pctables out of pclntab_old. Creates a new generator symbol, runtime.pctab, which holds all the deduplicated pctables. Also, tightens up some of the types in runtime. Darwin, cmd/compile statistics: alloc/op Pclntab_GC 26.4MB ± 0% 13.8MB ± 0% allocs/op Pclntab_GC 89.9k ± 0% 86.4k ± 0% liveB Pclntab_GC 25.5M ± 0% 24.2M ± 0% No significant change in binary size. Change-Id: I1560fd4421f8a210f8d4b508fbc54e1780e338f9 Reviewed-on: https://go-review.googlesource.com/c/go/+/248332 Run-TryBot: Jeremy Faller <jeremy@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-08-18 14:03:43 +00:00
Tobias Klauser	4149493443	runtime: move startupRandomData declaration to os_linux.go startupRandomData is only used in sysauxv and getRandomData on linux, thus move it closer to where it is used. Also adjust its godoc comment. Change-Id: Ice51d579ec33436adbfdf247caf4ba00bae865e0 Reviewed-on: https://go-review.googlesource.com/c/go/+/248761 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2020-08-18 08:47:48 +00:00
Jeremy Faller	9ae8f71c94	[dev.link] cmd/link: stop renumbering files for pclntab generation Creates two new symbols: runtime.cutab, and runtime.filetab, and strips the filenames out of runtime.pclntab_old. All stats are for cmd/compile. Time: Pclntab_GC 48.2ms ± 3% 45.5ms ± 9% -5.47% (p=0.004 n=9+9) Alloc/op: Pclntab_GC 30.0MB ± 0% 29.5MB ± 0% -1.88% (p=0.000 n=10+10) Allocs/op: Pclntab_GC 90.4k ± 0% 73.1k ± 0% -19.11% (p=0.000 n=10+10) live-B: Pclntab_GC 29.1M ± 0% 29.2M ± 0% +0.10% (p=0.000 n=10+10) binary sizes: NEW: 18565600 OLD: 18532768 The size differences in the binary are caused by the increased size of the Func objects, and (less likely) some extra alignment padding needed as a result. This is probably the maximum increase in size we'll size from the pclntab reworking. Change-Id: Idd95a9b159fea46f7701cfe6506813b88257fbea Reviewed-on: https://go-review.googlesource.com/c/go/+/246497 Run-TryBot: Jeremy Faller <jeremy@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Austin Clements <austin@google.com>	2020-08-12 17:14:51 +00:00
Jeremy Faller	e4e1c6a7af	[dev.link] add compilation unit index to func Not used yet, but add the compilation unit for a function to func. Change-Id: I7c43fa9f1da044ca63bab030062519771b9f4418 Reviewed-on: https://go-review.googlesource.com/c/go/+/244547 Run-TryBot: Jeremy Faller <jeremy@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-08-03 17:58:03 +00:00
Michael Anthony Knyszek	c847589ad0	runtime: synchronize StartTrace and StopTrace with sysmon Currently sysmon is not stopped when the world is stopped, which is in general a difficult thing to do. The result of this is that when tracing starts and the value of trace.enabled changes, it's possible for sysmon to fail to emit an event when it really should. This leads to traces which the execution trace parser deems inconsistent. Fix this by putting all of sysmon's work behind a new lock sysmonlock. StartTrace and StopTrace both acquire this lock after stopping the world but before performing any work in order to ensure sysmon sees the required state change in tracing. This change is expected to slow down StartTrace and StopTrace, but will help ensure consistent traces are generated. Updates #29707. Fixes #38794. Change-Id: I64c58e7c3fd173cd5281ffc208d6db24ff6c0284 Reviewed-on: https://go-review.googlesource.com/c/go/+/234617 Run-TryBot: Michael Knyszek <mknyszek@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-05-21 14:48:50 +00:00
Aaron Patterson	94e61ab94d	runtime/runtime2: pack the sudog struct This commit moves the isSelect bool below the ticket uint32. The boolean was consuming 8 bytes of the struct. The uint32 was also consuming 8 bytes, so we can pack isSelect below the uint32 and save 8 bytes. This reduces the sudog struct from 96 bytes to 88 bytes. Change-Id: If555cdaf2f5eaa125e2590fc4d113dbc99750738 GitHub-Last-Rev: `d63b4e086b` GitHub-Pull-Request: golang/go#36552 Reviewed-on: https://go-review.googlesource.com/c/go/+/214677 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-05-07 04:05:18 +00:00
geedchin	01a9cf8487	runtime: correct waitReasonForceGGIdle to waitResonForceGCIdle Change-Id: I211db915ce2e98555c58f4320ca58e91536f8f3d GitHub-Last-Rev: `40a7430f88` GitHub-Pull-Request: golang/go#38852 Reviewed-on: https://go-review.googlesource.com/c/go/+/232037 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-05-05 02:38:39 +00:00
Austin Clements	3633d2c545	runtime: perform debug call injection on a new goroutine Currently, when a debugger injects a call, that call happens on the goroutine where the debugger injected it. However, this requires significant runtime complexity that we're about to remove. To prepare for this, this CL switches to a different approach that leaves the interrupted goroutine parked and runs the debug call on a new goroutine. When the debug call returns, it resumes the original goroutine. This should be essentially transparent to debuggers. It follows the exact same call injection protocol and ensures the whole protocol executes indivisibly on a single OS thread. The only difference is that the current G and stack now change part way through the protocol. For #36365. Change-Id: I68463bfd73cbee06cfc49999606410a59dd8f653 Reviewed-on: https://go-review.googlesource.com/c/go/+/229299 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2020-04-29 21:29:11 +00:00
Dan Scales	0a820007e7	runtime: static lock ranking for the runtime (enabled by GOEXPERIMENT) I took some of the infrastructure from Austin's lock logging CR https://go-review.googlesource.com/c/go/+/192704 (with deadlock detection from the logs), and developed a setup to give static lock ranking for runtime locks. Static lock ranking establishes a documented total ordering among locks, and then reports an error if the total order is violated. This can happen if a deadlock happens (by acquiring a sequence of locks in different orders), or if just one side of a possible deadlock happens. Lock ordering deadlocks cannot happen as long as the lock ordering is followed. Along the way, I found a deadlock involving the new timer code, which Ian fixed via https://go-review.googlesource.com/c/go/+/207348, as well as two other potential deadlocks. See the constants at the top of runtime/lockrank.go to show the static lock ranking that I ended up with, along with some comments. This is great documentation of the current intended lock ordering when acquiring multiple locks in the runtime. I also added an array lockPartialOrder[] which shows and enforces the current partial ordering among locks (which is embedded within the total ordering). This is more specific about the dependencies among locks. I don't try to check the ranking within a lock class with multiple locks that can be acquired at the same time (i.e. check the ranking when multiple hchan locks are acquired). Currently, I am doing a lockInit() call to set the lock rank of most locks. Any lock that is not otherwise initialized is assumed to be a leaf lock (a very high rank lock), so that eliminates the need to do anything for a bunch of locks (including all architecture-dependent locks). For two locks, root.lock and notifyList.lock (only in the runtime/sema.go file), it is not as easy to do lock initialization, so instead, I am passing the lock rank with the lock calls. For Windows compilation, I needed to increase the StackGuard size from 896 to 928 because of the new lock-rank checking functions. Checking of the static lock ranking is enabled by setting GOEXPERIMENT=staticlockranking before doing a run. To make sure that the static lock ranking code has no overhead in memory or CPU when not enabled by GOEXPERIMENT, I changed 'go build/install' so that it defines a build tag (with the same name) whenever any experiment has been baked into the toolchain (by checking Expstring()). This allows me to avoid increasing the size of the 'mutex' type when static lock ranking is not enabled. Fixes #38029 Change-Id: I154217ff307c47051f8dae9c2a03b53081acd83a Reviewed-on: https://go-review.googlesource.com/c/go/+/207619 Reviewed-by: Dan Scales <danscales@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-07 21:51:03 +00:00
Cherry Zhang	0c0e8f224d	runtime: don't send preemption signal if there is a signal pending If multiple threads call preemptone to preempt the same M, it may send many signals to the same M such that it hardly make progress, causing live-lock problem. Only send a signal if there isn't already one pending. Fixes #37741. Change-Id: Id94adb0b95acbd18b23abe637a8dcd81ab41b452 Reviewed-on: https://go-review.googlesource.com/c/go/+/223737 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2020-03-18 16:00:44 +00:00
Ian Lance Taylor	3093959ee1	runtime: remove mcache field from m Having an mcache field in both m and p is confusing, so remove it from m. Always use mcache field from p. Use new variable mcache0 during bootstrap. Change-Id: If2cba9f8bb131d911d512b61fd883a86cf62cc98 Reviewed-on: https://go-review.googlesource.com/c/go/+/205239 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-02-24 16:39:52 +00:00
Ian Lance Taylor	895b7c85ad	runtime: don't skip checkTimers if we would clear deleted timers The timers code used to have a problem: if code started and stopped a lot of timers, as would happen with, for example, lots of calls to context.WithTimeout, then it would steadily use memory holding timers that had stopped but not been removed from the timer heap. That problem was fixed by CL 214299, which would remove all deleted timers whenever they got to be more than 1/4 of the total number of timers on the heap. The timers code had a different problem: if there were some idle P's, the running P's would have lock contention trying to steal their timers. That problem was fixed by CL 214185, which only acquired the timer lock if the next timer was ready to run or there were some timers to adjust. Unfortunately, CL 214185 partially undid 214299, in that we could now accumulate an increasing number of deleted timers while there were no timers ready to run. This CL restores the 214299 behavior, by checking whether there are lots of deleted timers without acquiring the lock. This is a performance issue to consider for the 1.14 release. Change-Id: I13c980efdcc2a46eb84882750c39e3f7c5b2e7c3 Reviewed-on: https://go-review.googlesource.com/c/go/+/215722 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-01-22 18:10:42 +00:00
Ian Lance Taylor	cfe3cd903f	runtime: keep P's first timer when in new atomically accessed field This reduces lock contention when only a few P's are running and checking for whether they need to run timers on the sleeping P's. Without this change the running P's would get lock contention while looking at the sleeping P's timers. With this change a single atomic load suffices to determine whether there are any ready timers. Change-Id: Ie843782bd56df49867a01ecf19c47498ec827452 Reviewed-on: https://go-review.googlesource.com/c/go/+/214185 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: David Chase <drchase@google.com>	2020-01-14 19:54:20 +00:00
Ian Lance Taylor	641e61db57	runtime: don't let P's timer heap get clogged with deleted timers Whenever more than 1/4 of the timers on a P's heap are deleted, remove them from the heap. Change-Id: Iff63ed3d04e6f33ffc5c834f77f645c52c007e52 Reviewed-on: https://go-review.googlesource.com/c/go/+/214299 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-01-10 23:03:06 +00:00
Austin Clements	91bb1d734e	runtime: move m.thread to mOS This field is only used on Windows. Change-Id: I12d4df09261f8e7ad54c2abd7beda669af28c8e7 Reviewed-on: https://go-review.googlesource.com/c/go/+/207778 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Reviewed-by: Alex Brainman <alex.brainman@gmail.com>	2019-11-20 17:13:36 +00:00
Michael Anthony Knyszek	a2cd2bd55d	runtime: add per-p page allocation cache This change adds a per-p free page cache which the page allocator may allocate out of without a lock. The change also introduces a completely lockless page allocator fast path. Although the cache contains at most 64 pages (and usually less), the vast majority (85%+) of page allocations are exactly 1 page in size. Updates #35112. Change-Id: I170bf0a9375873e7e3230845eb1df7e5cf741b78 Reviewed-on: https://go-review.googlesource.com/c/go/+/195701 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 18:00:54 +00:00
Michael Anthony Knyszek	4517c02f28	runtime: add per-p mspan cache This change adds a per-p mspan object cache similar to the sudog cache. Unfortunately this cache can't quite operate like the sudog cache, since it is used in contexts where write barriers are disallowed (i.e. allocation codepaths), so rather than managing an array and a slice, it's just an array and a length. A little bit more unsafe, but avoids any write barriers. The purpose of this change is to reduce the number of operations which require the heap lock in allocation, paving the way for a lockless fast path. Updates #35112. Change-Id: I32cfdcd8528fb7be985640e4f3a13cb98ffb7865 Reviewed-on: https://go-review.googlesource.com/c/go/+/196642 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 17:01:32 +00:00
Dan Scales	7dcd343ed6	runtime: ensure that Goexit cannot be aborted by a recursive panic/recover When we do a successful recover of a panic, we resume normal execution by returning from the frame that had the deferred call that did the recover (after executing any remaining deferred calls in that frame). However, suppose we have called runtime.Goexit and there is a panic during one of the deferred calls run by the Goexit. Further assume that there is a deferred call in the frame of the Goexit or a parent frame that does a recover. Then the recovery process will actually resume normal execution above the Goexit frame and hence abort the Goexit. We will not terminate the thread as expected, but continue running in the frame above the Goexit. To fix this, we explicitly create a _panic object for a Goexit call. We then change the "abort" behavior for Goexits, but not panics. After a recovery, if the top-level panic is actually a Goexit that is marked to be aborted, then we return to the Goexit defer-processing loop, so that the Goexit is not actually aborted. Actual code changes are just panic.go, runtime2.go, and funcid.go. Adjusted the test related to the new Goexit behavior (TestRecoverBeforePanicAfterGoexit) and added several new tests of aborted panics (whose behavior has not changed). Fixes #29226 Change-Id: Ib13cb0074f5acc2567a28db7ca6912cfc47eecb5 Reviewed-on: https://go-review.googlesource.com/c/go/+/200081 Run-TryBot: Dan Scales <danscales@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2019-11-04 16:32:38 +00:00
Austin Clements	177a36a5dc	runtime: implement async scheduler preemption This adds signal-based preemption to preemptone. Since STW and forEachP ultimately use preemptone, this also makes these work with async preemption. This also makes freezetheworld more robust so tracebacks from fatal panics should be far less likely to report "goroutine running on other thread; stack unavailable". For #10958, #24543. (This doesn't fix it yet because asynchronous preemption only works on POSIX platforms on 386 and amd64 right now.) Change-Id: If776181dd5a9b3026a7b89a1b5266521b95a5f61 Reviewed-on: https://go-review.googlesource.com/c/go/+/201762 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-11-02 21:51:21 +00:00
Austin Clements	62e53b7922	runtime: use signals to preempt Gs for suspendG This adds support for pausing a running G by sending a signal to its M. The main complication is that we want to target a G, but can only send a signal to an M. Hence, the protocol we use is to simply mark the G for preemption (which we already do) and send the M a "wake up and look around" signal. The signal checks if it's running a G with a preemption request and stops it if so in the same way that stack check preemptions stop Gs. Since the preemption may fail (the G could be moved or the signal could arrive at an unsafe point), we keep a count of the number of received preemption signals. This lets stopG detect if its request failed and should be retried without an explicit channel back to suspendG. For #10958, #24543. Change-Id: I3e1538d5ea5200aeb434374abb5d5fdc56107e53 Reviewed-on: https://go-review.googlesource.com/c/go/+/201760 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-11-02 21:51:18 +00:00
Dan Scales	1f3339f441	runtime: fix dumpgoroutine() to deal with open-coded defers _defer.fn can be nil, so we need to add a check when dumping _defer.fn.fn. Fixes #35172 Change-Id: Ic1138be5ec9dce915a87467cfa51ff83acc6e3a9 Reviewed-on: https://go-review.googlesource.com/c/go/+/203697 Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-10-28 01:00:20 +00:00
Austin Clements	6058603471	runtime: only shrink stacks at synchronous safe points We're about to introduce asynchronous safe points, where we won't have precise pointer maps for all stack frames. That's okay for scanning the stack (conservatively), but not for shrinking the stack. Hence, this CL prepares for this by only shrinking the stack as part of the stack scan if the goroutine is stopped at a synchronous safe point. Otherwise, it queues up the stack shrink for the next synchronous safe point. We already have one condition under which we can't shrink the stack for very similar reasons: syscalls. Currently, we just give up on shrinking the stack if it's in a syscall. But with this mechanism, we defer that stack shrink until the next synchronous safe point. For #10958, #24543. Change-Id: Ifa1dec6f33fdf30f9067be2ce3f7ab8a7f62ce38 Reviewed-on: https://go-review.googlesource.com/c/go/+/201438 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-10-25 23:25:35 +00:00
Austin Clements	36a432f27b	runtime: make copystack/sudog synchronization more explicit When we copy a stack of a goroutine blocked in a channel operation, we have to be very careful because other goroutines may be writing to that goroutine's stack. To handle this, stack copying acquires the locks for the channels a goroutine is waiting on. One complication is that stack growth may happen while a goroutine holds these locks, in which case stack copying must not acquire these locks because that would self-deadlock. Currently, stack growth never acquires these locks because stack growth only happens when a goroutine is running, which means it's either not blocking on a channel or it's holding the channel locks already. Stack shrinking always acquires these locks because shrinking happens asynchronously, so the goroutine is never running, so there are either no locks or they've been released by the goroutine. However, we're about to change when stack shrinking can happen, which is going to break the current rules. Rather than find a new way to derive whether to acquire these locks or not, this CL simply adds a flag to the g struct that indicates that stack copying should acquire channel locks. This flag is set while the goroutine is blocked on a channel op. For #10958, #24543. Change-Id: Ia2ac8831b1bfda98d39bb30285e144c4f7eaf9ab Reviewed-on: https://go-review.googlesource.com/c/go/+/172982 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2019-10-25 23:25:33 +00:00
Austin Clements	8c5861576a	runtime: remove g.gcscanvalid Currently, gcscanvalid is used to resolve a race between attempts to scan a stack. Now that there's a clear owner of the stack scan operation, there's no longer any danger of racing or attempting to scan a stack more than once, so this CL eliminates gcscanvalid. I double-checked my reasoning by first adding a throw if gcscanvalid was set in scanstack and verifying that all.bash still passed. For #10958, #24543. Fixes #24363. Change-Id: I76794a5fcda325ed7cfc2b545e2a839b8b3bc713 Reviewed-on: https://go-review.googlesource.com/c/go/+/201139 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-10-25 23:25:32 +00:00
Austin Clements	1b79afe460	runtime: remove old stack scanning code This removes scang and preemptscan, since the stack scanning code now uses suspendG. For #10958, #24543. Change-Id: Ic868bf5d6dcce40662a82cb27bb996cb74d0720e Reviewed-on: https://go-review.googlesource.com/c/go/+/201138 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-10-25 23:25:31 +00:00
Austin Clements	3f834114ab	runtime: add general suspendG/resumeG Currently, the process of suspending a goroutine is tied to stack scanning. In preparation for non-cooperative preemption, this CL abstracts this into general purpose suspendG/resumeG functions. suspendG and resumeG closely follow the existing scang and restartg functions with one exception: the addition of a _Gpreempted status. Currently, preemption tasks (stack scanning) are carried out by the target goroutine if it's in _Grunning. In this new approach, the task is always carried out by the goroutine that called suspendG. Thus, we need a reliable way to drive the target goroutine out of _Grunning until the requesting goroutine is ready to resume it. The new _Gpreempted state provides the handshake: when a runnable goroutine responds to a preemption request, it now parks itself and enters _Gpreempted. The requesting goroutine races to put it in _Gwaiting, which gives it ownership, but also the responsibility to start it again. This CL adds several TODOs about improving the synchronization on the G status. The existing code already has these problems; we're just taking note of them. The next CL will remove the now-dead scang and preemptscan. For #10958, #24543. Change-Id: I16dbf87bea9d50399cc86719c156f48e67198f16 Reviewed-on: https://go-review.googlesource.com/c/go/+/201137 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-10-25 23:25:28 +00:00
Austin Clements	fc8eb264bb	runtime: ensure _Grunning Gs have a valid g.m and g.m.p We already claim on the documentation for _Grunning that this is case, but execute transitions to _Grunning before assigning g.m. Fix this and make the documentation even more explicit. For #10958, #24543, but also a good cleanup. Change-Id: I1eb0108e7762f55cfb0282aca624af1c0a15fe56 Reviewed-on: https://go-review.googlesource.com/c/go/+/201440 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-10-25 16:59:39 +00:00
Dan Scales	be64a19d99	cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go binary without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I63b1a60d1ebf28126f55ee9fd7ecffe9cb23d1ff Reviewed-on: https://go-review.googlesource.com/c/go/+/202340 Reviewed-by: Austin Clements <austin@google.com>	2019-10-24 13:54:11 +00:00
Ian Lance Taylor	ab3f1a23b6	runtime: add race detector support for new timers Since the new timers run on g0, which does not have a race context, we add a race context field to the P, and use that for timer functions. This works since all timer functions are in the standard library. Updates #27707 Change-Id: I8a5b727b4ddc8ca6fc60eb6d6f5e9819245e395b Reviewed-on: https://go-review.googlesource.com/c/go/+/171882 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>	2019-10-23 07:43:18 +00:00
Ian Lance Taylor	48eb79ec21	runtime: add new modtimer function This adds a new field to P, adjustTimers, that tells the P that one of its existing timers was modified to be earlier, and that it therefore needs to resort them. Updates #27707 Change-Id: I4c5f5b51ed116f1d898d3f87cdddfa1b552337f8 Reviewed-on: https://go-review.googlesource.com/c/go/+/171832 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2019-10-22 19:35:04 +00:00
Ian Lance Taylor	06ac26279c	runtime: initial scheduler changes for timers on P's Add support to the main scheduler loop for handling timers on P's. This is not used yet, as timers are not yet put on P's. Updates #6239 Updates #27707 Change-Id: I6a359df408629f333a9232142ce19e8be8496dae Reviewed-on: https://go-review.googlesource.com/c/go/+/171826 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2019-10-21 17:23:42 +00:00
Bryan C. Mills	b76e6f8825	Revert "cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata" This reverts CL 190098. Reason for revert: broke several builders. Change-Id: I69161352f9ded02537d8815f259c4d391edd9220 Reviewed-on: https://go-review.googlesource.com/c/go/+/201519 Run-TryBot: Bryan C. Mills <bcmills@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Dan Scales <danscales@google.com>	2019-10-16 20:59:53 +00:00
Dan Scales	dad616375f	cmd/compile, cmd/link, runtime: make defers low-cost through inline code and extra funcdata Generate inline code at defer time to save the args of defer calls to unique (autotmp) stack slots, and generate inline code at exit time to check which defer calls were made and make the associated function/method/interface calls. We remember that a particular defer statement was reached by storing in the deferBits variable (always stored on the stack). At exit time, we check the bits of the deferBits variable to determine which defer function calls to make (in reverse order). These low-cost defers are only used for functions where no defers appear in loops. In addition, we don't do these low-cost defers if there are too many defer statements or too many exits in a function (to limit code increase). When a function uses open-coded defers, we produce extra FUNCDATA_OpenCodedDeferInfo information that specifies the number of defers, and for each defer, the stack slots where the closure and associated args have been stored. The funcdata also includes the location of the deferBits variable. Therefore, for panics, we can use this funcdata to determine exactly which defers are active, and call the appropriate functions/methods/closures with the correct arguments for each active defer. In order to unwind the stack correctly after a recover(), we need to add an extra code segment to functions with open-coded defers that simply calls deferreturn() and returns. This segment is not reachable by the normal function, but is returned to by the runtime during recovery. We set the liveness information of this deferreturn() to be the same as the liveness at the first function call during the last defer exit code (so all return values and all stack slots needed by the defer calls will be live). I needed to increase the stackguard constant from 880 to 896, because of a small amount of new code in deferreturn(). The -N flag disables open-coded defers. '-d defer' prints out the kind of defer being used at each defer statement (heap-allocated, stack-allocated, or open-coded). Cost of defer statement [ go test -run NONE -bench BenchmarkDefer$ runtime ] With normal (stack-allocated) defers only: 35.4 ns/op With open-coded defers: 5.6 ns/op Cost of function call alone (remove defer keyword): 4.4 ns/op Text size increase (including funcdata) for go cmd without/with open-coded defers: 0.09% The average size increase (including funcdata) for only the functions that use open-coded defers is 1.1%. The cost of a panic followed by a recover got noticeably slower, since panic processing now requires a scan of the stack for open-coded defer frames. This scan is required, even if no frames are using open-coded defers: Cost of panic and recover [ go test -run NONE -bench BenchmarkPanicRecover runtime ] Without open-coded defers: 62.0 ns/op With open-coded defers: 255 ns/op A CGO Go-to-C-to-Go benchmark got noticeably faster because of open-coded defers: CGO Go-to-C-to-Go benchmark [cd misc/cgo/test; go test -run NONE -bench BenchmarkCGoCallback ] Without open-coded defers: 443 ns/op With open-coded defers: 347 ns/op Updates #14939 (defer performance) Updates #34481 (design doc) Change-Id: I51a389860b9676cfa1b84722f5fb84d3c4ee9e28 Reviewed-on: https://go-review.googlesource.com/c/go/+/190098 Reviewed-by: Austin Clements <austin@google.com>	2019-10-16 18:27:16 +00:00
Brad Fitzpatrick	03ef105dae	all: remove nacl (part 3, more amd64p32) Part 1: CL 199499 (GOOS nacl) Part 2: CL 200077 (amd64p32 files, toolchain) Part 3: stuff that arguably should've been part of Part 2, but I forgot one of my grep patterns when splitting the original CL up into two parts. This one might also have interesting stuff to resurrect for any future x32 ABI support. Updates #30439 Change-Id: I2b4143374a253a003666f3c69e776b7e456bdb9c Reviewed-on: https://go-review.googlesource.com/c/go/+/200318 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2019-10-10 22:38:38 +00:00
Sean Chen	39ab8db914	runtime: update runtime2.go itab comments about sync struct `cmd/compile/internal/gc/reflect.go:/^func.dumptypestructs` was modified many times, now is `cmd/compile/internal/gc/reflect.go:/^func.dumptabs` Change-Id: Ie949a5bee7878c998591468a04f67a8a70c61da7 GitHub-Last-Rev: `9ecc26985e` GitHub-Pull-Request: golang/go#34489 Reviewed-on: https://go-review.googlesource.com/c/go/+/197037 Reviewed-by: Keith Randall <khr@golang.org>	2019-09-24 05:13:20 +00:00

1 2 3 4 5 ...

261 commits