Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
Russ Cox	80c98fa901	runtime/trace: record event sequence numbers explicitly Nearly all the flaky failures we've seen in trace tests have been due to the use of time stamps to determine relative event ordering. This is tricky for many reasons, including: - different cores might not have exactly synchronized clocks - VMs are worse than real hardware - non-x86 chips have different timer resolution than x86 chips - on fast systems two events can end up with the same time stamp Stop trying to make time reliable. It's clearly not going to be for Go 1.5. Instead, record an explicit event sequence number for ordering. Using our own counter solves all of the above problems. The trace still contains time stamps, of course. The sequence number is just used for ordering. Should alleviate #10554 somewhat. Then tickDiv can be chosen to be a useful time unit instead of having to be exact for ordering. Separating ordering and time stamps lets the trace parser diagnose systems where the time stamp order and actual order do not match for one reason or another. This CL adds that check to the end of trace.Parse, after all other sequence order-based checking. If that error is found, we skip the test instead of failing it. Putting the check in trace.Parse means that cmd/trace will pick up the same check, refusing to display a trace where the time stamps do not match actual ordering. Using net/http's BenchmarkClientServerParallel4 on various CPU counts, not tracing vs tracing: name old time/op new time/op delta ClientServerParallel4 50.4µs ± 4% 80.2µs ± 4% +59.06% (p=0.000 n=10+10) ClientServerParallel4-2 33.1µs ± 7% 57.8µs ± 5% +74.53% (p=0.000 n=10+10) ClientServerParallel4-4 18.5µs ± 4% 32.6µs ± 3% +75.77% (p=0.000 n=10+10) ClientServerParallel4-6 12.9µs ± 5% 24.4µs ± 2% +89.33% (p=0.000 n=10+10) ClientServerParallel4-8 11.4µs ± 6% 21.0µs ± 3% +83.40% (p=0.000 n=10+10) ClientServerParallel4-12 14.4µs ± 4% 23.8µs ± 4% +65.67% (p=0.000 n=10+10) Fixes #10512. Change-Id: I173eecf8191e86feefd728a5aad25bf1bc094b12 Reviewed-on: https://go-review.googlesource.com/12579 Reviewed-by: Austin Clements <austin@google.com>	2015-07-29 22:32:14 +00:00
Austin Clements	87f97c73d3	runtime: avoid race between SIGPROF traceback and stack barriers The following sequence of events can lead to the runtime attempting an out-of-bounds access on a stack barrier slice: 1. A SIGPROF comes in on a thread while the G on that thread is in _Gsyscall. The sigprof handler calls gentraceback, which saves a local copy of the G's stkbar slice. Currently the G has no stack barriers, so this slice is empty. 2. On another thread, the GC concurrently scans the stack of the goroutine being profiled (it considers it stopped because it's in _Gsyscall) and installs stack barriers. 3. Back on the sigprof thread, gentraceback comes across a stack barrier in the stack and attempts to look it up in its (zero length) copy of G's old stkbar slice, which causes an out-of-bounds access. This commit fixes this by adding a simple cas spin to synchronize the SIGPROF handler with stack barrier insertion. In general I would prefer that this synchronization be done through the G status, since that's how stack scans are otherwise synchronized, but adding a new lock is a much smaller change and G statuses are full of subtlety. Fixes #11863. Change-Id: Ie89614a6238bb9c6a5b1190499b0b48ec759eaf7 Reviewed-on: https://go-review.googlesource.com/12748 Reviewed-by: Russ Cox <rsc@golang.org>	2015-07-29 19:31:46 +00:00
Austin Clements	e42413cecc	runtime: fix saved PC/SP after safe-point function in syscall Running a safe-point function on syscall entry uses systemstack() and hence clobbers g.sched.pc and g.sched.sp. Fix this by re-saving them after the systemstack, just like in the other uses of systemstack in reentersyscall. Change-Id: I47868a53eba24d81919fda56ef6bbcf72f1f922e Reviewed-on: https://go-review.googlesource.com/12125 Reviewed-by: Russ Cox <rsc@golang.org>	2015-07-15 21:09:16 +00:00
Austin Clements	edfc979725	runtime: run safe-point function before entering _Psyscall Currently, we run a P's safe-point function immediately after entering _Psyscall state. This is unsafe, since as soon as we put the P in _Psyscall, we no longer control the P and another M may claim it. We'll still run the safe-point function only once (because doing so races on an atomic), but the P may no longer be at a safe-point when we do so. In particular, this means that the use of forEachP to dispose all P's gcw caches is unsafe. A P may enter a syscall, run the safe-point function, and dispose the P's gcw cache concurrently with another M claiming the P and attempting to use its gcw cache. If this happens, we may empty the gcw's workbuf after putting it on work.{full,partial}, or add pointers to it after putting it in work.empty. This will cause an assertion failure when we later pop the workbuf from the list and its object count is inconsistent with the list we got it from. Fix this by running the safe-point function just before putting the P in _Psyscall. Related to #11640. This probably fixes this issue, but while I'm able to show that we can enter a bad safe-point state as a result of this, I can't reproduce that specific failure. Change-Id: I6989c8ca7ef2a4a941ae1931e9a0748cbbb59434 Reviewed-on: https://go-review.googlesource.com/12124 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Russ Cox <rsc@golang.org>	2015-07-15 21:09:07 +00:00
Brad Fitzpatrick	2ae77376f7	all: link to https instead of http The one in misc/makerelease/makerelease.go is particularly bad and probably warrants rotating our keys. I didn't update old weekly notes, and reverted some changes involving test code for now, since we're late in the Go 1.5 freeze. Otherwise, the rest are all auto-generated changes, and all manually reviewed. Change-Id: Ia2753576ab5d64826a167d259f48a2f50508792d Reviewed-on: https://go-review.googlesource.com/12048 Reviewed-by: Rob Pike <r@golang.org>	2015-07-11 14:36:33 +00:00
Austin Clements	4b2774f5ea	runtime: make sysmon-triggered GC concurrent sysmon triggers a GC if there has been no GC for two minutes. Currently, this is a STW GC. There is no reason for this to be STW, so make it concurrent. Fixes #10261. Change-Id: I92f3ac37272d5c2a31480ff1fa897ebad08775a9 Reviewed-on: https://go-review.googlesource.com/11955 Reviewed-by: Rob Pike <r@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-07-09 05:53:21 +00:00
Russ Cox	2028077899	runtime: randomize scheduling in -race mode Basic randomization of goroutine scheduling for -race mode. It is probably possible to do much better (there's a paper linked in the issue that I haven't read, for example), but this suffices to introduce at least some unpredictability into the scheduling order. The goal here is to have _something_ for Go 1.5, so that we don't start hitting more of these scheduling order-dependent bugs if we change the scheduler order again in Go 1.6. For #11372. Change-Id: Idf1154123fbd5b7a1ee4d339e93f97635cc2bacb Reviewed-on: https://go-review.googlesource.com/11795 Reviewed-by: Austin Clements <austin@google.com>	2015-07-07 21:27:38 +00:00
Austin Clements	840965f8d7	runtime: always clear stack barriers on G exit Currently the runtime fails to clear a G's stack barriers in gfput if the G's stack allocation is _FixedStack bytes. This causes the runtime to panic if the following sequence of events happens: 1) The runtime installs stack barriers on a G. 2) The G exits by calling runtime.Goexit. Since this does not necessarily return through the stack barriers installed on the G, there may still be untriggered stack barriers left on the G's stack in recorded in g.stkbar. 3) The runtime calls gfput to add the exiting G to the free pool. If the G's stack allocation is _FixedStack bytes, we fail to clear g.stkbar. 4) A new G starts and allocates the G that was just added to the free pool. 5) The new G begins to execute and overwrites the stack slots that had stack barriers in them. 6) The garbage collector enters mark termination, attempts to remove stack barriers from the new G, and finds that they've been overwritten. Fix this by clearing the stack barriers in gfput in the case where it reuses the stack. Fixes #11256. Change-Id: I377c44258900e6bcc2d4b3451845814a8eeb2bcf Reviewed-on: https://go-review.googlesource.com/11461 Reviewed-by: Alex Brainman <alex.brainman@gmail.com> Reviewed-by: Russ Cox <rsc@golang.org>	2015-06-29 15:02:30 +00:00
Alex Brainman	9d968cb47b	runtime: rename cgocall_errno and asmcgocall_errno into cgocall and asmcgocall Change-Id: I5917bea8bb35b0e725dcc56a68f3a70137cfc180 Reviewed-on: https://go-review.googlesource.com/9387 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-06-19 01:47:11 +00:00
Dmitry Vyukov	e72f5f67a1	runtime: fix tracing of syscallexit There were two issues. 1. Delayed EvGoSysExit could have been emitted during TraceStart, while it had not yet emitted EvGoInSyscall. 2. Delayed EvGoSysExit could have been emitted during next tracing session. Fixes #10476 Fixes #11262 Change-Id: Iab68eb31cf38eb6eb6eee427f49c5ca0865a8c64 Reviewed-on: https://go-review.googlesource.com/9132 Reviewed-by: Russ Cox <rsc@golang.org>	2015-06-18 13:59:55 +00:00
Alex Brainman	2858b73843	runtime: remove cgocall and asmcgocall In preparation for rename of cgocall_errno into cgocall and asmcgocall_errno into asmcgocall in the fllowinng CL. rsc requested CL 9387 to be split into two parts. This is first part. Change-Id: I7434f0e4b44dd37017540695834bfcb1eebf0b2f Reviewed-on: https://go-review.googlesource.com/11166 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-06-18 04:42:53 +00:00
Russ Cox	cfa3eda587	runtime: fix race in scanvalid assertion Change-Id: I389b2e10fe667eaa55f87b71b1e004994694d4a3 Reviewed-on: https://go-review.googlesource.com/11173 Reviewed-by: Austin Clements <austin@google.com>	2015-06-17 20:12:37 +00:00
Russ Cox	3c60e6e8cf	runtime: fix races in stack scan This fixes a hang during runtime.TestTraceStress. It also fixes double-scan of stacks, which leads to stack barrier installation failures. Both of these have shown up as flaky failures on the dashboard. Fixes #10941. Change-Id: Ia2a5991ce2c9f43ba06ae1c7032f7c898dc990e0 Reviewed-on: https://go-review.googlesource.com/11089 Reviewed-by: Austin Clements <austin@google.com>	2015-06-17 17:56:26 +00:00
Ainar Garipov	7f9f70e5b6	all: fix misprints in comments These were found by grepping the comments from the go code and feeding the output to aspell. Change-Id: Id734d6c8d1938ec3c36bd94a4dbbad577e3ad395 Reviewed-on: https://go-review.googlesource.com/10941 Reviewed-by: Aamir Khan <syst3m.w0rm@gmail.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2015-06-11 14:18:57 +00:00
Russ Cox	3ffcbb633e	runtime: default GOMAXPROCS to NumCPU(), not 1 See golang.org/s/go15gomaxprocs for details. Change-Id: I8de5df34fa01d31d78f0194ec78a2474c281243c Reviewed-on: https://go-review.googlesource.com/10668 Reviewed-by: Rob Pike <r@golang.org>	2015-06-05 04:38:04 +00:00
Austin Clements	3f6e69aca5	runtime: steal space for stack barrier tracking from stack The stack barrier code will need a bookkeeping structure to keep track of the overwritten return PCs. This commit introduces and allocates this structure, but does not yet use the structure. We don't want to allocate space for this structure during garbage collection, so this commit allocates it along with the allocation of the corresponding stack. However, we can't do a regular allocation in newstack because mallocgc may itself grow the stack (which would lead to a recursive allocation). Hence, this commit makes the bookkeeping structure part of the stack allocation itself by stealing the necessary space from the top of the stack allocation. Since the size of this bookkeeping structure is logarithmic in the size of the stack, this has minimal impact on stack behavior. Change-Id: Ia14408be06aafa9ca4867f4e70bddb3fe0e96665 Reviewed-on: https://go-review.googlesource.com/10313 Reviewed-by: Russ Cox <rsc@golang.org>	2015-06-02 19:57:57 +00:00
Austin Clements	e610c25df0	runtime: decouple stack bounds and stack allocation size Currently the runtime assumes that the allocation for the stack is exactly [stack.lo, stack.hi). We're about to steal a small part of this allocation for per-stack GC metadata. To prepare for this, this commit adds a field to the G for the allocated size of the stack. With this change, stack.lo and stack.hi continue to act as the true bounds on the stack, but are no longer also used as the bounds on the stack allocation. (I also tried this the other way around, where stack.lo and stack.hi remained the allocation bounds and I introduced a new top of stack. However, there are far more places that assume stack.hi is the true top of the stack than there are places that assume it's the top of the allocation.) Change-Id: Ifa9d956753be53d286d09cbc73d47fb34a18c0c6 Reviewed-on: https://go-review.googlesource.com/10312 Reviewed-by: Russ Cox <rsc@golang.org>	2015-06-02 19:57:50 +00:00
Elias Naur	84cfba17c2	runtime: don't always unblock all signals Ian proposed an improved way of handling signals masks in Go, motivated by a problem where the Android java runtime expects certain signals to be blocked for all JVM threads. Discussion here https://groups.google.com/forum/#!topic/golang-dev/_TSCkQHJt6g Ian's text is used in the following: A Go program always needs to have the synchronous signals enabled. These are the signals for which _SigPanic is set in sigtable, namely SIGSEGV, SIGBUS, SIGFPE. A Go program that uses the os/signal package, and calls signal.Notify, needs to have at least one thread which is not blocking that signal, but it doesn't matter much which one. Unix programs do not change signal mask across execve. They inherit signal masks across fork. The shell uses this fact to some extent; for example, the job control signals (SIGTTIN, SIGTTOU, SIGTSTP) are blocked for commands run due to backquote quoting or $(). Our current position on signal masks was not thought out. We wandered into step by step, e.g., http://golang.org/cl/7323067 . This CL does the following: Introduce a new platform hook, msigsave, that saves the signal mask of the current thread to m.sigsave. Call msigsave from needm and newm. In minit grab set up the signal mask from m.sigsave and unblock the essential synchronous signals, and SIGILL, SIGTRAP, SIGPROF, SIGSTKFLT (for systems that have it). In unminit, restore the signal mask from m.sigsave. The first time that os/signal.Notify is called, start a new thread whose only purpose is to update its signal mask to make sure signals for signal.Notify are unblocked on at least one thread. The effect on Go programs will be that if they are invoked with some non-synchronous signals blocked, those signals will normally be ignored. Previously, those signals would mostly be ignored. A change in behaviour will occur for programs started with any of these signals blocked, if they receive the signal: SIGHUP, SIGINT, SIGQUIT, SIGABRT, SIGTERM. Previously those signals would always cause a crash (unless using the os/signal package); with this change, they will be ignored if the program is started with the signal blocked (and does not use the os/signal package). ./all.bash completes successfully on linux/amd64. OpenBSD is missing the implementation. Change-Id: I188098ba7eb85eae4c14861269cc466f2aa40e8c Reviewed-on: https://go-review.googlesource.com/10173 Reviewed-by: Ian Lance Taylor <iant@golang.org>	2015-05-22 20:24:08 +00:00
Rick Hudson	197aa9e64d	runtime: remove unused quiesce code This is dead code. If you want to quiesce the system the preferred way is to use forEachP(func(*p){}). Change-Id: Ic7677a5dd55e3639b99e78ddeb2c71dd1dd091fa Reviewed-on: https://go-review.googlesource.com/10267 Reviewed-by: Austin Clements <austin@google.com>	2015-05-20 17:56:44 +00:00
Rick Hudson	913db7685e	runtime: run background mark helpers only if work is available Prior to this CL whenever the GC marking was enabled and a P was looking for work we supplied a G to help the GC do its marking tasks. Once this G finished all the marking available it would release the P to find another available G. In the case where there was no work the P would drop into findrunnable which would execute the mark helper G which would immediately return and the P would drop into findrunnable again repeating the process. Since the P was always given a G to run it never blocks. This CL first checks if the GC mark helper G has available work and if not the P immediately falls through to its blocking logic. Fixes #10901 Change-Id: I94ac9646866ba64b7892af358888bc9950de23b5 Reviewed-on: https://go-review.googlesource.com/10189 Reviewed-by: Austin Clements <austin@google.com>	2015-05-19 15:57:50 +00:00
Austin Clements	f0dd002895	runtime: use separate count and note for forEachP Currently, forEachP reuses the stopwait and stopnote fields from stopTheWorld to track how many Ps have not responded to the safe-point request and to sleep until all Ps have responded. It was assumed this was safe because both stopTheWorld and forEachP must occur under the worlsema and hence stopwait and stopnote cannot be used for both purposes simultaneously and callers could always determine the appropriate use based on sched.gcwaiting (which is only set by stopTheWorld). However, this is not the case, since it's possible for there to be a window between when an M observes that gcwaiting is set and when it checks stopwait during which stopwait could have changed meanings. When this happens, the M decrements stopwait and may wakeup stopnote, but does not otherwise participate in the forEachP protocol. As a result, stopwait is decremented too many times, so it may reach zero before all Ps have run the safe-point function, causing forEachP to wake up early. It will then either observe that some P has not run the safe-point function and panic with "P did not run fn", or the remaining P (or Ps) will run the safe-point function before it wakes up and it will observe that stopwait is negative and panic with "not stopped". Fix this problem by giving forEachP its own safePointWait and safePointNote fields. One known sequence of events that can cause this race is as follows. It involves three actors: G1 is running on M1 on P1. P1 has an empty run queue. G2/M2 is in a blocked syscall and has lost its P. (The details of this don't matter, it just needs to be in a position where it needs to grab an idle P.) GC just started on G3/M3/P3. (These aren't very involved, they just have to be separate from the other G's, M's, and P's.) 1. GC calls stopTheWorld(), which sets sched.gcwaiting to 1. Now G1/M1 begins to enter a syscall: 2. G1/M1 invokes reentersyscall, which sets the P1's status to _Psyscall. 3. G1/M1's reentersyscall observes gcwaiting != 0 and calls entersyscall_gcwait. 4. G1/M1's entersyscall_gcwait blocks acquiring sched.lock. Back on GC: 5. stopTheWorld cas's P1's status to _Pgcstop, does other stuff, and returns. 6. GC does stuff and then calls startTheWorld(). 7. startTheWorld() calls procresize(), which sets P1's status to _Pidle and puts P1 on the idle list. Now G2/M2 returns from its syscall and takes over P1: 8. G2/M2 returns from its blocked syscall and gets P1 from the idle list. 9. G2/M2 acquires P1, which sets P1's status to _Prunning. 10. G2/M2 starts a new syscall and invokes reentersyscall, which sets P1's status to _Psyscall. Back on G1/M1: 11. G1/M1 finally acquires sched.lock in entersyscall_gcwait. At this point, G1/M1 still thinks it's running on P1. P1's status is _Psyscall, which is consistent with what G1/M1 is doing, but it's _Psyscall because G2/M2 put it in to _Psyscall, not G1/M1. This is basically an ABA race on P1's status. Because forEachP currently shares stopwait with stopTheWorld. G1/M1's entersyscall_gcwait observes the non-zero stopwait set by forEachP, but mistakes it for a stopTheWorld. It cas's P1's status from _Psyscall (set by G2/M2) to _Pgcstop and proceeds to decrement stopwait one more time than forEachP was expecting. Fixes #10618. (See the issue for details on why the above race is safe when forEachP is not involved.) Prior to this commit, the command stress ./runtime.test -test.run TestFutexsleep\\|TestGoroutineProfile would reliably fail after a few hundred runs. With this commit, it ran for over 2 million runs and never crashed. Change-Id: I9a91ea20035b34b6e5f07ef135b144115f281f30 Reviewed-on: https://go-review.googlesource.com/10157 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-18 14:55:47 +00:00
Austin Clements	277acca286	runtime: hold worldsema while starting the world Currently, startTheWorld releases worldsema before starting the world. Since startTheWorld can change gomaxprocs after allowing Ps to run, this means that gomaxprocs can change while another P holds worldsema. Unfortunately, the garbage collector and forEachP assume that holding worldsema protects against changes in gomaxprocs (which it almost does). In particular, this is causing somewhat frequent "P did not run fn" crashes in forEachP in the runtime tests because gomaxprocs is changing between the several loops that forEachP does over all the Ps. Fix this by only releasing worldsema after the world is started. This relates to issue #10618. forEachP still fails under stress testing, but much less frequently. Change-Id: I085d627b70cca9ebe9af28fe73b9872f1bb224ff Reviewed-on: https://go-review.googlesource.com/10156 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-18 14:55:37 +00:00
Austin Clements	9c44a41dd5	runtime: disallow preemption during startTheWorld Currently, startTheWorld clears preemptoff for the current M before starting the world. A few callers increment m.locks around startTheWorld, presumably to prevent preemption any time during starting the world. This is almost certainly pointless (none of the other callers do this), but there's no harm in making startTheWorld keep preemption disabled until it's all done, which definitely lets us drop these m.locks manipulations. Change-Id: I8a93658abd0c72276c9bafa3d2c7848a65b4691a Reviewed-on: https://go-review.googlesource.com/10155 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-18 14:55:31 +00:00
Austin Clements	a1da255aa0	runtime: factor stoptheworld/starttheworld pattern There are several steps to stopping and starting the world and currently they're open-coded in several places. The garbage collector is the only thing that needs to stop and start the world in a non-trivial pattern. Replace all other uses with calls to higher-level functions that implement the entire pattern necessary to stop and start the world. This is a pure refectoring and should not change any code semantics. In the following commits, we'll make changes that are easier to do with this abstraction in place. This commit renames the old starttheworld to startTheWorldWithSema. This is a slight misnomer right now because the callers release worldsema just before calling this. However, a later commit will swap these and I don't want to think of another name in the mean time. Change-Id: I5dc97f87b44fb98963c49c777d7053653974c911 Reviewed-on: https://go-review.googlesource.com/10154 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-18 14:55:25 +00:00
Austin Clements	a0fc306023	runtime: eliminate runqvictims and a copy from runqsteal Currently, runqsteal steals Gs from another P into an intermediate buffer and then copies those Gs into the current P's run queue. This intermediate buffer itself was moved from the stack to the P in commit `c4fe503` to eliminate the cost of zeroing it on every steal. This commit follows up `c4fe503` by stealing directly into the current P's run queue, which eliminates the copy and the need for the intermediate buffer. The update to the tail pointer is only committed once the entire steal operation has succeeded, so the semantics of stealing do not change. Change-Id: Icdd7a0eb82668980bf42c0154b51eef6419fdd51 Reviewed-on: https://go-review.googlesource.com/9998 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-05-17 01:08:42 +00:00
Rick Hudson	c4fe503119	runtime: reduce thrashing of gs between ps One important use case is a pipeline computation that pass values from one Goroutine to the next and then exits or is placed in a wait state. If GOMAXPROCS > 1 a Goroutine running on P1 will enable another Goroutine and then immediately make P1 available to execute it. We need to prevent other Ps from stealing the G that P1 is about to execute. Otherwise the Gs can thrash between Ps causing unneeded synchronization and slowing down throughput. Fix this by changing the stealing logic so that when a P attempts to steal the only G on some other P's run queue, it will pause momentarily to allow the victim P to schedule the G. As part of optimizing stealing we also use a per P victim queue move stolen gs. This eliminates the zeroing of a stack local victim queue which turned out to be expensive. This CL is a necessary but not sufficient prerequisite to changing the default value of GOMAXPROCS to something > 1 which is another CL/discussion. For highly serialized programs, such as GoroutineRing below this can make a large difference. For larger and more parallel programs such as the x/benchmarks there is no noticeable detriment. ~/work/code/src/rsc.io/benchstat/benchstat old.txt new.txt name old mean new mean delta GoroutineRing 30.2µs × (0.98,1.01) 30.1µs × (0.97,1.04) ~ (p=0.941) GoroutineRing-2 113µs × (0.91,1.07) 30µs × (0.98,1.03) -73.17% (p=0.004) GoroutineRing-4 144µs × (0.98,1.02) 32µs × (0.98,1.01) -77.69% (p=0.000) GoroutineRingBuf 32.7µs × (0.97,1.03) 32.5µs × (0.97,1.02) ~ (p=0.795) GoroutineRingBuf-2 120µs × (0.92,1.08) 33µs × (1.00,1.00) -72.48% (p=0.004) GoroutineRingBuf-4 138µs × (0.92,1.06) 33µs × (1.00,1.00) -76.21% (p=0.003) The bench benchmarks show little impact. old new garbage 7032879 7011696 httpold 25509 25301 splayold 1022073 1019499 jsonold 28230624 28081433 Change-Id: I228c48fed8d85c9bbef16a7edc53ab7898506f50 Reviewed-on: https://go-review.googlesource.com/9872 Reviewed-by: Austin Clements <austin@google.com>	2015-05-13 12:55:24 +00:00
Austin Clements	350fd548b3	runtime: don't run runq tests on the system stack Running these tests on the system stack is problematic because they allocate Ps, which are large enough to overflow the system stack if they are stack-allocated. It used to be necessary to run these tests on the system stack because they were written in C, but since this is no longer the case, we can fix this problem by simply not running the tests on the system stack. This also means we no longer need the hack in one of these tests that forces the allocated Ps to escape to the heap, so eliminate that as well. Change-Id: I9064f5f8fd7f7b446ff39a22a70b172cfcb2dc57 Reviewed-on: https://go-review.googlesource.com/9923 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-05-12 19:58:08 +00:00
Russ Cox	1635ab7dfe	runtime: remove wbshadow mode The write barrier shadow heap was very useful for developing the write barriers initially, but it's no longer used, clunky, and dragging the rest of the implementation down. The gccheckmark mode will find bugs due to missed barriers when they result in missed marks; wbshadow mode found the missed barriers more aggressively, but it required an entire separate copy of the heap. The gccheckmark mode requires no extra memory, making it more useful in practice. Compared to previous CL: name old mean new mean delta BinaryTree17 5.91s × (0.96,1.06) 5.72s × (0.97,1.03) -3.12% (p=0.000) Fannkuch11 4.32s × (1.00,1.00) 4.36s × (1.00,1.00) +0.91% (p=0.000) FmtFprintfEmpty 89.0ns × (0.93,1.10) 86.6ns × (0.96,1.11) ~ (p=0.077) FmtFprintfString 298ns × (0.98,1.06) 283ns × (0.99,1.04) -4.90% (p=0.000) FmtFprintfInt 286ns × (0.98,1.03) 283ns × (0.98,1.04) -1.09% (p=0.032) FmtFprintfIntInt 498ns × (0.97,1.06) 480ns × (0.99,1.02) -3.65% (p=0.000) FmtFprintfPrefixedInt 408ns × (0.98,1.02) 396ns × (0.99,1.01) -3.00% (p=0.000) FmtFprintfFloat 587ns × (0.98,1.01) 562ns × (0.99,1.01) -4.34% (p=0.000) FmtManyArgs 1.94µs × (0.99,1.02) 1.89µs × (0.99,1.01) -2.85% (p=0.000) GobDecode 15.8ms × (0.98,1.03) 15.7ms × (0.99,1.02) ~ (p=0.251) GobEncode 12.0ms × (0.96,1.09) 11.8ms × (0.98,1.03) -1.87% (p=0.024) Gzip 648ms × (0.99,1.01) 647ms × (0.99,1.01) ~ (p=0.688) Gunzip 143ms × (1.00,1.01) 143ms × (1.00,1.01) ~ (p=0.203) HTTPClientServer 90.3µs × (0.98,1.01) 89.1µs × (0.99,1.02) -1.30% (p=0.000) JSONEncode 31.6ms × (0.99,1.01) 31.7ms × (0.98,1.02) ~ (p=0.219) JSONDecode 107ms × (1.00,1.01) 111ms × (0.99,1.01) +3.58% (p=0.000) Mandelbrot200 6.03ms × (1.00,1.01) 6.01ms × (1.00,1.00) ~ (p=0.077) GoParse 6.53ms × (0.99,1.03) 6.54ms × (0.99,1.02) ~ (p=0.585) RegexpMatchEasy0_32 161ns × (1.00,1.01) 161ns × (0.98,1.05) ~ (p=0.948) RegexpMatchEasy0_1K 541ns × (0.99,1.01) 559ns × (0.98,1.01) +3.32% (p=0.000) RegexpMatchEasy1_32 138ns × (1.00,1.00) 137ns × (0.99,1.01) -0.55% (p=0.001) RegexpMatchEasy1_1K 887ns × (0.99,1.01) 878ns × (0.99,1.01) -0.98% (p=0.000) RegexpMatchMedium_32 253ns × (0.99,1.01) 252ns × (0.99,1.01) -0.39% (p=0.001) RegexpMatchMedium_1K 72.8µs × (1.00,1.00) 72.7µs × (1.00,1.00) ~ (p=0.485) RegexpMatchHard_32 3.85µs × (1.00,1.01) 3.85µs × (1.00,1.01) ~ (p=0.283) RegexpMatchHard_1K 117µs × (1.00,1.01) 117µs × (1.00,1.00) ~ (p=0.175) Revcomp 922ms × (0.97,1.08) 903ms × (0.98,1.05) -2.15% (p=0.021) Template 126ms × (0.99,1.01) 126ms × (0.99,1.01) ~ (p=0.943) TimeParse 628ns × (0.99,1.01) 634ns × (0.99,1.01) +0.92% (p=0.000) TimeFormat 668ns × (0.99,1.01) 698ns × (0.98,1.03) +4.53% (p=0.000) It's nice that the microbenchmarks are the ones helped the most, because those were the ones hurt the most by the conversion from 4-bit to 2-bit heap bitmaps. This CL brings the overall effect of that process to (compared to CL 9706 patch set 1): name old mean new mean delta BinaryTree17 5.87s × (0.94,1.09) 5.72s × (0.97,1.03) -2.57% (p=0.011) Fannkuch11 4.32s × (1.00,1.00) 4.36s × (1.00,1.00) +0.87% (p=0.000) FmtFprintfEmpty 89.1ns × (0.95,1.16) 86.6ns × (0.96,1.11) ~ (p=0.090) FmtFprintfString 283ns × (0.98,1.02) 283ns × (0.99,1.04) ~ (p=0.681) FmtFprintfInt 284ns × (0.98,1.04) 283ns × (0.98,1.04) ~ (p=0.620) FmtFprintfIntInt 486ns × (0.98,1.03) 480ns × (0.99,1.02) -1.27% (p=0.002) FmtFprintfPrefixedInt 400ns × (0.99,1.02) 396ns × (0.99,1.01) -0.84% (p=0.001) FmtFprintfFloat 566ns × (0.99,1.01) 562ns × (0.99,1.01) -0.80% (p=0.000) FmtManyArgs 1.91µs × (0.99,1.02) 1.89µs × (0.99,1.01) -1.10% (p=0.000) GobDecode 15.5ms × (0.98,1.05) 15.7ms × (0.99,1.02) +1.55% (p=0.005) GobEncode 11.9ms × (0.97,1.03) 11.8ms × (0.98,1.03) -0.97% (p=0.048) Gzip 648ms × (0.99,1.01) 647ms × (0.99,1.01) ~ (p=0.627) Gunzip 143ms × (1.00,1.00) 143ms × (1.00,1.01) ~ (p=0.482) HTTPClientServer 89.2µs × (0.99,1.02) 89.1µs × (0.99,1.02) ~ (p=0.740) JSONEncode 32.3ms × (0.97,1.06) 31.7ms × (0.98,1.02) -1.95% (p=0.002) JSONDecode 106ms × (0.99,1.01) 111ms × (0.99,1.01) +4.22% (p=0.000) Mandelbrot200 6.02ms × (1.00,1.00) 6.01ms × (1.00,1.00) ~ (p=0.417) GoParse 6.57ms × (0.97,1.06) 6.54ms × (0.99,1.02) ~ (p=0.404) RegexpMatchEasy0_32 162ns × (1.00,1.00) 161ns × (0.98,1.05) ~ (p=0.088) RegexpMatchEasy0_1K 561ns × (0.99,1.02) 559ns × (0.98,1.01) -0.47% (p=0.034) RegexpMatchEasy1_32 145ns × (0.95,1.04) 137ns × (0.99,1.01) -5.56% (p=0.000) RegexpMatchEasy1_1K 864ns × (0.99,1.04) 878ns × (0.99,1.01) +1.57% (p=0.000) RegexpMatchMedium_32 255ns × (0.99,1.04) 252ns × (0.99,1.01) -1.43% (p=0.001) RegexpMatchMedium_1K 73.9µs × (0.98,1.04) 72.7µs × (1.00,1.00) -1.55% (p=0.004) RegexpMatchHard_32 3.92µs × (0.98,1.04) 3.85µs × (1.00,1.01) -1.80% (p=0.003) RegexpMatchHard_1K 120µs × (0.98,1.04) 117µs × (1.00,1.00) -2.13% (p=0.001) Revcomp 936ms × (0.95,1.08) 903ms × (0.98,1.05) -3.58% (p=0.002) Template 130ms × (0.98,1.04) 126ms × (0.99,1.01) -2.98% (p=0.000) TimeParse 638ns × (0.98,1.05) 634ns × (0.99,1.01) ~ (p=0.198) TimeFormat 674ns × (0.99,1.01) 698ns × (0.98,1.03) +3.69% (p=0.000) Change-Id: Ia0e9b50b1d75a3c0c7556184cd966305574fe07c Reviewed-on: https://go-review.googlesource.com/9706 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-05-11 14:55:11 +00:00
Daniel Morsing	db6f88a84b	runtime: enable profiling on g0 Since we now have stack information for code running on the systemstack, we can traceback over it. To make cpu profiles useful, add a case in gentraceback to jump over systemstack switches. Fixes #10609. Change-Id: I21f47fcc802c07c5d4a1ada56374314e388a6dc7 Reviewed-on: https://go-review.googlesource.com/9506 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-05-11 08:44:30 +00:00
Michael Hudson-Doyle	fa896733b5	runtime: check consistency of all module data objects Current code just checks the consistency (that the functab is correctly sorted by PC, etc) of the moduledata object that the runtime belongs to. Change to check all of them. Change-Id: I544a44c5de7445fff87d3cdb4840ff04c5e2bf75 Reviewed-on: https://go-review.googlesource.com/9773 Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-05-07 15:06:08 +00:00
Keith Randall	5a828cfcde	runtime: let freezetheworld work even when gomaxprocs=1 Freezetheworld still has stuff to do when gomaxprocs=1. In particular, signals can come in on other Ms (like the GC M, say) and the single user M is still running. Fixes #10546 Change-Id: I2f07f17d1c81e93cf905df2cb087112d436ca7e7 Reviewed-on: https://go-review.googlesource.com/9551 Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-05-05 15:11:10 +00:00
Keith Randall	a55b131393	cmd/dist, runtime: Make stack guard larger for non-optimized builds Kind of a hack, but makes the non-optimized builds pass. Fixes #10079 Change-Id: I26f41c546867f8f3f16d953dc043e784768f2aff Reviewed-on: https://go-review.googlesource.com/9552 Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-01 15:41:55 +00:00
David Chase	7fbb1b36c3	cmd/internal/gc: improve flow of input params to output params This includes the following information in the per-function summary: outK = paramJ encoded in outK bits for paramJ outK = paramJ encoded in outK bits for paramJ heap = paramJ EscHeap heap = paramJ EscContentEscapes Note that (currently) if the address of a parameter is taken and returned, necessarily a heap allocation occurred to contain that reference, and the heap can never refer to stack, therefore the parameter and everything downstream from it escapes to the heap. The per-function summary information now has a tuneable number of bits (2 is probably noticeably better than 1, 3 is likely overkill, but it is now easy to check and the -m debugging output includes information that allows you to figure out if more would be better.) A new test was added to check pointer flow through struct-typed and struct-typed parameters and returns; some of these are sensitive to the number of summary bits, and ought to yield better results with a more competent escape analysis algorithm. Another new test checks (some) correctness with array parameters, results, and operations. The old analysis inferred a piece of plan9 runtime was non-escaping by counteracting overconservative analysis with buggy analysis; with the bug fixed, the result was too conservative (and it's not easy to fix in this framework) so the source code was tweaked to get the desired result. A test was added against the discovered bug. The escape analysis was further improved splitting the "level" into 3 parts, one tracking the conventional "level" and the other two computing the highest-level-suffix-from-copy, which is used to generally model the cancelling effect of indirection applied to address-of. With the improved escape analysis enabled, it was necessary to modify one of the runtime tests because it now attempts to allocate too much on the (small, fixed-size) G0 (system) stack and this failed the test. Compiling src/std after touching src/runtime/.go with -m logging turned on shows 420 fewer heap allocation sites (10538 vs 10968). Profiling allocations in src/html/template with for i in {1..5} ; do go tool 6g -memprofile=mastx.${i}.prof -memprofilerate=1 *.go; go tool pprof -alloc_objects -text mastx.${i}.prof ; done showed a 15% reduction in allocations performed by the compiler. Update #3753 Update #4720 Fixes #10466 Change-Id: I0fd97d5f5ac527b45f49e2218d158a6e89951432 Reviewed-on: https://go-review.googlesource.com/8202 Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-05-01 13:47:20 +00:00
Austin Clements	33e0f3d853	runtime: fix some out of date comments and typos Change-Id: I061057414c722c5a0f03c709528afc8554114db6 Reviewed-on: https://go-review.googlesource.com/9367 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 20:08:38 +00:00
Austin Clements	1b01910c06	runtime: rename gcController.findRunnable to findRunnableGCWorker This avoids confusion with the main findrunnable in the scheduler. Change-Id: I8cf40657557a8610a2fe5a2f74598518256ca7f0 Reviewed-on: https://go-review.googlesource.com/9305 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 19:26:42 +00:00
Austin Clements	bb6320535d	runtime: replace STW for enabling write barriers with ragged barrier Currently, we use a full stop-the-world around enabling write barriers. This is to ensure that all Gs have enabled write barriers before any blackening occurs (either in gcBgMarkWorker() or in gcAssistAlloc()). However, there's no need to bring the whole world to a synchronous stop to ensure this. This change replaces the STW with a ragged barrier that ensures each P has individually observed that write barriers should be enabled before GC performs any blackening. Change-Id: If2f129a6a55bd8bdd4308067af2b739f3fb41955 Reviewed-on: https://go-review.googlesource.com/8207 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 19:26:37 +00:00
Austin Clements	57afa76471	runtime: add ragged global barrier function This adds forEachP, which performs a general-purpose ragged global barrier. forEachP takes a callback and invokes it for every P at a GC safe point. Ps that are idle or in a syscall are considered to be at a continuous safe point. forEachP ensures that these Ps do not change state by forcing all syscall Ps into idle and holding the sched.lock. To ensure that Ps do not enter syscall or idle without running the safe-point function, this adds checks for a pending callback every place there is currently a gcwaiting check. We'll use forEachP to replace the STW around enabling the write barrier and to replace the current asynchronous per-M wbuf cache with a cooperatively managed per-P gcWork cache. Change-Id: Ie944f8ce1fead7c79bf271d2f42fcd61a41bb3cc Reviewed-on: https://go-review.googlesource.com/8206 Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-27 19:26:33 +00:00
Austin Clements	b0b1a66052	runtime: reset spinning in mspinning if work was ready()ed This fixes a bug where the runtime ready()s a goroutine while setting up a new M that's initially marked as spinning, causing the scheduler to later panic when it finds work in the run queue of a P associated with a spinning M. Specifically, the sequence of events that can lead to this is: 1) sysmon calls handoffp to hand off a P stolen from a syscall. 2) handoffp sees no pending work on the P, so it calls startm with spinning set. 3) startm calls newm, which in turn calls allocm to allocate a new M. 4) allocm "borrows" the P we're handing off in order to do allocation and performs this allocation. 5) This allocation may assist the garbage collector, and this assist may detect the end of concurrent mark and ready() the main GC goroutine to signal this. 6) This ready()ing puts the GC goroutine on the run queue of the borrowed P. 7) newm starts the OS thread, which runs mstart and subsequently mstart1, which marks the M spinning because startm was called with spinning set. 8) mstart1 enters the scheduler, which panics because there's work on the run queue, but the M is marked spinning. To fix this, before marking the M spinning in step 7, add a check to see if work was been added to the P's run queue. If this is the case, undo the spinning instead. Fixes #10573. Change-Id: I4670495ae00582144a55ce88c45ae71de597cfa5 Reviewed-on: https://go-review.googlesource.com/9332 Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-04-27 12:49:54 +00:00
Austin Clements	2a46f55b35	runtime: panic when idling a P with runnable Gs This adds a check that we never put a P on the idle list when it has work on its local run queue. Change-Id: Ifcfab750de60c335148a7f513d4eef17be03b6a7 Reviewed-on: https://go-review.googlesource.com/9324 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>	2015-04-27 12:49:49 +00:00
Austin Clements	0e6a6c510f	runtime: simplify process for starting GC goroutine Currently, when allocation reaches the GC trigger, the runtime uses readyExecute to start the GC goroutine immediately rather than wait for the scheduler to get around to the GC goroutine while the mutator continues to grow the heap. Now that the scheduler runs the most recently readied goroutine when a goroutine yields its time slice, this rigmarole is no longer necessary. The runtime can simply ready the GC goroutine and yield from the readying goroutine. Change-Id: I3b4ebadd2a72a923b1389f7598f82973dd5c8710 Reviewed-on: https://go-review.googlesource.com/9292 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2015-04-24 15:13:05 +00:00
Austin Clements	e870f06c3f	runtime: yield time slice to most recently readied G Currently, when the runtime ready()s a G, it adds it to the end of the current P's run queue and continues running. If there are many other things in the run queue, this can result in a significant delay before the ready()d G actually runs and can hurt fairness when other Gs in the run queue are CPU hogs. For example, if there are three Gs sharing a P, one of which is a CPU hog that never voluntarily gives up the P and the other two of which are doing small amounts of work and communicating back and forth on an unbuffered channel, the two communicating Gs will get very little CPU time. Change this so that when G1 ready()s G2 and then blocks, the scheduler immediately hands off the remainder of G1's time slice to G2. In the above example, the two communicating Gs will now act as a unit and together get half of the CPU time, while the CPU hog gets the other half of the CPU time. This fixes the problem demonstrated by the ping-pong benchmark added in the previous commit: benchmark old ns/op new ns/op delta BenchmarkPingPongHog 684287 825 -99.88% On the x/benchmarks suite, this change improves the performance of garbage by ~6% (for GOMAXPROCS=1 and 4), and json by 28% and 36% for GOMAXPROCS=1 and 4. It has negligible effect on heap size. This has no effect on the go1 benchmark suite since those benchmarks are mostly single-threaded. Change-Id: I858a08eaa78f702ea98a5fac99d28a4ac91d339f Reviewed-on: https://go-review.googlesource.com/9289 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 15:12:52 +00:00
Austin Clements	e5e52f4f2c	runtime: factor checking if P run queue is empty There are a variety of places where we check if a P's run queue is empty. This test is about to get slightly more complicated, so factor it out into a new function, runqempty. This function is inlinable, so this has no effect on performance. Change-Id: If4a0b01ffbd004937de90d8d686f6ded4aad2c6b Reviewed-on: https://go-review.googlesource.com/9287 Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-24 15:12:42 +00:00
Austin Clements	e0c3d85f08	runtime: fix background marking at 25% utilization Currently, in accordance with the GC pacing proposal, we schedule background marking with a goal of achieving 25% utilization total between mutator assists and background marking. This is stricter than was set out in the Go 1.5 proposal, which suggests that the garbage collector can use 25% just for itself and anything the mutator does to help out is on top of that. It also has several technical drawbacks. Because mutator assist time is constantly changing and we can't have instantaneous information on background marking time, it effectively requires hitting a moving target based on out-of-date information. This works out in the long run, but works poorly for short GC cycles and on short time scales. Also, this requires time-multiplexing all Ps between the mutator and background GC since the goal utilization of background GC constantly fluctuates. This results in a complicated scheduling algorithm, poor affinity, and extra overheads from context switching. This change modifies the way we schedule and run background marking so that background marking always consumes 25% of GOMAXPROCS and mutator assist is in addition to this. This enables a much more robust scheduling algorithm where we pre-determine the number of Ps we should dedicate to background marking as well as the utilization goal for a single floating "remainder" mark worker. Change-Id: I187fa4c03ab6fe78012a84d95975167299eb9168 Reviewed-on: https://go-review.googlesource.com/9013 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:50 +00:00
Austin Clements	8d03acce54	runtime: multi-threaded, utilization-scheduled background mark Currently, the concurrent mark phase is performed by the main GC goroutine. Prior to the previous commit enabling preemption, this caused marking to always consume 1/GOMAXPROCS of the available CPU time. If GOMAXPROCS=1, this meant background GC would consume 100% of the CPU (effectively a STW). If GOMAXPROCS>4, background GC would use less than the goal of 25%. If GOMAXPROCS=4, background GC would use the goal 25%, but if the mutator wasn't using the remaining 75%, background marking wouldn't take advantage of the idle time. Enabling preemption in the previous commit made GC miss CPU targets in completely different ways, but set us up to bring everything back in line. This change replaces the fixed GC goroutine with per-P background mark goroutines. Once started, these goroutines don't go in the standard run queues; instead, they are scheduled specially such that the time spent in mutator assists and the background mark goroutines totals 25% of the CPU time available to the program. Furthermore, this lets background marking take advantage of idle Ps, which significantly boosts GC performance for applications that under-utilize the CPU. This requires also changing how time is reported for gctrace, so this change splits the concurrent mark CPU time into assist/background/idle scanning. This also requires increasing the size of the StackRecord slice used in a GoroutineProfile test. Change-Id: I0936ff907d2cee6cb687a208f2df47e8988e3157 Reviewed-on: https://go-review.googlesource.com/8850 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-21 15:35:32 +00:00
Russ Cox	181e26b9fa	runtime: replace func-based write barrier skipping with type-based This CL revises CL 7504 to use explicitly uintptr types for the struct fields that are going to be updated sometimes without write barriers. The result is that the fields are now updated always without write barriers. This approach has two important properties: 1) Now the GC never looks at the field, so if the missing reference could cause a problem, it will do so all the time, not just when the write barrier is missed at just the right moment. 2) Now a write barrier never happens for the field, avoiding the (correct) detection of inconsistent write barriers when GODEBUG=wbshadow=1. Change-Id: Iebd3962c727c0046495cc08914a8dc0808460e0e Reviewed-on: https://go-review.googlesource.com/9019 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2015-04-20 20:20:09 +00:00
Ian Lance Taylor	725aa3451a	runtime: no deadlock error if buildmode=c-archive or c-shared Change-Id: I4ee6dac32bd3759aabdfdc92b235282785fbcca9 Reviewed-on: https://go-review.googlesource.com/9083 Reviewed-by: David Crawshaw <crawshaw@golang.org>	2015-04-20 17:31:44 +00:00
Austin Clements	c1c667542c	runtime: fix dangling pointer in readyExecute readyExecute passes a closure to mcall that captures an argument to readyExecute. Since mcall is marked noescape, this closure lives on the stack of the calling goroutine. However, the closure puts the calling goroutine on the run queue (and switches to a new goroutine). If the calling goroutine gets scheduled before the mcall returns, this stack-allocated closure will become invalid while it's still executing. One consequence of this we've observed is that the captured gp variable can get overwritten before the call to execute(gp), causing execute(gp) to segfault. Fix this by passing the currently captured gp variable through a field in the calling goroutine's g struct so that the func is no longer a closure. To prevent problems like this in the future, this change also removes the go:noescape annotation from mcall. Due to a compiler bug, this will currently cause a func closure passed to mcall to be implicitly allocated rather than refusing the implicit allocation. However, this is okay because there are no other closures passed to mcall right now and the compiler bug will be fixed shortly. Fixes #10428. Change-Id: I49b48b85de5643323b89e9eaa4df63854e968c32 Reviewed-on: https://go-review.googlesource.com/8866 TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-17 17:59:14 +00:00
Austin Clements	a23a341e10	runtime: make time slice a const A G will be preempted if it runs for 10ms without blocking. Currently this constant is hard-coded in retake. Move it to a global const. We'll use the time slice length in scheduling background GC. Change-Id: I79a979948af2fad3afe5df9d4af4062f166554b7 Reviewed-on: https://go-review.googlesource.com/8838 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-14 22:06:32 +00:00
Austin Clements	4b956ae317	runtime: start concurrent GC promptly when we reach its trigger Currently, when allocation reaches the concurrent GC trigger size, we start the concurrent collector by ready'ing its G. This simply puts it on the end of the P's run queue, which means we may not actually start GC for some time as the current G continues to run and then the P drains other Gs already on its run queue. Since the mutator can continue to allocate, the heap can potentially be much larger than we intended by the time GC actually starts. Furthermore, how much larger is difficult to predict since it depends on the scheduler. Fix this by preempting the current G and switching directly to the concurrent GC G as soon as we reach the trigger heap size. On the garbage benchmark from the benchmarks subrepo with GOMAXPROCS=4, this reduces the time from triggering the GC to the beginning of sweep termination by 10 to 30 milliseconds, which reduces allocation after the trigger by up to 10MB (a large fraction of the 64MB live heap the benchmark tries to maintain). One other known source of delay before we "really" start GC is the sweep finalization performed before sweep termination. This has similar negative effects on heap size and predictability, but is an orthogonal problem. This change adds a TODO for this. Change-Id: I8bae98cb43685c1bf353ff55868e4647e3743c47 Reviewed-on: https://go-review.googlesource.com/8513 Reviewed-by: Rick Hudson <rlh@golang.org>	2015-04-10 18:22:52 +00:00
Austin Clements	7c37249639	runtime: make test for freezetheworld more precise exitsyscallfast checks for freezetheworld, but does so only by checking if stopwait is positive. This can also happen during stoptheworld, which is harmless, but confusing. Shortly, it will be important that we get to the p.status cas even if stopwait is set. Hence, make this test more specific so it only triggers with freezetheworld and not other uses of stopwait. Change-Id: Ibb722cd8360c3ed5a9654482519e3ceb87a8274d Reviewed-on: https://go-review.googlesource.com/8205 Reviewed-by: Russ Cox <rsc@golang.org>	2015-04-10 18:02:55 +00:00

1 2 3

103 commits