Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
Austin Clements	dc0f0ab70f	runtime: don't count manually-managed spans from heap_{inuse,sys} Currently, manually-managed spans are included in memstats.heap_inuse and memstats.heap_sys, but when we export these stats to the user, we subtract out how much has been allocated for stack spans from both. This works for now because stacks are the only manually-managed spans we have. However, we're about to use manually-managed spans for more things that don't necessarily have obvious stats we can use to adjust the user-presented numbers. Prepare for this by changing the accounting so manually-managed spans don't count toward heap_inuse or heap_sys. This makes these fields align with the fields presented to the user and means we don't have to track more statistics just so we can adjust these statistics. For #19325. Change-Id: I5cb35527fd65587ff23339276ba2c3969e2ad98f Reviewed-on: https://go-review.googlesource.com/38577 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-04-13 18:20:38 +00:00
Austin Clements	407c56ae9f	runtime: generalize {alloc,free}Stack to {alloc,free}Manual We're going to start using manually-managed spans for GC workbufs, so rename the allocate/free methods and pass in a pointer to the stats to use instead of using the stack stats directly. For #19325. Change-Id: I37df0147ae5a8e1f3cb37d59c8e57a1fcc6f2980 Reviewed-on: https://go-review.googlesource.com/38576 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-04-13 18:20:35 +00:00
Austin Clements	ab9db51e1c	runtime: rename mspan.stackfreelist -> manualFreeList We're going to use this free list for other types of manually-managed memory in the heap. For #19325. Change-Id: Ib7e682295133eabfddf3a84f44db43d937bfdd9c Reviewed-on: https://go-review.googlesource.com/38575 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-04-13 18:20:33 +00:00
Austin Clements	8fbaa4f70b	runtime: rename _MSpanStack -> _MSpanManual We're about to generalize _MSpanStack to be used for other forms of in-heap manual memory management in the runtime. This is an automated rename of _MSpanStack to _MSpanManual plus some comment fix-ups. For #19325. Change-Id: I1e20a57bb3b87a0d324382f92a3e294ffc767395 Reviewed-on: https://go-review.googlesource.com/38574 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-04-13 18:20:30 +00:00
Austin Clements	6c6f455f88	runtime: consolidate changes to arena_used Changing mheap_.arena_used requires several steps that are currently repeated multiple times in mheap_.sysAlloc. Consolidate these into a single function. In the future, this will also make it easier to add other auxiliary VM structures. Change-Id: Ie68837d2612e1f4ba4904acb1b6b832b15431d56 Reviewed-on: https://go-review.googlesource.com/40151 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-04-11 01:35:47 +00:00
Daniel Martí	4e7724b2db	runtime: remove unused parameter from bestFitTreap This code was added recently, and it doesn't seem like the parameter will be useful in the near future. Change-Id: I5d64dadb6820c159b588262ab90df2461b5fdf04 Reviewed-on: https://go-review.googlesource.com/39692 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-04-06 17:20:43 +00:00
Austin Clements	9741f0275c	runtime: initialize more fields of stack spans Stack spans don't internally use many of the fields of the mspan, which means things like the size class and element size get left over from whatever last used the mspan. This can lead to confusing crashes and debugging. Zero these fields or initialize them to something reasonable. This also lets us simplify some code that currently has to distinguish between heap and stack spans. Change-Id: I9bd114e76c147bb32de497045b932f8bf1988bbf Reviewed-on: https://go-review.googlesource.com/38573 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-04-05 19:17:41 +00:00
Austin Clements	44ed88a5a7	runtime: track the number of active sweepone calls sweepone returns ^uintptr(0) when there are no more spans to start sweeping, but there may be spans being swept concurrently at the time and there's currently no efficient way to tell when the sweeper is done sweeping all the spans. We'll need this for concurrent runtime.GC(), so add a count of the number of active sweepone calls to make it possible to block until sweeping is truly done. This is also useful for more accurately printing the gcpacertrace, since that should be printed after all of the sweeping stats are in (currently we can print it slightly too early). For #18216. Change-Id: I06e6240c9e7b40aca6fd7b788bb6962107c10a0f Reviewed-on: https://go-review.googlesource.com/37716 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 01:15:18 +00:00
Austin Clements	786eb5b754	runtime: make debug.FreeOSMemory call runtime.GC() Currently freeOSMemory calls gcStart directly, but we really just want it to behave like runtime.GC() and then perform a scavenge, so make it call runtime.GC() rather than gcStart. For #18216. Change-Id: I548ec007afc788e87d383532a443a10d92105937 Reviewed-on: https://go-review.googlesource.com/37518 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 01:15:10 +00:00
Austin Clements	29be3f1999	runtime: generalize GC trigger Currently the GC triggering condition is an awkward combination of the gcMode (whether or not it's gcBackgroundMode) and a boolean "forceTrigger" flag. Replace this with a new gcTrigger type that represents the range of transition predicates we need. This has several advantages: 1. We can remove the awkward logic that affects the trigger behavior based on the gcMode. Now gcMode purely controls whether to run a STW GC or not and the gcTrigger controls whether this is a forced GC that cannot be consolidated with other GC cycles. 2. We can lift the time-based triggering logic in sysmon to just another type of GC trigger and move the logic to the trigger test. 3. This sets us up to have a cycle count-based trigger, which we'll use to make runtime.GC trigger concurrent GC with the desired consolidation properties. For #18216. Change-Id: If9cd49349579a548800f5022ae47b8128004bbfc Reviewed-on: https://go-review.googlesource.com/37516 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-31 01:15:06 +00:00
Rick Hudson	6e9ec14186	runtime: redo insert/remove of large spans Currently for spans with up to 1 MBytes (128 pages) we maintain an array indexed by the number of pages in the span. This is efficient both in terms of space as well as time to insert or remove a span of a particular size. Unfortunately for spans larger than 1 MByte we currently place them on a separate linked list. This results in O(n) behavior. Now that we are seeing heaps approaching 100 GBytes n is large enough to be noticed in real programs. This change replaces the linked list now used with a balanced binary tree structure called a treap. A treap is a probabilistically balanced tree offering O(logN) behavior for inserting and removing spans. To verify that this approach will work we start with noting that only spans with sizes > 1MByte will be put into the treap. This means that to support 1 TByte a treap will need at most 1 million nodes and can ideally be held in a treap with a depth of 20. Experiments with adding and removing randomly sized spans from the treap seem to result in treaps with depths of about twice the ideal or 40. A petabyte would require a tree of only twice again that depth again so this algorithm should last well into the future. Fixes #19393 Go1 benchmarks indicate this is basically an overall wash. Tue Mar 28 21:29:21 EDT 2017 name old time/op new time/op delta BinaryTree17-4 2.42s ± 1% 2.42s ± 1% ~ (p=0.980 n=21+21) Fannkuch11-4 3.00s ± 1% 3.18s ± 4% +6.10% (p=0.000 n=22+24) FmtFprintfEmpty-4 40.5ns ± 1% 40.3ns ± 3% ~ (p=0.692 n=22+25) FmtFprintfString-4 65.9ns ± 3% 64.6ns ± 1% -1.98% (p=0.000 n=24+23) FmtFprintfInt-4 69.6ns ± 1% 68.0ns ± 7% -2.30% (p=0.001 n=21+22) FmtFprintfIntInt-4 102ns ± 2% 99ns ± 1% -3.07% (p=0.000 n=23+23) FmtFprintfPrefixedInt-4 126ns ± 0% 125ns ± 0% -0.79% (p=0.000 n=19+17) FmtFprintfFloat-4 206ns ± 2% 205ns ± 1% ~ (p=0.671 n=23+21) FmtManyArgs-4 441ns ± 1% 445ns ± 1% +0.88% (p=0.000 n=22+23) GobDecode-4 5.73ms ± 1% 5.86ms ± 1% +2.37% (p=0.000 n=23+22) GobEncode-4 4.51ms ± 1% 4.89ms ± 1% +8.32% (p=0.000 n=22+22) Gzip-4 197ms ± 0% 202ms ± 1% +2.75% (p=0.000 n=23+24) Gunzip-4 32.9ms ± 8% 32.7ms ± 2% ~ (p=0.466 n=23+24) HTTPClientServer-4 57.3µs ± 1% 56.7µs ± 1% -0.94% (p=0.000 n=21+22) JSONEncode-4 13.8ms ± 1% 13.9ms ± 2% +1.14% (p=0.000 n=22+23) JSONDecode-4 47.4ms ± 1% 48.1ms ± 1% +1.49% (p=0.000 n=23+23) Mandelbrot200-4 3.92ms ± 0% 3.92ms ± 1% +0.21% (p=0.000 n=22+22) GoParse-4 2.89ms ± 1% 2.87ms ± 1% -0.68% (p=0.000 n=21+22) RegexpMatchEasy0_32-4 73.6ns ± 1% 72.0ns ± 2% -2.15% (p=0.000 n=21+22) RegexpMatchEasy0_1K-4 173ns ± 1% 173ns ± 1% ~ (p=0.847 n=22+24) RegexpMatchEasy1_32-4 71.9ns ± 1% 69.8ns ± 1% -2.99% (p=0.000 n=23+20) RegexpMatchEasy1_1K-4 314ns ± 1% 308ns ± 1% -1.91% (p=0.000 n=22+23) RegexpMatchMedium_32-4 106ns ± 0% 105ns ± 1% -0.58% (p=0.000 n=19+21) RegexpMatchMedium_1K-4 34.3µs ± 1% 34.3µs ± 1% ~ (p=0.871 n=23+22) RegexpMatchHard_32-4 1.67µs ± 1% 1.67µs ± 7% ~ (p=0.224 n=22+23) RegexpMatchHard_1K-4 51.5µs ± 1% 50.4µs ± 1% -1.99% (p=0.000 n=22+23) Revcomp-4 383ms ± 1% 415ms ± 0% +8.51% (p=0.000 n=22+22) Template-4 51.5ms ± 1% 51.5ms ± 1% ~ (p=0.555 n=20+23) TimeParse-4 279ns ± 2% 277ns ± 1% -0.95% (p=0.000 n=24+22) TimeFormat-4 294ns ± 1% 296ns ± 1% +0.58% (p=0.003 n=24+23) [Geo mean] 43.7µs 43.8µs +0.32% name old speed new speed delta GobDecode-4 134MB/s ± 1% 131MB/s ± 1% -2.32% (p=0.000 n=23+22) GobEncode-4 170MB/s ± 1% 157MB/s ± 1% -7.68% (p=0.000 n=22+22) Gzip-4 98.7MB/s ± 0% 96.1MB/s ± 1% -2.68% (p=0.000 n=23+24) Gunzip-4 590MB/s ± 7% 593MB/s ± 2% ~ (p=0.466 n=23+24) JSONEncode-4 141MB/s ± 1% 139MB/s ± 2% -1.13% (p=0.000 n=22+23) JSONDecode-4 40.9MB/s ± 1% 40.3MB/s ± 0% -1.47% (p=0.000 n=23+23) GoParse-4 20.1MB/s ± 1% 20.2MB/s ± 1% +0.69% (p=0.000 n=21+22) RegexpMatchEasy0_32-4 435MB/s ± 1% 444MB/s ± 2% +2.21% (p=0.000 n=21+22) RegexpMatchEasy0_1K-4 5.89GB/s ± 1% 5.89GB/s ± 1% ~ (p=0.439 n=22+24) RegexpMatchEasy1_32-4 445MB/s ± 1% 459MB/s ± 1% +3.06% (p=0.000 n=23+20) RegexpMatchEasy1_1K-4 3.26GB/s ± 1% 3.32GB/s ± 1% +1.97% (p=0.000 n=22+23) RegexpMatchMedium_32-4 9.40MB/s ± 1% 9.44MB/s ± 1% +0.43% (p=0.000 n=23+21) RegexpMatchMedium_1K-4 29.8MB/s ± 1% 29.8MB/s ± 1% ~ (p=0.826 n=23+22) RegexpMatchHard_32-4 19.1MB/s ± 1% 19.1MB/s ± 7% ~ (p=0.233 n=22+23) RegexpMatchHard_1K-4 19.9MB/s ± 1% 20.3MB/s ± 1% +2.03% (p=0.000 n=22+23) Revcomp-4 664MB/s ± 1% 612MB/s ± 0% -7.85% (p=0.000 n=22+22) Template-4 37.6MB/s ± 1% 37.7MB/s ± 1% ~ (p=0.558 n=20+23) [Geo mean] 134MB/s 133MB/s -0.76% Tue Mar 28 22:16:54 EDT 2017 Change-Id: I4a4f5c2b53d3fb85ef76c98522d3ed5cf8ae5b7e Reviewed-on: https://go-review.googlesource.com/38732 Reviewed-by: Russ Cox <rsc@golang.org>	2017-03-29 14:18:24 +00:00
Austin Clements	df6025bc0d	runtime: disallow malloc or panic in scavenge Mallocs and panics in the scavenge path are particularly nasty because they're likely to silently self-deadlock on the mheap.lock. Avoid sinking lots of time into debugging these issues in the future by turning these into immediate throws. Change-Id: Ib36fdda33bc90b21c32432b03561630c1f3c69bc Reviewed-on: https://go-review.googlesource.com/38293 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-19 22:42:28 +00:00
Austin Clements	2ef88f7fcf	runtime: lock-free fast path for mark bits allocation Currently we acquire a global lock for every newMarkBits call. This is unfortunate since every span sweep operation calls newMarkBits. However, most allocations are simply linear allocations from the current arena. Take advantage of this to add a lock-free fast path for allocating from the current arena. With this change, the global lock only protects the lists of arenas, not the free offset in the current arena. Change-Id: I6cf6182af8492c8bfc21276114c77275fe3d7826 Reviewed-on: https://go-review.googlesource.com/34595 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-06 18:40:26 +00:00
Austin Clements	6c4a8d195b	runtime: don't hold global gcBitsArenas lock over allocation Currently, newArena holds the gcBitsArenas lock across allocating memory from the OS for a new gcBits arena. This is a global lock and allocating physical memory can be expensive, so this has the potential to cause high lock contention, especially since every single span sweep operation calls newArena (via newMarkBits). Improve the situation by temporarily dropping the lock across allocation. This means the caller now has to revalidate its assumptions after the lock is dropped, so this also factors out that code path and reinvokes it after the lock is acquired. Change-Id: I1113200a954ab4aad16b5071512583cfac744bdc Reviewed-on: https://go-review.googlesource.com/34594 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-06 18:40:23 +00:00
Eitan Adler	789c5255a4	all: remove the the duplicate words Change-Id: I6343c162e27e2e492547c96f1fc504909b1c03c0 Reviewed-on: https://go-review.googlesource.com/37793 Reviewed-by: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2017-03-06 04:39:12 +00:00
Austin Clements	4a7cf960c3	runtime: make ReadMemStats STW for < 25µs Currently ReadMemStats stops the world for ~1.7 ms/GB of heap because it collects statistics from every single span. For large heaps, this can be quite costly. This is particularly unfortunate because many production infrastructures call this function regularly to collect and report statistics. Fix this by tracking the necessary cumulative statistics in the mcaches. ReadMemStats still has to stop the world to stabilize these statistics, but there are only O(GOMAXPROCS) mcaches to collect statistics from, so this pause is only 25µs even at GOMAXPROCS=100. Fixes #13613. Change-Id: I3c0a4e14833f4760dab675efc1916e73b4c0032a Reviewed-on: https://go-review.googlesource.com/34937 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-04 02:56:37 +00:00
Austin Clements	77f64c50db	runtime: clarify access to mheap_.busy There are two accesses to mheap_.busy that are guarded by checks against len(mheap_.free). This works because both lists are (and must be) the same length, but it makes the code less clear. Change these to use len(mheap_.busy) so the access more clearly parallels the check. Fixes #18944. Change-Id: I9bacbd3663988df351ed4396ae9018bc71018311 Reviewed-on: https://go-review.googlesource.com/36354 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2017-03-03 17:02:18 +00:00
Keith Randall	7ba36f4adb	runtime: compute size classes statically No point in computing this info on startup. Compute it at build time. This lets us spend more time computing & checking the size classes. Improve the div magic for rounding to the start of an object. We can now use 32-bit multiplies & shifts, which should help 32-bit platforms. The static data is <1KB. The actual size classes are not changed by this CL. Change-Id: I6450cec7d1b2b4ad31fd3f945f504ed2ec6570e7 Reviewed-on: https://go-review.googlesource.com/32219 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2016-10-30 03:48:49 +00:00
Austin Clements	87e48c5afd	runtime, cmd/compile: rename memclr -> memclrNoHeapPointers Since barrier-less memclr is only safe in very narrow circumstances, this commit renames memclr to avoid accidentally calling memclr on typed memory. This can cause subtle, non-deterministic bugs, so it's worth some effort to prevent. In the near term, this will also prevent bugs creeping in from any concurrent CLs that add calls to memclr; if this happens, whichever patch hits master second will fail to compile. This also adds the other new memclr variants to the compiler's builtin.go to minimize the churn on that binary blob. We'll use these in future commits. Updates #17503. Change-Id: I00eead049f5bd35ca107ea525966831f3d1ed9ca Reviewed-on: https://go-review.googlesource.com/31369 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-28 18:20:33 +00:00
Austin Clements	ae3bb4a537	runtime: make fixalloc zero allocations on reuse Currently fixalloc does not zero memory it reuses. This is dangerous with the hybrid barrier if the type may contain heap pointers, since it may cause us to observe a dead heap pointer on reuse. It's also error-prone since it's the only allocator that doesn't zero on allocation (mallocgc of course zeroes, but so do persistentalloc and sysAlloc). It's also largely pointless: for mcache, the caller immediately memclrs the allocation; and the two specials types are tiny so there's no real cost to zeroing them. Change fixalloc to zero allocations by default. The only type we don't zero by default is mspan. This actually requires that the spsn's sweepgen survive across freeing and reallocating a span. If we were to zero it, the following race would be possible: 1. The current sweepgen is 2. Span s is on the unswept list. 2. Direct sweeping sweeps span s, finds it's all free, and releases s to the fixalloc. 3. Thread 1 allocates s from fixalloc. Suppose this zeros s, including s.sweepgen. 4. Thread 1 calls s.init, which sets s.state to _MSpanDead. 5. On thread 2, background sweeping comes across span s in allspans and cas's s.sweepgen from 0 (sg-2) to 1 (sg-1). Now it thinks it owns it for sweeping. 6. Thread 1 continues initializing s. Everything breaks. I would like to fix this because it's obviously confusing, but it's a subtle enough problem that I'm leaving it alone for now. The solution may be to skip sweepgen 0, but then we have to think about wrap-around much more carefully. Updates #17503. Change-Id: Ie08691feed3abbb06a31381b94beb0a2e36a0613 Reviewed-on: https://go-review.googlesource.com/31368 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-28 18:20:23 +00:00
Austin Clements	f4dcc9b29b	runtime: make _MSpanDead be the zero state Currently the zero value of an mspan is in the "in use" state. This seems like a bad idea in general. But it's going to wreak havoc when we make fixalloc zero allocations: even "freed" mspan objects are still on the allspans list and still get looked at by the garbage collector. Hence, if we leave the mspan states the way they are, allocating a span that reuses old memory will temporarily pass that span (which is visible to GC!) through the "in use" state, which can cause "unswept span" panics. Fix all of this by making the zero state "dead". Updates #17503. Change-Id: I77c7ac06e297af4b9e6258bc091c37abe102acc3 Reviewed-on: https://go-review.googlesource.com/31367 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-28 18:20:13 +00:00
Austin Clements	575b1dda4e	runtime: eliminate allspans snapshot Now that sweeping and span marking use the sweep list, there's no need for the work.spans snapshot of the allspans list. This change eliminates the few remaining uses of it, which are either dead code or can use allspans directly, and removes work.spans and its support functions. Change-Id: Id5388b42b1e68e8baee853d8eafb8bb4ff95bb43 Reviewed-on: https://go-review.googlesource.com/30537 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-25 22:33:02 +00:00
Austin Clements	f9497a6747	runtime: make sweep time proportional to in-use spans Currently sweeping walks the list of all spans, which means the work in sweeping is proportional to the maximum number of spans ever used. If the heap was once large but is now small, this causes an amortization failure: on a small heap, GCs happen frequently, but a full sweep still has to happen in each GC cycle, which means we spent a lot of time in sweeping. Fix this by creating a separate list consisting of just the in-use spans to be swept, so sweeping is proportional to the number of in-use spans (which is proportional to the live heap). Specifically, we create two lists: a list of unswept in-use spans and a list of swept in-use spans. At the start of the sweep cycle, the swept list becomes the unswept list and the new swept list is empty. Allocating a new in-use span adds it to the swept list. Sweeping moves spans from the unswept list to the swept list. This fixes the amortization problem because a shrinking heap moves spans off the unswept list without adding them to the swept list, reducing the time required by the next sweep cycle. Updates #9265. This fix eliminates almost all of the time spent in sweepone; however, markrootSpans has essentially the same bug, so now the test program from this issue spends all of its time in markrootSpans. No significant effect on other benchmarks. Change-Id: Ib382e82790aad907da1c127e62b3ab45d7a4ac1e Reviewed-on: https://go-review.googlesource.com/30535 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-25 22:32:57 +00:00
Austin Clements	5915ce6674	runtime: use len(h.spans) to indicate mapped region Currently we set the len and cap of h.spans to the full reserved region of the address space and track the actual mapped region separately in h.spans_mapped. Since we have both the len and cap at our disposal, change things so len(h.spans) tracks how much of the spans array is mapped and eliminate h.spans_mapped. This simplifies mheap and means we'll get nice "index out of bounds" exceptions if we do try to go off the end of the spans rather than a SIGSEGV. Change-Id: I8ed9a1a9a844d90e9fd2e269add4704623dbdfe6 Reviewed-on: https://go-review.googlesource.com/30533 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-25 22:32:51 +00:00
Austin Clements	6b0f668044	runtime: consolidate h_spans and mheap_.spans Like h_allspans and mheap_.allspans, these were two ways of referring to the spans array from when the runtime was split between C and Go. Clean this up by making mheap_.spans a slice and eliminating h_spans. Change-Id: I3aa7038d53c3a4252050aa33e468c48dfed0b70e Reviewed-on: https://go-review.googlesource.com/30532 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-25 22:32:48 +00:00
Austin Clements	66e849b168	runtime: eliminate mheap.nspan and use range loops This was necessary in the C days when allspans was an mspan**, but now that allspans is a Go slice, this is redundant with len(allspans) and we can use range loops over allspans. Change-Id: Ie1dc39611e574e29a896e01690582933f4c5be7e Reviewed-on: https://go-review.googlesource.com/30531 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-25 22:32:45 +00:00
Austin Clements	4d6207790b	runtime: consolidate h_allspans and mheap_.allspans These are two ways to refer to the allspans array that hark back to when the runtime was split between C and Go. Clean this up by making mheap_.allspans a slice and eliminating h_allspans. Change-Id: Ic9360d040cf3eb590b5dfbab0b82e8ace8525610 Reviewed-on: https://go-review.googlesource.com/30530 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-25 22:32:42 +00:00
Austin Clements	1bc6be6423	runtime: mark several types go:notinheap This covers basically all sysAlloc'd, persistentalloc'd, and fixalloc'd types. Change-Id: I0487c887c2a0ade5e33d4c4c12d837e97468e66b Reviewed-on: https://go-review.googlesource.com/30941 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-15 17:58:20 +00:00
Austin Clements	991a85c889	runtime: make mSpanList more go:notinheap-friendly Currently mspan links to its previous mspan using a *mspan field that points to the previous span's next field. This simplifies some of the list manipulation code, but is going to make it very hard to convince the compiler that mspan list manipulations don't need write barriers. Fix this by using a more traditional ("boring") linked list that uses a simple mspan pointer to the previous mspan. This complicates some of the list manipulation slightly, but it will let us eliminate all write barriers from the mspan list manipulation code by marking mspan go:notinheap. Change-Id: I0d0b212db5f20002435d2a0ed2efc8aa0364b905 Reviewed-on: https://go-review.googlesource.com/30940 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-15 17:58:17 +00:00
Austin Clements	38f1df66ff	runtime: make gcDumpObject useful on stack frames gcDumpObject is often used on a stack pointer (for example, when checkmark finds an unmarked object on the stack), but since stack spans don't have an elemsize, it doesn't print any of the memory from the frame. Make it at least slightly more useful by printing everything between obj and obj+off (inclusive). While we're here, also print out the span state. Change-Id: I51be064ea8791b4a365865bfdc7afa7b5aaecfbd Reviewed-on: https://go-review.googlesource.com/30142 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-03 21:59:54 +00:00
Austin Clements	6879dbde4e	runtime: introduce a type for span states Currently span states are untyped constants and the field is just a uint8. Make this more type-safe by introducing a type for the span state. Change-Id: I369bf59fe6e8234475f4921611424fceb7d0a6de Reviewed-on: https://go-review.googlesource.com/30141 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-10-03 21:59:45 +00:00
Austin Clements	6dda7b2f5f	runtime: don't hard-code physical page size Now that the runtime fetches the true physical page size from the OS, make the physical page size used by heap growth a variable instead of a constant. This isn't used in any performance-critical paths, so it shouldn't be an issue. sys.PhysPageSize is also renamed to sys.DefaultPhysPageSize to make it clear that it's not necessarily the true page size. There are no uses of this constant any more, but we'll keep it around for now. Updates #12480 and #10180. Change-Id: I6c23b9df860db309c38c8287a703c53817754f03 Reviewed-on: https://go-review.googlesource.com/25022 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2016-09-06 21:05:53 +00:00
Austin Clements	3de7dbb191	runtime: fix check for vacuous page boundary rounding again The previous fix for this, commit `336dad2a`, had everything right in the commit message, but reversed the test in the code. Fix the test in the code. This reversal effectively disabled the scavenger on large page systems except in the rare cases where this code was originally wrong, which is why it didn't obviously show up in testing. Fixes #16644. Again. :( Change-Id: I27cce4aea13de217197db4b628f17860f27ce83e Reviewed-on: https://go-review.googlesource.com/27402 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2016-08-19 20:16:43 +00:00
Austin Clements	336dad2a07	runtime: fix check for vacuous page boundary rounding sysUnused (e.g., madvise MADV_FREE) is only sensible to call on physical page boundaries, so scavengelist rounds in the bounds of the region being released to the nearest physical page boundaries. However, if the region is smaller than a physical page and neither the start nor end fall on a boundary, then rounding the start up to a page boundary and the end down to a page boundary will result in end < start. Currently, we only give up on the region if start == end, so if we encounter end < start, we'll call madvise with a negative length and the madvise will fail. Issue #16644 gives a concrete example of this: start = 0x1285ac000 end = 0x1285ae000 (1 8K page) This leads to the rounded values start = 0x1285b0000 end = 0x1285a0000 which leads to len = -65536. Fix this by giving up on the region if end <= start, not just if end == start. Fixes #16644. Change-Id: I8300db492dbadc82ac1ad878318b36bcb7c39524 Reviewed-on: https://go-review.googlesource.com/27230 Reviewed-by: Keith Randall <khr@golang.org>	2016-08-17 14:04:16 +00:00
Austin Clements	f407ca9288	runtime: support smaller physical pages than PhysPageSize Most operations need an upper bound on the physical page size, which is what sys.PhysPageSize is for (this is checked at runtime init on Linux). However, a few operations need a lower bound on the physical page size. Introduce a "minPhysPageSize" constant to act as this lower bound and use it where it makes sense: 1) In addrspace_free, we have to query each page in the given range. Currently we increment by the upper bound on the physical page size, which means we may skip over pages if the true size is smaller. Worse, we currently pass a result buffer that only has enough room for one page. If there are actually multiple pages in the range passed to mincore, the kernel will overflow this buffer. Fix these problems by incrementing by the lower-bound on the physical page size and by passing "1" for the length, which the kernel will round up to the true physical page size. 2) In the write barrier, the bad pointer check tests for pointers to the first physical page, which are presumably small integers masquerading as pointers. However, if physical pages are smaller than we think, we may have legitimate pointers below sys.PhysPageSize. Hence, use minPhysPageSize for this test since pointers should never fall below that. In particular, this applies to ARM64 and MIPS. The runtime is configured to use 64kB pages on ARM64, but by default Linux uses 4kB pages. Similarly, the runtime assumes 16kB pages on MIPS, but both 4kB and 16kB kernel configurations are common. This also applies to ARM on systems where the runtime is recompiled to deal with a larger page size. It is also a step toward making the runtime use only a dynamically-queried page size. Change-Id: I1fdfd18f6e7cbca170cc100354b9faa22fde8a69 Reviewed-on: https://go-review.googlesource.com/25020 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Austin Clements <austin@google.com>	2016-07-20 18:28:43 +00:00
Austin Clements	64770f642f	runtime: use conventional shift style for gcBitsChunkBytes The convention for writing something like "64 kB" is 64<<10, since this is easier to read than 1<<16. Update gcBitsChunkBytes to follow this convention. Change-Id: I5b5a3f726dcf482051ba5b1814db247ff3b8bb2f Reviewed-on: https://go-review.googlesource.com/23132 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-05-16 18:28:38 +00:00
Elias Naur	e6ec82067a	runtime: use entire address space on 32 bit In issue #13992, Russ mentioned that the heap bitmap footprint was halved but that the bitmap size calculation hadn't been updated. This presents the opportunity to either halve the bitmap size or double the addressable virtual space. This CL doubles the addressable virtual space. On 32 bit this can be tweaked further to allow the bitmap to cover the entire 4GB virtual address space, removing a failure mode if the kernel hands out memory with a too low address. First, fix the calculation and double _MaxArena32 to cover 4GB virtual memory space with the same bitmap size (256 MB). Then, allow the fallback mode for the initial memory reservation on 32 bit (or 64 bit with too little available virtual memory) to not include space for the arena. mheap.sysAlloc will automatically reserve additional space when the existing arena is full. Finally, set arena_start to 0 in 32 bit mode, so that any address is acceptable for subsequent (additional) reservations. Before, the bitmap was always located just before arena_start, so fix the two places relying on that assumption: Point the otherwise unused mheap.bitmap to one byte after the end of the bitmap, and use it for bitmap addressing instead of arena_start. With arena_start set to 0 on 32 bit, the cgoInRange check is no longer a sufficient check for Go pointers. Introduce and call inHeapOrStack to check whether a pointer is to the Go heap or stack. While we're here, remove sysReserveHigh which seems to be unused. Fixes #13992 Change-Id: I592b513148a50b9d3967b5c5d94b86b3ec39acc2 Reviewed-on: https://go-review.googlesource.com/20471 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-05-07 03:04:39 +00:00
Austin Clements	3e2462387f	[dev.garbage] runtime: eliminate mspan.start This converts all remaining uses of mspan.start to instead use mspan.base(). In many cases, this actually reduces the complexity of the code. Change-Id: If113840e00d3345a6cf979637f6a152e6344aee7 Reviewed-on: https://go-review.googlesource.com/22590 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2016-04-29 03:53:17 +00:00
Austin Clements	b7adc41fba	[dev.garbage] runtime: use s.base() everywhere it makes sense Currently we have lots of (s.start << _PageShift) and variants. We now have an s.base() function that returns this. It's faster and more readable, so use it. Change-Id: I888060a9dae15ea75ca8cc1c2b31c905e71b452b Reviewed-on: https://go-review.googlesource.com/22559 Reviewed-by: Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2016-04-29 03:53:14 +00:00
Rick Hudson	2fb75ea6c6	[dev.garbage] runtime: use sys.Ctz64 intrinsic Our compilers now provides instrinsics including sys.Ctz64 that support CTZ (count trailing zero) instructions. This CL replaces the Go versions of CTZ with the compiler intrinsic. Count trailing zeros CTZ finds the least significant 1 in a word and returns the number of less significant 0s in the word. Allocation uses the bitmap created by the garbage collector to locate an unmarked object. The logic takes a word of the bitmap, complements, and then caches it. It then uses CTZ to locate an available unmarked object. It then shifts marked bits out of the bitmap word preparing it for the next search. Once all the unmarked objects are used in the cached work the bitmap gets another word and repeats the process. Change-Id: Id2fc42d1d4b9893efaa2e1bd01896985b7e42f82 Reviewed-on: https://go-review.googlesource.com/21366 Reviewed-by: Austin Clements <austin@google.com>	2016-04-29 00:00:50 +00:00
Rick Hudson	2063d5d903	[dev.garbage] runtime: restructure alloc and mark bits Two changes are included here that are dependent on the other. The first is that allocBits and gcamrkBits are changed to a *uint8 which points to the first byte of that span's mark and alloc bits. Several places were altered to perform pointer arithmetic to locate the byte corresponding to an object in the span. The actual bit corresponding to an object is indexed in the byte by using the lower three bits of the objects index. The second change avoids the redundant calculation of an object's index. The index is returned from heapBitsForObject and then used by the functions indexing allocBits and gcmarkBits. Finally we no longer allocate the gc bits in the span structures. Instead we use an arena based allocation scheme that allows for a more compact bit map as well as recycling and bulk clearing of the mark bits. Change-Id: If4d04b2021c092ec39a4caef5937a8182c64dfef Reviewed-on: https://go-review.googlesource.com/20705 Reviewed-by: Austin Clements <austin@google.com>	2016-04-29 00:00:47 +00:00
Rick Hudson	23aeb34df1	[dev.garbage] Merge remote-tracking branch 'origin/master' into HEAD Change-Id: I282fd9ce9db435dfd35e882a9502ab1abc185297	2016-04-27 18:46:52 -04:00
Rick Hudson	f8d0d4fd59	[dev.garbage] runtime: cleanup and optimize span.base() Prior to this CL the base of a span was calculated in various places using shifts or calls to base(). This CL now always calls base() which has been optimized to calculate the base of the span when the span is initialized and store that value in the span structure. Change-Id: I661f2bfa21e3748a249cdf049ef9062db6e78100 Reviewed-on: https://go-review.googlesource.com/20703 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:59 +00:00
Rick Hudson	8dda1c4c08	[dev.garbage] runtime: remove heapBitsSweepSpan Prior to this CL the sweep phase was responsible for locating all objects that were about to be freed and calling a function to process the object. This was done by the function heapBitsSweepSpan. Part of processing included calls to tracefree and msanfree as well as counting how many objects were freed. The calls to tracefree and msanfree have been moved into the gcmalloc routine and called when the object is about to be reallocated. The counting of free objects has been optimized using an array based popcnt algorithm and if all the objects in a span are free then span is freed. Similarly the code to locate the next free object has been optimized to use an array based ctz (count trailing zero). Various hot paths in the allocation logic have been optimized. At this point the garbage benchmark is within 3% of the 1.6 release. Change-Id: I00643c442e2ada1685c010c3447e4ea8537d2dfa Reviewed-on: https://go-review.googlesource.com/20201 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:57 +00:00
Rick Hudson	4093481523	[dev.garbage] runtime: add bit and cache ctz64 (count trailing zero) Add to each span a 64 bit cache (allocCache) of the allocBits at freeindex. allocCache is shifted such that the lowest bit corresponds to the bit freeindex. allocBits uses a 0 to indicate an object is free, on the other hand allocCache uses a 1 to indicate an object is free. This facilitates ctz64 (count trailing zero) which counts the number of 0s trailing the least significant 1. This is also the index of the least significant 1. Each span maintains a freeindex indicating the boundary between allocated objects and unallocated objects. allocCache is shifted as freeindex is incremented such that the low bit in allocCache corresponds to the bit a freeindex in the allocBits array. Currently ctz64 is written in Go using a for loop so it is not very efficient. Use of the hardware instruction will follow. With this in mind comparisons of the garbage benchmark are as follows. 1.6 release 2.8 seconds dev:garbage branch 3.1 seconds. Profiling shows the go implementation of ctz64 takes up 1% of the total time. Change-Id: If084ed9c3b1eda9f3c6ab2e794625cb870b8167f Reviewed-on: https://go-review.googlesource.com/20200 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:54 +00:00
Rick Hudson	e4ac2d4acc	[dev.garbage] runtime: replace ref with allocCount This is a renaming of the field ref to the more appropriate allocCount. The field holds the number of objects in the span that are currently allocated. Some throws strings were adjusted to more accurately convey the meaning of allocCount. Change-Id: I10daf44e3e9cc24a10912638c7de3c1984ef8efe Reviewed-on: https://go-review.googlesource.com/19518 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:49 +00:00
Rick Hudson	3479b065d4	[dev.garbage] runtime: allocate directly from GC mark bits Instead of building a freelist from the mark bits generated by the GC this CL allocates directly from the mark bits. The approach moves the mark bits from the pointer/no pointer heap structures into their own per span data structures. The mark/allocation vectors consist of a single mark bit per object. Two vectors are maintained, one for allocation and one for the GC's mark phase. During the GC cycle's sweep phase the interpretation of the vectors is swapped. The mark vector becomes the allocation vector and the old allocation vector is cleared and becomes the mark vector that the next GC cycle will use. Marked entries in the allocation vector indicate that the object is not free. Each allocation vector maintains a boundary between areas of the span already allocated from and areas not yet allocated from. As objects are allocated this boundary is moved until it reaches the end of the span. At this point further allocations will be done from another span. Since we no longer sweep a span inspecting each freed object the responsibility for maintaining pointer/scalar bits in the heapBitMap containing is now the responsibility of the the routines doing the actual allocation. This CL is functionally complete and ready for performance tuning. Change-Id: I336e0fc21eef1066e0b68c7067cc71b9f3d50e04 Reviewed-on: https://go-review.googlesource.com/19470 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:47 +00:00
Rick Hudson	aed861038f	[dev.garbage] runtime: add stackfreelist The freelist for normal objects and the freelist for stacks share the same mspan field for holding the list head but are operated on by different code sequences. This overloading complicates the use of bit vectors for allocation of normal objects. This change refactors the use of the stackfreelist out from the use of freelist. Change-Id: I5b155b5b8a1fcd8e24c12ee1eb0800ad9b6b4fa0 Reviewed-on: https://go-review.googlesource.com/19315 Reviewed-by: Austin Clements <austin@google.com>	2016-04-27 21:54:39 +00:00
Rick Hudson	2ac8bdc52a	[dev.garbage] runtime: bitmap allocation data structs The bitmap allocation data structure prototypes. Before this is released these underlying data structures need to be more performant but the signatures of helper functions utilizing these structures will remain stable. Change-Id: I5ace12f2fb512a7038a52bbde2bfb7e98783bcbe Reviewed-on: https://go-review.googlesource.com/19221 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2016-04-27 21:54:35 +00:00
Austin Clements	2cdcb6f829	runtime: scavenge memory on physical page-aligned boundaries Currently the scavenger marks memory unused in multiples of the allocator page size (8K). This is safe as long as the true physical page size is 4K (or 8K), as it is on many platforms. However, on ARM64, PPC64x, and MIPS64, the physical page size is larger than 8K, so if we attempt to mark memory unused, the kernel will round the boundaries of the region out to all pages covered by the requested region, and we'll release a larger region of memory than intended. As a result, the scavenger is currently disabled on these platforms. Fix this by first rounding the region to be marked unused in to multiples of the physical page size, so that when we ask the kernel to mark it unused, it releases exactly the requested region. Fixes #9993. Change-Id: I96d5fdc2f77f9d69abadcea29bcfe55e68288cb1 Reviewed-on: https://go-review.googlesource.com/22066 Reviewed-by: Rick Hudson <rlh@golang.org>	2016-04-16 21:42:43 +00:00

1 2

98 commits