Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
Michael Anthony Knyszek	2f99e889f0	runtime: de-duplicate coalescing code Currently the code surrounding coalescing is duplicated between merging with the span before the span considered for coalescing and merging with the span after. This change factors out the shared portions of these codepaths into a local closure which acts as a helper. Change-Id: I7919fbed3f9a833eafb324a21a4beaa81f2eaa91 Reviewed-on: https://go-review.googlesource.com/c/158077 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-01-17 17:05:37 +00:00
Michael Anthony Knyszek	79ac638e41	runtime: refactor coalescing into its own method The coalescing process is complex and in a follow-up change we'll need to do it in more than one place, so this change factors out the coalescing code in freeSpanLocked into a method on mheap. Change-Id: Ia266b6cb1157c1b8d3d8a4287b42fbcc032bbf3a Reviewed-on: https://go-review.googlesource.com/c/157838 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-01-17 16:59:49 +00:00
Michael Anthony Knyszek	4b3f04c63b	runtime: make mTreap iterator bidirectional This change makes mTreap's iterator type, treapIter, bidirectional instead of unidirectional. This change helps support moving the find operation on a treap to return an iterator instead of a treapNode, in order to hide the details of the treap when accessing elements. For #28479. Change-Id: I5dbea4fd4fb9bede6e81bfd089f2368886f98943 Reviewed-on: https://go-review.googlesource.com/c/156918 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-01-10 18:15:48 +00:00
Michael Anthony Knyszek	064842450b	runtime: allocate from free and scav fairly This change modifies the behavior of span allocations to no longer prefer the free treap over the scavenged treap. While there is an additional cost to allocating out of the scavenged treap, the current behavior of preferring the unscavenged spans can lead to unbounded growth of a program's virtual memory footprint. In small programs (low # of Ps, low resident set size, low allocation rate) this behavior isn't really apparent and is difficult to reproduce. However, in relatively large, long-running programs we see this unbounded growth in free spans, and an unbounded amount of heap growths. It still remains unclear how this policy change actually ends up increasing the number of heap growths over time, but switching the policy back to best-fit does indeed solve the problem. Change-Id: Ibb88d24f9ef6766baaa7f12b411974cc03341e7b Reviewed-on: https://go-review.googlesource.com/c/148979 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-12-17 23:28:36 +00:00
Michael Anthony Knyszek	3651476075	runtime: add iterator abstraction for mTreap This change adds the treapIter type which provides an iterator abstraction for walking over an mTreap. In particular, the mTreap type now has iter() and rev() for iterating both forwards (smallest to largest) and backwards (largest to smallest). It also has an erase() method for erasing elements at the iterator's current position. For #28479. While the expectation is that this change will slow down Go programs, the impact on Go1 and Garbage is negligible. Go1: https://perf.golang.org/search?q=upload:20181214.6 Garbage: https://perf.golang.org/search?q=upload:20181214.11 Change-Id: I60dbebbbe73cbbe7b78d45d2093cec12cc0bc649 Reviewed-on: https://go-review.googlesource.com/c/151537 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-12-17 23:28:18 +00:00
Austin Clements	b00a6d8bfe	runtime: eliminate mheap.busy* lists The old whole-page reclaimer was the only thing that used the busy span lists. Remove them so nothing uses them any more. Change-Id: I4007dd2be08b9ef41bfdb0c387215c73c392cc4c Reviewed-on: https://go-review.googlesource.com/c/138960 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2018-11-15 19:27:13 +00:00
Austin Clements	5333550bdc	runtime: implement efficient page reclaimer When we attempt to allocate an N page span (either for a large allocation or when an mcentral runs dry), we first try to sweep spans to release N pages. Currently, this can be extremely expensive: sweeping a span to emptiness is the hardest thing to ask for and the sweeper generally doesn't know where to even look for potentially fruitful results. Since this is on the critical path of many allocations, this is unfortunate. This CL changes how we reclaim empty spans. Instead of trying lots of spans and hoping for the best, it uses the newly introduced span marks to efficiently find empty spans. The span marks (and in-use bits) are in a dense bitmap, so these spans can be found with an efficient sequential memory scan. This approach can scan for unmarked spans at about 300 GB/ms and can free unmarked spans at about 32 MB/ms. We could probably significantly improve the rate at which is can free unmarked spans, but that's a separate issue. Like the current reclaimer, this is still linear in the number of spans that are swept, but the constant factor is now so vanishingly small that it doesn't matter. The benchmark in #18155 demonstrates both significant page reclaiming delays, and object reclaiming delays. With "-retain-count=20000000 -preallocate=true -loop-count=3", the benchmark demonstrates several page reclaiming delays on the order of 40ms. After this change, the page reclaims are insignificant. The longest sweeps are still ~150ms, but are object reclaiming delays. We'll address those in the next several CLs. Updates #18155. Fixes #21378 by completely replacing the logic that had that bug. Change-Id: Iad80eec11d7fc262d02c8f0761ac6998425c4064 Reviewed-on: https://go-review.googlesource.com/c/138959 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-11-15 19:27:11 +00:00
Austin Clements	ba1698e963	runtime: mark span when marking any object on the span This adds a mark bit for each span that is set if any objects on the span are marked. This will be used for sweeping. For #18155. The impact of this is negligible for most benchmarks, and < 1% for GC-heavy benchmarks. name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.18ms ± 0% 2.20ms ± 1% +0.88% (p=0.000 n=16+18) (https://perf.golang.org/search?q=upload:20180928.1) name old time/op new time/op delta BinaryTree17-12 2.68s ± 1% 2.68s ± 1% ~ (p=0.707 n=17+19) Fannkuch11-12 2.28s ± 0% 2.39s ± 0% +4.95% (p=0.000 n=19+18) FmtFprintfEmpty-12 40.3ns ± 4% 39.4ns ± 2% -2.27% (p=0.000 n=17+18) FmtFprintfString-12 67.9ns ± 1% 68.3ns ± 1% +0.55% (p=0.000 n=18+19) FmtFprintfInt-12 75.7ns ± 1% 76.1ns ± 1% +0.44% (p=0.005 n=18+19) FmtFprintfIntInt-12 123ns ± 1% 121ns ± 1% -1.00% (p=0.000 n=18+18) FmtFprintfPrefixedInt-12 150ns ± 0% 148ns ± 0% -1.33% (p=0.000 n=16+13) FmtFprintfFloat-12 208ns ± 0% 204ns ± 0% -1.92% (p=0.000 n=13+17) FmtManyArgs-12 501ns ± 1% 498ns ± 0% -0.55% (p=0.000 n=19+17) GobDecode-12 6.24ms ± 0% 6.25ms ± 1% ~ (p=0.113 n=20+19) GobEncode-12 5.33ms ± 0% 5.29ms ± 1% -0.72% (p=0.000 n=20+18) Gzip-12 220ms ± 1% 218ms ± 1% -1.02% (p=0.000 n=19+19) Gunzip-12 35.5ms ± 0% 35.7ms ± 0% +0.45% (p=0.000 n=16+18) HTTPClientServer-12 77.9µs ± 1% 77.7µs ± 1% -0.30% (p=0.047 n=20+19) JSONEncode-12 8.82ms ± 0% 8.93ms ± 0% +1.20% (p=0.000 n=18+17) JSONDecode-12 47.3ms ± 0% 47.0ms ± 0% -0.49% (p=0.000 n=17+18) Mandelbrot200-12 3.69ms ± 0% 3.68ms ± 0% -0.25% (p=0.000 n=19+18) GoParse-12 3.13ms ± 1% 3.13ms ± 1% ~ (p=0.640 n=20+20) RegexpMatchEasy0_32-12 76.2ns ± 1% 76.2ns ± 1% ~ (p=0.818 n=20+19) RegexpMatchEasy0_1K-12 226ns ± 0% 226ns ± 0% -0.22% (p=0.001 n=17+18) RegexpMatchEasy1_32-12 71.9ns ± 1% 72.0ns ± 1% ~ (p=0.653 n=18+18) RegexpMatchEasy1_1K-12 355ns ± 1% 356ns ± 1% ~ (p=0.160 n=18+19) RegexpMatchMedium_32-12 106ns ± 1% 106ns ± 1% ~ (p=0.325 n=17+20) RegexpMatchMedium_1K-12 31.1µs ± 2% 31.2µs ± 0% +0.59% (p=0.007 n=19+15) RegexpMatchHard_32-12 1.54µs ± 2% 1.53µs ± 2% -0.78% (p=0.021 n=17+18) RegexpMatchHard_1K-12 46.0µs ± 1% 45.9µs ± 1% -0.31% (p=0.025 n=17+19) Revcomp-12 391ms ± 1% 394ms ± 2% +0.80% (p=0.000 n=17+19) Template-12 59.9ms ± 1% 59.9ms ± 1% ~ (p=0.428 n=20+19) TimeParse-12 304ns ± 1% 312ns ± 0% +2.88% (p=0.000 n=20+17) TimeFormat-12 318ns ± 0% 326ns ± 0% +2.64% (p=0.000 n=20+17) (https://perf.golang.org/search?q=upload:20180928.2) Change-Id: I336b9bf054113580a24103192904c8c76593e90e Reviewed-on: https://go-review.googlesource.com/c/138958 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2018-11-15 19:27:09 +00:00
Austin Clements	69e666e4f7	runtime: record in-use spans in a page-indexed bitmap This adds a bitmap indexed by page number that marks the starts of in-use spans. This will be used to quickly find in-use spans with no marked objects for sweeping. For #18155. Change-Id: Icee56f029cde502447193e136fa54a74c74326dd Reviewed-on: https://go-review.googlesource.com/c/138957 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2018-11-15 19:27:06 +00:00
Austin Clements	e500ffd88c	runtime: track all heap arenas in a slice Currently, there's no efficient way to iterate over the Go heap. We're going to need this for fast free page sweeping, so this CL adds a slice of all allocated heap arenas. This will also be useful for generational GC. For #18155. Change-Id: I58d126cfb9c3f61b3125d80b74ccb1b2169efbcc Reviewed-on: https://go-review.googlesource.com/c/138076 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-11-15 19:14:10 +00:00
Michael Anthony Knyszek	3a7a56cc70	runtime: gofmt all improperly formatted code This change fixes incorrect formatting in mheap.go (the result of my previous heap scavenging changes) and map_test.go. Change-Id: I2963687504abdc4f0cdf2f0c558174b3bc0ed2df Reviewed-on: https://go-review.googlesource.com/c/148977 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-11-11 16:09:05 +00:00
Michael Anthony Knyszek	06be7cbf3c	runtime: stop unnecessary span scavenges on free This change fixes a bug wherein freeing a scavenged span that didn't coalesce with any neighboring spans would result in that span getting scavenged again. This case may actually be a common occurance because "freeing" span trimmings and newly-grown spans end up using the same codepath. On systems where madvise is relatively expensive, this can have a large performance impact. This change also cleans up some of this logic in freeSpanLocked since a number of factors made the coalescing code somewhat difficult to reason about with respect to scavenging. Notably, the way the needsScavenge boolean is handled could be better expressed and the inverted conditions (e.g. !after.released) can make things even more confusing. Fixes #28595. Change-Id: I75228dba70b6596b90853020b7c24fbe7ab937cf Reviewed-on: https://go-review.googlesource.com/c/147559 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-11-09 20:57:57 +00:00
Michael Anthony Knyszek	44dcb5cb61	runtime: clean up MSpan* MCache* MCentral* in docs This change cleans up references to MSpan, MCache, and MCentral in the docs via a bunch of sed invocations to better reflect the Go names for the equivalent structures (i.e. mspan, mcache, mcentral) and their methods (i.e. MSpan_Sweep -> mspan.sweep). Change-Id: Ie911ac975a24bd25200a273086dd835ab78b1711 Reviewed-on: https://go-review.googlesource.com/c/147557 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-11-05 22:56:22 +00:00
Michael Anthony Knyszek	2ae8bf7054	runtime: fix stale comments about mheap and mspan As of `07e738e` all spans are allocated out of a treap, and not just large spans or spans for large objects. Also, now we have a separate treap for spans that have been scavenged. Change-Id: I9c2cb7b6798fc536bbd34835da2e888224fd7ed4 Reviewed-on: https://go-review.googlesource.com/c/142958 Reviewed-by: Austin Clements <austin@google.com>	2018-11-05 19:30:42 +00:00
Michael Anthony Knyszek	c803ffc67d	runtime: scavenge large spans before heap growth This change scavenges the largest spans before growing the heap for physical pages to "make up" for the newly-mapped space which, presumably, will be touched. In theory, this approach to scavenging helps reduce the RSS of an application by marking fragments in memory as reclaimable to the OS more eagerly than before. In practice this may not necessarily be true, depending on how sysUnused is implemented for each platform. Fixes #14045. Change-Id: Iab60790be05935865fc71f793cb9323ab00a18bd Reviewed-on: https://go-review.googlesource.com/c/139719 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-10-30 15:41:55 +00:00
Michael Anthony Knyszek	db82a1bc12	runtime: sysUsed spans after trimming Currently, we mark a whole span as sysUsed before trimming, but this unnecessarily tells the OS that the trimmed section from the span is used when it may have been scavenged, if s was scavenged. Overall, this just makes invocations of sysUsed a little more fine-grained. It does come with the caveat that now heap_released needs to be managed a little more carefully in allocSpanLocked. In this case, we choose to (like before this change) negate any effect the span has on heap_released before trimming, then add it back if the trimmed part is scavengable. For #14045. Change-Id: Ifa384d989611398bfad3ca39d3bb595a5962a3ea Reviewed-on: https://go-review.googlesource.com/c/140198 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-10-30 15:28:24 +00:00
Michael Anthony Knyszek	78bb91cbd3	runtime: remove npreleased in favor of boolean This change removes npreleased from mspan since spans may now either be scavenged or not scavenged; how many of its pages were actually scavenged doesn't matter. It saves some space in mpsan overhead too, as the boolean fits into what would otherwise be struct padding. For #14045. Change-Id: I63f25a4d98658f5fe21c6a466fc38c59bfc5d0f5 Reviewed-on: https://go-review.googlesource.com/c/139737 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-10-30 15:28:01 +00:00
Michael Anthony Knyszek	b46bf0240c	runtime: separate scavenged spans This change adds a new treap to mheap which contains scavenged (i.e. its physical pages were returned to the OS) spans. As of this change, spans may no longer be partially scavenged. For #14045. Change-Id: I0d428a255c6d3f710b9214b378f841b997df0993 Reviewed-on: https://go-review.googlesource.com/c/139298 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-10-30 15:27:51 +00:00
Michael Anthony Knyszek	07e738ec32	runtime: use only treaps for tracking spans Currently, mheap tracks spans in both mSpanLists and mTreaps, but mSpanLists, while they tend to be smaller, complicate the implementation. Here we simplify the implementation by removing free and busy from mheap and renaming freelarge -> free and busylarge -> busy. This change also slightly changes the reclamation policy. Previously, for allocations under 1MB we would attempt to find a small span of the right size. Now, we just try to find any number of spans totaling the right size. This may increase heap fragmentation, but that will be dealt with using virtual memory tricks in follow-up CLs. For #14045. Garbage-heavy benchmarks show very little change, except what appears to be a decrease in STW times and peak RSS. name old STW-ns/GC new STW-ns/GC delta Garbage/benchmem-MB=64-8 263k ±64% 217k ±24% -17.66% (p=0.028 n=25+23) name old STW-ns/op new STW-ns/op delta Garbage/benchmem-MB=64-8 9.39k ±65% 7.80k ±24% -16.88% (p=0.037 n=25+23) name old peak-RSS-bytes new peak-RSS-bytes delta Garbage/benchmem-MB=64-8 281M ± 0% 249M ± 4% -11.40% (p=0.000 n=19+18) https://perf.golang.org/search?q=upload:20181005.1 Go1 benchmarks perform roughly the same, the most notable regression being the JSON encode/decode benchmark with worsens by ~2%. name old time/op new time/op delta BinaryTree17-8 3.02s ± 2% 2.99s ± 2% -1.18% (p=0.000 n=25+24) Fannkuch11-8 3.05s ± 1% 3.02s ± 2% -1.20% (p=0.000 n=25+25) FmtFprintfEmpty-8 43.6ns ± 5% 43.4ns ± 3% ~ (p=0.528 n=25+25) FmtFprintfString-8 74.9ns ± 3% 73.4ns ± 1% -2.03% (p=0.001 n=25+24) FmtFprintfInt-8 79.3ns ± 3% 77.9ns ± 1% -1.73% (p=0.003 n=25+25) FmtFprintfIntInt-8 119ns ± 6% 116ns ± 0% -2.68% (p=0.000 n=25+18) FmtFprintfPrefixedInt-8 134ns ± 4% 132ns ± 1% -1.52% (p=0.004 n=25+25) FmtFprintfFloat-8 240ns ± 1% 241ns ± 1% ~ (p=0.403 n=24+23) FmtManyArgs-8 543ns ± 1% 537ns ± 1% -1.00% (p=0.000 n=25+25) GobDecode-8 6.88ms ± 1% 6.92ms ± 4% ~ (p=0.088 n=24+22) GobEncode-8 5.92ms ± 1% 5.93ms ± 1% ~ (p=0.898 n=25+24) Gzip-8 267ms ± 2% 266ms ± 2% ~ (p=0.213 n=25+24) Gunzip-8 35.4ms ± 1% 35.6ms ± 1% +0.70% (p=0.000 n=25+25) HTTPClientServer-8 104µs ± 2% 104µs ± 2% ~ (p=0.686 n=25+25) JSONEncode-8 9.67ms ± 1% 9.80ms ± 4% +1.32% (p=0.000 n=25+25) JSONDecode-8 47.7ms ± 1% 48.8ms ± 5% +2.33% (p=0.000 n=25+25) Mandelbrot200-8 4.87ms ± 1% 4.91ms ± 1% +0.79% (p=0.000 n=25+25) GoParse-8 3.59ms ± 4% 3.55ms ± 1% ~ (p=0.199 n=25+24) RegexpMatchEasy0_32-8 90.3ns ± 1% 89.9ns ± 1% -0.47% (p=0.000 n=25+21) RegexpMatchEasy0_1K-8 204ns ± 1% 204ns ± 1% ~ (p=0.914 n=25+24) RegexpMatchEasy1_32-8 84.9ns ± 0% 84.6ns ± 1% -0.36% (p=0.000 n=24+25) RegexpMatchEasy1_1K-8 350ns ± 1% 348ns ± 3% -0.59% (p=0.007 n=25+25) RegexpMatchMedium_32-8 122ns ± 1% 121ns ± 0% -1.08% (p=0.000 n=25+18) RegexpMatchMedium_1K-8 36.1µs ± 1% 34.6µs ± 1% -4.02% (p=0.000 n=25+25) RegexpMatchHard_32-8 1.69µs ± 2% 1.65µs ± 1% -2.38% (p=0.000 n=25+25) RegexpMatchHard_1K-8 50.8µs ± 1% 49.4µs ± 1% -2.69% (p=0.000 n=25+24) Revcomp-8 453ms ± 2% 449ms ± 3% -0.74% (p=0.022 n=25+24) Template-8 63.2ms ± 2% 63.4ms ± 1% ~ (p=0.127 n=25+24) TimeParse-8 313ns ± 1% 315ns ± 3% ~ (p=0.924 n=24+25) TimeFormat-8 294ns ± 1% 292ns ± 2% -0.65% (p=0.004 n=23+24) [Geo mean] 49.9µs 49.6µs -0.65% name old speed new speed delta GobDecode-8 112MB/s ± 1% 110MB/s ± 4% -1.00% (p=0.036 n=24+24) GobEncode-8 130MB/s ± 1% 129MB/s ± 1% ~ (p=0.894 n=25+24) Gzip-8 72.7MB/s ± 2% 73.0MB/s ± 2% ~ (p=0.208 n=25+24) Gunzip-8 549MB/s ± 1% 545MB/s ± 1% -0.70% (p=0.000 n=25+25) JSONEncode-8 201MB/s ± 1% 198MB/s ± 3% -1.29% (p=0.000 n=25+25) JSONDecode-8 40.7MB/s ± 1% 39.8MB/s ± 5% -2.23% (p=0.000 n=25+25) GoParse-8 16.2MB/s ± 4% 16.3MB/s ± 1% ~ (p=0.211 n=25+24) RegexpMatchEasy0_32-8 354MB/s ± 1% 356MB/s ± 1% +0.47% (p=0.000 n=25+21) RegexpMatchEasy0_1K-8 5.00GB/s ± 0% 4.99GB/s ± 1% ~ (p=0.588 n=24+24) RegexpMatchEasy1_32-8 377MB/s ± 1% 378MB/s ± 1% +0.39% (p=0.000 n=25+25) RegexpMatchEasy1_1K-8 2.92GB/s ± 1% 2.94GB/s ± 3% +0.65% (p=0.008 n=25+25) RegexpMatchMedium_32-8 8.14MB/s ± 1% 8.22MB/s ± 1% +0.98% (p=0.000 n=25+24) RegexpMatchMedium_1K-8 28.4MB/s ± 1% 29.6MB/s ± 1% +4.19% (p=0.000 n=25+25) RegexpMatchHard_32-8 18.9MB/s ± 2% 19.4MB/s ± 1% +2.43% (p=0.000 n=25+25) RegexpMatchHard_1K-8 20.2MB/s ± 1% 20.7MB/s ± 1% +2.76% (p=0.000 n=25+24) Revcomp-8 561MB/s ± 2% 566MB/s ± 3% +0.75% (p=0.021 n=25+24) Template-8 30.7MB/s ± 2% 30.6MB/s ± 1% ~ (p=0.131 n=25+24) [Geo mean] 120MB/s 121MB/s +0.48% https://perf.golang.org/search?q=upload:20181004.6 Change-Id: I97f9fee34577961a116a8ddd445c6272253f0f95 Reviewed-on: https://go-review.googlesource.com/c/139837 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-10-17 16:26:10 +00:00
Michael Anthony Knyszek	e508a5f072	runtime: de-duplicate span scavenging Currently, span scavenging was done nearly identically in two different locations. This change deduplicates that into one shared routine. For #14045. Change-Id: I15006b2c9af0e70b7a9eae9abb4168d3adca3860 Reviewed-on: https://go-review.googlesource.com/c/139297 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2018-10-17 16:25:42 +00:00
Austin Clements	3f86d7cc67	runtime: tidy mheap.freeSpan freeSpan currently takes a mysterious "acct int32" argument. This is really just a boolean and actually just needs to match the "large" argument to alloc in order to balance out accounting. To make this clearer, replace acct with a "large bool" argument that must match the call to mheap.alloc. Change-Id: Ibc81faefdf9f0583114e1953fcfb362e9c3c76de Reviewed-on: https://go-review.googlesource.com/c/138655 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-10-09 16:43:18 +00:00
Igor Zhilianin	04dc1b2443	all: fix a bunch of misspellings Change-Id: I94cebca86706e072fbe3be782d3edbe0e22b9432 GitHub-Last-Rev: `8e15a40545` GitHub-Pull-Request: golang/go#28067 Reviewed-on: https://go-review.googlesource.com/c/140437 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-10-08 03:12:03 +00:00
Austin Clements	415e948eae	runtime: improve mheap.alloc doc and let compiler check system stack The alloc_m documentation refers to concepts that don't exist (and maybe never did?). alloc_m is also not the API entry point to span allocation. Hence, rewrite the documentation for alloc and alloc_m. While we're here, document why alloc_m must run on the system stack and replace alloc_m's hand-implemented system stack check with a go:systemstack annotation. Change-Id: I30e263d8e53c2774a6614e1b44df5464838cef09 Reviewed-on: https://go-review.googlesource.com/c/139459 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-10-05 16:05:17 +00:00
Keith Randall	cbafcc55e8	cmd/compile,runtime: implement stack objects Rework how the compiler+runtime handles stack-allocated variables whose address is taken. Direct references to such variables work as before. References through pointers, however, use a new mechanism. The new mechanism is more precise than the old "ambiguously live" mechanism. It computes liveness at runtime based on the actual references among objects on the stack. Each function records all of its address-taken objects in a FUNCDATA. These are called "stack objects". The runtime then uses that information while scanning a stack to find all of the stack objects on a stack. It then does a mark phase on the stack objects, using all the pointers found on the stack (and ancillary structures, like defer records) as the root set. Only stack objects which are found to be live during this mark phase will be scanned and thus retain any heap objects they point to. A subsequent CL will remove all the "ambiguously live" logic from the compiler, so that the stack object tracing will be required. For this CL, the stack tracing is all redundant with the current ambiguously live logic. Update #22350 Change-Id: Ide19f1f71a5b6ec8c4d54f8f66f0e9a98344772f Reviewed-on: https://go-review.googlesource.com/c/134155 Reviewed-by: Austin Clements <austin@google.com>	2018-10-03 19:52:49 +00:00
Austin Clements	873bd47dfb	runtime: flush mcaches lazily Currently, all mcaches are flushed during STW mark termination as a root marking job. This is currently necessary because all spans must be out of these caches before sweeping begins to avoid races with allocation and to ensure the spans are in the state expected by sweeping. We do it as a root marking job because mcache flushing is somewhat expensive and O(GOMAXPROCS) and this parallelizes the work across the Ps. However, it's also the last remaining root marking job performed during mark termination. This CL moves mcache flushing out of mark termination and performs it lazily. We keep track of the last sweepgen at which each mcache was flushed and as each P is woken from STW, it observes that its mcache is out-of-date and flushes it. The introduces a complication for spans cached in stale mcaches. These may now be observed by background or proportional sweeping or when attempting to add a finalizer, but aren't in a stable state. For example, they are likely to be on the wrong mcentral list. To fix this, this CL extends the sweepgen protocol to also capture whether a span is cached and, if so, whether or not its cache is stale. This protocol blocks asynchronous sweeping from touching cached spans and makes it the responsibility of mcache flushing to sweep the flushed spans. This eliminates the last mark termination root marking job, which means we can now eliminate that entire infrastructure. Updates #26903. This implements lazy mcache flushing. Change-Id: Iadda7aabe540b2026cffc5195da7be37d5b4125e Reviewed-on: https://go-review.googlesource.com/c/134783 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:35 +00:00
Austin Clements	d398dbdfc3	runtime: eliminate gcBlackenPromptly mode Now that there is no mark 2 phase, gcBlackenPromptly is no longer used. Updates #26903. This is a follow-up to eliminating mark 2. Change-Id: Ib9c534f21b36b8416fcf3cab667f186167b827f8 Reviewed-on: https://go-review.googlesource.com/c/134319 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-10-02 20:35:21 +00:00
Austin Clements	5a8c11ce3e	runtime: rename _MSpan* constants to mSpan* We already aliased mSpanInUse to _MSpanInUse. The dual constants are getting annoying, so fix all of these to use the mSpan* naming convention. This was done automatically with: sed -i -re 's/_?MSpan(Dead\|InUse\|Manual\|Free)/mSpan\1/g' *.go plus deleting the existing definition of mSpanInUse. Change-Id: I09979d9d491d06c10689cea625dc57faa9cc6767 Reviewed-on: https://go-review.googlesource.com/137875 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>	2018-09-26 20:51:07 +00:00
Martin Möhrmann	961eb13b67	runtime: replace sys.CacheLineSize by corresponding internal/cpu const and vars sys here is runtime/internal/sys. Replace uses of sys.CacheLineSize for padding by cpu.CacheLinePad or cpu.CacheLinePadSize. Replace other uses of sys.CacheLineSize by cpu.CacheLineSize. Remove now unused sys.CacheLineSize. Updates #25203 Change-Id: I1daf410fe8f6c0493471c2ceccb9ca0a5a75ed8f Reviewed-on: https://go-review.googlesource.com/126601 Run-TryBot: Martin Möhrmann <moehrmann@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2018-08-24 18:28:25 +00:00
Austin Clements	ec25210564	runtime: support a two-level arena map Currently, the heap arena map is a single, large array that covers every possible arena frame in the entire address space. This is practical up to about 48 bits of address space with 64 MB arenas. However, there are two problems with this: 1. mips64, ppc64, and s390x support full 64-bit address spaces (though on Linux only s390x has kernel support for 64-bit address spaces). On these platforms, it would be good to support these larger address spaces. 2. On Windows, processes are charged for untouched memory, so for processes with small heaps, the mostly-untouched 32 MB arena map plus a 64 MB arena are significant overhead. Hence, it would be good to reduce both the arena map size and the arena size, but with a single-level arena, these are inversely proportional. This CL adds support for a two-level arena map. Arena frame numbers are now divided into arenaL1Bits of L1 index and arenaL2Bits of L2 index. At the moment, arenaL1Bits is always 0, so we effectively have a single level map. We do a few things so that this has no cost beyond the current single-level map: 1. We embed the L2 array directly in mheap, so if there's a single entry in the L2 array, the representation is identical to the current representation and there's no extra level of indirection. 2. Hot code that accesses the arena map is structured so that it optimizes to nearly the same machine code as it does currently. 3. We make some small tweaks to hot code paths and to the inliner itself to keep some important functions inlined despite their now-larger ASTs. In particular, this is necessary for heapBitsForAddr and heapBits.next. Possibly as a result of some of the tweaks, this actually slightly improves the performance of the x/benchmarks garbage benchmark: name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.28ms ± 1% 2.26ms ± 1% -1.07% (p=0.000 n=17+19) (https://perf.golang.org/search?q=upload:20180223.2) For #23900. Change-Id: If5164e0961754f97eb9eca58f837f36d759505ff Reviewed-on: https://go-review.googlesource.com/96779 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-23 21:59:50 +00:00
Austin Clements	33b76920ec	runtime: rename "arena index" to "arena map" There are too many places where I want to talk about "indexing into the arena index". Make this less awkward and ambiguous by calling it the "arena map" instead. Change-Id: I726b0667bb2139dbc006175a0ec09a871cdf73f9 Reviewed-on: https://go-review.googlesource.com/96777 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-23 21:59:48 +00:00
Jerrin Shaji George	5b3cd56038	runtime: fix a few typos in comments Change-Id: I07a1eb02ffc621c5696b49491181300bf411f822 Reviewed-on: https://go-review.googlesource.com/96475 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-02-23 00:17:20 +00:00
Austin Clements	ea8d7a370d	runtime: clarify address space limit constants and comments Now that we support the full non-contiguous virtual address space of amd64 hardware, some of the comments and constants related to this are out of date. This renames memLimitBits to heapAddrBits because 1<<memLimitBits is no longer the limit of the address space and rewrites the comment to focus first on hardware limits (which span OSes) and then discuss kernel limits. Second, this eliminates the memLimit constant because there's no longer a meaningful "highest possible heap pointer value" on amd64. Updates #23862. Change-Id: I44b32033d2deb6b69248fb8dda14fc0e65c47f11 Reviewed-on: https://go-review.googlesource.com/95498 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-21 20:32:36 +00:00
Austin Clements	ed1959c6e6	runtime: offset the heap arena index by 2^47 on amd64 On amd64, the virtual address space, when interpreted as signed values, is [-2^47, 2^47). Currently, we only support heap addresses in the "positive" half of this, [0, 2^47). This suffices for linux/amd64 and windows/amd64, but solaris/amd64 can map user addresses in the negative part of this range. Specifically, addresses 0xFFFF8000'00000000 to 0xFFFFFD80'00000000 are part of user space. This leads to "memory allocated by OS not in usable address space" panic, since we don't map heap arena index space for these addresses. Fix this by offsetting addresses when computing arena indexes so that arena entry 0 corresponds to address -2^47 on amd64. We already map enough arena space for 2^48 heap addresses on 64-bit (because arm64's virtual address space is [0, 2^48)), so we don't need to grow any structures to support this. A different approach would be to simply mask out the top 16 bits. However, there are two advantages to the offset approach: 1) invalid heap addresses continue to naturally map to invalid arena indexes so we don't need extra checks and 2) it perturbs the mapping of addresses to arena indexes more, which helps check that we don't accidentally compute incorrect arena indexes somewhere that happen to be right most of the time. Several comments and constant names are now somewhat misleading. We'll fix that in the next CL. This CL is the core change the arena indexing. Fixes #23862. Change-Id: Idb8e299fded04593a286b01a9582da6ddbac2f9a Reviewed-on: https://go-review.googlesource.com/95497 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-21 20:32:35 +00:00
Austin Clements	e9db7b9dd1	runtime: abstract indexing of arena index Accessing the arena index is about to get slightly more complicated. Abstract this away into a set of functions for going back and forth between addresses and arena slice indexes. For #23862. Change-Id: I0b20e74ef47a07b78ed0cf0a6128afe6f6e40f4b Reviewed-on: https://go-review.googlesource.com/95496 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-21 20:32:34 +00:00
Ryuma Yoshida	8fc25b531b	all: remove duplicate word "the" Change-Id: Ia5908e94a6bd362099ca3c63f6ffb7e94457131d GitHub-Last-Rev: `545a40571a` GitHub-Pull-Request: golang/go#23942 Reviewed-on: https://go-review.googlesource.com/95435 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>	2018-02-20 16:45:55 +00:00
Austin Clements	51ae88ee2f	runtime: remove non-reserved heap logic Currently large sysReserve calls on some OSes don't actually reserve the memory, but just check that it can be reserved. This was important when we called sysReserve to "reserve" many gigabytes for the heap up front, but now that we map memory in small increments as we need it, this complication is no longer necessary. This has one curious side benefit: currently, on Linux, allocations that are large enough to be rejected by mmap wind up freezing the application for a long time before it panics. This happens because sysReserve doesn't reserve the memory, so sysMap calls mmap_fixed, which calls mmap, which fails because the mapping is too large. However, mmap_fixed doesn't inspect why mmap fails, so it falls back to probing every page in the desired region individually with mincore before performing an (otherwise dangerous) MAP_FIXED mapping, which will also fail. This takes a long time for a large region. Now this logic is gone, so the mmap failure leads to an immediate panic. Updates #10460. Change-Id: I8efe88c611871cdb14f99fadd09db83e0161ca2e Reviewed-on: https://go-review.googlesource.com/85888 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:24 +00:00
Austin Clements	2b415549b8	runtime: use sparse mappings for the heap This replaces the contiguous heap arena mapping with a potentially sparse mapping that can support heap mappings anywhere in the address space. This has several advantages over the current approach: * There is no longer any limit on the size of the Go heap. (Currently it's limited to 512GB.) Hence, this fixes #10460. * It eliminates many failures modes of heap initialization and growing. In particular it eliminates any possibility of panicking with an address space conflict. This can happen for many reasons and even causes a low but steady rate of TSAN test failures because of conflicts with the TSAN runtime. See #16936 and #11993. * It eliminates the notion of "non-reserved" heap, which was added because creating huge address space reservations (particularly on 64-bit) led to huge process VSIZE. This was at best confusing and at worst conflicted badly with ulimit -v. However, the non-reserved heap logic is complicated, can race with other mappings in non-pure Go binaries (e.g., #18976), and requires that the entire heap be either reserved or non-reserved. We currently maintain the latter property, but it's quite difficult to convince yourself of that, and hence difficult to keep correct. This logic is still present, but will be removed in the next CL. * It fixes problems on 32-bit where skipping over parts of the address space leads to mapping huge (and never-to-be-used) metadata structures. See #19831. This also completely rewrites and significantly simplifies mheap.sysAlloc, which has been a source of many bugs. E.g., #21044, #20259, #18651, and #13143 (and maybe #23222). This change also makes it possible to allocate individual objects larger than 512GB. As a result, a few tests that expected huge allocations to fail needed to be changed to make even larger allocations. However, at the moment attempting to allocate a humongous object may cause the program to freeze for several minutes on Linux as we fall back to probing every page with addrspace_free. That logic (and this failure mode) will be removed in the next CL. Fixes #10460. Fixes #22204 (since it rewrites the code involved). This slightly slows down compilebench and the x/benchmarks garbage benchmark. name old time/op new time/op delta Template 184ms ± 1% 185ms ± 1% ~ (p=0.065 n=10+9) Unicode 86.9ms ± 3% 86.3ms ± 1% ~ (p=0.631 n=10+10) GoTypes 599ms ± 0% 602ms ± 0% +0.56% (p=0.000 n=10+9) Compiler 2.87s ± 1% 2.89s ± 1% +0.51% (p=0.002 n=9+10) SSA 7.29s ± 1% 7.25s ± 1% ~ (p=0.182 n=10+9) Flate 118ms ± 2% 118ms ± 1% ~ (p=0.113 n=9+9) GoParser 147ms ± 1% 148ms ± 1% +1.07% (p=0.003 n=9+10) Reflect 401ms ± 1% 404ms ± 1% +0.71% (p=0.003 n=10+9) Tar 175ms ± 1% 175ms ± 1% ~ (p=0.604 n=9+10) XML 209ms ± 1% 210ms ± 1% ~ (p=0.052 n=10+10) (https://perf.golang.org/search?q=upload:20171231.4) name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.23ms ± 1% 2.25ms ± 1% +0.84% (p=0.000 n=19+19) (https://perf.golang.org/search?q=upload:20171231.3) Relative to the start of the sparse heap changes (starting at and including "runtime: fix various contiguous bitmap assumptions"), overall slowdown is roughly 1% on GC-intensive benchmarks: name old time/op new time/op delta Template 183ms ± 1% 185ms ± 1% +1.32% (p=0.000 n=9+9) Unicode 84.9ms ± 2% 86.3ms ± 1% +1.65% (p=0.000 n=9+10) GoTypes 595ms ± 1% 602ms ± 0% +1.19% (p=0.000 n=9+9) Compiler 2.86s ± 0% 2.89s ± 1% +0.91% (p=0.000 n=9+10) SSA 7.19s ± 0% 7.25s ± 1% +0.75% (p=0.000 n=8+9) Flate 117ms ± 1% 118ms ± 1% +1.10% (p=0.000 n=10+9) GoParser 146ms ± 2% 148ms ± 1% +1.48% (p=0.002 n=10+10) Reflect 398ms ± 1% 404ms ± 1% +1.51% (p=0.000 n=10+9) Tar 173ms ± 1% 175ms ± 1% +1.17% (p=0.000 n=10+10) XML 208ms ± 1% 210ms ± 1% +0.62% (p=0.011 n=10+10) [Geo mean] 369ms 373ms +1.17% (https://perf.golang.org/search?q=upload:20180101.2) name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.22ms ± 1% 2.25ms ± 1% +1.51% (p=0.000 n=20+19) (https://perf.golang.org/search?q=upload:20180101.3) Change-Id: I5daf4cfec24b252e5a57001f0a6c03f22479d0f0 Reviewed-on: https://go-review.googlesource.com/85887 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:23 +00:00
Austin Clements	d6e8218581	runtime: make span map sparse This splits the span map into separate chunks for every 64MB of the heap. The span map chunks now live in the same indirect structure as the bitmap. Updates #10460. This causes a slight improvement in compilebench and the x/benchmarks garbage benchmark. I'm not sure why it improves performance. name old time/op new time/op delta Template 185ms ± 1% 184ms ± 1% ~ (p=0.315 n=9+10) Unicode 86.9ms ± 1% 86.9ms ± 3% ~ (p=0.356 n=9+10) GoTypes 602ms ± 1% 599ms ± 0% -0.59% (p=0.002 n=9+10) Compiler 2.89s ± 0% 2.87s ± 1% -0.50% (p=0.003 n=9+9) SSA 7.25s ± 0% 7.29s ± 1% ~ (p=0.400 n=9+10) Flate 118ms ± 1% 118ms ± 2% ~ (p=0.065 n=10+9) GoParser 147ms ± 2% 147ms ± 1% ~ (p=0.549 n=10+9) Reflect 403ms ± 1% 401ms ± 1% -0.47% (p=0.035 n=9+10) Tar 176ms ± 1% 175ms ± 1% -0.59% (p=0.013 n=10+9) XML 211ms ± 1% 209ms ± 1% -0.83% (p=0.011 n=10+10) (https://perf.golang.org/search?q=upload:20171231.1) name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.24ms ± 1% 2.23ms ± 1% -0.36% (p=0.001 n=20+19) (https://perf.golang.org/search?q=upload:20171231.2) Change-Id: I2563f8704ab9812434947faf293c5327f9b0d07a Reviewed-on: https://go-review.googlesource.com/85885 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:20 +00:00
Austin Clements	0de5324d61	runtime: abstract remaining mheap.spans access This abstracts the remaining direct accesses to mheap.spans into new mheap.setSpan and mheap.setSpans methods. For #10460. Change-Id: Id1db8bc5e34a77a9221032aa2e62d05322707364 Reviewed-on: https://go-review.googlesource.com/85884 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:19 +00:00
Austin Clements	c0392d2e7f	runtime: make the heap bitmap sparse This splits the heap bitmap into separate chunks for every 64MB of the heap and introduces an index mapping from virtual address to metadata. It modifies the heapBits abstraction to use this two-level structure. Finally, it modifies heapBitsSetType to unroll the bitmap into the object itself and then copy it out if the bitmap would span discontiguous bitmap chunks. This is a step toward supporting general sparse heaps, which will eliminate address space conflict failures as well as the limit on the heap size. It's also advantageous for 32-bit. 32-bit already supports discontiguous heaps by always starting the arena at address 0. However, as a result, with a contiguous bitmap, if the kernel chooses a high address (near 2GB) for a heap mapping, the runtime is forced to map up to 128MB of heap bitmap. Now the runtime can map sections of the bitmap for just the parts of the address space used by the heap. Updates #10460. This slightly slows down the x/garbage and compilebench benchmarks. However, I think the slowdown is acceptably small. name old time/op new time/op delta Template 178ms ± 1% 180ms ± 1% +0.78% (p=0.029 n=10+10) Unicode 85.7ms ± 2% 86.5ms ± 2% ~ (p=0.089 n=10+10) GoTypes 594ms ± 0% 599ms ± 1% +0.70% (p=0.000 n=9+9) Compiler 2.86s ± 0% 2.87s ± 0% +0.40% (p=0.001 n=9+9) SSA 7.23s ± 2% 7.29s ± 2% +0.94% (p=0.029 n=10+10) Flate 116ms ± 1% 117ms ± 1% +0.99% (p=0.000 n=9+9) GoParser 146ms ± 1% 146ms ± 0% ~ (p=0.193 n=10+7) Reflect 399ms ± 0% 403ms ± 1% +0.89% (p=0.001 n=10+10) Tar 173ms ± 1% 174ms ± 1% +0.91% (p=0.013 n=10+9) XML 208ms ± 1% 210ms ± 1% +0.93% (p=0.000 n=10+10) [Geo mean] 368ms 371ms +0.79% name old time/op new time/op delta Garbage/benchmem-MB=64-12 2.17ms ± 1% 2.21ms ± 1% +2.15% (p=0.000 n=20+20) Change-Id: I037fd283221976f4f61249119d6b97b100bcbc66 Reviewed-on: https://go-review.googlesource.com/85883 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:18 +00:00
Austin Clements	29e9c4d4a4	runtime: lay out heap bitmap forward in memory Currently the heap bitamp is laid in reverse order in memory relative to the heap itself. This was originally done out of "excessive cleverness" so that computing a bitmap pointer could load only the arena_start field and so that heaps could be more contiguous by growing the arena and the bitmap out from a common center point. However, this appears to have no actual performance benefit, it complicates nearly every use of the bitmap, and it makes already confusing code more confusing. Furthermore, it's still possible to use a single field (the new bitmap_delta) for the bitmap pointer computation by employing slightly different excessive cleverness. Hence, this CL puts the bitmap into forward order. This is a (very) updated version of CL 9404. Change-Id: I743587cc626c4ecd81e660658bad85b54584108c Reviewed-on: https://go-review.googlesource.com/85881 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:16 +00:00
Austin Clements	4de468621a	runtime: use spanOf* more widely The logic in the spanOf* functions is open-coded in a lot of places right now. Replace these with calls to the spanOf* functions. Change-Id: I3cc996aceb9a529b60fea7ec6fef22008c012978 Reviewed-on: https://go-review.googlesource.com/85880 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:15 +00:00
Austin Clements	a90f9a00ca	runtime: consolidate mheap.lookup* and spanOf* I think we'd forgotten about the mheap.lookup APIs when we introduced spanOf, but, at any rate, the spanOf functions are used far more widely at this point, so this CL eliminates the mheap.lookup* functions in favor of spanOf*. Change-Id: I15facd0856e238bb75d990e838a092b5bef5bdfc Reviewed-on: https://go-review.googlesource.com/85879 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:14 +00:00
Austin Clements	058bb7ea27	runtime: split object finding out of heapBitsForObject heapBitsForObject does two things: it finds the base of the object and it creates the heapBits for the base of the object. There are several places where we just care about the base of the object. Furthermore, greyobject only needs the heapBits in the checkmark path and can easily compute them only when needed. Once we eliminate passing the heap bits to grayobject, almost all uses of heapBitsForObject don't need the heap bits. Hence, this splits heapBitsForObject into findObject and heapBitsForAddr (the latter already exists), removes the hbits argument to grayobject, and replaces all heapBitsForObject calls with calls to findObject. In addition to making things cleaner overall, heapBitsForAddr is going to get more expensive shortly, so it's important that we don't do it needlessly. Note that there's an interesting performance pitfall here. I had originally moved findObject to mheap.go, since it made more sense there. However, that leads to a ~2% slow down and a whopping 11% increase in L1 icache misses on both the x/garbage and compilebench benchmarks. This suggests we may want to be more principled about this, but, for now, let's just leave findObject in mbitmap.go. (I tried to make findObject small enough to inline by splitting out the error case, but, sadly, wasn't quite able to get it under the inlining budget.) Change-Id: I7bcb92f383ade565d22a9f2494e4c66fd513fb10 Reviewed-on: https://go-review.googlesource.com/85878 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:13 +00:00
Austin Clements	41e6abdc61	runtime: replace mlookup and findObject with heapBitsForObject These functions all serve essentially the same purpose. mlookup is used in only one place and findObject in only three. Use heapBitsForObject instead, which is the most optimized implementation. (This may seem slightly silly because none of these uses care about the heap bits, but we're about to split up the functionality of heapBitsForObject anyway. At that point, findObject will rise from the ashes.) Change-Id: I906468c972be095dd23cf2404a7d4434e802f250 Reviewed-on: https://go-review.googlesource.com/85877 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>	2018-02-15 21:12:12 +00:00
Austin Clements	164e1b8477	runtime: eliminate remaining recordspan write barriers recordspan has two remaining write barriers from writing to the pointer to the backing store of h.allspans. However, h.allspans is always backed by off-heap memory, so let the compiler know this. Unfortunately, this isn't quite as clean as most go:notinheap uses because we can't directly name the backing store of a slice, but we can get it done with some judicious casting. For #22460. Change-Id: I296f92fa41cf2cb6ae572b35749af23967533877 Reviewed-on: https://go-review.googlesource.com/73414 Reviewed-by: Rick Hudson <rlh@golang.org>	2017-10-29 20:22:00 +00:00
Keith Randall	97d17fcfd1	runtime: force the type of specialfinalizer into DWARF The core dump reader wants to know the layout of this type. No variable has this type, so it wasn't previously dumped to DWARF output. Change-Id: I982040b81bff202976743edc7fe53247533a9d81 Reviewed-on: https://go-review.googlesource.com/68312 Reviewed-by: Austin Clements <austin@google.com>	2017-10-05 20:03:42 +00:00
Kunpei Sakai	5a986eca86	all: fix article typos a -> an Change-Id: I7362bdc199e83073a712be657f5d9ba16df3077e Reviewed-on: https://go-review.googlesource.com/63850 Reviewed-by: Rob Pike <r@golang.org>	2017-09-15 02:39:16 +00:00
Daniel Martí	59413d34c9	all: unindent some big chunks of code Found with mvdan.cc/unindent. Prioritized the ones with the biggest wins for now. Change-Id: I2b032e45cdd559fc9ed5b1ee4c4de42c4c92e07b Reviewed-on: https://go-review.googlesource.com/56470 Run-TryBot: Daniel Martí <mvdan@mvdan.cc> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2017-08-18 06:59:48 +00:00
Austin Clements	53f2d53450	runtime: document concurrency of mheap.spans We use lock-free reads from mheap.spans, but the safety of these is somewhat subtle. Document this. Change-Id: I928c893232176135308e38bed788d5f84ff11533 Reviewed-on: https://go-review.googlesource.com/54310 Reviewed-by: Rick Hudson <rlh@golang.org>	2017-08-09 16:06:23 +00:00

1 2 3 4

160 commits