Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
Austin Clements	4e1bf8ed38	runtime: add GC testing helpers for regabi signature fuzzer This CL adds a set of helper functions for testing GC interactions. These are intended for use in the regabi signature fuzzer, but are generally useful for GC tests, so we make them generally available to runtime tests. These provide: 1. An easy way to force stack movement, for testing stack copying. 2. A simple and robust way to check the reachability of a set of pointers. 3. A way to check what general category of memory a pointer points to, mostly so tests can make sure they're testing what they mean to. For #40724, but generally useful. Change-Id: I15d33ccb3f5a792c0472a19c2cc9a8b4a9356a66 Reviewed-on: https://go-review.googlesource.com/c/go/+/305330 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Than McIntosh <thanm@google.com>	2021-03-29 21:50:16 +00:00
Austin Clements	1ef114d12c	runtime: abstract specials list iteration The specials processing loop in mspan.sweep is about to get more complicated and I'm too allergic to list manipulation to open code more of it there. Change-Id: I767a0889739da85fb2878fc06a5c55b73bf2ba7d Reviewed-on: https://go-review.googlesource.com/c/go/+/305551 Trust: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2021-03-29 21:50:14 +00:00
Michael Anthony Knyszek	bd6aeca968	runtime: prepare arenas for use incrementally This change moves the call of sysMap from (mheap).sysAlloc into (mheap).grow, so we only sysMap what we're going to use in the near future (thanks to the curArena mechanism). The purpose of this change is to better support systems with strict overcommit rules which generally accept reserved memory but not prepared memory (see malloc.go for exact descriptions of these states). This move requires changing linearAlloc to only optionally map memory. In one case, with mheap.heapArenaAlloc, we do want it to map memory. But now in the other case, with mheap.arena, we don't, because we want grow to take care of it. The risk with this change is we may make more syscalls than before on systems with 64 MiB arenas, but because heap growth is relatively rare this is unlikely to be a noticable issue. We also bound the amount of syscalls made by only extending curArena (and thus mapping) by pallocChunkPages*pageSize which is 4 MiB. Fixes #42612. Change-Id: I736df696afe78ddb1a747a896caa0db8726027e5 Reviewed-on: https://go-review.googlesource.com/c/go/+/270537 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2021-03-15 20:20:51 +00:00
Matthew Dempsky	4662029264	runtime: simplify divmagic for span calculations It's both simpler and faster to just unconditionally do two 32-bit multiplies rather than a bunch of branching to try to avoid them. This is safe thanks to the tight bounds derived in [1] and verified during mksizeclasses.go. Benchstat results below for compilebench benchmarks on my P920. See also [2] for micro benchmarks comparing the new functions against the originals (as well as several basic attempts at optimizing them). name old time/op new time/op delta Template 295ms ± 3% 290ms ± 1% -1.95% (p=0.000 n=20+20) Unicode 113ms ± 3% 110ms ± 2% -2.32% (p=0.000 n=21+17) GoTypes 1.78s ± 1% 1.76s ± 1% -1.23% (p=0.000 n=21+20) Compiler 119ms ± 2% 117ms ± 4% -1.53% (p=0.007 n=20+20) SSA 14.3s ± 1% 13.8s ± 1% -3.12% (p=0.000 n=17+20) Flate 173ms ± 2% 170ms ± 1% -1.64% (p=0.000 n=20+19) GoParser 278ms ± 2% 273ms ± 2% -1.92% (p=0.000 n=20+19) Reflect 686ms ± 3% 671ms ± 3% -2.18% (p=0.000 n=19+20) Tar 255ms ± 2% 248ms ± 2% -2.90% (p=0.000 n=20+20) XML 335ms ± 3% 327ms ± 2% -2.34% (p=0.000 n=20+20) LinkCompiler 799ms ± 1% 799ms ± 1% ~ (p=0.925 n=20+20) ExternalLinkCompiler 1.90s ± 1% 1.90s ± 0% ~ (p=0.327 n=20+20) LinkWithoutDebugCompiler 385ms ± 1% 386ms ± 1% ~ (p=0.251 n=18+20) [Geo mean] 512ms 504ms -1.61% name old user-time/op new user-time/op delta Template 286ms ± 4% 282ms ± 4% -1.42% (p=0.025 n=21+20) Unicode 104ms ± 9% 102ms ±14% ~ (p=0.294 n=21+20) GoTypes 1.75s ± 3% 1.72s ± 2% -1.36% (p=0.000 n=21+20) Compiler 109ms ±11% 108ms ± 8% ~ (p=0.187 n=21+19) SSA 14.0s ± 1% 13.5s ± 2% -3.25% (p=0.000 n=16+20) Flate 166ms ± 4% 164ms ± 4% -1.34% (p=0.032 n=19+19) GoParser 268ms ± 4% 263ms ± 4% -1.71% (p=0.011 n=18+20) Reflect 666ms ± 3% 654ms ± 4% -1.77% (p=0.002 n=18+20) Tar 245ms ± 5% 236ms ± 6% -3.34% (p=0.000 n=20+20) XML 320ms ± 4% 314ms ± 3% -2.01% (p=0.001 n=19+18) LinkCompiler 744ms ± 4% 747ms ± 3% ~ (p=0.627 n=20+19) ExternalLinkCompiler 1.71s ± 3% 1.72s ± 2% ~ (p=0.149 n=20+20) LinkWithoutDebugCompiler 345ms ± 6% 342ms ± 8% ~ (p=0.355 n=20+20) [Geo mean] 484ms 477ms -1.50% [1] Daniel Lemire, Owen Kaser, Nathan Kurz. 2019. "Faster Remainder by Direct Computation: Applications to Compilers and Software Libraries." https://arxiv.org/abs/1902.01961 [2] https://github.com/mdempsky/benchdivmagic Change-Id: Ie4d214e7a908b0d979c878f2d404bd56bdf374f6 Reviewed-on: https://go-review.googlesource.com/c/go/+/300994 Run-TryBot: Matthew Dempsky <mdempsky@google.com> Trust: Matthew Dempsky <mdempsky@google.com> Trust: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2021-03-12 18:02:59 +00:00
Joel Sing	5ca43acdb3	runtime: allow physical page aligned stacks to be allocated Add a physPageAlignedStack boolean which if set, results in over allocation by a physical page, the allocation being rounded to physical page alignment and the unused memory surrounding the allocation being freed again. OpenBSD/octeon has 16KB physical pages and requires stacks to be physical page aligned in order for them to be remapped as MAP_STACK. This change allows Go to work on this platform. Based on a suggestion from mknyszek in issue #41008. Updates #40995 Fixes #41008 Change-Id: Ia5d652292b515916db473043b41f6030094461d8 Reviewed-on: https://go-review.googlesource.com/c/go/+/266919 Trust: Joel Sing <joel@sing.id.au> Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Go Bot <gobot@golang.org>	2020-11-04 06:14:02 +00:00
Michael Anthony Knyszek	39a5ee52b9	runtime: decouple consistent stats from mcache and allow P-less update This change modifies the consistent stats implementation to keep the per-P sequence counter on each P instead of each mcache. A valid mcache is not available everywhere that we want to call e.g. allocSpan, as per issue #42339. By decoupling these two, we can add a mechanism to allow contexts without a P to update stats consistently. In this CL, we achieve that with a mutex. In practice, it will be very rare for an M to update these stats without a P. Furthermore, the stats reader also only needs to hold the mutex across the update to "gen" since once that changes, writers are free to continue updating the new stats generation. Contention could thus only arise between writers without a P, and as mentioned earlier, those should be rare. A nice side-effect of this change is that the consistent stats acquire and release API becomes simpler. Fixes #42339. Change-Id: Ied74ab256f69abd54b550394c8ad7c4c40a5fe34 Reviewed-on: https://go-review.googlesource.com/c/go/+/267158 Run-TryBot: Michael Knyszek <mknyszek@google.com> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-11-02 21:21:46 +00:00
Michael Anthony Knyszek	ac766e3718	runtime: make getMCache inlineable This change moves the responsibility of throwing if an mcache is not available to the caller, because the inlining cost of throw is set very high in the compiler. Even if it was reduced down to the cost of a usual function call, it would still be too expensive, so just move it out. This choice also makes sense in the context of #42339 since we're going to have to handle the case where we don't have an mcache to update stats in a few contexts anyhow. Also, add getMCache to the list of functions that should be inlined to prevent future regressions. getMCache is called on the allocation fast path and because its not inlined actually causes a significant regression (~10%) in some microbenchmarks. Fixes #42305. Change-Id: I64ac5e4f26b730bd4435ea1069a4a50f55411ced Reviewed-on: https://go-review.googlesource.com/c/go/+/267157 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org>	2020-11-02 21:10:41 +00:00
Michael Pratt	9393b5bae5	runtime: add heap lock assertions Some functions that required holding the heap lock _or_ world stop have been simplified to simply requiring the heap lock. This is conceptually simpler and taking the heap lock during world stop is guaranteed to not contend. This was only done on functions already called on the systemstack to avoid too many extra systemstack calls in GC. Updates #40677 Change-Id: I15aa1dadcdd1a81aac3d2a9ecad6e7d0377befdc Reviewed-on: https://go-review.googlesource.com/c/go/+/250262 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Trust: Michael Pratt <mpratt@google.com>	2020-10-30 20:21:14 +00:00
Michael Anthony Knyszek	f77a9025f1	runtime: replace some memstats with consistent stats This change replaces stacks_inuse, gcWorkBufInUse and gcProgPtrScalarBitsInUse with their corresponding consistent stats. It also adds checks to make sure the rest of the sharded stats line up with existing stats in updatememstats. Change-Id: I17d0bd181aedb5c55e09c8dff18cef5b2a3a14e3 Reviewed-on: https://go-review.googlesource.com/c/go/+/247038 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 18:28:20 +00:00
Michael Anthony Knyszek	fe7ff71185	runtime: add consistent heap statistics This change adds a global set of heap statistics which are similar to existing memory statistics. The purpose of these new statistics is to be able to read them and get a consistent result without stopping the world. The goal is to eventually replace as many of the existing memstats statistics with the sharded ones as possible. The consistent memory statistics use a tailor-made synchronization mechanism to allow writers (allocators) to proceed with minimal synchronization by using a sequence counter and a global generation counter to determine which set of statistics to update. Readers increment the global generation counter to effectively grab a snapshot of the statistics, and then iterate over all Ps using the sequence counter to ensure that they may safely read the snapshotted statistics. To keep statistics fresh, the reader also has a responsibility to merge sets of statistics. These consistent statistics are computed, but otherwise unused for now. Upcoming changes will integrate them with the rest of the codebase and will begin to phase out existing statistics. Change-Id: I637a11f2439e2049d7dccb8650c5d82500733ca5 Reviewed-on: https://go-review.googlesource.com/c/go/+/247037 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 18:28:14 +00:00
Michael Anthony Knyszek	c5dea8f387	runtime: remove memstats.heap_idle This statistic is updated in many places but for MemStats may be computed from existing statistics. Specifically by definition heap_idle = heap_sys - heap_inuse since heap_sys is all memory allocated from the OS for use in the heap minus memory used for non-heap purposes. heap_idle is almost the same (since it explicitly includes memory that could be used for non-heap purposes) but also doesn't include memory that's actually used to hold heap objects. Although it has some utility as a sanity check, it complicates accounting and we want fewer, orthogonal statistics for upcoming metrics changes, so just drop it. Change-Id: I40af54a38e335f43249f6e218f35088bfd4380d1 Reviewed-on: https://go-review.googlesource.com/c/go/+/246974 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 18:10:12 +00:00
Michael Anthony Knyszek	ad863ba32a	runtime: break down memstats.gc_sys This change breaks apart gc_sys into three distinct pieces. Two of those pieces are pieces which come from heap_sys since they're allocated from the page heap. The rest comes from memory mapped from e.g. persistentalloc which better fits the purpose of a sysMemStat. Also, rename gc_sys to gcMiscSys. Change-Id: I098789170052511e7b31edbcdc9a53e5c24573f7 Reviewed-on: https://go-review.googlesource.com/c/go/+/246973 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 18:10:04 +00:00
Michael Anthony Knyszek	8ebc58452a	runtime: delineate which memstats are system stats with a type This change modifies the type of several mstats fields to be a new type: sysMemStat. This type has the same structure as the fields used to have. The purpose of this change is to make it very clear which stats may be used in various functions for accounting (usually the platform-specific sys* functions, but there are others). Currently there's an implicit understanding that the *uint64 value passed to these functions is some kind of statistic whose value is atomically managed. This understanding isn't inherently problematic, but we're about to change how some stats (which currently use mSysStatInc and mSysStatDec) work, so we want to make it very clear what the various requirements are around "sysStat". This change also removes mSysStatInc and mSysStatDec in favor of a method on sysMemStat. Note that those two functions were originally written the way they were because atomic 64-bit adds required a valid G on ARM, but this hasn't been the case for a very long time (since golang.org/cl/14204, but even before then it wasn't clear if mutexes required a valid G anymore). Today we implement 64-bit adds on ARM with a spinlock table. Change-Id: I4e9b37cf14afc2ae20cf736e874eb0064af086d7 Reviewed-on: https://go-review.googlesource.com/c/go/+/246971 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 18:09:41 +00:00
Michael Anthony Knyszek	dc02578ac8	runtime: make the span allocation purpose more explicit This change modifies mheap's span allocation API to have each caller declare a purpose, defined as a new enum called spanAllocType. The purpose behind this change is two-fold: 1. Tight control over who gets to allocate heap memory is, generally speaking, a good thing. Every codepath that allocates heap memory places additional implicit restrictions on the allocator. A notable example of a restriction is work bufs coming from heap memory: write barriers are not allowed in allocation paths because then we could have a situation where the allocator calls into the allocator. 2. Memory statistic updating is explicit. Instead of passing an opaque pointer for statistic updating, which places restrictions on how that statistic may be updated, we use the spanAllocType to determine which statistic to update and how. We also take this opportunity to group all the statistic updating code together, which should make the accounting code a little easier to follow. Change-Id: Ic0b0898959ba2a776f67122f0e36c9d7d60e3085 Reviewed-on: https://go-review.googlesource.com/c/go/+/246970 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 17:27:14 +00:00
Michael Anthony Knyszek	d677899e90	runtime: flush local_scan directly and more often Now that local_scan is the last mcache-based statistic that is flushed by purgecachedstats, and heap_scan and gcController.revise may be interacted with concurrently, we don't need to flush heap_scan at arbitrary locations where the heap is locked, and we don't need purgecachedstats and cachestats anymore. Instead, we can flush local_scan at the same time we update heap_live in refill, so the two updates may share the same revise call. Clean up unused functions, remove code that would cause the heap to get locked in the allocSpan when it didn't need to (other than to flush local_scan), and flush local_scan explicitly in a few important places. Notably we need to flush local_scan whenever we flush the other stats, but it doesn't need to be donated anywhere, so have releaseAll do the flushing. Also, we need to flush local_scan before we set heap_scan at the end of a GC, which was previously handled by cachestats. Just do so explicitly -- it's not much code and it becomes a lot more clear why we need to do so. Change-Id: I35ac081784df7744d515479896a41d530653692d Reviewed-on: https://go-review.googlesource.com/c/go/+/246968 Run-TryBot: Michael Knyszek <mknyszek@google.com> Trust: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 17:26:40 +00:00
Michael Anthony Knyszek	cca3d1e553	runtime: don't flush local_tinyallocs This change makes local_tinyallocs work like the rest of the malloc stats and doesn't flush local_tinyallocs, instead making that the source-of-truth. Change-Id: I3e6cb5f1b3d086e432ce7d456895511a48e3617a Reviewed-on: https://go-review.googlesource.com/c/go/+/246967 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 17:26:30 +00:00
Michael Anthony Knyszek	e63716bc76	runtime: make nlargealloc and largealloc mcache fields This change makes nlargealloc and largealloc into mcache fields just like nlargefree and largefree. These local fields become the new source-of-truth. This change also moves the accounting for these fields out of allocSpan (which is an inappropriate place for it -- this accounting generally happens much closer to the point of allocation) and into largeAlloc. This move is partially possible now that we can call gcController.revise at that point. Furthermore, this change moves largeAlloc into mcache.go and makes it a method of mcache. While there's a little bit of a mismatch here because largeAlloc barely interacts with the mcache, it helps solidify the mcache as the first allocation layer and provides a clear place to aggregate and manage statistics. Change-Id: I37b5e648710733bb4c04430b71e96700e438587a Reviewed-on: https://go-review.googlesource.com/c/go/+/246965 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 17:26:16 +00:00
Michael Anthony Knyszek	42019613df	runtime: make distributed/local malloc stats the source-of-truth This change makes it so that various local malloc stats (excluding heap_scan and local_tinyallocs) are no longer written first to mheap fields but are instead accessed directly from each mcache. This change is part of a move toward having stats be distributed, and cleaning up some old code related to the stats. Note that because there's no central source-of-truth, when an mcache dies, it must donate its stats to another mcache. It's always safe to donate to the mcache for the 0th P, so do that. Change-Id: I2556093dbc27357cb9621c9b97671f3c00aa1173 Reviewed-on: https://go-review.googlesource.com/c/go/+/246964 Trust: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 17:26:08 +00:00
Michael Anthony Knyszek	8cc280aa72	runtime: define and enforce synchronization on heap_scan Currently heap_scan is mostly protected by the heap lock, but gcControllerState.revise sometimes accesses it without a lock. In an effort to make gcControllerState.revise callable from more contexts (and have its synchronization guarantees actually respected), make heap_scan atomically read from and written to, unless the world is stopped. Note that we don't update gcControllerState.revise's erroneous doc comment here because this change isn't about revise's guarantees, just about heap_scan. The comment is updated in a later change. Change-Id: Iddbbeb954767c704c2bd1d221f36e6c4fc9948a6 Reviewed-on: https://go-review.googlesource.com/c/go/+/246960 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Go Bot <gobot@golang.org> Trust: Emmanuel Odeke <emmanuel@orijtech.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-10-26 17:25:40 +00:00
lihaowei	7fbd8c75c6	all: fix spelling mistakes Change-Id: I7d512281d8442d306594b57b5deaecd132b5ea9e GitHub-Last-Rev: `251e1d6857` GitHub-Pull-Request: golang/go#40793 Reviewed-on: https://go-review.googlesource.com/c/go/+/248441 Reviewed-by: Dave Cheney <dave@cheney.net>	2020-08-18 03:28:52 +00:00
Michael Anthony Knyszek	e6d0bd2b89	runtime: clean up old mcentral code This change deletes the old mcentral implementation from the code base and the newMCentralImpl feature flag along with it. Updates #37487. Change-Id: Ibca8f722665f0865051f649ffe699cbdbfdcfcf2 Reviewed-on: https://go-review.googlesource.com/c/go/+/221184 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-08-17 20:06:49 +00:00
Michael Anthony Knyszek	260dff3ca3	runtime: clean up old markrootSpans This change removes the old markrootSpans implementation and deletes the feature flag. Updates #37487. Change-Id: Idb5a2559abcc3be5a7da6f2ccce1a86e1d7634e3 Reviewed-on: https://go-review.googlesource.com/c/go/+/221183 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Austin Clements <austin@google.com>	2020-08-17 20:06:41 +00:00
Austin Clements	d19fedd180	runtime: move checkmarks to a separate bitmap Currently, the GC stores the object marks for checkmarks mode in the heap bitmap using a rather complex encoding: for one word objects, the checkmark is stored in the pointer/scalar bit since one word objects must be pointers; for larger objects, the checkmark is stored in what would be the scan/dead bit for the second word of the object. This encoding made more sense when the runtime used the first scan/dead bit as the regular mark bit, but we moved away from that long ago. This encoding and overloading of the heap bitmap bits causes a great deal of complexity in many parts of the allocator and garbage collector and leads to some subtle bugs like #15903. This CL moves the checkmarks mark bits into their own per-arena bitmap and reclaims the second scan/dead bit as a regular scan/dead bit. I tested this by enabling doubleCheck mode in heapBitsSetType and running in both regular and GODEBUG=gccheckmark=1 mode. Fixes #15903. No performance degradation. (Very slight improvement on a few benchmarks, but it's probably just noise.) name old time/op new time/op delta BiogoIgor 16.6s ± 1% 16.4s ± 1% -0.94% (p=0.000 n=25+24) BiogoKrishna 19.2s ± 3% 19.2s ± 3% ~ (p=0.638 n=23+25) BleveIndexBatch100 6.12s ± 5% 6.17s ± 4% ~ (p=0.170 n=25+25) CompileTemplate 206ms ± 1% 205ms ± 1% -0.43% (p=0.005 n=24+24) CompileUnicode 82.2ms ± 2% 81.5ms ± 2% -0.95% (p=0.001 n=22+22) CompileGoTypes 755ms ± 3% 754ms ± 4% ~ (p=0.715 n=25+25) CompileCompiler 3.73s ± 1% 3.73s ± 1% ~ (p=0.445 n=25+24) CompileSSA 8.67s ± 1% 8.66s ± 1% ~ (p=0.836 n=24+22) CompileFlate 134ms ± 2% 133ms ± 1% -0.66% (p=0.001 n=24+23) CompileGoParser 164ms ± 1% 163ms ± 1% -0.85% (p=0.000 n=24+24) CompileReflect 466ms ± 5% 466ms ± 3% ~ (p=0.863 n=25+25) CompileTar 182ms ± 1% 182ms ± 1% -0.31% (p=0.048 n=24+24) CompileXML 249ms ± 1% 248ms ± 1% -0.32% (p=0.031 n=21+25) CompileStdCmd 10.3s ± 1% 10.3s ± 1% ~ (p=0.459 n=23+23) FoglemanFauxGLRenderRotateBoat 8.66s ± 1% 8.62s ± 1% -0.47% (p=0.000 n=23+24) FoglemanPathTraceRenderGopherIter1 20.3s ± 3% 20.2s ± 2% ~ (p=0.893 n=25+25) GopherLuaKNucleotide 29.7s ± 1% 29.8s ± 2% ~ (p=0.421 n=24+25) MarkdownRenderXHTML 246ms ± 1% 247ms ± 1% ~ (p=0.558 n=25+24) Tile38WithinCircle100kmRequest 779µs ± 4% 779µs ± 3% ~ (p=0.954 n=25+25) Tile38IntersectsCircle100kmRequest 1.02ms ± 3% 1.01ms ± 4% ~ (p=0.658 n=25+25) Tile38KNearestLimit100Request 984µs ± 4% 986µs ± 4% ~ (p=0.627 n=24+25) [Geo mean] 552ms 551ms -0.19% https://perf.golang.org/search?q=upload:20200723.6 Change-Id: Ic703f26a83fb034941dc6f4788fc997d56890dec Reviewed-on: https://go-review.googlesource.com/c/go/+/244539 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Martin Möhrmann <moehrmann@google.com>	2020-08-17 14:31:20 +00:00
Cherry Zhang	c53b2bdb35	runtime: add a barrier after a new span is allocated When copying a stack, we 1. allocate a new stack, 2. adjust pointers pointing to the old stack to pointing to the new stack. If the GC is running on another thread concurrently, on a machine with weak memory model, the GC could observe the adjusted pointer (e.g. through gp._defer which could be a special heap-to-stack pointer), but not observe the publish of the new stack span. In this case, the GC will see the adjusted pointer pointing to an unallocated span, and throw. Fixing this by adding a publication barrier between the allocation of the span and adjusting pointers. One testcase for this is TestDeferHeapAndStack in long mode. It fails reliably on linux-mips64le-mengzhuo builder without the fix, and passes reliably after the fix. Fixes #35541. Change-Id: I82b09b824fdf14be7336a9ee853f56dec1b13b90 Reviewed-on: https://go-review.googlesource.com/c/go/+/234478 Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-05-21 14:31:36 +00:00
Michael Anthony Knyszek	796786cd0c	runtime: make maxOffAddr reflect the actual address space upper bound Currently maxOffAddr is defined in terms of the whole 64-bit address space, assuming that it's all supported, by using ^uintptr(0) as the maximal address in the offset space. In reality, the maximal address in the offset space is (1<<heapAddrBits)-1 because we don't have more than that actually available to us on a given platform. On most platforms this is fine, because arenaBaseOffset is just connecting two segments of address space, but on AIX we use it as an actual offset for the starting address of the available address space, which is limited. This means using ^uintptr(0) as the maximal address in the offset address space causes wrap-around, especially when we just want to represent a range approximately like [addr, infinity), which today we do by using maxOffAddr. To fix this, we define maxOffAddr more appropriately, in terms of (1<<heapAddrBits)-1. This change also redefines arenaBaseOffset to not be the negation of the virtual address corresponding to address zero in the virtual address space, but instead directly as the virtual address corresponding to zero. This matches the existing documentation more closely and makes the logic around arenaBaseOffset decidedly simpler, especially when trying to reason about its use on AIX. Fixes #38966. Change-Id: I1336e5036a39de846f64cc2d253e8536dee57611 Reviewed-on: https://go-review.googlesource.com/c/go/+/233497 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Pratt <mpratt@google.com>	2020-05-14 16:20:19 +00:00
Michael Anthony Knyszek	55ec5182d7	runtime: remove scavAddr in favor of address ranges This change removes the concept of s.scavAddr in favor of explicitly reserving and unreserving address ranges. s.scavAddr has several problems with raciness that can cause the scavenger to miss updates, or move it back unnecessarily, forcing future scavenge calls to iterate over searched address space unnecessarily. This change achieves this by replacing scavAddr with a second addrRanges which is cloned from s.inUse at the end of each sweep phase. Ranges from this second addrRanges are then reserved by scavengers (with the reservation size proportional to the heap size) who are then able to safely iterate over those ranges without worry of another scavenger coming in. Fixes #35788. Change-Id: Ief01ae170384174875118742f6c26b2a41cbb66d Reviewed-on: https://go-review.googlesource.com/c/go/+/208378 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Austin Clements <austin@google.com>	2020-05-08 16:24:40 +00:00
Michael Anthony Knyszek	14ae846f54	runtime: avoid overflow in (*mheap).grow Currently when checking if we can grow the heap into the current arena, we do an addition which may overflow. This is particularly likely on 32-bit systems. Avoid this situation by explicitly checking for overflow, and adding in some comments about when overflow is possible, when it isn't, and why. For #35954. Change-Id: I2d4ecbb1ccbd43da55979cc721f0cd8d1757add2 Reviewed-on: https://go-review.googlesource.com/c/go/+/231337 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-05-07 21:39:50 +00:00
Michael Anthony Knyszek	a13691966a	runtime: add new mcentral implementation Currently mcentral is implemented as a couple of linked lists of spans protected by a lock. Unfortunately this design leads to significant lock contention. The span ownership model is also confusing and complicated. In-use spans jump between being owned by multiple sources, generally some combination of a gcSweepBuf, a concurrent sweeper, an mcentral or an mcache. So first to address contention, this change replaces those linked lists with gcSweepBufs which have an atomic fast path. Then, we change up the ownership model: a span may be simultaneously owned only by an mcentral and the page reclaimer. Otherwise, an mcentral (which now consists of sweep bufs), a sweeper, or an mcache are the sole owners of a span at any given time. This dramatically simplifies reasoning about span ownership in the runtime. As a result of this new ownership model, sweeping is now driven by walking over the mcentrals rather than having its own global list of spans. Because we no longer have a global list and we traditionally haven't used the mcentrals for large object spans, we no longer have anywhere to put large objects. So, this change also makes it so that we keep large object spans in the appropriate mcentral lists. In terms of the static lock ranking, we add the spanSet spine locks in pretty much the same place as the mcentral locks, since they have the potential to be manipulated both on the allocation and sweep paths, like the mcentral locks. This new implementation is turned on by default via a feature flag called go115NewMCentralImpl. Benchmark results for 1 KiB allocation throughput (5 runs each): name \ MiB/s go113 go114 gotip gotip+this-patch AllocKiB-1 1.71k ± 1% 1.68k ± 1% 1.59k ± 2% 1.71k ± 1% AllocKiB-2 2.46k ± 1% 2.51k ± 1% 2.54k ± 1% 2.93k ± 1% AllocKiB-4 4.27k ± 1% 4.41k ± 2% 4.33k ± 1% 5.01k ± 2% AllocKiB-8 4.38k ± 3% 5.24k ± 1% 5.46k ± 1% 8.23k ± 1% AllocKiB-12 4.38k ± 3% 4.49k ± 1% 5.10k ± 1% 10.04k ± 0% AllocKiB-16 4.31k ± 1% 4.14k ± 3% 4.22k ± 0% 10.42k ± 0% AllocKiB-20 4.26k ± 1% 3.98k ± 1% 4.09k ± 1% 10.46k ± 3% AllocKiB-24 4.20k ± 1% 3.97k ± 1% 4.06k ± 1% 10.74k ± 1% AllocKiB-28 4.15k ± 0% 4.00k ± 0% 4.20k ± 0% 10.76k ± 1% Fixes #37487. Change-Id: I92d47355acacf9af2c41bf080c08a8c1638ba210 Reviewed-on: https://go-review.googlesource.com/c/go/+/221182 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-04-27 18:19:26 +00:00
Michael Anthony Knyszek	eacdf76b93	runtime: add bitmap-based markrootSpans implementation Currently markrootSpans, the scanning routine which scans span specials (particularly finalizers) as roots, uses sweepSpans to shard work and find spans to mark. However, as part of a future CL to change span ownership and how mcentral works, we want to avoid having markrootSpans use the sweep bufs to find specials, so in this change we introduce a new mechanism. Much like for the page reclaimer, we set up a per-page bitmap where the first page for a span is marked if the span contains any specials, and unmarked if it has no specials. This bitmap is updated by addspecial, removespecial, and during sweeping. markrootSpans then shards this bitmap into mark work and markers iterate over the bitmap looking for spans with specials to mark. Unlike the page reclaimer, we don't need to use the pageInUse bits because having a special implies that a span is in-use. While in terms of computational complexity this design is technically worse, because it needs to iterate over the mapped heap, in practice this iteration is very fast (we can skip over large swathes of the heap very quickly) and we only look at spans that have any specials at all, rather than having to touch each span. This new implementation of markrootSpans is behind a feature flag called go115NewMarkrootSpans. Updates #37487. Change-Id: I8ea07b6c11059f6d412fe419e0ab512d989377b8 Reviewed-on: https://go-review.googlesource.com/c/go/+/221178 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-04-21 22:50:51 +00:00
Michael Pratt	300ff5d8ac	runtime: allow proflock and mheap.speciallock above globalAlloc.mutex During schedinit, these may occur in: mProf_Malloc stkbucket newBucket persistentalloc persistentalloc1 mProf_Malloc setprofilebucket fixalloc.alloc persistentalloc persistentalloc1 These seem to be legitimate lock orderings. Additionally, mheap.speciallock had a defined rank, but it was never actually used. That is fixed now. Updates #38474 Change-Id: I0f6e981852eac66dafb72159f426476509620a65 Reviewed-on: https://go-review.googlesource.com/c/go/+/228786 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Dan Scales <danscales@google.com>	2020-04-21 20:22:06 +00:00
Dan Scales	0a820007e7	runtime: static lock ranking for the runtime (enabled by GOEXPERIMENT) I took some of the infrastructure from Austin's lock logging CR https://go-review.googlesource.com/c/go/+/192704 (with deadlock detection from the logs), and developed a setup to give static lock ranking for runtime locks. Static lock ranking establishes a documented total ordering among locks, and then reports an error if the total order is violated. This can happen if a deadlock happens (by acquiring a sequence of locks in different orders), or if just one side of a possible deadlock happens. Lock ordering deadlocks cannot happen as long as the lock ordering is followed. Along the way, I found a deadlock involving the new timer code, which Ian fixed via https://go-review.googlesource.com/c/go/+/207348, as well as two other potential deadlocks. See the constants at the top of runtime/lockrank.go to show the static lock ranking that I ended up with, along with some comments. This is great documentation of the current intended lock ordering when acquiring multiple locks in the runtime. I also added an array lockPartialOrder[] which shows and enforces the current partial ordering among locks (which is embedded within the total ordering). This is more specific about the dependencies among locks. I don't try to check the ranking within a lock class with multiple locks that can be acquired at the same time (i.e. check the ranking when multiple hchan locks are acquired). Currently, I am doing a lockInit() call to set the lock rank of most locks. Any lock that is not otherwise initialized is assumed to be a leaf lock (a very high rank lock), so that eliminates the need to do anything for a bunch of locks (including all architecture-dependent locks). For two locks, root.lock and notifyList.lock (only in the runtime/sema.go file), it is not as easy to do lock initialization, so instead, I am passing the lock rank with the lock calls. For Windows compilation, I needed to increase the StackGuard size from 896 to 928 because of the new lock-rank checking functions. Checking of the static lock ranking is enabled by setting GOEXPERIMENT=staticlockranking before doing a run. To make sure that the static lock ranking code has no overhead in memory or CPU when not enabled by GOEXPERIMENT, I changed 'go build/install' so that it defines a build tag (with the same name) whenever any experiment has been baked into the toolchain (by checking Expstring()). This allows me to avoid increasing the size of the 'mutex' type when static lock ranking is not enabled. Fixes #38029 Change-Id: I154217ff307c47051f8dae9c2a03b53081acd83a Reviewed-on: https://go-review.googlesource.com/c/go/+/207619 Reviewed-by: Dan Scales <danscales@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Dan Scales <danscales@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-04-07 21:51:03 +00:00
Ian Lance Taylor	3093959ee1	runtime: remove mcache field from m Having an mcache field in both m and p is confusing, so remove it from m. Always use mcache field from p. Use new variable mcache0 during bootstrap. Change-Id: If2cba9f8bb131d911d512b61fd883a86cf62cc98 Reviewed-on: https://go-review.googlesource.com/c/go/+/205239 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2020-02-24 16:39:52 +00:00
Michael Anthony Knyszek	8ac98e7b3f	runtime: add scavtrace debug flag and remove scavenge info from gctrace Currently, scavenging information is printed if the gctrace debug variable is >0. Scavenging information is also printed naively, for every page scavenged, resulting in a lot of noise when the typical expectation for GC trace is one line per GC. This change adds a new GODEBUG flag called scavtrace which prints scavenge information roughly once per GC cycle and removes any scavenge information from gctrace. The exception is debug.FreeOSMemory, which may force an additional line to be printed. Fixes #32952. Change-Id: I4177dcb85fe3f9653fd74297ea93c97c389c1811 Reviewed-on: https://go-review.googlesource.com/c/go/+/212640 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2020-01-09 18:00:06 +00:00
Dan Scales	f266cce676	runtime: avoid potential deadlock when tracing memory code In reclaimChunk, the runtime is calling traceGCSweepDone() while holding the mheap lock. traceGCSweepDone() can call traceEvent() and traceFlush(). These functions not only can get various trace locks, but they may also do memory allocations (runtime.newobject) that may end up getting the mheap lock. So, there may be either a self-deadlock or a possible deadlock between multiple threads. It seems better to release the mheap lock before calling traceGCSweepDone(). It is fine to release the lock, since the operations to get the index of the chunk of work to do are atomic. We already release the lock to call sweep, so there is no new behavior for any of the callers of reclaimChunk. With this change, mheap is a leaf lock (no other lock is ever acquired while it is held). Testing: besides normal all.bash, also ran all.bash with --long enabled, since it does longer tests of runtime/trace. Change-Id: I4f8cb66c24bb8d424f24d6c2305b4b8387409248 Reviewed-on: https://go-review.googlesource.com/c/go/+/207846 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2020-01-07 00:05:43 +00:00
Michael Anthony Knyszek	4e3d58009a	runtime: reset scavenge address in scavengeAll Currently scavengeAll (which is called by debug.FreeOSMemory) doesn't reset the scavenge address before scavenging, meaning it could miss large portions of the heap. Fix this by reseting the address before scavenging, which will ensure it is able to walk over the entire heap. Fixes #35858. Change-Id: I4a7408050b8e134318ff94428f98cb96a1795aa9 Reviewed-on: https://go-review.googlesource.com/c/go/+/208960 Reviewed-by: Cherry Zhang <cherryyz@google.com> Run-TryBot: Cherry Zhang <cherryyz@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>	2019-11-27 15:06:55 +00:00
Ville Skyttä	440f7d6404	all: fix a bunch of misspellings Change-Id: I5b909df0fd048cd66c5a27fca1b06466d3bcaac7 GitHub-Last-Rev: `778c5d2131` GitHub-Pull-Request: golang/go#35624 Reviewed-on: https://go-review.googlesource.com/c/go/+/207421 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@golang.org>	2019-11-15 21:04:43 +00:00
Michael Anthony Knyszek	a2cd2bd55d	runtime: add per-p page allocation cache This change adds a per-p free page cache which the page allocator may allocate out of without a lock. The change also introduces a completely lockless page allocator fast path. Although the cache contains at most 64 pages (and usually less), the vast majority (85%+) of page allocations are exactly 1 page in size. Updates #35112. Change-Id: I170bf0a9375873e7e3230845eb1df7e5cf741b78 Reviewed-on: https://go-review.googlesource.com/c/go/+/195701 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 18:00:54 +00:00
Michael Anthony Knyszek	4517c02f28	runtime: add per-p mspan cache This change adds a per-p mspan object cache similar to the sudog cache. Unfortunately this cache can't quite operate like the sudog cache, since it is used in contexts where write barriers are disallowed (i.e. allocation codepaths), so rather than managing an array and a slice, it's just an array and a length. A little bit more unsafe, but avoids any write barriers. The purpose of this change is to reduce the number of operations which require the heap lock in allocation, paving the way for a lockless fast path. Updates #35112. Change-Id: I32cfdcd8528fb7be985640e4f3a13cb98ffb7865 Reviewed-on: https://go-review.googlesource.com/c/go/+/196642 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 17:01:32 +00:00
Michael Anthony Knyszek	a762221bea	runtime: rearrange mheap_.alloc* into allocSpan This change combines the functionality of allocSpanLocked, allocManual, and alloc_m into a new method called allocSpan. While these methods' abstraction boundaries are OK when the heap lock is held throughout, they start to break down when we want finer-grained locking in the page allocator. allocSpan does just that, and only locks the heap when it absolutely has to. Piggy-backing off of work in previous CLs to make more of span initialization lockless, this change makes span initialization entirely lockless as part of the reorganization. Ultimately this change will enable us to add a lockless fast path to allocSpan. Updates #35112. Change-Id: I99875939d75fb4e958a67ac99e4a7cda44f06864 Reviewed-on: https://go-review.googlesource.com/c/go/+/196641 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 17:01:18 +00:00
Michael Anthony Knyszek	dac936a4ab	runtime: make more page sweeper operations atomic This change makes it so that allocation and free related page sweeper metadata operations (e.g. pageInUse and pagesInUse) are atomic rather than protected by the heap lock. This will help in reducing the length of the critical path with the heap lock held in future changes. Updates #35112. Change-Id: Ie82bff024204dd17c4c671af63350a7a41add354 Reviewed-on: https://go-review.googlesource.com/c/go/+/196640 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 17:00:57 +00:00
Michael Anthony Knyszek	7f574e476a	runtime: remove unnecessary large parameter to mheap_.alloc mheap_.alloc currently accepts both a spanClass and a "large" parameter indicating whether the allocation is large. These are redundant, since spanClass.sizeclass() == 0 is an equivalent way to determine this and is already used in mheap_.alloc. There are no places in the runtime where the size class could be non-zero and large == true. Updates #35112. Change-Id: Ie66facf8f0faca6f4cd3d20a8ac4bc259e11823d Reviewed-on: https://go-review.googlesource.com/c/go/+/196639 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 16:44:33 +00:00
Michael Anthony Knyszek	ffb5646fe0	runtime: define maximum supported physical page and huge page sizes This change defines a maximum supported physical and huge page size in the runtime based on the new page allocator's implementation, and uses them where appropriate. Furthemore, if the system exceeds the maximum supported huge page size, we simply ignore it silently. It also fixes a huge-page-related test which is only triggered by a condition which is definitely wrong. Finally, it adds a few TODOs related to code clean-up and supporting larger huge page sizes. Updates #35112. Fixes #35431. Change-Id: Ie4348afb6bf047cce2c1433576d1514720d8230f Reviewed-on: https://go-review.googlesource.com/c/go/+/205937 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>	2019-11-08 16:35:48 +00:00
Michael Anthony Knyszek	ae4534e659	runtime: ensure heap memstats are updated atomically For the most part, heap memstats are already updated atomically when passed down to OS-level memory functions (e.g. sysMap). Elsewhere, however, they're updated with the heap lock. In order to facilitate holding the heap lock for less time during allocation paths, this change more consistently makes the update of these statistics atomic by calling mSysStat{Inc,Dec} appropriately instead of simply adding or subtracting. It also ensures these values are loaded atomically. Furthermore, an undocumented but safe update condition for these memstats is during STW, at which point using atomics is unnecessary. This change also documents this condition in mstats.go. Updates #35112. Change-Id: I87d0b6c27b98c88099acd2563ea23f8da1239b66 Reviewed-on: https://go-review.googlesource.com/c/go/+/196638 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 16:21:04 +00:00
Michael Anthony Knyszek	814c5058bb	runtime: remove useless heap_objects accounting This change removes useless additional heap_objects accounting for large objects. heap_objects is computed from scratch at ReadMemStats time (which stops the world) by using nlargealloc and nlargefree, so mutating heap_objects turns out to be pointless. As a result, the "large" parameter on "mheap_.freeSpan" is no longer necessary and so this change cleans that up too. Change-Id: I7d6b486d9b57c018e3db46221d81b55fe4c1b021 Reviewed-on: https://go-review.googlesource.com/c/go/+/196637 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 16:20:27 +00:00
Michael Anthony Knyszek	4208dbef16	runtime: make allocNeedsZero lock-free In preparation for a lockless fast path in the page allocator, this change makes it so that checking if an allocation needs to be zeroed may be done atomically. Unfortunately, this means there is a CAS-loop to ensure monotonicity of the zeroedBase value in heapArena. This CAS-loop exits if an allocator acquiring memory further on in the arena wins or if it succeeds. The CAS-loop should have a relatively small amount of contention because of this monotonicity, though it would be ideal if we could just have CAS-ers with the greatest value always win. The CAS-loop is unnecessary in the steady-state, but should bring some start-up performance gains as it's likely cheaper than the additional zeroing required, especially for large allocations. For very large allocations that span arenas, the CAS-loop should be completely uncontended for most of the arenas it touches, it may only encounter contention on the first and last arena. Updates #35112. Change-Id: If3d19198b33f1b1387b71e1ce5902d39a5c0f98e Reviewed-on: https://go-review.googlesource.com/c/go/+/203859 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 16:20:17 +00:00
Michael Anthony Knyszek	33dfd3529b	runtime: remove old page allocator This change removes the old page allocator from the runtime. Updates #35112. Change-Id: Ib20e1c030f869b6318cd6f4288a9befdbae1b771 Reviewed-on: https://go-review.googlesource.com/c/go/+/195700 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-08 00:07:43 +00:00
Michael Anthony Knyszek	a120cc8b36	runtime: compute whether a span needs zeroing in the new page allocator This change adds the allocNeedZero method to mheap which uses the new heapArena field zeroedBase to determine whether a new allocation needs zeroing. The purpose of this work is to avoid zeroing memory that is fresh from the OS in the context of the new allocator, where we no longer have the concept of a free span to track this information. The new field in heapArena, zeroedBase, is small, which runs counter to the advice in the doc comment for heapArena. Since heapArenas are already not a multiple of the system page size, this advice seems stale, and we're OK with using an extra physical page for a heapArena. So, this change also deletes the comment with that advice. Updates #35112. Change-Id: I688cd9fd3c57a98a6d43c45cf699543ce16697e2 Reviewed-on: https://go-review.googlesource.com/c/go/+/203858 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-07 20:52:05 +00:00
Michael Anthony Knyszek	689f6f77f0	runtime: integrate new page allocator into runtime This change integrates all the bits and pieces of the new page allocator into the runtime, behind a global constant. Updates #35112. Change-Id: I6696bde7bab098a498ab37ed2a2caad2a05d30ec Reviewed-on: https://go-review.googlesource.com/c/go/+/201764 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-07 20:14:02 +00:00
Michael Anthony Knyszek	21445b091e	runtime: make the scavenger self-paced Currently the runtime background scavenger is paced externally, controlled by a collection of variables which together describe a line that we'd like to stay under. However, the line to stay under is computed as a function of the number of free and unscavenged huge pages in the heap at the end of the last GC. Aside from this number being inaccurate (which is still acceptable), the scavenging system also makes an order-of-magnitude assumption as to how expensive scavenging a single page actually is. This change simplifies the scavenger in preparation for making it operate on bitmaps. It makes it so that the scavenger paces itself, by measuring the amount of time it takes to scavenge a single page. The scavenging methods on mheap already avoid breaking huge pages, so if we scavenge a real huge page, then we'll have paced correctly, otherwise we'll sleep for longer to avoid using more than scavengePercent wall clock time. Unfortunately, all this involves measuring time, which is quite tricky. Currently we don't directly account for long process sleeps or OS-level context switches (which is quite difficult to do in general), but we do account for Go scheduler overhead and variations in it by maintaining an EWMA of the ratio of time spent scavenging to the time spent sleeping. This ratio, as well as the sleep time, are bounded in order to deal with the aforementioned OS-related anomalies. Updates #35112. Change-Id: Ieca8b088fdfca2bebb06bcde25ef14a42fd5216b Reviewed-on: https://go-review.googlesource.com/c/go/+/201763 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2019-11-07 20:12:18 +00:00
Michael Anthony Knyszek	383b447e0d	runtime: clean up power-of-two rounding code with align functions This change renames the "round" function to the more appropriately named "alignUp" which rounds an integer up to the next multiple of a power of two. This change also adds the alignDown function, which is almost like alignUp but rounds down to the previous multiple of a power of two. With these two functions, we also go and replace manual rounding code with it where we can. Change-Id: Ie1487366280484dcb2662972b01b4f7135f72fec Reviewed-on: https://go-review.googlesource.com/c/go/+/190618 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2019-11-04 23:41:34 +00:00

1 2 3 4 5

237 commits