Stowage/go - Remotebranch.eu

Stowage/go

mirror of https://github.com/golang/go.git synced 2025-12-08 06:10:04 +00:00

Author	SHA1	Message	Date
Michael Anthony Knyszek	b1aadd034c	runtime: emit STW events for all pauses, not just those for the GC Currently STW events are only emitted for GC STWs. There's little reason why the trace can't contain events for every STW: they're rare so don't take up much space in the trace, yet being able to see when the world was stopped is often critical to debugging certain latency issues, especially when they stem from user-level APIs. This change adds new "kinds" to the EvGCSTWStart event, renames the GCSTW events to just "STW," and lets the parser deal with unknown STW kinds for future backwards compatibility. But, this change must break trace compatibility, so it bumps the trace version to Go 1.21. This change also includes a small cleanup in the trace command, which previously checked for STW events when deciding whether user tasks overlapped with a GC. Looking at the source, I don't see a way for STW events to ever enter the stream that that code looks at, so that condition has been deleted. Change-Id: I9a5dc144092c53e92eb6950e9a5504a790ac00cf Reviewed-on: https://go-review.googlesource.com/c/go/+/494495 Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Michael Knyszek <mknyszek@google.com>	2023-05-19 17:06:45 +00:00
Sven Anderson	251daf46fb	runtime: implement Pinner API for object pinning Some C APIs require the use or structures that contain pointers to buffers (iovec, io_uring, ...). The pointer passing rules would require that these buffers are allocated in C memory and to process this data with Go libraries it would need to be copied. In order to provide a zero-copy way to use these C APIs, this CL implements a Pinner API that allows to pin Go objects, which guarantees that the garbage collector does not move these objects while pinned. This allows to relax the pointer passing rules so that pinned pointers can be stored in C allocated memory or can be contained in Go memory that is passed to C functions. The Pin() method accepts pointers to objects of any type and unsafe.Pointer. Slices and arrays can be pinned by calling Pin() with the pointer to the first element. Pinning of maps is not supported. If the GC collects unreachable Pinner holding pinned objects it panics. If Pin() is called with the other non-pointer types it panics as well. Performance considerations: This change has no impact on execution time on existing code, because checks are only done in code paths, that would panic otherwise. The memory footprint on existing code is one pointer per memory span. Fixes: #46787 Signed-off-by: Sven Anderson <sven@anderson.de> Change-Id: I110031fe789b92277ae45a9455624687bd1c54f2 Reviewed-on: https://go-review.googlesource.com/c/go/+/367296 Auto-Submit: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Than McIntosh <thanm@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com>	2023-05-19 14:59:14 +00:00
Michael Anthony Knyszek	e97bd776f9	runtime: make the memory limit heap goal headroom proportional Currently if GOGC=off and GOMEMLIMIT is set, then the synchronous scavenger is likely to work fairly often to maintain the limit, since the heap goal goes right up to the edge of the memory limit (minus a fixed 1 MiB of headroom). If the application's allocation rate is high, and page-level fragmentation is high, then most allocations will scavenge. This change mitigates this problem by adding a proportional component to constant headroom added to the memory-limit-based heap goal. This means the runtime will have much more headroom before fragmentation forces memory to be eagerly scavenged. The proportional headroom in this case is 3%, or ~30 MiB for a 1 GiB heap. This technically will increase GC frequency in the GOGC=off case by a tiny amount, but will likely have a positive impact on both allocation throughput and latency that outweighs this difference. I wrote a small program to reproduce this issue and confirmed that the issue is resolved by this patch: https://github.com/golang/go/issues/57069#issuecomment-1551746565 This value of 3% is chosen as it seems to be a inflection point in this particular small program. 2% still resulted in quite a bit of eager scavenging work. I confirmed this results in a GC frequency increase of about 3%. This choice is still somewhat arbitrary because the program is arbitrary, so perhaps worth revisiting in the future. Still, it should help a good number of programs. Fixes #57069. Change-Id: Icb9829db0dfefb4fe42a0cabc5aa8d35970dd7d5 Reviewed-on: https://go-review.googlesource.com/c/go/+/460375 Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2023-05-19 13:53:21 +00:00
Cherry Mui	be4fe08b57	reflect: do not escape Value.Type Types are either static (for compiler-created types) or heap allocated and always reachable (for reflection-created types, held in the central map). So there is no need to escape types. With CL 408826 reflect.Value does not always escape. Some functions that escapes Value.typ would make the Value escape without this CL. Had to add a special case for the inliner to keep (*Value).Type still inlineable. Change-Id: I7c14d35fd26328347b509a06eb5bd1534d40775f Reviewed-on: https://go-review.googlesource.com/c/go/+/413474 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2023-05-12 21:11:51 +00:00
Felix Geisendörfer	c3db9af3a6	runtime: remove unused skip arg from fpTracebackPCs This was accidentally left behind when moving the logic to set the skip sentinel in pcBuf to the caller. Change-Id: Id7565f6ea4df6b32cf18b99c700bca322998d182 Reviewed-on: https://go-review.googlesource.com/c/go/+/489095 Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>	2023-05-12 13:51:15 +00:00
Michael Pratt	96add980ad	Revert "runtime: rename getcallerfp to getfp" This reverts CL 481617. Reason for revert: breaks test build on Windows Change-Id: Ifc1a323b0cc521e7a5a1f7de7b3da667f5fee375 Reviewed-on: https://go-review.googlesource.com/c/go/+/494377 Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2023-05-11 17:15:06 +00:00
Felix Geisendörfer	bc9e21351c	runtime: rename getcallerfp to getfp The previous name was wrong due to the mistaken assumption that calling f->g->getcallerpc and f->g->getcallersp would respectively return the pc/sp at g. However, they are actually referring to their caller's caller, i.e. f. Rename getcallerfp to getfp in order to stay consistent with this naming convention. Also see discussion on CL 463835. For #16638 Change-Id: I07990645da78819efd3db92f643326652ee516f8 Reviewed-on: https://go-review.googlesource.com/c/go/+/481617 Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2023-05-11 15:18:52 +00:00
David Chase	2e93fe0a9f	runtime: move per-type types to internal/abi Change-Id: I1f031f0f83a94bebe41d3978a91a903dc5bcda66 Reviewed-on: https://go-review.googlesource.com/c/go/+/489276 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: David Chase <drchase@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2023-05-11 13:45:40 +00:00
David Chase	bdc6ae579a	internal/abi: refactor (basic) type struct into one definition This touches a lot of files, which is bad, but it is also good, since there's N copies of this information commoned into 1. The new files in internal/abi are copied from the end of the stack; ultimately this will all end up being used. Change-Id: Ia252c0055aaa72ca569411ef9f9e96e3d610889e Reviewed-on: https://go-review.googlesource.com/c/go/+/462995 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Carlos Amedee <carlos@golang.org> Run-TryBot: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org>	2023-05-05 14:59:28 +00:00
Felix Geisendörfer	6dca1a29ab	runtime: add test for systemstack frame pointer adjustment Add TestSystemstackFramePointerAdjust as a regression test for CL 489015. By turning stackPoisonCopy into a var instead of const and introducing the ShrinkStackAndVerifyFramePointers() helper function, we are able to trigger the exact combination of events that can crash traceEvent() if systemstack restores a frame pointer that is pointing into the old stack. Updates #59692 Change-Id: I60fc6940638077e3b60a81d923b5f5b4f6d8a44c Reviewed-on: https://go-review.googlesource.com/c/go/+/489115 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>	2023-05-03 14:34:04 +00:00
Michael Anthony Knyszek	15c1276246	runtime: bring back minHeapIdx in scavenge index The scavenge index currently doesn't guard against overflow, and CL 436395 removed the minHeapIdx optimization that allows the chunk scan to skip scanning chunks that haven't been mapped for the heap, and are only available as a consequence of chunks' mapped region being rounded out to a page on both ends. Because the 0'th chunk is never mapped, minHeapIdx effectively prevents overflow, fixing the iOS breakage. This change also refactors growth and initialization a little bit to decouple it from pageAlloc a bit and share code across platforms. Change-Id: If7fc3245aa81cf99451bf8468458da31986a9b0a Reviewed-on: https://go-review.googlesource.com/c/go/+/486695 Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Michael Knyszek <mknyszek@google.com>	2023-04-20 20:08:25 +00:00
Michael Anthony Knyszek	8fa9e3beee	runtime: manage huge pages explicitly This change makes it so that on Linux the Go runtime explicitly marks page heap memory as either available to be backed by hugepages or not using heuristics based on density. The motivation behind this change is twofold: 1. In default Linux configurations, khugepaged can recoalesce hugepages even after the scavenger breaks them up, resulting in significant overheads for small heaps when their heaps shrink. 2. The Go runtime already has some heuristics about this, but those heuristics appear to have bit-rotted and result in haphazard hugepage management. Unlucky (but otherwise fairly dense) regions of memory end up not backed by huge pages while sparse regions end up accidentally marked MADV_HUGEPAGE and are not later broken up by the scavenger, because it already got the memory it needed from more dense sections (this is more likely to happen with small heaps that go idle). In this change, the runtime uses a new policy: 1. Mark all new memory MADV_HUGEPAGE. 2. Track whether each page chunk (4 MiB) became dense during the GC cycle. Mark those MADV_HUGEPAGE, and hide them from the scavenger. 3. If a chunk is not dense for 1 full GC cycle, make it visible to the scavenger. 4. The scavenger marks a chunk MADV_NOHUGEPAGE before it scavenges it. This policy is intended to try and back memory that is a good candidate for huge pages (high occupancy) with huge pages, and give memory that is not (low occupancy) to the scavenger. Occupancy is defined not just by occupancy at any instant of time, but also occupancy in the near future. It's generally true that by the end of a GC cycle the heap gets quite dense (from the perspective of the page allocator). Because we want scavenging and huge page management to happen together (the right time to MADV_NOHUGEPAGE is just before scavenging in order to break up huge pages and keep them that way) and the cost of applying MADV_HUGEPAGE and MADV_NOHUGEPAGE is somewhat high, the scavenger avoids releasing memory in dense page chunks. All this together means the scavenger will now more generally release memory on a ~1 GC cycle delay. Notably this has implications for scavenging to maintain the memory limit and the runtime/debug.FreeOSMemory API. This change makes it so that in these cases all memory is visible to the scavenger regardless of sparseness and delays the page allocator in re-marking this memory with MADV_NOHUGEPAGE for around 1 GC cycle to mitigate churn. The end result of this change should be little-to-no performance difference for dense heaps (MADV_HUGEPAGE works a lot like the default unmarked state) but should allow the scavenger to more effectively take back fragments of huge pages. The main risk here is churn, because MADV_HUGEPAGE usually forces the kernel to immediately back memory with a huge page. That's the reason for the large amount of hysteresis (1 full GC cycle) and why the definition of high density is 96% occupancy. Fixes #55328. Change-Id: I8da7998f1a31b498a9cc9bc662c1ae1a6bf64630 Reviewed-on: https://go-review.googlesource.com/c/go/+/436395 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2023-04-19 14:30:00 +00:00
Felix Geisendörfer	25f9af6661	runtime: add a benchmark of FPCallers This allows comparing frame pointer unwinding against the default unwinder as shown below. goos: linux goarch: amd64 pkg: runtime cpu: Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz │ callers.txt │ │ sec/op │ Callers/cached-32 1.254µ ± 0% FPCallers/cached-32 24.99n ± 0% For #16638 Change-Id: I4dd05f82254726152ef4a5d5beceab33641e9d2b Reviewed-on: https://go-review.googlesource.com/c/go/+/475795 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>	2023-03-30 19:18:18 +00:00
Austin Clements	9eba17ff90	runtime: for deep stacks, print both the top 50 and bottom 50 frames This is relatively easy using the new traceback iterator. Ancestor tracebacks are now limited to 50 frames. We could keep that at 100, but the fact that it used 100 before seemed arbitrary and unnecessary. Fixes #7181 Updates #54466 Change-Id: If693045881d84848f17e568df275a5105b6f1cb0 Reviewed-on: https://go-review.googlesource.com/c/go/+/475960 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2023-03-21 19:14:14 +00:00
Austin Clements	3e360b035f	runtime: new API for filling PC traceback buffers Currently, filling PC traceback buffers is one of the jobs of gentraceback. This moves it into a new function, tracebackPCs, with a simple API built around unwinder, and changes all callers to use this new API. Updates #54466. Change-Id: Id2038bded81bf533a5a4e71178a7c014904d938c Reviewed-on: https://go-review.googlesource.com/c/go/+/468300 Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Austin Clements <austin@google.com>	2023-03-10 17:59:37 +00:00
Nick Ripley	51225f6fc6	runtime: record parent goroutine ID, and print it in stack traces Fixes #38651 Change-Id: Id46d684ee80e208c018791a06c26f304670ed159 Reviewed-on: https://go-review.googlesource.com/c/go/+/435337 Run-TryBot: Nick Ripley <nick.ripley@datadoghq.com> Reviewed-by: Ethan Reesor <ethan.reesor@gmail.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2023-02-21 17:35:22 +00:00
Keith Randall	d6171c9be2	runtime: fix conflict between lfstack and checkptr lfstack does very unsafe things. In particular, it will not work with nodes that live on the heap. In normal use by the runtime, that is the case (it is only used for gc work bufs). But the lfstack test does use heap objects. It goes through some hoops to prevent premature deallocation, but those hoops are not enough to convince -d=checkptr that everything is ok. Instead, allocate the test objects outside the heap, like the runtime does for all of its lfstack usage. Remove the lifetime workaround from the test. Reported in https://groups.google.com/g/golang-nuts/c/psjrUV2ZKyI Change-Id: If611105eab6c823a4d6c105938ce145ed731781d Reviewed-on: https://go-review.googlesource.com/c/go/+/448899 Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Keith Randall <khr@golang.org>	2022-11-17 23:12:04 +00:00
Austin Clements	e72da1c15d	runtime: skip TestArenaCollision on failed reservation If TestArenaCollision cannot reserve the address range it expects to reserve, it currently fails somewhat mysteriously. Detect this case and skip the test. This could lead to test rot if we wind up always skipping this test, but it's not clear that there's a better answer. If the test does fail, we now also log what it thinks it reserved so the failure message is more useful in debugging any issues. Fixes #49415 Fixes #54597 Change-Id: I05cf27258c1c0a7a3ac8d147f36bf8890820d59b Reviewed-on: https://go-review.googlesource.com/c/go/+/446877 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>	2022-11-01 17:06:23 +00:00
Michael Pratt	f2656f20ea	cmd/compile,cmd/link,runtime: add start line numbers to func metadata This adds the function "start line number" to runtime._func and runtime.inlinedCall objects. The "start line number" is the line number of the func keyword or TEXT directive for assembly. Subtracting the start line number from PC line number provides the relative line offset of a PC from the the start of the function. This helps with source stability by allowing code above the function to move without invalidating samples within the function. Encoding start line rather than relative lines directly is convenient because the pprof format already contains a start line field. This CL uses a straightforward encoding of explictly including a start line field in every _func and inlinedCall. It is possible that we could compress this further in the future. e.g., functions with a prologue usually have <line of PC 0> == <start line>. In runtime.test, 95% of functions have <line of PC 0> == <start line>. According to bent, this is geomean +0.83% binary size vs master and -0.31% binary size vs 1.19. Note that //line directives can change the file and line numbers arbitrarily. The encoded start line is as adjusted by //line directives. Since this can change in the middle of a function, `line - start line` offset calculations may not be meaningful if //line directives are in use. For #55022. Change-Id: Iaabbc6dd4f85ffdda294266ef982ae838cc692f6 Reviewed-on: https://go-review.googlesource.com/c/go/+/429638 Run-TryBot: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>	2022-10-14 14:47:12 +00:00
Michael Anthony Knyszek	e0d01b8467	arena: add experimental arena package This change adds the arenas package and a function to reflect for allocating from an arena via reflection, but all the new API is placed behind a GOEXPERIMENT. For #51317. Change-Id: I026d46294e26ab386d74625108c19a0024fbcedc Reviewed-on: https://go-review.googlesource.com/c/go/+/423361 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-10-12 20:23:36 +00:00
Michael Anthony Knyszek	7866538d25	runtime: add safe arena support to the runtime This change adds an API to the runtime for arenas. A later CL can potentially export it as an experimental API, but for now, just the runtime implementation will suffice. The purpose of arenas is to improve efficiency, primarily by allowing for an application to manually free memory, thereby delaying garbage collection. It comes with other potential performance benefits, such as better locality, a better allocation strategy, and better handling of interior pointers by the GC. This implementation is based on one by danscales@google.com with a few significant differences: * The implementation lives entirely in the runtime (all layers). * Arena chunks are the minimum of 8 MiB or the heap arena size. This choice is made because in practice 64 MiB appears to be way too large of an area for most real-world use-cases. * Arena chunks are not unmapped, instead they're placed on an evacuation list and when there are no pointers left pointing into them, they're allowed to be reused. * Reusing partially-used arena chunks no longer tries to find one used by the same P first; it just takes the first one available. * In order to ensure worst-case fragmentation is never worse than 25%, only types and slice backing stores whose sizes are 1/4th the size of a chunk or less may be used. Previously larger sizes, up to the size of the chunk, were allowed. * ASAN, MSAN, and the race detector are fully supported. * Sets arena chunks to fault that were deferred at the end of mark termination (a non-public patch once did this; I don't see a reason not to continue that). For #51317. Change-Id: I83b1693a17302554cb36b6daa4e9249a81b1644f Reviewed-on: https://go-review.googlesource.com/c/go/+/423359 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Michael Knyszek <mknyszek@google.com>	2022-10-12 20:23:30 +00:00
Michael Anthony Knyszek	63ceff95fa	runtime/metrics: add /sync/mutex/wait/total:seconds metric This change adds a metric to the runtime/metrics package which tracks total mutex wait time for sync.Mutex and sync.RWMutex. The purpose of this metric is to be able to quickly get an idea of the total mutex wait time. The implementation of this metric piggybacks off of the existing G runnable tracking infrastructure, as well as the wait reason set on a G when it goes into _Gwaiting. Fixes #49881. Change-Id: I4691abf64ac3574bec69b4d7d4428b1573130517 Reviewed-on: https://go-review.googlesource.com/c/go/+/427618 Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-09-16 16:33:08 +00:00
Michael Anthony Knyszek	87eda2a782	runtime: shrink time histogram buckets There are lots of useless buckets with too much precision. Introduce a minimum level of precision with a minimum bucket bit. This cuts down on the size of a time histogram dramatically (~3x). Also, pick a smaller sub bucket count; we don't need 6% precision. Also, rename super-buckets to buckets to more closely line up with HDR histogram literature. Change-Id: I199449650e4b34f2a6dca3cf1d8edb071c6655c0 Reviewed-on: https://go-review.googlesource.com/c/go/+/427615 Run-TryBot: Michael Knyszek <mknyszek@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2022-09-16 16:32:01 +00:00
Leonard Wang	bd5595d7fa	runtime: refactor finalizer goroutine status Use an atomic.Uint32 to represent the state of finalizer goroutine. fingStatus will only be changed to fingWake in non fingWait state, so it is safe to set fingRunningFinalizer status in runfinq. name old time/op new time/op delta Finalizer-8 592µs ± 4% 561µs ± 1% -5.22% (p=0.000 n=10+10) FinalizerRun-8 694ns ± 6% 675ns ± 7% ~ (p=0.059 n=9+8) Change-Id: I7e4da30cec98ce99f7d8cf4c97f933a8a2d1cae1 Reviewed-on: https://go-review.googlesource.com/c/go/+/400134 Reviewed-by: Joedian Reid <joedian@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Daniel Martí <mvdan@mvdan.cc> Reviewed-by: Michael Pratt <mpratt@google.com>	2022-09-05 08:28:34 +00:00
cuiweixie	fa0e3bffb4	runtime: convert semaRoot.nwait to atomic type For #53821 Change-Id: I686fe81268f70acc6a4c3e6b1d3ed0e07bb0d61c Reviewed-on: https://go-review.googlesource.com/c/go/+/425775 Run-TryBot: xie cui <523516579@qq.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Pratt <mpratt@google.com>	2022-08-31 15:33:00 +00:00
Cuong Manh Le	a719a78c1b	runtime: add and use runtime/internal/sys.NotInHeap Updates #46731 Change-Id: Ic2208c8bb639aa1e390be0d62e2bd799ecf20654 Reviewed-on: https://go-review.googlesource.com/c/go/+/421878 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>	2022-08-19 00:29:18 +00:00
Michael Pratt	b04e4637db	runtime: convert timeHistogram to atomic types I've dropped the note that sched.timeToRun is protected by sched.lock, as it does not seem to be true. For #53821. Change-Id: I03f8dc6ca0bcd4ccf3ec113010a0aa39c6f7d6ef Reviewed-on: https://go-review.googlesource.com/c/go/+/419449 Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Michael Pratt <mpratt@google.com>	2022-08-12 01:53:11 +00:00
Michael Pratt	b8f4847d6f	runtime: convert gcController.globalsScan to atomic type For #53821. Change-Id: I92bd33e355c868ae229395fd9c98fdb10768d03d Reviewed-on: https://go-review.googlesource.com/c/go/+/417779 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>	2022-08-08 14:11:40 +00:00
Michael Pratt	02fb9b8ca9	runtime: convert gcController.maxStackScan to atomic type For #53821. Change-Id: I1bd23cdbc371011ec2331fb0a37482ecf99a063b Reviewed-on: https://go-review.googlesource.com/c/go/+/417778 Run-TryBot: Michael Pratt <mpratt@google.com> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-08-08 14:11:32 +00:00
Michael Pratt	6e9925c4f7	runtime: convert gcController.heapScan to atomic type For #53821. Change-Id: I64d3f53c89a579d93056906304e4c05fc35cd9b3 Reviewed-on: https://go-review.googlesource.com/c/go/+/417776 Run-TryBot: Michael Pratt <mpratt@google.com> Reviewed-by: Austin Clements <austin@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-08-08 14:11:19 +00:00
Michael Pratt	3a9281ff61	runtime: convert gcController.heapLive to atomic type Atomic operations are used even during STW for consistency. For #53821. Change-Id: Ibe7afe5cf893b1288ce24fc96b7691b1f81754ff Reviewed-on: https://go-review.googlesource.com/c/go/+/417775 Run-TryBot: Michael Pratt <mpratt@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-08-08 14:11:09 +00:00
Michael Pratt	29b9a328d2	runtime: trivial replacements of g in remaining files Rename g variables to gp for consistency. Change-Id: I09ecdc7e8439637bc0e32f9c5f96f515e6436362 Reviewed-on: https://go-review.googlesource.com/c/go/+/418591 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Michael Pratt <mpratt@google.com>	2022-08-02 18:52:33 +00:00
Michael Pratt	5e8d261918	runtime: rename _p_ to pp _g_, _p_, and _m_ are primarily vestiges of the C version of the runtime, while today we prefer Go-style variable names (generally gp, pp, and mp). This change replaces all remaining uses of _p_ with pp. These are all trivial replacements (i.e., no conflicts). That said, there are several functions that refer to two different Ps at once. There the naming convention is generally that pp refers to the local P, and p2 refers to the other P we are accessing. Change-Id: I205b801be839216972e7644b1fbeacdbf2612859 Reviewed-on: https://go-review.googlesource.com/c/go/+/306674 Reviewed-by: Austin Clements <austin@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-08-02 18:51:06 +00:00
Michael Pratt	d6481d5b96	runtime: add race annotations to metricsSema metricsSema protects the metrics map. The map implementation is race instrumented regardless of which package is it called from. semacquire/semrelease are not automatically race instrumented, so we can trigger race false positives without manually annotating our lock acquire and release. See similar instrumentation on trace.shutdownSema and reflectOffs.lock. Fixes #53542. Change-Id: Ia3fd239ac860e037d09c7cb9c4ad267391e70705 Reviewed-on: https://go-review.googlesource.com/c/go/+/414517 Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>	2022-06-29 16:30:19 +00:00
Michael Anthony Knyszek	7bad61554e	runtime: write much more direct test for semaphore waiter scalability This test originally existed as two tests in test/locklinear.go, but this checked against actual locks and was flaky. The test was checking a property of a deep part of the runtime but from a much higher level, and it's easy for nondeterminism due to scheduling to completely mess that up, especially on an oversubscribed system. That test was then moved to the sync package with a more rigorous testing methodology, but it could still flake pretty easily. Finally, this CL makes semtable more testable, exports it in export_test.go, then writes a very direct scalability test for exactly the situation the original test described. As far as I can tell, this is much, much more stable, because it's single-threaded and is just checking exactly the algorithm we need to check. Don't bother trying to bring in a test that checks for O(log n) behavior on the other kind of iteration. It'll be perpetually flaky because the underlying data structure is a treap, so it's only _expected_ to be O(log n), but it's very easy for it to get unlucky without a large number of iterations that's too much for a simple test. Fixes #53381. Change-Id: Ia1cd2d2b0e36d552d5a8ae137077260a16016602 Reviewed-on: https://go-review.googlesource.com/c/go/+/412875 Reviewed-by: Michael Pratt <mpratt@google.com>	2022-06-16 21:25:35 +00:00
Michael Anthony Knyszek	d7941030c9	runtime: only use CPU time from the current window in the GC CPU limiter Currently the GC CPU limiter consumes CPU time from a few pools, but because the events that flush to those pools may overlap, rather than be strictly contained within, the update window for the GC CPU limiter, the limiter's accounting is ultimately sloppy. This sloppiness complicates accounting for idle time more completely, and makes reasoning about the transient behavior of the GC CPU limiter much more difficult. To remedy this, this CL adds a field to the P struct that tracks the start time of any in-flight event the limiter might care about, along with information about the nature of that event. This timestamp is managed atomically so that the GC CPU limiter can come in and perform a read of the partial CPU time consumed by a given event. The limiter also updates the timestamp so that only what's left over is flushed by the event itself when it completes. The end result of this change is that, since the GC CPU limiter is aware of all past completed events, and all in-flight events, it can much more accurately collect the CPU time of events since the last update. There's still the possibility for skew, but any leftover time will be captured in the following update, and the magnitude of this leftover time is effectively bounded by the update period of the GC CPU limiter, which is much easier to consider. One caveat of managing this timestamp-type combo atomically is that they need to be packed in 64 bits. So, this CL gives up the top 3 bits of the timestamp and places the type information there. What this means is we effectively have only a 61-bit resolution timestamp. This is fine when the top 3 bits are the same between calls to nanotime, but becomes a problem on boundaries when those 3 bits change. These cases may cause hiccups in the GC CPU limiter by not accounting for some source of CPU time correctly, but with 61 bits of resolution this should be extremely rare. The rate of update is on the order of milliseconds, so at worst the runtime will be off of any given measurement by only a few CPU-milliseconds (and this is directly bounded by the rate of update). We're probably more inaccurate from the fact that we don't measure real CPU time but only approximate it. For #52890. Change-Id: I347f30ac9e2ba6061806c21dfe0193ef2ab3bbe9 Reviewed-on: https://go-review.googlesource.com/c/go/+/410120 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-06-03 20:16:15 +00:00
Michael Anthony Knyszek	1f0ef6bec7	runtime: cancel mark and scavenge assists if the limiter is enabled This change forces mark and scavenge assists to be cancelled early if the limiter is enabled. This avoids goroutines getting stuck in really long assists if the limiter happens to be disabled when they first come into the assist. This can get especially bad for mark assists, which, in dire situations, can end up "owing" the GC a really significant debt. For #52890. Change-Id: I4bfaa76b8de3e167d49d2ffd8bc2127b87ea566a Reviewed-on: https://go-review.googlesource.com/c/go/+/408816 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-05-27 17:30:15 +00:00
Michael Anthony Knyszek	2eb8b6eec6	runtime: make CPU limiter assist time much less error-prone At the expense of performance (having to update another atomic counter) this change makes CPU limiter assist time much less error-prone to manage. There are currently a number of issues with respect to how scavenge assist time is treated, and this change resolves those by just having the limiter maintain its own internal pool that's drained on each update. While we're here, clear the measured assist time each cycle, which was the impetus for the change. Change-Id: I84c513a9f012b4007362a33cddb742c5779782b7 Reviewed-on: https://go-review.googlesource.com/c/go/+/404304 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com>	2022-05-13 16:02:20 +00:00
Keith Randall	016d755213	runtime: measure stack usage; start stacks larger if needed Measure the average stack size used by goroutines at every GC. When starting a new goroutine, allocate an initial goroutine stack of that average size. Intuition is that we'll waste at most 2x in stack space because only half the goroutines can be below average. In turn, we avoid some of the early stack growth / copying needed in the average case. More details in the design doc at: https://docs.google.com/document/d/1YDlGIdVTPnmUiTAavlZxBI1d9pwGQgZT7IKFKlIXohQ/edit?usp=sharing name old time/op new time/op delta Issue18138 95.3µs ± 0% 67.3µs ±13% -29.35% (p=0.000 n=9+10) Fixes #18138 Change-Id: Iba34d22ed04279da7e718bbd569bbf2734922eaa Reviewed-on: https://go-review.googlesource.com/c/go/+/345889 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com>	2022-05-12 22:32:42 +00:00
Michael Anthony Knyszek	e7508598bb	runtime: use Escape instead of escape in export_test.go I landed the bottom CL of my stack without rebasing or retrying trybots, but in the rebase "escape" was removed in favor of "Escape." Change-Id: Icdc4d8de8b6ebc782215f2836cd191377cc211df Reviewed-on: https://go-review.googlesource.com/c/go/+/403755 Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com>	2022-05-03 15:51:30 +00:00
Michael Anthony Knyszek	91f863013e	runtime: redesign scavenging algorithm Currently the runtime's scavenging algorithm involves running from the top of the heap address space to the bottom (or as far as it gets) once per GC cycle. Once it treads some ground, it doesn't tread it again until the next GC cycle. This works just fine for the background scavenger, for heap-growth scavenging, and for debug.FreeOSMemory. However, it breaks down in the face of a memory limit for small heaps in the tens of MiB. Basically, because the scavenger never retreads old ground, it's completely oblivious to new memory it could scavenge, and that it really should in the face of a memory limit. Also, every time some thread goes to scavenge in the runtime, it reserves what could be a considerable amount of address space, hiding it from other scavengers. This change modifies and simplifies the implementation overall. It's less code with complexities that are much better encapsulated. The current implementation iterates optimistically over the address space looking for memory to scavenge, keeping track of what it last saw. The new implementation does the same, but instead of directly iterating over pages, it iterates over chunks. It maintains an index of chunks (as a bitmap over the address space) that indicate which chunks may contain scavenge work. The page allocator populates this index, while scavengers consume it and iterate over it optimistically. This has a two key benefits: 1. Scavenging is much simpler: find a candidate chunk, and check it, essentially just using the scavengeOne fast path. There's no need for the complexity of iterating beyond one chunk, because the index is lock-free and already maintains that information. 2. If pages are freed to the page allocator (always guaranteed to be unscavenged), the page allocator immediately notifies all scavengers of the new source of work, avoiding the hiding issues of the old implementation. One downside of the new implementation, however, is that it's potentially more expensive to find pages to scavenge. In the past, if a single page would become free high up in the address space, the runtime's scavengers would ignore it. Now that scavengers won't, one or more scavengers may need to iterate potentially across the whole heap to find the next source of work. For the background scavenger, this just means a potentially less reactive scavenger -- overall it should still use the same amount of CPU. It means worse overheads for memory limit scavenging, but that's not exactly something with a baseline yet. In practice, this shouldn't be too bad, hopefully since the chunk index is extremely compact. For a 48-bit address space, the index is only 8 MiB in size at worst, but even just one physical page in the index is able to support up to 128 GiB heaps, provided they aren't terribly sparse. On 32-bit platforms, the index is only 128 bytes in size. For #48409. Change-Id: I72b7e74365046b18c64a6417224c5d85511194fb Reviewed-on: https://go-review.googlesource.com/c/go/+/399474 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-05-03 15:13:53 +00:00
Michael Anthony Knyszek	7e4bc74119	runtime: set the heap goal from the memory limit This change makes the memory limit functional by including it in the heap goal calculation. Specifically, we derive a heap goal from the memory limit, and compare that to the GOGC-based goal. If the goal based on the memory limit is lower, we prefer that. To derive the memory limit goal, the heap goal calculation now takes a few additional parameters as input. As a result, the heap goal, in the presence of a memory limit, may change dynamically. The consequences of this are that different parts of the runtime can have different views of the heap goal; this is OK. What's important is that all of the runtime is able to observe the correct heap goal for the moment it's doing something that affects it, like anything that should trigger a GC cycle. On the topic of triggering a GC cycle, this change also allows any manually managed memory allocation from the page heap to trigger a GC. So, specifically workbufs, unrolled GC scan programs, and goroutine stacks. The reason for this is that now non-heap memory can effect the trigger or the heap goal. Most sources of non-heap memory only change slowly, like GC pointer bitmaps, or change in response to explicit function calls like GOMAXPROCS. Note also that unrolled GC scan programs and workbufs are really only relevant during a GC cycle anyway, so they won't actually ever trigger a GC. Our primary target here is goroutine stacks. Goroutine stacks can increase quickly, and this is currently totally independent of the GC cycle. Thus, if for example a goroutine begins to recurse suddenly and deeply, then even though the heap goal and trigger react, we might not notice until its too late. As a result, we need to trigger a GC cycle. We do this trigger in allocManual instead of in stackalloc because it's far more general. We ultimately care about memory that's mapped read/write and not returned to the OS, which is much more the domain of the page heap than the stack allocator. Furthermore, there may be new sources of memory manual allocation in the future (e.g. arenas) that need to trigger a GC if necessary. As such, I'm inclined to leave the trigger in allocManual as an extra defensive measure. It's worth noting that because goroutine stacks do not behave quite as predictably as other non-heap memory, there is the potential for the heap goal to swing wildly. Fortunately, goroutine stacks that haven't been set up to shrink by the last GC cycle will not shrink until after the next one. This reduces the amount of possible churn in the heap goal because it means that shrinkage only happens once per goroutine, per GC cycle. After all the goroutines that should shrink did, then goroutine stacks will only grow. The shrink mechanism is analagous to sweeping, which is incremental and thus tends toward a steady amount of heap memory used. As a result, in practice, I expect this to be a non-issue. Note that if the memory limit is not set, this change should be a no-op. For #48409. Change-Id: Ie06d10175e5e36f9fb6450e26ed8acd3d30c681c Reviewed-on: https://go-review.googlesource.com/c/go/+/394221 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2022-05-03 15:13:35 +00:00
Michael Anthony Knyszek	129dcb7226	runtime: check the heap goal and trigger dynamically As it stands, the heap goal and the trigger are set once by gcController.commit, and then read out of gcController. However with the coming memory limit we need the GC to be able to respond to changes in non-heap memory. The simplest way of achieving this is to compute the heap goal and its associated trigger dynamically. In order to make this easier to implement, the GC trigger is now based on the heap goal, as opposed to the status quo of computing both simultaneously. In many cases we just want the heap goal anyway, not both, but we definitely need the goal to compute the trigger, because the trigger's bounds are entirely based on the goal (the initial runway is not). A consequence of this is that we can't rely on the trigger to enforce a minimum heap size anymore, and we need to lift that up directly to the goal. Specifically, we need to lift up any part of the calculation that could put the trigger ahead of the goal. Luckily this is just the heap minimum and minimum sweep distance. In the first case, the pacer may behave slightly differently, as the heap minimum is no longer the minimum trigger, but the actual minimum heap goal. In the second case it should be the same, as we ensure the additional runway for sweeping is added to both the goal and the trigger, as before, by computing that in gcControllerState.commit. There's also another place we update the heap goal: if a GC starts and we triggered beyond the goal, we always ensure there's some runway. That calculation uses the current trigger, which violates the rule of keeping the goal based on the trigger. Notice, however, that using the precomputed trigger for this isn't even quite correct: due to a bug, or something else, we might trigger a GC beyond the precomputed trigger. So this change also adds a "triggered" field to gcControllerState that tracks the point at which a GC actually triggered. This is independent of the precomputed trigger, so it's fine for the heap goal calculation to rely on it. It also turns out, there's more than just that one place where we really should be using the actual trigger point, so this change fixes those up too. Also, because the heap minimum is set by the goal and not the trigger, the maximum trigger calculation now happens after the goal is set, so the maximum trigger actually does what I originally intended (and what the comment says): at small heaps, the pacer picks 95% of the runway as the maximum trigger. Currently, the pacer picks a small trigger based on a not-yet-rounded-up heap goal, so the trigger gets rounded up to the goal, and as per the "ensure there's some runway" check, the runway ends up at always being 64 KiB. That check is supposed to be for exceptional circumstances, not the status quo. There's a test introduced in the last CL that needs to be updated to accomodate this slight change in behavior. So, this all sounds like a lot that changed, but what we're talking about here are really, really tight corner cases that arise from situations outside of our control, like pathologically bad behavior on the part of an OS or CPU. Even in these corner cases, it's very unlikely that users will notice any difference at all. What's more important, I think, is that the pacer behaves more closely to what all the comments describe, and what the original intent was. Another note: at first, one might think that computing the heap goal and trigger dynamically introduces some raciness, but not in this CL: the heap goal and trigger are completely static. Allocation outside of a GC cycle may now be a bit slower than before, as the GC trigger check is now significantly more complex. However, note that this executes basically just as often as gcController.revise, and that makes up for a vanishingly small part of any CPU profile. The next CL cleans up the floating point multiplications on this path nonetheless, just to be safe. For #48409. Change-Id: I280f5ad607a86756d33fb8449ad08555cbee93f9 Reviewed-on: https://go-review.googlesource.com/c/go/+/397014 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-05-03 15:12:52 +00:00
Michael Anthony Knyszek	473c99643f	runtime: rewrite pacer max trigger calculation Currently the maximum trigger calculation is totally incorrect with respect to the comment above it and its intent. This change rectifies this mistake. For #48409. Change-Id: Ifef647040a8bdd304dd327695f5f315796a61a74 Reviewed-on: https://go-review.googlesource.com/c/go/+/398834 Run-TryBot: Michael Knyszek <mknyszek@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-05-03 15:12:45 +00:00
Michael Anthony Knyszek	375d696ddf	runtime: move inconsistent memstats into gcController Fundamentally, all of these memstats exist to serve the runtime in managing memory. For the sake of simpler testing, couple these stats more tightly with the GC. This CL was mostly done automatically. The fields had to be moved manually, but the references to the fields were updated via gofmt -w -r 'memstats.<field> -> gcController.<field>' *.go For #48409. Change-Id: Ic036e875c98138d9a11e1c35f8c61b784c376134 Reviewed-on: https://go-review.googlesource.com/c/go/+/397678 Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>	2022-05-03 15:12:38 +00:00
Michael Anthony Knyszek	4649a43903	runtime: track how much memory is mapped in the Ready state This change adds a field to memstats called mappedReady that tracks how much memory is in the Ready state at any given time. In essence, it's the total memory usage by the Go runtime (with one exception which is documented). Essentially, all memory mapped read/write that has either been paged in or will soon. To make tracking this not involve the many different stats that track mapped memory, we track this statistic at a very low level. The downside of tracking this statistic at such a low level is that it managed to catch lots of situations where the runtime wasn't fully accounting for memory. This change rectifies these situations by always accounting for memory that's mapped in some way (i.e. always passing a sysMemStat to a mem.go function), with two exceptions. Rectifying these situations means also having the memory mapped during testing being accounted for, so that tests (i.e. ReadMemStats) that ultimately check mappedReady continue to work correctly without special exceptions. We choose to simply account for this memory in other_sys. Let's talk about the exceptions. The first is the arenas array for finding heap arena metadata from an address is mapped as read/write in one large chunk. It's tens of MiB in size. On systems with demand paging, we assume that the whole thing isn't paged in at once (after all, it maps to the whole address space, and it's exceedingly difficult with today's technology to even broach having as much physical memory as the total address space). On systems where we have to commit memory manually, we use a two-level structure. Now, the reason why this is an exception is because we have no mechanism to track what memory is paged in, and we can't just account for the entire thing, because that would look like an enormous overhead. Furthermore, this structure is on a few really, really critical paths in the runtime, so doing more explicit tracking isn't really an option. So, we explicitly don't and call sysAllocOS to map this memory. The second exception is that we call sysFree with no accounting to clean up address space reservations, or otherwise to throw out mappings we don't care about. In this case, also drop down to a lower level and call sysFreeOS to explicitly avoid accounting. The third exception is debuglog allocations. That is purely a debugging facility and ideally we want it to have as small an impact on the runtime as possible. If we include it in mappedReady calculations, it could cause GC pacing shifts in future CLs, especailly if one increases the debuglog buffer sizes as a one-off. As of this CL, these are the only three places in the runtime that would pass nil for a stat to any of the functions in mem.go. As a result, this CL makes sysMemStats mandatory to facilitate better accounting in the future. It's now much easier to grep and find out where accounting is explicitly elided, because one doesn't have to follow the trail of sysMemStat nil pointer values, and can just look at the function name. For #48409. Change-Id: I274eb467fc2603881717482214fddc47c9eaf218 Reviewed-on: https://go-review.googlesource.com/c/go/+/393402 Reviewed-by: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Michael Knyszek <mknyszek@google.com>	2022-05-03 15:12:21 +00:00
Michael Anthony Knyszek	986a31053d	runtime: add a non-functional memory limit to the pacer Nothing much to see here, just some plumbing to make latter CLs smaller and clearer. For #48409. Change-Id: Ide23812d5553e0b6eea5616c277d1a760afb4ed0 Reviewed-on: https://go-review.googlesource.com/c/go/+/393401 Run-TryBot: Michael Knyszek <mknyszek@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com>	2022-05-03 15:12:04 +00:00
Michael Anthony Knyszek	0feebe6eb5	runtime: add byte count parser for GOMEMLIMIT This change adds a parser for the GOMEMLIMIT environment variable's input. This environment variable accepts a number followed by an optional prefix expressing the unit. Acceptable units include B, KiB, MiB, GiB, TiB, where *iB is a power-of-two byte unit. For #48409. Change-Id: I6a3b4c02b175bfcf9c4debee6118cf5dda93bb6f Reviewed-on: https://go-review.googlesource.com/c/go/+/393400 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Knyszek <mknyszek@google.com>	2022-05-03 15:11:55 +00:00
Michael Knyszek	01359b4681	runtime: add GC CPU utilization limiter This change adds a GC CPU utilization limiter to the GC. It disables assists to ensure GC CPU utilization remains under 50%. It uses a leaky bucket mechanism that will only fill if GC CPU utilization exceeds 50%. Once the bucket begins to overflow, GC assists are limited until the bucket empties, at the risk of GC overshoot. The limiter is primarily updated by assists. The scheduler may also update it, but only if the GC is on and a few milliseconds have passed since the last update. This second case exists to ensure that if the limiter is on, and no assists are happening, we're still updating the limiter regularly. The purpose of this limiter is to mitigate GC death spirals, opting to use more memory instead. This change turns the limiter on always. In practice, 50% overall GC CPU utilization is very difficult to hit unless you're trying; even the most allocation-heavy applications with complex heaps still need to do something with that memory. Note that small GOGC values (i.e. single-digit, or low teens) are more likely to trigger the limiter, which means the GOGC tradeoff may no longer be respected. Even so, it should still be relatively rare. This change also introduces the feature flag for code to support the memory limit feature. For #48409. Change-Id: Ia30f914e683e491a00900fd27868446c65e5d3c2 Reviewed-on: https://go-review.googlesource.com/c/go/+/353989 Reviewed-by: Michael Pratt <mpratt@google.com>	2022-05-03 15:11:42 +00:00
Austin Clements	123e27170a	runtime: clean up escaping in tests There are several tests in the runtime that need to force various things to escape to the heap. This CL centralizes this functionality into runtime.Escape, defined in export_test. Change-Id: I2de2519661603ad46c372877a9c93efef8e7a857 Reviewed-on: https://go-review.googlesource.com/c/go/+/402178 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Austin Clements <austin@google.com> Auto-Submit: Austin Clements <austin@google.com>	2022-04-28 18:28:44 +00:00

1 2 3 4 5

221 commits