Commit graph

48 commits

Author SHA1 Message Date
Roland Shoemaker
6669aa3b14 runtime: randomize heap base address
During initialization, allow randomizing the heap base address by
generating a random uint64 and using its bits to randomize various
portions of the heap base address.

We use the following method to randomize the base address:

* We first generate a random heapArenaBytes aligned address that we use
  for generating the hints.
* On the first call to mheap.grow, we then generate a random
  PallocChunkBytes aligned offset into the mmap'd heap region, which we
  use as the base for the heap region.
* We then mark a random number of pages within the page allocator as
  allocated.

Our final randomized "heap base address" becomes the first byte of
the first available page returned by the page allocator. This results
in an address with at least heapAddrBits-gc.PageShift-1 bits of
entropy.

Fixes #27583

Change-Id: Ideb4450a5ff747a132f702d563d2a516dec91a88
Reviewed-on: https://go-review.googlesource.com/c/go/+/674835
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-24 09:59:23 -07:00
Michael Anthony Knyszek
528bafa049 runtime: move sizeclass defs to new package internal/runtime/gc
We will want to reference these definitions from new generator programs,
and this is a good opportunity to cleanup all these old C-style names.

Change-Id: Ifb06f0afc381e2697e7877f038eca786610c96de
Reviewed-on: https://go-review.googlesource.com/c/go/+/655275
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2025-04-23 08:00:33 -07:00
Lénaïc Huard
52eaed6633 runtime: decorate anonymous memory mappings
Leverage the prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, ...) API to name
the anonymous memory areas.

This API has been introduced in Linux 5.17 to decorate the anonymous
memory areas shown in /proc/<pid>/maps.

This is already used by glibc. See:
* https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c;h=27dfd1eb907f4615b70c70237c42c552bb4f26a8;hb=HEAD#l2434
* https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/setvmaname.c;h=ea93a5ffbebc9e5a7e32a297138f465724b4725f;hb=HEAD#l63

This can be useful when investigating the memory consumption of a
multi-language program.
On a 100% Go program, pprof profiler can be used to profile the memory
consumption of the program. But pprof is only aware of what happens
within the Go world.

On a multi-language program, there could be a doubt about whether the
suspicious extra-memory consumption comes from the Go part or the native
part.

With this change, the following Go program:

        package main

        import (
                "fmt"
                "log"
                "os"
        )

        /*
        #include <stdlib.h>

        void f(void)
        {
          (void)malloc(1024*1024*1024);
        }
        */
        import "C"

        func main() {
                C.f()

                data, err := os.ReadFile("/proc/self/maps")
                if err != nil {
                        log.Fatal(err)
                }
                fmt.Println(string(data))
        }

produces this output:

        $ GLIBC_TUNABLES=glibc.mem.decorate_maps=1 ~/doc/devel/open-source/go/bin/go run .
        00400000-00402000 r--p 00000000 00:21 28451768                           /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name
        00402000-004a4000 r-xp 00002000 00:21 28451768                           /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name
        004a4000-00574000 r--p 000a4000 00:21 28451768                           /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name
        00574000-00575000 r--p 00173000 00:21 28451768                           /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name
        00575000-00580000 rw-p 00174000 00:21 28451768                           /home/lenaic/.cache/go-build/9f/9f25a17baed5a80d03eb080a2ce2a5ff49c17f9a56e28330f0474a2bb74a30a0-d/test_vma_name
        00580000-005a4000 rw-p 00000000 00:00 0
        2e075000-2e096000 rw-p 00000000 00:00 0                                  [heap]
        c000000000-c000400000 rw-p 00000000 00:00 0                              [anon: Go: heap]
        c000400000-c004000000 ---p 00000000 00:00 0                              [anon: Go: heap reservation]
        777f40000000-777f40021000 rw-p 00000000 00:00 0                          [anon: glibc: malloc arena]
        777f40021000-777f44000000 ---p 00000000 00:00 0
        777f44000000-777f44021000 rw-p 00000000 00:00 0                          [anon: glibc: malloc arena]
        777f44021000-777f48000000 ---p 00000000 00:00 0
        777f48000000-777f48021000 rw-p 00000000 00:00 0                          [anon: glibc: malloc arena]
        777f48021000-777f4c000000 ---p 00000000 00:00 0
        777f4c000000-777f4c021000 rw-p 00000000 00:00 0                          [anon: glibc: malloc arena]
        777f4c021000-777f50000000 ---p 00000000 00:00 0
        777f50000000-777f50021000 rw-p 00000000 00:00 0                          [anon: glibc: malloc arena]
        777f50021000-777f54000000 ---p 00000000 00:00 0
        777f55afb000-777f55afc000 ---p 00000000 00:00 0
        777f55afc000-777f562fc000 rw-p 00000000 00:00 0                          [anon: glibc: pthread stack: 216378]
        777f562fc000-777f562fd000 ---p 00000000 00:00 0
        777f562fd000-777f56afd000 rw-p 00000000 00:00 0                          [anon: glibc: pthread stack: 216377]
        777f56afd000-777f56afe000 ---p 00000000 00:00 0
        777f56afe000-777f572fe000 rw-p 00000000 00:00 0                          [anon: glibc: pthread stack: 216376]
        777f572fe000-777f572ff000 ---p 00000000 00:00 0
        777f572ff000-777f57aff000 rw-p 00000000 00:00 0                          [anon: glibc: pthread stack: 216375]
        777f57aff000-777f57b00000 ---p 00000000 00:00 0
        777f57b00000-777f58300000 rw-p 00000000 00:00 0                          [anon: glibc: pthread stack: 216374]
        777f58300000-777f58400000 rw-p 00000000 00:00 0                          [anon: Go: page alloc index]
        777f58400000-777f5a400000 rw-p 00000000 00:00 0                          [anon: Go: heap index]
        777f5a400000-777f6a580000 ---p 00000000 00:00 0                          [anon: Go: scavenge index]
        777f6a580000-777f6a581000 rw-p 00000000 00:00 0                          [anon: Go: scavenge index]
        777f6a581000-777f7a400000 ---p 00000000 00:00 0                          [anon: Go: scavenge index]
        777f7a400000-777f8a580000 ---p 00000000 00:00 0                          [anon: Go: page summary]
        777f8a580000-777f8a581000 rw-p 00000000 00:00 0                          [anon: Go: page alloc]
        777f8a581000-777f9c430000 ---p 00000000 00:00 0                          [anon: Go: page summary]
        777f9c430000-777f9c431000 rw-p 00000000 00:00 0                          [anon: Go: page alloc]
        777f9c431000-777f9e806000 ---p 00000000 00:00 0                          [anon: Go: page summary]
        777f9e806000-777f9e807000 rw-p 00000000 00:00 0                          [anon: Go: page alloc]
        777f9e807000-777f9ec00000 ---p 00000000 00:00 0                          [anon: Go: page summary]
        777f9ec36000-777f9ecb6000 rw-p 00000000 00:00 0                          [anon: Go: immortal metadata]
        777f9ecb6000-777f9ecc6000 rw-p 00000000 00:00 0                          [anon: Go: gc bits]
        777f9ecc6000-777f9ecd6000 rw-p 00000000 00:00 0                          [anon: Go: allspans array]
        777f9ecd6000-777f9ece7000 rw-p 00000000 00:00 0                          [anon: Go: immortal metadata]
        777f9ece7000-777f9ed67000 ---p 00000000 00:00 0                          [anon: Go: page summary]
        777f9ed67000-777f9ed68000 rw-p 00000000 00:00 0                          [anon: Go: page alloc]
        777f9ed68000-777f9ede7000 ---p 00000000 00:00 0                          [anon: Go: page summary]
        777f9ede7000-777f9ee07000 rw-p 00000000 00:00 0                          [anon: Go: page alloc]
        777f9ee07000-777f9ee0a000 rw-p 00000000 00:00 0                          [anon: glibc: loader malloc]
        777f9ee0a000-777f9ee2e000 r--p 00000000 00:21 48158213                   /usr/lib/libc.so.6
        777f9ee2e000-777f9ef9f000 r-xp 00024000 00:21 48158213                   /usr/lib/libc.so.6
        777f9ef9f000-777f9efee000 r--p 00195000 00:21 48158213                   /usr/lib/libc.so.6
        777f9efee000-777f9eff2000 r--p 001e3000 00:21 48158213                   /usr/lib/libc.so.6
        777f9eff2000-777f9eff4000 rw-p 001e7000 00:21 48158213                   /usr/lib/libc.so.6
        777f9eff4000-777f9effc000 rw-p 00000000 00:00 0
        777f9effc000-777f9effe000 rw-p 00000000 00:00 0                          [anon: glibc: loader malloc]
        777f9f00a000-777f9f04a000 rw-p 00000000 00:00 0                          [anon: Go: immortal metadata]
        777f9f04a000-777f9f04c000 r--p 00000000 00:00 0                          [vvar]
        777f9f04c000-777f9f04e000 r--p 00000000 00:00 0                          [vvar_vclock]
        777f9f04e000-777f9f050000 r-xp 00000000 00:00 0                          [vdso]
        777f9f050000-777f9f051000 r--p 00000000 00:21 48158204                   /usr/lib/ld-linux-x86-64.so.2
        777f9f051000-777f9f07a000 r-xp 00001000 00:21 48158204                   /usr/lib/ld-linux-x86-64.so.2
        777f9f07a000-777f9f085000 r--p 0002a000 00:21 48158204                   /usr/lib/ld-linux-x86-64.so.2
        777f9f085000-777f9f087000 r--p 00034000 00:21 48158204                   /usr/lib/ld-linux-x86-64.so.2
        777f9f087000-777f9f088000 rw-p 00036000 00:21 48158204                   /usr/lib/ld-linux-x86-64.so.2
        777f9f088000-777f9f089000 rw-p 00000000 00:00 0
        7ffc7bfa7000-7ffc7bfc8000 rw-p 00000000 00:00 0                          [stack]
        ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0                  [vsyscall]

The anonymous memory areas are now labelled so that we can see which
ones have been allocated by the Go runtime versus which ones have been
allocated by the glibc.

Fixes #71546

Change-Id: I304e8b4dd7f2477a6da794fd44e9a7a5354e4bf4
Reviewed-on: https://go-review.googlesource.com/c/go/+/646095
Auto-Submit: Alan Donovan <adonovan@google.com>
Commit-Queue: Alan Donovan <adonovan@google.com>
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
2025-03-04 11:22:33 -08:00
Andy Pan
4c2b1e0feb runtime: migrate internal/atomic to internal/runtime
For #65355

Change-Id: I65dd090fb99de9b231af2112c5ccb0eb635db2be
Reviewed-on: https://go-review.googlesource.com/c/go/+/560155
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Ibrahim Bazoka <ibrahimbazoka729@gmail.com>
Auto-Submit: Emmanuel Odeke <emmanuel@orijtech.com>
2024-03-25 19:53:03 +00:00
apocelipes
c8c46e746b runtime: use built-in clear to simplify code
Change-Id: Icb6d9ca996b4119d8636d9f7f6a56e510d74d059
GitHub-Last-Rev: 08178e8ff7
GitHub-Pull-Request: golang/go#66188
Reviewed-on: https://go-review.googlesource.com/c/go/+/569979
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: qiulaidongfeng <2645477756@qq.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2024-03-08 16:28:44 +00:00
Michael Anthony Knyszek
5f08b44799 runtime: call enableMetadataHugePages and its callees on the systemstack
These functions acquire the heap lock. If they're not called on the
systemstack, a stack growth could cause a self-deadlock since stack
growth may allocate memory from the page heap.

This has been a problem for a while. If this is what's plaguing the
ppc64 port right now, it's very surprising (and probably just
coincidental) that it's showing up now.

For #64050.
For #64062.
Fixes #64067.

Change-Id: I2b95dc134d17be63b9fe8f7a3370fe5b5438682f
Reviewed-on: https://go-review.googlesource.com/c/go/+/541635
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
2023-11-13 14:11:13 +00:00
qiulaidongfeng
b5f87b5407 runtime: use max/min func
Change-Id: I3f0b7209621b39cee69566a5cc95e4343b4f1f20
GitHub-Last-Rev: af9dbbe69a
GitHub-Pull-Request: golang/go#63321
Reviewed-on: https://go-review.googlesource.com/c/go/+/531916
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2023-10-24 20:28:25 +00:00
Michael Anthony Knyszek
9f9bb26880 runtime: avoid MADV_HUGEPAGE for heap memory
Currently the runtime marks all new memory as MADV_HUGEPAGE on Linux and
manages its hugepage eligibility status. Unfortunately, the default
THP behavior on most Linux distros is that MADV_HUGEPAGE blocks while
the kernel eagerly reclaims and compacts memory to allocate a hugepage.

This direct reclaim and compaction is unbounded, and may result in
significant application thread stalls. In really bad cases, this can
exceed 100s of ms or even seconds.

Really all we want is to undo MADV_NOHUGEPAGE marks and let the default
Linux paging behavior take over, but the only way to unmark a region as
MADV_NOHUGEPAGE is to also mark it MADV_HUGEPAGE.

The overall strategy of trying to keep hugepages for the heap unbroken
however is sound. So instead let's use the new shiny MADV_COLLAPSE if it
exists.

MADV_COLLAPSE makes a best-effort synchronous attempt at collapsing the
physical memory backing a memory region into a hugepage. We'll use
MADV_COLLAPSE where we would've used MADV_HUGEPAGE, and stop using
MADV_NOHUGEPAGE altogether.

Because MADV_COLLAPSE is synchronous, it's also important to not
re-collapse huge pages if the huge pages are likely part of some large
allocation. Although in many cases it's advantageous to back these
allocations with hugepages because they're contiguous, eagerly
collapsing every hugepage means having to page in at least part of the
large allocation.

However, because we won't use MADV_NOHUGEPAGE anymore, we'll no longer
handle the fact that khugepaged might come in and back some memory we
returned to the OS with a hugepage. I've come to the conclusion that
this is basically unavoidable without a new madvise flag and that it's
just not a good default. If this change lands, advice about Linux huge
page settings will be added to the GC guide.

Verified that this change doesn't regress Sweet, at least not on my
machine with:

/sys/kernel/mm/transparent_hugepage/enabled [always or madvise]
/sys/kernel/mm/transparent_hugepage/defrag [madvise]
/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none [0 or 511]

Unfortunately, this workaround means that we only get forced hugepages
on Linux 6.1+.

Fixes #61718.

Change-Id: I7f4a7ba397847de29f800a99f9cb66cb2720a533
Reviewed-on: https://go-review.googlesource.com/c/go/+/516795
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
2023-08-22 19:05:10 +00:00
Michael Anthony Knyszek
a3e90dc377 runtime: add eager scavenging details to GODEBUG=scavtrace=1
Also, clean up atomics on released-per-cycle while we're here.

For #57069.

Change-Id: I14026e8281f01dea1e8c8de6aa8944712b7b24d9
Reviewed-on: https://go-review.googlesource.com/c/go/+/495916
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-19 13:38:43 +00:00
Michael Anthony Knyszek
15c1276246 runtime: bring back minHeapIdx in scavenge index
The scavenge index currently doesn't guard against overflow, and CL
436395 removed the minHeapIdx optimization that allows the chunk scan to
skip scanning chunks that haven't been mapped for the heap, and are only
available as a consequence of chunks' mapped region being rounded out to
a page on both ends.

Because the 0'th chunk is never mapped, minHeapIdx effectively prevents
overflow, fixing the iOS breakage.

This change also refactors growth and initialization a little bit to
decouple it from pageAlloc a bit and share code across platforms.

Change-Id: If7fc3245aa81cf99451bf8468458da31986a9b0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/486695
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2023-04-20 20:08:25 +00:00
Michael Anthony Knyszek
8588a3fa08 runtime: initialize scavengeIndex fields properly
Currently these fields are uninitialized causing failures on aix-ppc64,
which has a slightly oddly-defined address space compared to the rest.

Change-Id: I2aa46731174154dce86c2074bd0b00eef955d86d
Reviewed-on: https://go-review.googlesource.com/c/go/+/486655
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Bypass: Michael Knyszek <mknyszek@google.com>
2023-04-20 15:25:16 +00:00
Michael Anthony Knyszek
8fa9e3beee runtime: manage huge pages explicitly
This change makes it so that on Linux the Go runtime explicitly marks
page heap memory as either available to be backed by hugepages or not
using heuristics based on density.

The motivation behind this change is twofold:
1. In default Linux configurations, khugepaged can recoalesce hugepages
   even after the scavenger breaks them up, resulting in significant
   overheads for small heaps when their heaps shrink.
2. The Go runtime already has some heuristics about this, but those
   heuristics appear to have bit-rotted and result in haphazard
   hugepage management. Unlucky (but otherwise fairly dense) regions of
   memory end up not backed by huge pages while sparse regions end up
   accidentally marked MADV_HUGEPAGE and are not later broken up by the
   scavenger, because it already got the memory it needed from more
   dense sections (this is more likely to happen with small heaps that
   go idle).

In this change, the runtime uses a new policy:

1. Mark all new memory MADV_HUGEPAGE.
2. Track whether each page chunk (4 MiB) became dense during the GC
   cycle. Mark those MADV_HUGEPAGE, and hide them from the scavenger.
3. If a chunk is not dense for 1 full GC cycle, make it visible to the
   scavenger.
4. The scavenger marks a chunk MADV_NOHUGEPAGE before it scavenges it.

This policy is intended to try and back memory that is a good candidate
for huge pages (high occupancy) with huge pages, and give memory that is
not (low occupancy) to the scavenger. Occupancy is defined not just by
occupancy at any instant of time, but also occupancy in the near future.
It's generally true that by the end of a GC cycle the heap gets quite
dense (from the perspective of the page allocator).

Because we want scavenging and huge page management to happen together
(the right time to MADV_NOHUGEPAGE is just before scavenging in order to
break up huge pages and keep them that way) and the cost of applying
MADV_HUGEPAGE and MADV_NOHUGEPAGE is somewhat high, the scavenger avoids
releasing memory in dense page chunks. All this together means the
scavenger will now more generally release memory on a ~1 GC cycle delay.

Notably this has implications for scavenging to maintain the memory
limit and the runtime/debug.FreeOSMemory API. This change makes it so
that in these cases all memory is visible to the scavenger regardless of
sparseness and delays the page allocator in re-marking this memory with
MADV_NOHUGEPAGE for around 1 GC cycle to mitigate churn.

The end result of this change should be little-to-no performance
difference for dense heaps (MADV_HUGEPAGE works a lot like the default
unmarked state) but should allow the scavenger to more effectively take
back fragments of huge pages. The main risk here is churn, because
MADV_HUGEPAGE usually forces the kernel to immediately back memory with
a huge page. That's the reason for the large amount of hysteresis (1
full GC cycle) and why the definition of high density is 96% occupancy.

Fixes #55328.

Change-Id: I8da7998f1a31b498a9cc9bc662c1ae1a6bf64630
Reviewed-on: https://go-review.googlesource.com/c/go/+/436395
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-19 14:30:00 +00:00
Michael Anthony Knyszek
1f9d80e331 runtime: disable huge pages for GC metadata for small heaps
For #55328.

Change-Id: I8792161f09906c08d506cc0ace9d07e76ec6baa6
Reviewed-on: https://go-review.googlesource.com/c/go/+/460316
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-19 14:27:27 +00:00
Oleksandr Redko
1a09d57de5 runtime: correct typos
- Fix typo in throw error message for arena.
- Correct typos in assembly and Go comments.
- Fix log message in TestTraceCPUProfile.

Change-Id: I874c9e8cd46394448b6717bc6021aa3ecf319d16
GitHub-Last-Rev: d27fad4d3c
GitHub-Pull-Request: golang/go#58375
Reviewed-on: https://go-review.googlesource.com/c/go/+/465975
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-02-08 14:52:12 +00:00
cui fliter
069d1fc9e2 runtime: fix a few function names on comments
Change-Id: I4be0b1e612dcc21ca6bb7d4395f1c0aa52480759
GitHub-Last-Rev: 032480c4c9
GitHub-Pull-Request: golang/go#55993
Reviewed-on: https://go-review.googlesource.com/c/go/+/437518
Reviewed-by: hopehook <hopehook@golangcn.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: hopehook <hopehook@golangcn.org>
2022-10-26 02:39:39 +00:00
Michael Anthony Knyszek
b7c28f484d runtime/metrics: add CPU stats
This changes adds a breakdown for estimated CPU usage by time. These
estimates are not based on real on-CPU counters, so each metric has a
disclaimer explaining so. They can, however, be more reasonably
compared to a total CPU time metric that this change also adds.

Fixes #47216.

Change-Id: I125006526be9f8e0d609200e193da5a78d9935be
Reviewed-on: https://go-review.googlesource.com/c/go/+/404307
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Josh MacDonald <jmacd@lightstep.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-09-16 16:32:20 +00:00
Michael Anthony Knyszek
e28cc362a8 runtime: remove alignment padding in mheap and pageAlloc
All subfields use atomic types to ensure alignment, so there's no more
need for these fields.

Change-Id: Iada4253f352a074073ce603f1f6b07cbd5b7c58a
Reviewed-on: https://go-review.googlesource.com/c/go/+/429220
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2022-09-08 16:06:00 +00:00
Michael Anthony Knyszek
1c59199c91 runtime: remove atomic store requirement on pageAlloc.chunks
pageAlloc.chunks used to require an atomic store when growing the heap
because the scavenger would look at the list without locking the heap
lock. However, the scavenger doesn't do that anymore, and it looks like
nothing really does at all.

This change updates the comment and makes the store non-atomic.

Change-Id: Ib452d147861060f9f6e74e2d98ee111cf89ce8f5
Reviewed-on: https://go-review.googlesource.com/c/go/+/429219
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2022-09-08 16:05:56 +00:00
bqyang
bc0a033266 runtime: fix comment typo in mpagealloc.go
leve --> level

Change-Id: Ia5ff46c79c4dda2df426ec75d69e8fcede909b47
GitHub-Last-Rev: e57cad22d9
GitHub-Pull-Request: golang/go#54788
Reviewed-on: https://go-review.googlesource.com/c/go/+/426974
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Auto-Submit: Keith Randall <khr@google.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Reviewed-by: Keith Randall <khr@google.com>
2022-08-31 16:27:09 +00:00
Cuong Manh Le
f324355d1f runtime: remove pageAlloc.scav padding for atomic field alignment
CL 404096 makes atomic.Int64 8 bytes aligned everywhere.

Change-Id: I5a676f646260d6391bb071f9376cbdb1553e6e6f
Reviewed-on: https://go-review.googlesource.com/c/go/+/424925
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2022-08-19 15:53:47 +00:00
Matthew Dempsky
ccb798741b runtime: change maxSearchAddr into a helper function
This avoids a dependency on the compiler statically initializing
maxSearchAddr, which is necessary so we can disable the (overly
aggressive and spec non-conforming) optimizations in cmd/compile and
gccgo.

Updates #51913.

Change-Id: I424e62c81c722bb179ed8d2d8e188274a1aeb7b6
Reviewed-on: https://go-review.googlesource.com/c/go/+/396194
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-11 03:28:01 +00:00
Michael Anthony Knyszek
91f863013e runtime: redesign scavenging algorithm
Currently the runtime's scavenging algorithm involves running from the
top of the heap address space to the bottom (or as far as it gets) once
per GC cycle. Once it treads some ground, it doesn't tread it again
until the next GC cycle.

This works just fine for the background scavenger, for heap-growth
scavenging, and for debug.FreeOSMemory. However, it breaks down in the
face of a memory limit for small heaps in the tens of MiB. Basically,
because the scavenger never retreads old ground, it's completely
oblivious to new memory it could scavenge, and that it really *should*
in the face of a memory limit.

Also, every time some thread goes to scavenge in the runtime, it
reserves what could be a considerable amount of address space, hiding it
from other scavengers.

This change modifies and simplifies the implementation overall. It's
less code with complexities that are much better encapsulated. The
current implementation iterates optimistically over the address space
looking for memory to scavenge, keeping track of what it last saw. The
new implementation does the same, but instead of directly iterating over
pages, it iterates over chunks. It maintains an index of chunks (as a
bitmap over the address space) that indicate which chunks may contain
scavenge work. The page allocator populates this index, while scavengers
consume it and iterate over it optimistically.

This has a two key benefits:
1. Scavenging is much simpler: find a candidate chunk, and check it,
   essentially just using the scavengeOne fast path. There's no need for
   the complexity of iterating beyond one chunk, because the index is
   lock-free and already maintains that information.
2. If pages are freed to the page allocator (always guaranteed to be
   unscavenged), the page allocator immediately notifies all scavengers
   of the new source of work, avoiding the hiding issues of the old
   implementation.

One downside of the new implementation, however, is that it's
potentially more expensive to find pages to scavenge. In the past, if
a single page would become free high up in the address space, the
runtime's scavengers would ignore it. Now that scavengers won't, one or
more scavengers may need to iterate potentially across the whole heap to
find the next source of work. For the background scavenger, this just
means a potentially less reactive scavenger -- overall it should still
use the same amount of CPU. It means worse overheads for memory limit
scavenging, but that's not exactly something with a baseline yet.

In practice, this shouldn't be too bad, hopefully since the chunk index
is extremely compact. For a 48-bit address space, the index is only 8
MiB in size at worst, but even just one physical page in the index is
able to support up to 128 GiB heaps, provided they aren't terribly
sparse. On 32-bit platforms, the index is only 128 bytes in size.

For #48409.

Change-Id: I72b7e74365046b18c64a6417224c5d85511194fb
Reviewed-on: https://go-review.googlesource.com/c/go/+/399474
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:13:53 +00:00
Michael Anthony Knyszek
b4d81147d8 runtime: make the scavenger and allocator respect the memory limit
This change does everything necessary to make the memory allocator and
the scavenger respect the memory limit. In particular, it:

- Adds a second goal for the background scavenge that's based on the
  memory limit, setting a target 5% below the limit to make sure it's
  working hard when the application is close to it.
- Makes span allocation assist the scavenger if the next allocation is
  about to put total memory use above the memory limit.
- Measures any scavenge assist time and adds it to GC assist time for
  the sake of GC CPU limiting, to avoid a death spiral as a result of
  scavenging too much.

All of these changes have a relatively small impact, but each is
intimately related and thus benefit from being done together.

For #48409.

Change-Id: I35517a752f74dd12a151dd620f102c77e095d3e8
Reviewed-on: https://go-review.googlesource.com/c/go/+/397017
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:13:45 +00:00
Michael Anthony Knyszek
4649a43903 runtime: track how much memory is mapped in the Ready state
This change adds a field to memstats called mappedReady that tracks how
much memory is in the Ready state at any given time. In essence, it's
the total memory usage by the Go runtime (with one exception which is
documented). Essentially, all memory mapped read/write that has either
been paged in or will soon.

To make tracking this not involve the many different stats that track
mapped memory, we track this statistic at a very low level. The downside
of tracking this statistic at such a low level is that it managed to
catch lots of situations where the runtime wasn't fully accounting for
memory. This change rectifies these situations by always accounting for
memory that's mapped in some way (i.e. always passing a sysMemStat to a
mem.go function), with *two* exceptions.

Rectifying these situations means also having the memory mapped during
testing being accounted for, so that tests (i.e. ReadMemStats) that
ultimately check mappedReady continue to work correctly without special
exceptions. We choose to simply account for this memory in other_sys.

Let's talk about the exceptions. The first is the arenas array for
finding heap arena metadata from an address is mapped as read/write in
one large chunk. It's tens of MiB in size. On systems with demand
paging, we assume that the whole thing isn't paged in at once (after
all, it maps to the whole address space, and it's exceedingly difficult
with today's technology to even broach having as much physical memory as
the total address space). On systems where we have to commit memory
manually, we use a two-level structure.

Now, the reason why this is an exception is because we have no mechanism
to track what memory is paged in, and we can't just account for the
entire thing, because that would *look* like an enormous overhead.
Furthermore, this structure is on a few really, really critical paths in
the runtime, so doing more explicit tracking isn't really an option. So,
we explicitly don't and call sysAllocOS to map this memory.

The second exception is that we call sysFree with no accounting to clean
up address space reservations, or otherwise to throw out mappings we
don't care about. In this case, also drop down to a lower level and call
sysFreeOS to explicitly avoid accounting.

The third exception is debuglog allocations. That is purely a debugging
facility and ideally we want it to have as small an impact on the
runtime as possible. If we include it in mappedReady calculations, it
could cause GC pacing shifts in future CLs, especailly if one increases
the debuglog buffer sizes as a one-off.

As of this CL, these are the only three places in the runtime that would
pass nil for a stat to any of the functions in mem.go. As a result, this
CL makes sysMemStats mandatory to facilitate better accounting in the
future. It's now much easier to grep and find out where accounting is
explicitly elided, because one doesn't have to follow the trail of
sysMemStat nil pointer values, and can just look at the function name.

For #48409.

Change-Id: I274eb467fc2603881717482214fddc47c9eaf218
Reviewed-on: https://go-review.googlesource.com/c/go/+/393402
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-03 15:12:21 +00:00
Michael Anthony Knyszek
4f543b59c5 runtime: don't hold the heap lock while scavenging
This change modifies the scavenger to no longer hold the heap lock while
actively scavenging pages. To achieve this, the change also:
* Reverses the locking behavior of the (*pageAlloc).scavenge API, to
  only acquire the heap lock when necessary.
* Introduces a new lock on the scavenger-related fields in a pageAlloc
  so that access to those fields doesn't require the heap lock. There
  are a few places in the scavenge path, notably reservation, that
  requires synchronization. The heap lock is far too heavy handed for
  this case.
* Changes the scavenger to marks pages that are actively being scavenged
  as allocated, and "frees" them back to the page allocator the usual
  way.
* Lifts the heap-growth scavenging code out of mheap.grow, where the
  heap lock is held, and into allocSpan, just after the lock is
  released. Releasing the lock during mheap.grow is not feasible if we
  want to ensure that allocation always makes progress (post-growth,
  another allocator could come in and take all that space, forcing the
  goroutine that just grew the heap to do so again).

This change means that the scavenger now must do more work for each
scavenge, but it is also now much more scalable. Although in theory it's
not great by always taking the locked paths in the page allocator, it
takes advantage of some properties of the allocator:
* Most of the time, the scavenger will be working with one page at a
  time. The page allocator's locked path is optimized for this case.
* On the allocation path, it doesn't need to do the find operation at
  all; it can go straight to setting bits for the range and updating the
  summary structure.

Change-Id: Ie941d5e7c05dcc96476795c63fef74bcafc2a0f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/353974
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-11-05 17:46:27 +00:00
Dan Kortschak
8d2a9c32a2 all: remove incorrectly repeated words in comments
Change-Id: Icbf36e1cd8311b40d18177464e7c41dd8cb1c65b
Reviewed-on: https://go-review.googlesource.com/c/go/+/340350
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Carlos Amedee <carlos@golang.org>
2021-09-16 23:57:40 +00:00
Ian Lance Taylor
0e8a72b62e runtime: check for sysAlloc failures in pageAlloc
Change-Id: I78c5744bb01988f1f599569703d83fd21542ac7a
Reviewed-on: https://go-review.googlesource.com/c/go/+/305911
Trust: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-03-31 03:52:40 +00:00
Michael Pratt
9393b5bae5 runtime: add heap lock assertions
Some functions that required holding the heap lock _or_ world stop have
been simplified to simply requiring the heap lock. This is conceptually
simpler and taking the heap lock during world stop is guaranteed to not
contend. This was only done on functions already called on the
systemstack to avoid too many extra systemstack calls in GC.

Updates #40677

Change-Id: I15aa1dadcdd1a81aac3d2a9ecad6e7d0377befdc
Reviewed-on: https://go-review.googlesource.com/c/go/+/250262
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>
2020-10-30 20:21:14 +00:00
Michael Anthony Knyszek
8ebc58452a runtime: delineate which memstats are system stats with a type
This change modifies the type of several mstats fields to be a new type:
sysMemStat. This type has the same structure as the fields used to have.

The purpose of this change is to make it very clear which stats may be
used in various functions for accounting (usually the platform-specific
sys* functions, but there are others). Currently there's an implicit
understanding that the *uint64 value passed to these functions is some
kind of statistic whose value is atomically managed. This understanding
isn't inherently problematic, but we're about to change how some stats
(which currently use mSysStatInc and mSysStatDec) work, so we want to
make it very clear what the various requirements are around "sysStat".

This change also removes mSysStatInc and mSysStatDec in favor of a
method on sysMemStat. Note that those two functions were originally
written the way they were because atomic 64-bit adds required a valid G
on ARM, but this hasn't been the case for a very long time (since
golang.org/cl/14204, but even before then it wasn't clear if mutexes
required a valid G anymore). Today we implement 64-bit adds on ARM with
a spinlock table.

Change-Id: I4e9b37cf14afc2ae20cf736e874eb0064af086d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/246971
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:09:41 +00:00
Michael Pratt
ad64272724 runtime: rename pageAlloc receiver
The history of pageAlloc using 's' as a receiver are lost to the depths
of time (perhaps it used to be called summary?), but it doesn't make
much sense anymore. Rename it to 'p'.

Generated with:

$ cd src/runtime
$ grep -R -b "func (s \*pageAlloc" . | awk -F : '{ print $1 ":#" $2+6 }' | xargs -n 1 -I {} env GOROOT=$(pwd)/../../ gorename -offset {} -to p -v
$ grep -R -b "func (s \*pageAlloc" . | awk -F : '{ print $1 ":#" $2+6 }' | xargs -n 1 -I {} env GOROOT=$(pwd)/../../ GOARCH=386 gorename -offset {} -to p -v
$ GOROOT=$(pwd)/../../ gorename -offset mpagecache.go:#2397 -to p -v

($2+6 to advance past "func (".)

Plus manual comment fixups.

Change-Id: I2d521a1cbf6ebe2ef6aae92e654bfc33c63d1aa9
Reviewed-on: https://go-review.googlesource.com/c/go/+/250517
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2020-10-23 16:54:35 +00:00
Michael Anthony Knyszek
34835df048 runtime: fix ReadMemStatsSlow's and CheckScavengedBits' chunk iteration
Both ReadMemStatsSlow and CheckScavengedBits iterate over the page
allocator's chunks but don't actually check if they exist. During the
development process the chunks index became sparse, so now this was a
possibility. If the runtime tests' heap is sparse we might end up
segfaulting in either one of these functions, though this will generally
be very rare.

The pattern here to return nil for a nonexistent chunk is also useful
elsewhere, so this change introduces tryChunkOf which won't throw, but
might return nil. It also updates the documentation of chunkOf.

Fixes #41296.

Change-Id: Id5ae0ca3234480de1724fdf2e3677eeedcf76fa0
Reviewed-on: https://go-review.googlesource.com/c/go/+/253777
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-09-09 17:48:56 +00:00
Michael Anthony Knyszek
b56791cdea runtime: validate candidate searchAddr in pageAlloc.find
Currently pageAlloc.find attempts to find a better estimate for the
first free page in the heap, even if the space its looking for isn't
necessarily going to be the first free page in the heap (e.g. if npages
>= 2). However, in doing so it has the potential to return a searchAddr
candidate that doesn't actually correspond to mapped memory, but this
candidate might still be adopted. As a result, pageAlloc.alloc's fast
path may look at unmapped summary memory and segfault. This case is rare
on most operating systems since the heap is kept fairly contiguous, so
the chance that the candidate searchAddr discovered is unmapped is
fairly low. Even so, this is totally possible and outside the user's
control when it happens (in fact, it's likely to happen consistently for
a given user on a given system).

Fix this problem by ensuring that our candidate always points to mapped
memory. We do this by looking at mheap's arenas structure first. If it
turns out our candidate doesn't correspond to mapped memory, then we
look at inUse to round up the searchAddr to the next mapped address.

While we're here, clean up some documentation related to searchAddr.

Fixes #40191.

Change-Id: I759efec78987e4a8fde466ae45aabbaa3d9d4214
Reviewed-on: https://go-review.googlesource.com/c/go/+/242680
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-07-31 14:35:39 +00:00
Michael Anthony Knyszek
796786cd0c runtime: make maxOffAddr reflect the actual address space upper bound
Currently maxOffAddr is defined in terms of the whole 64-bit address
space, assuming that it's all supported, by using ^uintptr(0) as the
maximal address in the offset space. In reality, the maximal address in
the offset space is (1<<heapAddrBits)-1 because we don't have more than
that actually available to us on a given platform.

On most platforms this is fine, because arenaBaseOffset is just
connecting two segments of address space, but on AIX we use it as an
actual offset for the starting address of the available address space,
which is limited. This means using ^uintptr(0) as the maximal address in
the offset address space causes wrap-around, especially when we just
want to represent a range approximately like [addr, infinity), which
today we do by using maxOffAddr.

To fix this, we define maxOffAddr more appropriately, in terms of
(1<<heapAddrBits)-1.

This change also redefines arenaBaseOffset to not be the negation of the
virtual address corresponding to address zero in the virtual address
space, but instead directly as the virtual address corresponding to
zero. This matches the existing documentation more closely and makes the
logic around arenaBaseOffset decidedly simpler, especially when trying
to reason about its use on AIX.

Fixes #38966.

Change-Id: I1336e5036a39de846f64cc2d253e8536dee57611
Reviewed-on: https://go-review.googlesource.com/c/go/+/233497
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-05-14 16:20:19 +00:00
Michael Anthony Knyszek
ea5f9b666c runtime: use offAddr in more parts of the runtime
This change uses the new offAddr type in more parts of the runtime where
we've been implicitly switching from the default address space to a
contiguous view. The purpose of offAddr is to represent addresses in the
contiguous view of the address space, and to make direct computations
between real addresses and offset addresses impossible. This change thus
improves readability in the runtime.

Updates #35788.

Change-Id: I4e1c5fed3ed68aa12f49a42b82eb3f46aba82fc1
Reviewed-on: https://go-review.googlesource.com/c/go/+/230718
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2020-05-08 16:32:03 +00:00
Michael Anthony Knyszek
d69509ff99 runtime: make addrRange[s] operate on offset addresses
Currently addrRange and addrRanges operate on real addresses. That is,
the addresses they manipulate don't include arenaBaseOffset. When added
to an address, arenaBaseOffset makes the address space appear contiguous
on platforms where the address space is segmented. While this is
generally OK because even those platforms which have a segmented address
space usually don't give addresses in a different segment, today it
causes a mismatch between the scavenger and the rest of the page
allocator. The scavenger scavenges from the highest addresses first, but
only via real address, whereas the page allocator allocates memory in
offset address order.

So this change makes addrRange and addrRanges, i.e. what the scavenger
operates on, use offset addresses. However, lots of the page allocator
relies on an addrRange containing real addresses.

To make this transition less error-prone, this change introduces a new
type, offAddr, whose purpose is to make offset addresses a distinct
type, so any attempt to trivially mix real and offset addresses will
trigger a compilation error.

This change doesn't attempt to use offAddr in all of the runtime; a
follow-up change will look for and catch remaining uses of an offset
address which doesn't use the type.

Updates #35788.

Change-Id: I991d891ac8ace8339ca180daafdf6b261a4d43d1
Reviewed-on: https://go-review.googlesource.com/c/go/+/230717
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2020-05-08 16:31:00 +00:00
Michael Anthony Knyszek
dba1205b2f runtime: avoid re-scanning scavenged and untouched memory
Currently the scavenger will reset to the top of the heap every GC. This
means if it scavenges a bunch of memory which doesn't get used again,
it's going to keep re-scanning that memory on subsequent cycles. This
problem is especially bad when it comes to heap spikes: suppose an
application's heap spikes to 2x its steady-state size. The scavenger
will run over the top half of that heap even if the heap shrinks, for
the rest of the application's lifetime.

To fix this, we maintain two numbers: a "free" high watermark, which
represents the highest address freed to the page allocator in that
cycle, and a "scavenged" low watermark, which represents how low of an
address the scavenger got to when scavenging. If the "free" watermark
exceeds the "scavenged" watermark, then we pick the "free" watermark as
the new "top of the heap" for the scavenger when starting the next
scavenger cycle. Otherwise, we have the scavenger pick up where it left
off.

With this mechanism, we only ever re-scan scavenged memory if a random
page gets freed very high up in the heap address space while most of the
action is happening in the lower parts. This case should be exceedingly
unlikely because the page reclaimer walks over the heap from low address
to high addresses, and we use a first-fit address-ordered allocation
policy.

Updates #35788.

Change-Id: Id335603b526ce3a0eb79ef286d1a4e876abc9cab
Reviewed-on: https://go-review.googlesource.com/c/go/+/218997
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
2020-05-08 16:25:31 +00:00
Michael Anthony Knyszek
55ec5182d7 runtime: remove scavAddr in favor of address ranges
This change removes the concept of s.scavAddr in favor of explicitly
reserving and unreserving address ranges. s.scavAddr has several
problems with raciness that can cause the scavenger to miss updates, or
move it back unnecessarily, forcing future scavenge calls to iterate
over searched address space unnecessarily.

This change achieves this by replacing scavAddr with a second addrRanges
which is cloned from s.inUse at the end of each sweep phase. Ranges from
this second addrRanges are then reserved by scavengers (with the
reservation size proportional to the heap size) who are then able to
safely iterate over those ranges without worry of another scavenger
coming in.

Fixes #35788.

Change-Id: Ief01ae170384174875118742f6c26b2a41cbb66d
Reviewed-on: https://go-review.googlesource.com/c/go/+/208378
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2020-05-08 16:24:40 +00:00
Michael Anthony Knyszek
89e13c88e4 runtime: check the correct sanity condition in the page allocator
Currently there are a few sanity checks in the page allocator which
should fail immediately but because it's a check for a negative number
on a uint, it's actually dead-code.

If there's a bug in the page allocator which would cause the sanity
check to fail, this could cause memory corruption by returning an
invalid address (more precisely, one might either see a segfault, or
span overlap).

This change fixes these sanity checks to check the correct condition.

Fixes #38130.

Change-Id: Ia19786cece783d39f26df24dec8788833a6a3f21
Reviewed-on: https://go-review.googlesource.com/c/go/+/226297
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-03-30 14:30:38 +00:00
Michael Anthony Knyszek
e7f9e17b79 runtime: ensure that searchAddr always refers to inUse memory
This change formalizes an assumption made by the page allocator, which
is that (*pageAlloc).searchAddr should never refer to memory that is not
represented by (*pageAlloc).inUse. The portion of address space covered
by (*pageAlloc).inUse reflects the parts of the summary arrays which are
guaranteed to mapped, and so looking at any summary which is not
reflected there may cause a segfault.

In fact, this can happen today. This change thus also removes a
micro-optimization which is the only case which may cause
(*pageAlloc).searchAddr to point outside of any region covered by
(*pageAlloc).inUse, and adds a test verifying that the current segfault
can no longer occur.

Change-Id: I98b534f0ffba8656d3bd6d782f6fc22549ddf1c2
Reviewed-on: https://go-review.googlesource.com/c/go/+/216697
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-01-28 22:08:43 +00:00
Michael Anthony Knyszek
8ac98e7b3f runtime: add scavtrace debug flag and remove scavenge info from gctrace
Currently, scavenging information is printed if the gctrace debug
variable is >0. Scavenging information is also printed naively, for
every page scavenged, resulting in a lot of noise when the typical
expectation for GC trace is one line per GC.

This change adds a new GODEBUG flag called scavtrace which prints
scavenge information roughly once per GC cycle and removes any scavenge
information from gctrace. The exception is debug.FreeOSMemory, which may
force an additional line to be printed.

Fixes #32952.

Change-Id: I4177dcb85fe3f9653fd74297ea93c97c389c1811
Reviewed-on: https://go-review.googlesource.com/c/go/+/212640
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-01-09 18:00:06 +00:00
Michael Anthony Knyszek
1b1fbb3192 runtime: use inUse ranges to map in summary memory only as needed
Prior to this change, if the heap was very discontiguous (such as in
TestArenaCollision) it's possible we could map a large amount of memory
as R/W and commit it. We would use only the start and end to track what
should be mapped, and we would extend that mapping as needed to
accomodate a potentially fragmented address space.

After this change, we only map exactly the part of the summary arrays
that we need by using the inUse ranges from the previous change. This
reduces the GCSys footprint of TestArenaCollision from 300 MiB to 18
MiB.

Because summaries are no longer mapped contiguously, this means the
scavenger can no longer iterate directly. This change also updates the
scavenger to borrow ranges out of inUse and iterate over only the
parts of the heap which are actually currently in use. This is both an
optimization and necessary for correctness.

Fixes #35514.

Change-Id: I96bf0c73ed0d2d89a00202ece7b9d089a53bac90
Reviewed-on: https://go-review.googlesource.com/c/go/+/207758
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-12-11 19:51:34 +00:00
Michael Anthony Knyszek
9d78e75a0a runtime: track ranges of address space which are owned by the heap
This change adds a new inUse field to the allocator which tracks ranges
of addresses that are owned by the heap. It is updated on each heap
growth.

These ranges are tracked in an array which is kept sorted. In practice
this array shouldn't exceed its initial allocation except in rare cases
and thus should be small (ideally exactly 1 element in size).

In a hypothetical worst-case scenario wherein we have a 1 TiB heap and 4
MiB arenas (note that the address ranges will never be at a smaller
granularity than an arena, since arenas are always allocated
contiguously), inUse would use at most 4 MiB of memory if the heap
mappings were completely discontiguous (highly unlikely) with an
additional 2 MiB leaked from previous allocations. Furthermore, the
copies that are done to keep the inUse array sorted will copy at most 4
MiB of memory in such a scenario, which, assuming a conservative copying
rate of 5 GiB/s, amounts to about 800µs.

However, note that in practice:
1) Most 64-bit platforms have 64 MiB arenas.
2) The copies should incur little-to-no page faults, meaning a copy rate
   closer to 25-50 GiB/s is expected.
3) Go heaps are almost always mostly contiguous.

Updates #35514.

Change-Id: I3ad07f1c2b5b9340acf59ecc3b9ae09e884814fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/207757
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2019-12-11 19:37:19 +00:00
Michael Anthony Knyszek
acf3ff2e8a runtime: convert page allocator bitmap to sparse array
Currently the page allocator bitmap is implemented as a single giant
memory mapping which is reserved at init time and committed as needed.
This causes problems on systems that don't handle large uncommitted
mappings well, or institute low virtual address space defaults as a
memory limiting mechanism.

This change modifies the implementation of the page allocator bitmap
away from a directly-mapped set of bytes to a sparse array in same vein
as mheap.arenas. This will hurt performance a little but the biggest
gains are from the lockless allocation possible with the page allocator,
so the impact of this extra layer of indirection should be minimal.

In fact, this is exactly what we see:
    https://perf.golang.org/search?q=upload:20191125.5

This reduces the amount of mapped (PROT_NONE) memory needed on systems
with 48-bit address spaces to ~600 MiB down from almost 9 GiB. The bulk
of this remaining memory is used by the summaries.

Go processes with 32-bit address spaces now always commit to 128 KiB of
memory for the bitmap. Previously it would only commit the pages in the
bitmap which represented the range of addresses (lowest address to
highest address, even if there are unused regions in that range) used by
the heap.

Updates #35568.
Updates #35451.

Change-Id: I0ff10380156568642b80c366001eefd0a4e6c762
Reviewed-on: https://go-review.googlesource.com/c/go/+/207497
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-12-03 17:35:06 +00:00
Michael Anthony Knyszek
e1ddf0507c runtime: count scavenged bits for new allocation for new page allocator
This change makes it so that the new page allocator returns the number
of pages that are scavenged in a new allocation so that mheap can update
memstats appropriately.

The accounting could be embedded into pageAlloc, but that would make
the new allocator more difficult to test.

Updates #35112.

Change-Id: I0f94f563d7af2458e6d534f589d2e7dd6af26d12
Reviewed-on: https://go-review.googlesource.com/c/go/+/195698
Reviewed-by: Austin Clements <austin@google.com>
2019-11-07 19:14:38 +00:00
Michael Anthony Knyszek
73317080e1 runtime: add scavenging code for new page allocator
This change adds a scavenger for the new page allocator along with
tests. The scavenger walks over the heap backwards once per GC, looking
for memory to scavenge. It walks across the heap without any lock held,
searching optimistically. If it finds what appears to be a scavenging
candidate it acquires the heap lock and attempts to verify it. Upon
verification it then scavenges.

Notably, unlike the old scavenger, it doesn't show any preference for
huge pages and instead follows a more strict last-page-first policy.

Updates #35112.

Change-Id: I0621ef73c999a471843eab2d1307ae5679dd18d6
Reviewed-on: https://go-review.googlesource.com/c/go/+/195697
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-11-07 19:14:27 +00:00
Michael Anthony Knyszek
39e8cb0faa runtime: add new page allocator core
This change adds a new bitmap-based allocator to the runtime with tests.
It does not yet integrate the page allocator into the runtime and thus
this change is almost purely additive.

Updates #35112.

Change-Id: Ic3d024c28abee8be8797d3918116a80f901cc2bf
Reviewed-on: https://go-review.googlesource.com/c/go/+/190622
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-11-07 19:11:26 +00:00
Michael Anthony Knyszek
cec01395c5 runtime: add packed bitmap summaries
This change adds the concept of summaries and of summarizing a set of
pallocBits, a core concept in the new page allocator. These summaries
are really just three integers packed into a uint64. This change also
adds tests and a benchmark for generating these summaries.

Updates #35112.

Change-Id: I69686316086c820c792b7a54235859c2105e5fee
Reviewed-on: https://go-review.googlesource.com/c/go/+/190621
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2019-11-07 17:45:15 +00:00
Michael Anthony Knyszek
14849f0fa5 runtime: add new page allocator constants and description
This change is the first of a series of changes which replace the
current page allocator (which is based on the contents of mgclarge.go
and some of mheap.go) with one based on free/used bitmaps.

It adds in the key constants for the page allocator as well as a comment
describing the implementation.

Updates #35112.

Change-Id: I839d3a07f46842ad379701d27aa691885afdba63
Reviewed-on: https://go-review.googlesource.com/c/go/+/190619
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-11-07 16:20:25 +00:00