Commit graph

266 commits

Author SHA1 Message Date
Michael Anthony Knyszek
1f0ef6bec7 runtime: cancel mark and scavenge assists if the limiter is enabled
This change forces mark and scavenge assists to be cancelled early if
the limiter is enabled. This avoids goroutines getting stuck in really
long assists if the limiter happens to be disabled when they first come
into the assist. This can get especially bad for mark assists, which, in
dire situations, can end up "owing" the GC a really significant debt.

For #52890.

Change-Id: I4bfaa76b8de3e167d49d2ffd8bc2127b87ea566a
Reviewed-on: https://go-review.googlesource.com/c/go/+/408816
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-27 17:30:15 +00:00
Michael Anthony Knyszek
b58067013e runtime: allocate physical-page-aligned memory differently
Currently, physical-page-aligned allocations for stacks (where the
physical page size is greater than the runtime page size) first
overallocates some memory, then frees the unaligned portions back to the
heap.

However, because allocating via h.pages.alloc causes scavenged bits to
get cleared, we need to account for that memory correctly in heapFree
and heapReleased. Currently that is not the case, leading to throws at
runtime.

Trying to get that accounting right is complicated, because information
about exactly which pages were scavenged needs to get plumbed up.
Instead, find the oversized region first, and then only allocate the
aligned part. This avoids any accounting issues.

However, this does come with some performance cost, because we don't
update searchAddr (which is safe, it just means the next allocation
potentially must look harder) and we skip the fast path that
h.pages.alloc has for simplicity.

Fixes #52682.

Change-Id: Iefa68317584d73b187634979d730eb30db770bb6
Reviewed-on: https://go-review.googlesource.com/c/go/+/407502
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-05-20 21:54:20 +00:00
Michael Anthony Knyszek
2eb8b6eec6 runtime: make CPU limiter assist time much less error-prone
At the expense of performance (having to update another atomic counter)
this change makes CPU limiter assist time much less error-prone to
manage. There are currently a number of issues with respect to how
scavenge assist time is treated, and this change resolves those by just
having the limiter maintain its own internal pool that's drained on each
update.

While we're here, clear the measured assist time each cycle, which was
the impetus for the change.

Change-Id: I84c513a9f012b4007362a33cddb742c5779782b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/404304
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-13 16:02:20 +00:00
Michael Anthony Knyszek
91f863013e runtime: redesign scavenging algorithm
Currently the runtime's scavenging algorithm involves running from the
top of the heap address space to the bottom (or as far as it gets) once
per GC cycle. Once it treads some ground, it doesn't tread it again
until the next GC cycle.

This works just fine for the background scavenger, for heap-growth
scavenging, and for debug.FreeOSMemory. However, it breaks down in the
face of a memory limit for small heaps in the tens of MiB. Basically,
because the scavenger never retreads old ground, it's completely
oblivious to new memory it could scavenge, and that it really *should*
in the face of a memory limit.

Also, every time some thread goes to scavenge in the runtime, it
reserves what could be a considerable amount of address space, hiding it
from other scavengers.

This change modifies and simplifies the implementation overall. It's
less code with complexities that are much better encapsulated. The
current implementation iterates optimistically over the address space
looking for memory to scavenge, keeping track of what it last saw. The
new implementation does the same, but instead of directly iterating over
pages, it iterates over chunks. It maintains an index of chunks (as a
bitmap over the address space) that indicate which chunks may contain
scavenge work. The page allocator populates this index, while scavengers
consume it and iterate over it optimistically.

This has a two key benefits:
1. Scavenging is much simpler: find a candidate chunk, and check it,
   essentially just using the scavengeOne fast path. There's no need for
   the complexity of iterating beyond one chunk, because the index is
   lock-free and already maintains that information.
2. If pages are freed to the page allocator (always guaranteed to be
   unscavenged), the page allocator immediately notifies all scavengers
   of the new source of work, avoiding the hiding issues of the old
   implementation.

One downside of the new implementation, however, is that it's
potentially more expensive to find pages to scavenge. In the past, if
a single page would become free high up in the address space, the
runtime's scavengers would ignore it. Now that scavengers won't, one or
more scavengers may need to iterate potentially across the whole heap to
find the next source of work. For the background scavenger, this just
means a potentially less reactive scavenger -- overall it should still
use the same amount of CPU. It means worse overheads for memory limit
scavenging, but that's not exactly something with a baseline yet.

In practice, this shouldn't be too bad, hopefully since the chunk index
is extremely compact. For a 48-bit address space, the index is only 8
MiB in size at worst, but even just one physical page in the index is
able to support up to 128 GiB heaps, provided they aren't terribly
sparse. On 32-bit platforms, the index is only 128 bytes in size.

For #48409.

Change-Id: I72b7e74365046b18c64a6417224c5d85511194fb
Reviewed-on: https://go-review.googlesource.com/c/go/+/399474
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:13:53 +00:00
Michael Anthony Knyszek
b4d81147d8 runtime: make the scavenger and allocator respect the memory limit
This change does everything necessary to make the memory allocator and
the scavenger respect the memory limit. In particular, it:

- Adds a second goal for the background scavenge that's based on the
  memory limit, setting a target 5% below the limit to make sure it's
  working hard when the application is close to it.
- Makes span allocation assist the scavenger if the next allocation is
  about to put total memory use above the memory limit.
- Measures any scavenge assist time and adds it to GC assist time for
  the sake of GC CPU limiting, to avoid a death spiral as a result of
  scavenging too much.

All of these changes have a relatively small impact, but each is
intimately related and thus benefit from being done together.

For #48409.

Change-Id: I35517a752f74dd12a151dd620f102c77e095d3e8
Reviewed-on: https://go-review.googlesource.com/c/go/+/397017
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:13:45 +00:00
Michael Anthony Knyszek
375d696ddf runtime: move inconsistent memstats into gcController
Fundamentally, all of these memstats exist to serve the runtime in
managing memory. For the sake of simpler testing, couple these stats
more tightly with the GC.

This CL was mostly done automatically. The fields had to be moved
manually, but the references to the fields were updated via

    gofmt -w -r 'memstats.<field> -> gcController.<field>' *.go

For #48409.

Change-Id: Ic036e875c98138d9a11e1c35f8c61b784c376134
Reviewed-on: https://go-review.googlesource.com/c/go/+/397678
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:12:38 +00:00
Michael Anthony Knyszek
d36d5bd3c1 runtime: clean up inconsistent heap stats
The inconsistent heaps stats in memstats are a bit messy. Primarily,
heap_sys is non-orthogonal with heap_released and heap_inuse. In later
CLs, we're going to want heap_sys-heap_released-heap_inuse, so clean
this up by replacing heap_sys with an orthogonal metric: heapFree.
heapFree represents page heap memory that is free but not released.

I think this change also simplifies a lot of reasoning about these
stats; it's much clearer what they mean, and to obtain HeapSys for
memstats, we no longer need to do the strange subtraction from heap_sys
when allocating specifically non-heap memory from the page heap.

Because we're removing heap_sys, we need to replace it with a sysMemStat
for mem.go functions. In this case, heap_released is the most
appropriate because we increase it anyway (again, non-orthogonality). In
which case, it makes sense for heap_inuse, heap_released, and heapFree
to become more uniform, and to just represent them all as sysMemStats.

While we're here and messing with the types of heap_inuse and
heap_released, let's also fix their names (and last_heap_inuse's name)
up to the more modern Go convention of camelCase.

For #48409.

Change-Id: I87fcbf143b3e36b065c7faf9aa888d86bd11710b
Reviewed-on: https://go-review.googlesource.com/c/go/+/397677
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:12:31 +00:00
Michael Anthony Knyszek
4649a43903 runtime: track how much memory is mapped in the Ready state
This change adds a field to memstats called mappedReady that tracks how
much memory is in the Ready state at any given time. In essence, it's
the total memory usage by the Go runtime (with one exception which is
documented). Essentially, all memory mapped read/write that has either
been paged in or will soon.

To make tracking this not involve the many different stats that track
mapped memory, we track this statistic at a very low level. The downside
of tracking this statistic at such a low level is that it managed to
catch lots of situations where the runtime wasn't fully accounting for
memory. This change rectifies these situations by always accounting for
memory that's mapped in some way (i.e. always passing a sysMemStat to a
mem.go function), with *two* exceptions.

Rectifying these situations means also having the memory mapped during
testing being accounted for, so that tests (i.e. ReadMemStats) that
ultimately check mappedReady continue to work correctly without special
exceptions. We choose to simply account for this memory in other_sys.

Let's talk about the exceptions. The first is the arenas array for
finding heap arena metadata from an address is mapped as read/write in
one large chunk. It's tens of MiB in size. On systems with demand
paging, we assume that the whole thing isn't paged in at once (after
all, it maps to the whole address space, and it's exceedingly difficult
with today's technology to even broach having as much physical memory as
the total address space). On systems where we have to commit memory
manually, we use a two-level structure.

Now, the reason why this is an exception is because we have no mechanism
to track what memory is paged in, and we can't just account for the
entire thing, because that would *look* like an enormous overhead.
Furthermore, this structure is on a few really, really critical paths in
the runtime, so doing more explicit tracking isn't really an option. So,
we explicitly don't and call sysAllocOS to map this memory.

The second exception is that we call sysFree with no accounting to clean
up address space reservations, or otherwise to throw out mappings we
don't care about. In this case, also drop down to a lower level and call
sysFreeOS to explicitly avoid accounting.

The third exception is debuglog allocations. That is purely a debugging
facility and ideally we want it to have as small an impact on the
runtime as possible. If we include it in mappedReady calculations, it
could cause GC pacing shifts in future CLs, especailly if one increases
the debuglog buffer sizes as a one-off.

As of this CL, these are the only three places in the runtime that would
pass nil for a stat to any of the functions in mem.go. As a result, this
CL makes sysMemStats mandatory to facilitate better accounting in the
future. It's now much easier to grep and find out where accounting is
explicitly elided, because one doesn't have to follow the trail of
sysMemStat nil pointer values, and can just look at the function name.

For #48409.

Change-Id: I274eb467fc2603881717482214fddc47c9eaf218
Reviewed-on: https://go-review.googlesource.com/c/go/+/393402
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-03 15:12:21 +00:00
Michael Anthony Knyszek
79db59ded9 runtime: make alloc count metrics truly monotonic
Right now we export alloc count metrics via the runtime/metrics package
and mark them as monotonic, but that's not actually true. As an
optimization, the runtime assumes a span is always fully allocated
before being uncached, and updates the accounting as such. In the rare
case that it's wrong, the span has enough information to back out what
did not get allocated.

This change uses 16 bits of padding in the mspan to house another field
that represents the amount of mspan slots filled just as the mspan is
cached. This is information is enough to get an exact count, allowing us
to make the metrics truly monotonic.

Change-Id: Iaff3ca43f8745dc1bbb0232372423e014b89b920
Reviewed-on: https://go-review.googlesource.com/c/go/+/377516
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-26 22:08:00 +00:00
Russ Cox
19309779ac all: gofmt main repo
[This CL is part of a sequence implementing the proposal #51082.
The design doc is at https://go.dev/s/godocfmt-design.]

Run the updated gofmt, which reformats doc comments,
on the main repository. Vendored files are excluded.

For #51082.

Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407
Reviewed-on: https://go-review.googlesource.com/c/go/+/384268
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2022-04-11 16:34:30 +00:00
Russ Cox
9839668b56 all: separate doc comment from //go: directives
A future change to gofmt will rewrite

	// Doc comment.
	//go:foo

to

	// Doc comment.
	//
	//go:foo

Apply that change preemptively to all comments (not necessarily just doc comments).

For #51082.

Change-Id: Iffe0285418d1e79d34526af3520b415a12203ca9
Reviewed-on: https://go-review.googlesource.com/c/go/+/384260
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-05 17:54:15 +00:00
Russ Cox
7d87ccc860 all: fix various doc comment formatting nits
A run of lines that are indented with any number of spaces or tabs
format as a <pre> block. This commit fixes various doc comments
that format badly according to that (standard) rule.

For example, consider:

	// - List item.
	//   Second line.
	// - Another item.

Because the - lines are unindented, this is actually two paragraphs
separated by a one-line <pre> block. This CL rewrites it to:

	//  - List item.
	//    Second line.
	//  - Another item.

Today, that will format as a single <pre> block.
In a future release, we hope to format it as a bulleted list.

Various other minor fixes as well, all in preparation for reformatting.

For #51082.

Change-Id: I95cf06040d4186830e571cd50148be3bf8daf189
Reviewed-on: https://go-review.googlesource.com/c/go/+/384257
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 18:18:01 +00:00
Michael Anthony Knyszek
4f543b59c5 runtime: don't hold the heap lock while scavenging
This change modifies the scavenger to no longer hold the heap lock while
actively scavenging pages. To achieve this, the change also:
* Reverses the locking behavior of the (*pageAlloc).scavenge API, to
  only acquire the heap lock when necessary.
* Introduces a new lock on the scavenger-related fields in a pageAlloc
  so that access to those fields doesn't require the heap lock. There
  are a few places in the scavenge path, notably reservation, that
  requires synchronization. The heap lock is far too heavy handed for
  this case.
* Changes the scavenger to marks pages that are actively being scavenged
  as allocated, and "frees" them back to the page allocator the usual
  way.
* Lifts the heap-growth scavenging code out of mheap.grow, where the
  heap lock is held, and into allocSpan, just after the lock is
  released. Releasing the lock during mheap.grow is not feasible if we
  want to ensure that allocation always makes progress (post-growth,
  another allocator could come in and take all that space, forcing the
  goroutine that just grew the heap to do so again).

This change means that the scavenger now must do more work for each
scavenge, but it is also now much more scalable. Although in theory it's
not great by always taking the locked paths in the page allocator, it
takes advantage of some properties of the allocator:
* Most of the time, the scavenger will be working with one page at a
  time. The page allocator's locked path is optimized for this case.
* On the allocation path, it doesn't need to do the find operation at
  all; it can go straight to setting bits for the range and updating the
  summary structure.

Change-Id: Ie941d5e7c05dcc96476795c63fef74bcafc2a0f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/353974
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-11-05 17:46:27 +00:00
Michael Anthony Knyszek
fc5e8cd6c9 runtime: update and access scavengeGoal atomically
The first step toward acquiring the heap lock less frequently in the
scavenger.

Change-Id: Idc69fd8602be2c83268c155951230d60e20b42fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/353973
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-11-04 20:01:11 +00:00
fanzha02
6f327f7b88 runtime, syscall: add calls to asan functions
Add explicit address sanitizer instrumentation to the runtime and
syscall packages. The compiler does not instrument the runtime
package. It does instrument the syscall package, but we need to add
a couple of cases that it can't see.

Refer to the implementation of the asan malloc runtime library,
this patch also allocates extra memory as the redzone, around the
returned memory region, and marks the redzone as unaddressable to
detect the overflows or underflows.

Updates #44853.

Change-Id: I2753d1cc1296935a66bf521e31ce91e35fcdf798
Reviewed-on: https://go-review.googlesource.com/c/go/+/298614
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: fannie zhang <Fannie.Zhang@arm.com>
2021-11-02 05:35:11 +00:00
Michael Anthony Knyszek
d8fc7f785e runtime: clean up allocation zeroing
Currently, the runtime zeroes allocations in several ways. First, small
object spans are always zeroed if they come from mheap, and their slots
are zeroed later in mallocgc if needed. Second, large object spans
(objects that have their own spans) plumb the need for zeroing down into
mheap. Thirdly, large objects that have no pointers have their zeroing
delayed until after preemption is reenabled, but before returning in
mallocgc.

All of this has two consequences:
1. Spans for small objects that come from mheap are sometimes
   unnecessarily zeroed, even if the mallocgc call that created them
   doesn't need the object slot to be zeroed.
2. This is all messy and difficult to reason about.

This CL simplifies this code, resolving both (1) and (2). First, it
recognizes that zeroing in mheap is unnecessary for small object spans;
mallocgc and its callees in mcache and mcentral, by design, are *always*
able to deal with non-zeroed spans. They must, for they deal with
recycled spans all the time. Once this fact is made clear, the only
remaining use of zeroing in mheap is for large objects.

As a result, this CL lifts mheap zeroing for large objects into
mallocgc, to parallel all the other codepaths in mallocgc. This is makes
the large object allocation code less surprising.

Next, this CL sets the flag for the delayed zeroing explicitly in the one
case where it matters, and inverts and renames the flag from isZeroed to
delayZeroing.

Finally, it adds a check to make sure that only pointer-free allocations
take the delayed zeroing codepath, as an extra safety measure.

Benchmark results: https://perf.golang.org/search?q=upload:20211028.8

Inspired by tapir.liu@gmail.com's CL 343470.

Change-Id: I7e1296adc19ce8a02c8d93a0a5082aefb2673e8f
Reviewed-on: https://go-review.googlesource.com/c/go/+/359477
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
2021-10-29 17:44:15 +00:00
Michael Anthony Knyszek
3aecb3a8f7 runtime: fix sweep termination condition
Currently, there is a chance that the sweep termination condition could
flap, causing e.g. runtime.GC to return before all sweep work has not
only been drained, but also completed. CL 307915 and CL 307916 attempted
to fix this problem, but it is still possible that mheap_.sweepDrained is
marked before any outstanding sweepers are accounted for in
mheap_.sweepers, leaving a window in which a thread could observe
isSweepDone as true before it actually was (and after some time it would
revert to false, then true again, depending on the number of outstanding
sweepers at that point).

This change fixes the sweep termination condition by merging
mheap_.sweepers and mheap_.sweepDrained into a single atomic value.

This value is updated such that a new potential sweeper will increment
the oustanding sweeper count iff there are still outstanding spans to be
swept without an outstanding sweeper to pick them up. This design
simplifies the sweep termination condition into a single atomic load and
comparison and ensures the condition never flaps.

Updates #46500.
Fixes #45315.

Change-Id: I6d69aff156b8d48428c4cc8cfdbf28be346dbf04
Reviewed-on: https://go-review.googlesource.com/c/go/+/333389
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2021-10-29 17:12:47 +00:00
Michael Anthony Knyszek
016d5eea11 runtime: retype mheap.reclaimCredit as atomic.Uintptr
[git-generate]
cd src/runtime
mv export_test.go export.go
GOROOT=$(dirname $(dirname $PWD)) rf '
  add mheap.reclaimCredit \
// reclaimCredit is spare credit for extra pages swept. Since \
// the page reclaimer works in large chunks, it may reclaim \
// more than requested. Any spare pages released go to this \
// credit pool. \
reclaimCredit_ atomic.Uintptr
  ex {
    import "runtime/internal/atomic"

    var t mheap
    var v, w uintptr
    var d uintptr

    t.reclaimCredit -> t.reclaimCredit_.Load()
    t.reclaimCredit = v -> t.reclaimCredit_.Store(v)
    atomic.Loaduintptr(&t.reclaimCredit) -> t.reclaimCredit_.Load()
    atomic.LoadAcquintptr(&t.reclaimCredit) -> t.reclaimCredit_.LoadAcquire()
    atomic.Storeuintptr(&t.reclaimCredit, v) -> t.reclaimCredit_.Store(v)
    atomic.StoreReluintptr(&t.reclaimCredit, v) -> t.reclaimCredit_.StoreRelease(v)
    atomic.Casuintptr(&t.reclaimCredit, v, w) -> t.reclaimCredit_.CompareAndSwap(v, w)
    atomic.Xchguintptr(&t.reclaimCredit, v) -> t.reclaimCredit_.Swap(v)
    atomic.Xadduintptr(&t.reclaimCredit, d) -> t.reclaimCredit_.Add(d)
  }
  rm mheap.reclaimCredit
  mv mheap.reclaimCredit_ mheap.reclaimCredit
'
mv export.go export_test.go

Change-Id: I2c567781a28f5d8c2275ff18f2cf605b82f22d09
Reviewed-on: https://go-review.googlesource.com/c/go/+/356712
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2021-10-20 20:39:36 +00:00
Michael Anthony Knyszek
a91e976fd2 runtime: retype mheap.reclaimIndex as atomic.Uint64
[git-generate]
cd src/runtime
mv export_test.go export.go
GOROOT=$(dirname $(dirname $PWD)) rf '
  add mheap.reclaimIndex \
// reclaimIndex is the page index in allArenas of next page to \
// reclaim. Specifically, it refers to page (i % \
// pagesPerArena) of arena allArenas[i / pagesPerArena]. \
// \
// If this is >= 1<<63, the page reclaimer is done scanning \
// the page marks. \
reclaimIndex_ atomic.Uint64
  ex {
    import "runtime/internal/atomic"

    var t mheap
    var v, w uint64
    var d int64

    t.reclaimIndex -> t.reclaimIndex_.Load()
    t.reclaimIndex = v -> t.reclaimIndex_.Store(v)
    atomic.Load64(&t.reclaimIndex) -> t.reclaimIndex_.Load()
    atomic.LoadAcq64(&t.reclaimIndex) -> t.reclaimIndex_.LoadAcquire()
    atomic.Store64(&t.reclaimIndex, v) -> t.reclaimIndex_.Store(v)
    atomic.StoreRel64(&t.reclaimIndex, v) -> t.reclaimIndex_.StoreRelease(v)
    atomic.Cas64(&t.reclaimIndex, v, w) -> t.reclaimIndex_.CompareAndSwap(v, w)
    atomic.Xchg64(&t.reclaimIndex, v) -> t.reclaimIndex_.Swap(v)
    atomic.Xadd64(&t.reclaimIndex, d) -> t.reclaimIndex_.Add(d)
  }
  rm mheap.reclaimIndex
  mv mheap.reclaimIndex_ mheap.reclaimIndex
'
mv export.go export_test.go

Change-Id: I1d619e3ac032285b5f7eb6c563a5188c8e36d089
Reviewed-on: https://go-review.googlesource.com/c/go/+/356711
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
2021-10-20 20:39:33 +00:00
Michael Anthony Knyszek
1dff8f0a05 runtime: retype mheap.pagesSweptBasis as atomic.Uint64
[git-generate]
cd src/runtime
mv export_test.go export.go
GOROOT=$(dirname $(dirname $PWD)) rf '
  add mheap.pagesSweptBasis pagesSweptBasis_ atomic.Uint64 // pagesSwept to use as the origin of the sweep ratio
  ex {
    import "runtime/internal/atomic"

    var t mheap
    var v, w uint64
    var d int64

    t.pagesSweptBasis -> t.pagesSweptBasis_.Load()
    t.pagesSweptBasis = v -> t.pagesSweptBasis_.Store(v)
    atomic.Load64(&t.pagesSweptBasis) -> t.pagesSweptBasis_.Load()
    atomic.LoadAcq64(&t.pagesSweptBasis) -> t.pagesSweptBasis_.LoadAcquire()
    atomic.Store64(&t.pagesSweptBasis, v) -> t.pagesSweptBasis_.Store(v)
    atomic.StoreRel64(&t.pagesSweptBasis, v) -> t.pagesSweptBasis_.StoreRelease(v)
    atomic.Cas64(&t.pagesSweptBasis, v, w) -> t.pagesSweptBasis_.CompareAndSwap(v, w)
    atomic.Xchg64(&t.pagesSweptBasis, v) -> t.pagesSweptBasis_.Swap(v)
    atomic.Xadd64(&t.pagesSweptBasis, d) -> t.pagesSweptBasis_.Add(d)
  }
  rm mheap.pagesSweptBasis
  mv mheap.pagesSweptBasis_ mheap.pagesSweptBasis
'
mv export.go export_test.go

Change-Id: Id9438184b9bd06d96894c02376385bad45dee154
Reviewed-on: https://go-review.googlesource.com/c/go/+/356710
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
2021-10-20 20:39:29 +00:00
Michael Anthony Knyszek
e90492882a runtime: retype mheap.pagesSwept as atomic.Uint64
[git-generate]
cd src/runtime
mv export_test.go export.go
GOROOT=$(dirname $(dirname $PWD)) rf '
  add mheap.pagesSwept pagesSwept_ atomic.Uint64 // pages swept this cycle
  ex {
    import "runtime/internal/atomic"

    var t mheap
    var v, w uint64
    var d int64

    t.pagesSwept -> t.pagesSwept_.Load()
    t.pagesSwept = v -> t.pagesSwept_.Store(v)
    atomic.Load64(&t.pagesSwept) -> t.pagesSwept_.Load()
    atomic.LoadAcq64(&t.pagesSwept) -> t.pagesSwept_.LoadAcquire()
    atomic.Store64(&t.pagesSwept, v) -> t.pagesSwept_.Store(v)
    atomic.StoreRel64(&t.pagesSwept, v) -> t.pagesSwept_.StoreRelease(v)
    atomic.Cas64(&t.pagesSwept, v, w) -> t.pagesSwept_.CompareAndSwap(v, w)
    atomic.Xchg64(&t.pagesSwept, v) -> t.pagesSwept_.Swap(v)
    atomic.Xadd64(&t.pagesSwept, d) -> t.pagesSwept_.Add(d)
  }
  rm mheap.pagesSwept
  mv mheap.pagesSwept_ mheap.pagesSwept
'
mv export.go export_test.go

Change-Id: Ife99893d90a339655f604bc3a64ee3decec645ea
Reviewed-on: https://go-review.googlesource.com/c/go/+/356709
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2021-10-20 20:39:25 +00:00
Michael Anthony Knyszek
d419a80bc7 runtime: retype mheap.pagesInUse as atomic.Uint64
[git-generate]
cd src/runtime
mv export_test.go export.go
GOROOT=$(dirname $(dirname $PWD)) rf '
  add mheap.pagesInUse \
// Proportional sweep \
// \
// These parameters represent a linear function from gcController.heapLive \
// to page sweep count. The proportional sweep system works to \
// stay in the black by keeping the current page sweep count \
// above this line at the current gcController.heapLive. \
// \
// The line has slope sweepPagesPerByte and passes through a \
// basis point at (sweepHeapLiveBasis, pagesSweptBasis). At \
// any given time, the system is at (gcController.heapLive, \
// pagesSwept) in this space. \
// \
// It is important that the line pass through a point we \
// control rather than simply starting at a 0,0 origin \
// because that lets us adjust sweep pacing at any time while \
// accounting for current progress. If we could only adjust \
// the slope, it would create a discontinuity in debt if any \
// progress has already been made. \
pagesInUse_ atomic.Uint64 // pages of spans in stats mSpanInUse
  ex {
    import "runtime/internal/atomic"

    var t mheap
    var v, w uint64
    var d int64

    t.pagesInUse -> t.pagesInUse_.Load()
    t.pagesInUse = v -> t.pagesInUse_.Store(v)
    atomic.Load64(&t.pagesInUse) -> t.pagesInUse_.Load()
    atomic.LoadAcq64(&t.pagesInUse) -> t.pagesInUse_.LoadAcquire()
    atomic.Store64(&t.pagesInUse, v) -> t.pagesInUse_.Store(v)
    atomic.StoreRel64(&t.pagesInUse, v) -> t.pagesInUse_.StoreRelease(v)
    atomic.Cas64(&t.pagesInUse, v, w) -> t.pagesInUse_.CompareAndSwap(v, w)
    atomic.Xchg64(&t.pagesInUse, v) -> t.pagesInUse_.Swap(v)
    atomic.Xadd64(&t.pagesInUse, d) -> t.pagesInUse_.Add(d)
  }
  rm mheap.pagesInUse
  mv mheap.pagesInUse_ mheap.pagesInUse
'
mv export.go export_test.go

Change-Id: I495d188683dba0778518563c46755b5ad43be298
Reviewed-on: https://go-review.googlesource.com/c/go/+/356549
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2021-10-20 20:39:19 +00:00
Michael Anthony Knyszek
9c58e399a4 [dev.typeparams] runtime: fix import sort order [generated]
[git-generate]
cd src/runtime
goimports -w *.go

Change-Id: I1387af0f2fd1a213dc2f4c122e83a8db0fcb15f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/329189
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-06-17 20:42:23 +00:00
Michael Anthony Knyszek
6d85891b29 [dev.typeparams] runtime: replace uses of runtime/internal/sys.PtrSize with internal/goarch.PtrSize [generated]
[git-generate]
cd src/runtime/internal/math
gofmt -w -r "sys.PtrSize -> goarch.PtrSize" .
goimports -w *.go
cd ../..
gofmt -w -r "sys.PtrSize -> goarch.PtrSize" .
goimports -w *.go

Change-Id: I43491cdd54d2e06d4d04152b3d213851b7d6d423
Reviewed-on: https://go-review.googlesource.com/c/go/+/328337
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2021-06-17 18:54:48 +00:00
David Chase
8a4b7294af cmd/compile: fix possible nil deref added in CL 270943
In the event allocSpan returned a nil, this would crash.
Cleaned up the code and comments slightly, too.

Change-Id: I6231d4b4c14218e6956b4a97a205adc3206f59ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/316429
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-05-03 18:17:42 +00:00
David Chase
0bbfc5c31e runtime: break up large calls to memclrNoHeapPointers to allow preemption
If something "huge" is allocated, and the zeroing is trivial (no pointers
involved) then zero it by chunks in a loop so that preemption can occur,
not all in a single non-preemptible call.

Benchmarking suggests that 256K is the best chunk size.

Updates #42642.

Change-Id: I94015e467eaa098c59870e479d6d83bc88efbfb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/270943
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-04-30 19:41:02 +00:00
Michael Anthony Knyszek
f2d5bd1ad3 runtime: move internal GC statistics from memstats to gcController
This change moves certain important but internal-only GC statistics from
memstats into gcController. These statistics are mainly used in pacing
the GC, so it makes sense to keep them in the pacer's state.

This CL was mostly generated via

rf '
    ex . {
	memstats.gc_trigger -> gcController.trigger
	memstats.triggerRatio -> gcController.triggerRatio
	memstats.heap_marked -> gcController.heapMarked
	memstats.heap_live -> gcController.heapLive
	memstats.heap_scan -> gcController.heapScan
    }
'

except for a few special cases, like updating names in comments and when
these fields are used within gcControllerState methods (at which point
they're accessed through the reciever).

For #44167.

Change-Id: I6bd1602585aeeb80818ded24c07d8e6fec992b93
Reviewed-on: https://go-review.googlesource.com/c/go/+/306598
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-04-13 23:42:29 +00:00
Austin Clements
1b736b3c19 runtime: consolidate "is sweep done" conditions
The runtime currently has two different notions of sweep completion:

1. All spans are either swept or have begun sweeping.

2. The sweeper has *finished* sweeping all spans.

Having both is confusing (it doesn't help that the documentation is
often unclear or wrong). Condition 2 is stronger and the theoretical
slight optimization that condition 1 could impact is never actually
useful. Hence, this CL consolidates both conditions down to condition 2.

Updates #45315.

Change-Id: I55c84d767d74eb31a004a5619eaba2e351162332
Reviewed-on: https://go-review.googlesource.com/c/go/+/307916
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-04-12 19:22:52 +00:00
Austin Clements
a25a77aed2 runtime: block sweep completion on all sweep paths
The runtime currently has two different notions of sweep completion:

1. All spans are either swept or have begun sweeping.

2. The sweeper has *finished* sweeping all spans.

Most things depend on condition 1. Notably, GC correctness depends on
condition 1, but since all sweep operations a non-preemptible, the STW
at the beginning of GC forces condition 1 to become condition 2.

runtime.GC(), however, depends on condition 2, since the intent is to
complete a complete GC cycle, and also update the heap profile (which
can only be done after sweeping is complete).

However, the way we compute condition 2 is racy right now and may in
fact only indicate condition 1. Specifically, sweepone blocks
condition 2 until all sweepone calls are done, but there are many
other ways to enter the sweeper that don't block this. Hence, sweepone
may see that there are no more spans in the sweep list and see that
it's the last sweepone and declare sweeping done, while there's some
other sweeper still working on a span.

Fix this by making sure every entry to the sweeper participates in the
protocol that blocks condition 2. To make sure we get this right, this
CL introduces a type to track sweep blocking and (lightly) enforces
span sweep ownership via the type system. This has the nice
side-effect of abstracting the pattern of acquiring sweep ownership
that's currently repeated in many different places.

Fixes #45315.

Change-Id: I7fab30170c5ae14c8b2f10998628735b8be6d901
Reviewed-on: https://go-review.googlesource.com/c/go/+/307915
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-04-12 19:22:50 +00:00
Austin Clements
4e1bf8ed38 runtime: add GC testing helpers for regabi signature fuzzer
This CL adds a set of helper functions for testing GC interactions.
These are intended for use in the regabi signature fuzzer, but are
generally useful for GC tests, so we make them generally available to
runtime tests.

These provide:

1. An easy way to force stack movement, for testing stack copying.

2. A simple and robust way to check the reachability of a set of
pointers.

3. A way to check what general category of memory a pointer points to,
mostly so tests can make sure they're testing what they mean to.

For #40724, but generally useful.

Change-Id: I15d33ccb3f5a792c0472a19c2cc9a8b4a9356a66
Reviewed-on: https://go-review.googlesource.com/c/go/+/305330
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
2021-03-29 21:50:16 +00:00
Austin Clements
1ef114d12c runtime: abstract specials list iteration
The specials processing loop in mspan.sweep is about to get more
complicated and I'm too allergic to list manipulation to open code
more of it there.

Change-Id: I767a0889739da85fb2878fc06a5c55b73bf2ba7d
Reviewed-on: https://go-review.googlesource.com/c/go/+/305551
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-03-29 21:50:14 +00:00
Michael Anthony Knyszek
bd6aeca968 runtime: prepare arenas for use incrementally
This change moves the call of sysMap from (*mheap).sysAlloc into
(*mheap).grow, so we only sysMap what we're going to use in the near
future (thanks to the curArena mechanism). The purpose of this change is
to better support systems with strict overcommit rules which generally
accept reserved memory but not prepared memory (see malloc.go for exact
descriptions of these states).

This move requires changing linearAlloc to only optionally map memory.
In one case, with mheap.heapArenaAlloc, we do want it to map memory. But
now in the other case, with mheap.arena, we don't, because we want grow
to take care of it.

The risk with this change is we may make more syscalls than before on
systems with 64 MiB arenas, but because heap growth is relatively rare
this is unlikely to be a noticable issue. We also bound the amount of
syscalls made by only extending curArena (and thus mapping) by
pallocChunkPages*pageSize which is 4 MiB.

Fixes #42612.

Change-Id: I736df696afe78ddb1a747a896caa0db8726027e5
Reviewed-on: https://go-review.googlesource.com/c/go/+/270537
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-03-15 20:20:51 +00:00
Matthew Dempsky
4662029264 runtime: simplify divmagic for span calculations
It's both simpler and faster to just unconditionally do two 32-bit
multiplies rather than a bunch of branching to try to avoid them.
This is safe thanks to the tight bounds derived in [1] and verified
during mksizeclasses.go.

Benchstat results below for compilebench benchmarks on my P920. See
also [2] for micro benchmarks comparing the new functions against the
originals (as well as several basic attempts at optimizing them).

name                      old time/op       new time/op       delta
Template                        295ms ± 3%        290ms ± 1%  -1.95%  (p=0.000 n=20+20)
Unicode                         113ms ± 3%        110ms ± 2%  -2.32%  (p=0.000 n=21+17)
GoTypes                         1.78s ± 1%        1.76s ± 1%  -1.23%  (p=0.000 n=21+20)
Compiler                        119ms ± 2%        117ms ± 4%  -1.53%  (p=0.007 n=20+20)
SSA                             14.3s ± 1%        13.8s ± 1%  -3.12%  (p=0.000 n=17+20)
Flate                           173ms ± 2%        170ms ± 1%  -1.64%  (p=0.000 n=20+19)
GoParser                        278ms ± 2%        273ms ± 2%  -1.92%  (p=0.000 n=20+19)
Reflect                         686ms ± 3%        671ms ± 3%  -2.18%  (p=0.000 n=19+20)
Tar                             255ms ± 2%        248ms ± 2%  -2.90%  (p=0.000 n=20+20)
XML                             335ms ± 3%        327ms ± 2%  -2.34%  (p=0.000 n=20+20)
LinkCompiler                    799ms ± 1%        799ms ± 1%    ~     (p=0.925 n=20+20)
ExternalLinkCompiler            1.90s ± 1%        1.90s ± 0%    ~     (p=0.327 n=20+20)
LinkWithoutDebugCompiler        385ms ± 1%        386ms ± 1%    ~     (p=0.251 n=18+20)
[Geo mean]                      512ms             504ms       -1.61%

name                      old user-time/op  new user-time/op  delta
Template                        286ms ± 4%        282ms ± 4%  -1.42%  (p=0.025 n=21+20)
Unicode                         104ms ± 9%        102ms ±14%    ~     (p=0.294 n=21+20)
GoTypes                         1.75s ± 3%        1.72s ± 2%  -1.36%  (p=0.000 n=21+20)
Compiler                        109ms ±11%        108ms ± 8%    ~     (p=0.187 n=21+19)
SSA                             14.0s ± 1%        13.5s ± 2%  -3.25%  (p=0.000 n=16+20)
Flate                           166ms ± 4%        164ms ± 4%  -1.34%  (p=0.032 n=19+19)
GoParser                        268ms ± 4%        263ms ± 4%  -1.71%  (p=0.011 n=18+20)
Reflect                         666ms ± 3%        654ms ± 4%  -1.77%  (p=0.002 n=18+20)
Tar                             245ms ± 5%        236ms ± 6%  -3.34%  (p=0.000 n=20+20)
XML                             320ms ± 4%        314ms ± 3%  -2.01%  (p=0.001 n=19+18)
LinkCompiler                    744ms ± 4%        747ms ± 3%    ~     (p=0.627 n=20+19)
ExternalLinkCompiler            1.71s ± 3%        1.72s ± 2%    ~     (p=0.149 n=20+20)
LinkWithoutDebugCompiler        345ms ± 6%        342ms ± 8%    ~     (p=0.355 n=20+20)
[Geo mean]                      484ms             477ms       -1.50%

[1] Daniel Lemire, Owen Kaser, Nathan Kurz. 2019. "Faster Remainder by
Direct Computation: Applications to Compilers and Software Libraries."
https://arxiv.org/abs/1902.01961

[2] https://github.com/mdempsky/benchdivmagic

Change-Id: Ie4d214e7a908b0d979c878f2d404bd56bdf374f6
Reviewed-on: https://go-review.googlesource.com/c/go/+/300994
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Trust: Matthew Dempsky <mdempsky@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-03-12 18:02:59 +00:00
Joel Sing
5ca43acdb3 runtime: allow physical page aligned stacks to be allocated
Add a physPageAlignedStack boolean which if set, results in over allocation
by a physical page, the allocation being rounded to physical page alignment
and the unused memory surrounding the allocation being freed again.

OpenBSD/octeon has 16KB physical pages and requires stacks to be physical page
aligned in order for them to be remapped as MAP_STACK. This change allows Go
to work on this platform.

Based on a suggestion from mknyszek in issue #41008.

Updates #40995
Fixes #41008

Change-Id: Ia5d652292b515916db473043b41f6030094461d8
Reviewed-on: https://go-review.googlesource.com/c/go/+/266919
Trust: Joel Sing <joel@sing.id.au>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-11-04 06:14:02 +00:00
Michael Anthony Knyszek
39a5ee52b9 runtime: decouple consistent stats from mcache and allow P-less update
This change modifies the consistent stats implementation to keep the
per-P sequence counter on each P instead of each mcache. A valid mcache
is not available everywhere that we want to call e.g. allocSpan, as per
issue #42339. By decoupling these two, we can add a mechanism to allow
contexts without a P to update stats consistently.

In this CL, we achieve that with a mutex. In practice, it will be very
rare for an M to update these stats without a P. Furthermore, the stats
reader also only needs to hold the mutex across the update to "gen"
since once that changes, writers are free to continue updating the new
stats generation. Contention could thus only arise between writers
without a P, and as mentioned earlier, those should be rare.

A nice side-effect of this change is that the consistent stats acquire
and release API becomes simpler.

Fixes #42339.

Change-Id: Ied74ab256f69abd54b550394c8ad7c4c40a5fe34
Reviewed-on: https://go-review.googlesource.com/c/go/+/267158
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-11-02 21:21:46 +00:00
Michael Anthony Knyszek
ac766e3718 runtime: make getMCache inlineable
This change moves the responsibility of throwing if an mcache is not
available to the caller, because the inlining cost of throw is set very
high in the compiler. Even if it was reduced down to the cost of a usual
function call, it would still be too expensive, so just move it out.

This choice also makes sense in the context of #42339 since we're going
to have to handle the case where we don't have an mcache to update stats
in a few contexts anyhow.

Also, add getMCache to the list of functions that should be inlined to
prevent future regressions.

getMCache is called on the allocation fast path and because its not
inlined actually causes a significant regression (~10%) in some
microbenchmarks.

Fixes #42305.

Change-Id: I64ac5e4f26b730bd4435ea1069a4a50f55411ced
Reviewed-on: https://go-review.googlesource.com/c/go/+/267157
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-11-02 21:10:41 +00:00
Michael Pratt
9393b5bae5 runtime: add heap lock assertions
Some functions that required holding the heap lock _or_ world stop have
been simplified to simply requiring the heap lock. This is conceptually
simpler and taking the heap lock during world stop is guaranteed to not
contend. This was only done on functions already called on the
systemstack to avoid too many extra systemstack calls in GC.

Updates #40677

Change-Id: I15aa1dadcdd1a81aac3d2a9ecad6e7d0377befdc
Reviewed-on: https://go-review.googlesource.com/c/go/+/250262
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Pratt <mpratt@google.com>
2020-10-30 20:21:14 +00:00
Michael Anthony Knyszek
f77a9025f1 runtime: replace some memstats with consistent stats
This change replaces stacks_inuse, gcWorkBufInUse and
gcProgPtrScalarBitsInUse with their corresponding consistent stats. It
also adds checks to make sure the rest of the sharded stats line up with
existing stats in updatememstats.

Change-Id: I17d0bd181aedb5c55e09c8dff18cef5b2a3a14e3
Reviewed-on: https://go-review.googlesource.com/c/go/+/247038
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:28:20 +00:00
Michael Anthony Knyszek
fe7ff71185 runtime: add consistent heap statistics
This change adds a global set of heap statistics which are similar
to existing memory statistics. The purpose of these new statistics
is to be able to read them and get a consistent result without stopping
the world. The goal is to eventually replace as many of the existing
memstats statistics with the sharded ones as possible.

The consistent memory statistics use a tailor-made synchronization
mechanism to allow writers (allocators) to proceed with minimal
synchronization by using a sequence counter and a global generation
counter to determine which set of statistics to update. Readers
increment the global generation counter to effectively grab a snapshot
of the statistics, and then iterate over all Ps using the sequence
counter to ensure that they may safely read the snapshotted statistics.
To keep statistics fresh, the reader also has a responsibility to merge
sets of statistics.

These consistent statistics are computed, but otherwise unused for now.
Upcoming changes will integrate them with the rest of the codebase and
will begin to phase out existing statistics.

Change-Id: I637a11f2439e2049d7dccb8650c5d82500733ca5
Reviewed-on: https://go-review.googlesource.com/c/go/+/247037
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:28:14 +00:00
Michael Anthony Knyszek
c5dea8f387 runtime: remove memstats.heap_idle
This statistic is updated in many places but for MemStats may be
computed from existing statistics. Specifically by definition
heap_idle = heap_sys - heap_inuse since heap_sys is all memory allocated
from the OS for use in the heap minus memory used for non-heap purposes.
heap_idle is almost the same (since it explicitly includes memory that
*could* be used for non-heap purposes) but also doesn't include memory
that's actually used to hold heap objects.

Although it has some utility as a sanity check, it complicates
accounting and we want fewer, orthogonal statistics for upcoming metrics
changes, so just drop it.

Change-Id: I40af54a38e335f43249f6e218f35088bfd4380d1
Reviewed-on: https://go-review.googlesource.com/c/go/+/246974
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:10:12 +00:00
Michael Anthony Knyszek
ad863ba32a runtime: break down memstats.gc_sys
This change breaks apart gc_sys into three distinct pieces. Two of those
pieces are pieces which come from heap_sys since they're allocated from
the page heap. The rest comes from memory mapped from e.g.
persistentalloc which better fits the purpose of a sysMemStat. Also,
rename gc_sys to gcMiscSys.

Change-Id: I098789170052511e7b31edbcdc9a53e5c24573f7
Reviewed-on: https://go-review.googlesource.com/c/go/+/246973
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:10:04 +00:00
Michael Anthony Knyszek
8ebc58452a runtime: delineate which memstats are system stats with a type
This change modifies the type of several mstats fields to be a new type:
sysMemStat. This type has the same structure as the fields used to have.

The purpose of this change is to make it very clear which stats may be
used in various functions for accounting (usually the platform-specific
sys* functions, but there are others). Currently there's an implicit
understanding that the *uint64 value passed to these functions is some
kind of statistic whose value is atomically managed. This understanding
isn't inherently problematic, but we're about to change how some stats
(which currently use mSysStatInc and mSysStatDec) work, so we want to
make it very clear what the various requirements are around "sysStat".

This change also removes mSysStatInc and mSysStatDec in favor of a
method on sysMemStat. Note that those two functions were originally
written the way they were because atomic 64-bit adds required a valid G
on ARM, but this hasn't been the case for a very long time (since
golang.org/cl/14204, but even before then it wasn't clear if mutexes
required a valid G anymore). Today we implement 64-bit adds on ARM with
a spinlock table.

Change-Id: I4e9b37cf14afc2ae20cf736e874eb0064af086d7
Reviewed-on: https://go-review.googlesource.com/c/go/+/246971
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:09:41 +00:00
Michael Anthony Knyszek
dc02578ac8 runtime: make the span allocation purpose more explicit
This change modifies mheap's span allocation API to have each caller
declare a purpose, defined as a new enum called spanAllocType.

The purpose behind this change is two-fold:
1. Tight control over who gets to allocate heap memory is, generally
   speaking, a good thing. Every codepath that allocates heap memory
   places additional implicit restrictions on the allocator. A notable
   example of a restriction is work bufs coming from heap memory: write
   barriers are not allowed in allocation paths because then we could
   have a situation where the allocator calls into the allocator.
2. Memory statistic updating is explicit. Instead of passing an opaque
   pointer for statistic updating, which places restrictions on how that
   statistic may be updated, we use the spanAllocType to determine which
   statistic to update and how.

We also take this opportunity to group all the statistic updating code
together, which should make the accounting code a little easier to
follow.

Change-Id: Ic0b0898959ba2a776f67122f0e36c9d7d60e3085
Reviewed-on: https://go-review.googlesource.com/c/go/+/246970
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 17:27:14 +00:00
Michael Anthony Knyszek
d677899e90 runtime: flush local_scan directly and more often
Now that local_scan is the last mcache-based statistic that is flushed
by purgecachedstats, and heap_scan and gcController.revise may be
interacted with concurrently, we don't need to flush heap_scan at
arbitrary locations where the heap is locked, and we don't need
purgecachedstats and cachestats anymore. Instead, we can flush
local_scan at the same time we update heap_live in refill, so the two
updates may share the same revise call.

Clean up unused functions, remove code that would cause the heap to get
locked in the allocSpan when it didn't need to (other than to flush
local_scan), and flush local_scan explicitly in a few important places.
Notably we need to flush local_scan whenever we flush the other stats,
but it doesn't need to be donated anywhere, so have releaseAll do the
flushing. Also, we need to flush local_scan before we set heap_scan at
the end of a GC, which was previously handled by cachestats. Just do so
explicitly -- it's not much code and it becomes a lot more clear why we
need to do so.

Change-Id: I35ac081784df7744d515479896a41d530653692d
Reviewed-on: https://go-review.googlesource.com/c/go/+/246968
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 17:26:40 +00:00
Michael Anthony Knyszek
cca3d1e553 runtime: don't flush local_tinyallocs
This change makes local_tinyallocs work like the rest of the malloc
stats and doesn't flush local_tinyallocs, instead making that the
source-of-truth.

Change-Id: I3e6cb5f1b3d086e432ce7d456895511a48e3617a
Reviewed-on: https://go-review.googlesource.com/c/go/+/246967
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 17:26:30 +00:00
Michael Anthony Knyszek
e63716bc76 runtime: make nlargealloc and largealloc mcache fields
This change makes nlargealloc and largealloc into mcache fields just
like nlargefree and largefree. These local fields become the new
source-of-truth. This change also moves the accounting for these fields
out of allocSpan (which is an inappropriate place for it -- this
accounting generally happens much closer to the point of allocation) and
into largeAlloc. This move is partially possible now that we can call
gcController.revise at that point.

Furthermore, this change moves largeAlloc into mcache.go and makes it a
method of mcache. While there's a little bit of a mismatch here because
largeAlloc barely interacts with the mcache, it helps solidify the
mcache as the first allocation layer and provides a clear place to
aggregate and manage statistics.

Change-Id: I37b5e648710733bb4c04430b71e96700e438587a
Reviewed-on: https://go-review.googlesource.com/c/go/+/246965
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 17:26:16 +00:00
Michael Anthony Knyszek
42019613df runtime: make distributed/local malloc stats the source-of-truth
This change makes it so that various local malloc stats (excluding
heap_scan and local_tinyallocs) are no longer written first to mheap
fields but are instead accessed directly from each mcache.

This change is part of a move toward having stats be distributed, and
cleaning up some old code related to the stats.

Note that because there's no central source-of-truth, when an mcache
dies, it must donate its stats to another mcache. It's always safe to
donate to the mcache for the 0th P, so do that.

Change-Id: I2556093dbc27357cb9621c9b97671f3c00aa1173
Reviewed-on: https://go-review.googlesource.com/c/go/+/246964
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 17:26:08 +00:00
Michael Anthony Knyszek
8cc280aa72 runtime: define and enforce synchronization on heap_scan
Currently heap_scan is mostly protected by the heap lock, but
gcControllerState.revise sometimes accesses it without a lock. In an
effort to make gcControllerState.revise callable from more contexts (and
have its synchronization guarantees actually respected), make heap_scan
atomically read from and written to, unless the world is stopped.

Note that we don't update gcControllerState.revise's erroneous doc
comment here because this change isn't about revise's guarantees, just
about heap_scan. The comment is updated in a later change.

Change-Id: Iddbbeb954767c704c2bd1d221f36e6c4fc9948a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/246960
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 17:25:40 +00:00
lihaowei
7fbd8c75c6 all: fix spelling mistakes
Change-Id: I7d512281d8442d306594b57b5deaecd132b5ea9e
GitHub-Last-Rev: 251e1d6857
GitHub-Pull-Request: golang/go#40793
Reviewed-on: https://go-review.googlesource.com/c/go/+/248441
Reviewed-by: Dave Cheney <dave@cheney.net>
2020-08-18 03:28:52 +00:00
Michael Anthony Knyszek
e6d0bd2b89 runtime: clean up old mcentral code
This change deletes the old mcentral implementation from the code base
and the newMCentralImpl feature flag along with it.

Updates #37487.

Change-Id: Ibca8f722665f0865051f649ffe699cbdbfdcfcf2
Reviewed-on: https://go-review.googlesource.com/c/go/+/221184
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-08-17 20:06:49 +00:00