Commit graph

290 commits

Author SHA1 Message Date
Michael Anthony Knyszek
a3e90dc377 runtime: add eager scavenging details to GODEBUG=scavtrace=1
Also, clean up atomics on released-per-cycle while we're here.

For #57069.

Change-Id: I14026e8281f01dea1e8c8de6aa8944712b7b24d9
Reviewed-on: https://go-review.googlesource.com/c/go/+/495916
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-05-19 13:38:43 +00:00
Michael Anthony Knyszek
8992bb19ad runtime: replace trace.enabled with traceEnabled
[git-generate]
cd src/runtime
grep -l 'trace\.enabled' *.go | grep -v "trace.go" | xargs sed -i 's/trace\.enabled/traceEnabled()/g'

Change-Id: I14c7821c1134690b18c8abc0edd27abcdabcad72
Reviewed-on: https://go-review.googlesource.com/c/go/+/494181
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2023-05-11 21:27:08 +00:00
Michael Anthony Knyszek
8fa9e3beee runtime: manage huge pages explicitly
This change makes it so that on Linux the Go runtime explicitly marks
page heap memory as either available to be backed by hugepages or not
using heuristics based on density.

The motivation behind this change is twofold:
1. In default Linux configurations, khugepaged can recoalesce hugepages
   even after the scavenger breaks them up, resulting in significant
   overheads for small heaps when their heaps shrink.
2. The Go runtime already has some heuristics about this, but those
   heuristics appear to have bit-rotted and result in haphazard
   hugepage management. Unlucky (but otherwise fairly dense) regions of
   memory end up not backed by huge pages while sparse regions end up
   accidentally marked MADV_HUGEPAGE and are not later broken up by the
   scavenger, because it already got the memory it needed from more
   dense sections (this is more likely to happen with small heaps that
   go idle).

In this change, the runtime uses a new policy:

1. Mark all new memory MADV_HUGEPAGE.
2. Track whether each page chunk (4 MiB) became dense during the GC
   cycle. Mark those MADV_HUGEPAGE, and hide them from the scavenger.
3. If a chunk is not dense for 1 full GC cycle, make it visible to the
   scavenger.
4. The scavenger marks a chunk MADV_NOHUGEPAGE before it scavenges it.

This policy is intended to try and back memory that is a good candidate
for huge pages (high occupancy) with huge pages, and give memory that is
not (low occupancy) to the scavenger. Occupancy is defined not just by
occupancy at any instant of time, but also occupancy in the near future.
It's generally true that by the end of a GC cycle the heap gets quite
dense (from the perspective of the page allocator).

Because we want scavenging and huge page management to happen together
(the right time to MADV_NOHUGEPAGE is just before scavenging in order to
break up huge pages and keep them that way) and the cost of applying
MADV_HUGEPAGE and MADV_NOHUGEPAGE is somewhat high, the scavenger avoids
releasing memory in dense page chunks. All this together means the
scavenger will now more generally release memory on a ~1 GC cycle delay.

Notably this has implications for scavenging to maintain the memory
limit and the runtime/debug.FreeOSMemory API. This change makes it so
that in these cases all memory is visible to the scavenger regardless of
sparseness and delays the page allocator in re-marking this memory with
MADV_NOHUGEPAGE for around 1 GC cycle to mitigate churn.

The end result of this change should be little-to-no performance
difference for dense heaps (MADV_HUGEPAGE works a lot like the default
unmarked state) but should allow the scavenger to more effectively take
back fragments of huge pages. The main risk here is churn, because
MADV_HUGEPAGE usually forces the kernel to immediately back memory with
a huge page. That's the reason for the large amount of hysteresis (1
full GC cycle) and why the definition of high density is 96% occupancy.

Fixes #55328.

Change-Id: I8da7998f1a31b498a9cc9bc662c1ae1a6bf64630
Reviewed-on: https://go-review.googlesource.com/c/go/+/436395
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-19 14:30:00 +00:00
Michael Anthony Knyszek
1f9d80e331 runtime: disable huge pages for GC metadata for small heaps
For #55328.

Change-Id: I8792161f09906c08d506cc0ace9d07e76ec6baa6
Reviewed-on: https://go-review.googlesource.com/c/go/+/460316
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-04-19 14:27:27 +00:00
Oleksandr Redko
1a09d57de5 runtime: correct typos
- Fix typo in throw error message for arena.
- Correct typos in assembly and Go comments.
- Fix log message in TestTraceCPUProfile.

Change-Id: I874c9e8cd46394448b6717bc6021aa3ecf319d16
GitHub-Last-Rev: d27fad4d3c
GitHub-Pull-Request: golang/go#58375
Reviewed-on: https://go-review.googlesource.com/c/go/+/465975
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-02-08 14:52:12 +00:00
Keith Randall
79edd1d19d runtime: remove go119MemoryLimitSupport flag
Change-Id: I207480d991c6242a1610795605c5ec6a3b3c59de
Reviewed-on: https://go-review.googlesource.com/c/go/+/463225
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2023-01-28 20:46:29 +00:00
Michael Knyszek
e4435cb844 runtime: add page tracer
This change adds a new GODEBUG flag called pagetrace that writes a
low-overhead trace of how pages of memory are managed by the Go runtime.

The page tracer is kept behind a GOEXPERIMENT flag due to a potential
security risk for setuid binaries.

Change-Id: I6f4a2447d02693c25214400846a5d2832ad6e5c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/444157
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-18 03:45:30 +00:00
Cherry Mui
febe7b8e2a runtime: make GC see object as allocated after it is initialized
When the GC is scanning some memory (possibly conservatively),
finding a pointer, while concurrently another goroutine is
allocating an object at the same address as the found pointer, the
GC may see the pointer before the object and/or the heap bits are
initialized. This may cause the GC to see bad pointers and
possibly crash.

To prevent this, we make it that the scanner can only see the
object as allocated after the object and the heap bits are
initialized. Currently the allocator uses freeindex to find the
next available slot, and that code is coupled with updating the
free index to a new slot past it. The scanner also uses the
freeindex to determine if an object is allocated. This is somewhat
racy. This CL makes the scanner use a different field, which is
only updated after the object initialization (and a memory
barrier).

Fixes #54596.

Change-Id: I2a57a226369926e7192c253dd0d21d3faf22297c
Reviewed-on: https://go-review.googlesource.com/c/go/+/449017
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-11-15 02:55:24 +00:00
Michael Anthony Knyszek
7866538d25 runtime: add safe arena support to the runtime
This change adds an API to the runtime for arenas. A later CL can
potentially export it as an experimental API, but for now, just the
runtime implementation will suffice.

The purpose of arenas is to improve efficiency, primarily by allowing
for an application to manually free memory, thereby delaying garbage
collection. It comes with other potential performance benefits, such as
better locality, a better allocation strategy, and better handling of
interior pointers by the GC.

This implementation is based on one by danscales@google.com with a few
significant differences:
* The implementation lives entirely in the runtime (all layers).
* Arena chunks are the minimum of 8 MiB or the heap arena size. This
  choice is made because in practice 64 MiB appears to be way too large
  of an area for most real-world use-cases.
* Arena chunks are not unmapped, instead they're placed on an evacuation
  list and when there are no pointers left pointing into them, they're
  allowed to be reused.
* Reusing partially-used arena chunks no longer tries to find one used
  by the same P first; it just takes the first one available.
* In order to ensure worst-case fragmentation is never worse than 25%,
  only types and slice backing stores whose sizes are 1/4th the size of
  a chunk or less may be used. Previously larger sizes, up to the size
  of the chunk, were allowed.
* ASAN, MSAN, and the race detector are fully supported.
* Sets arena chunks to fault that were deferred at the end of mark
  termination (a non-public patch once did this; I don't see a reason
  not to continue that).

For #51317.

Change-Id: I83b1693a17302554cb36b6daa4e9249a81b1644f
Reviewed-on: https://go-review.googlesource.com/c/go/+/423359
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-10-12 20:23:30 +00:00
Michael Anthony Knyszek
4c383951b9 runtime: make (*mheap).sysAlloc more general
This change makes (*mheap).sysAlloc take an explicit list of hints and a
boolean as to whether or not any newly-created heapArenas should be
registered in the full arena list.

This is a refactoring in preparation for arenas.

For #51317.

Change-Id: I0584a033fce3fcb60c5d0bc033d5fb8bd23b2378
Reviewed-on: https://go-review.googlesource.com/c/go/+/432078
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-10-12 20:23:24 +00:00
Michael Anthony Knyszek
69fc74f3ee runtime: factor out mheap span initialization
This change refactors span heap initialization. This change should just
be a no-op and just prepares for adding support for arenas.

For #51317.

Change-Id: Ie6f877ca10f86d26e7b6c4857b223589a351e253
Reviewed-on: https://go-review.googlesource.com/c/go/+/423364
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-10-12 20:23:11 +00:00
Than McIntosh
506e690a26 runtime: mark arenaIdx.l1 and arenaIdx.l2 methods as nosplit
Mark the "l1" and "l2" methods on "arenaIdx" with //go:nosplit, since
these methods are called from a nosplit context (for example, from
"spanOf").

Fixes #56044.
Updates #21314.

Change-Id: I48c7aa756b59a13162c89ef21066f83371ae50f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/441859
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-10-10 19:56:54 +00:00
Keith Randall
f1b7b2fc52 runtime: make mSpanStateBox accessors nosplit
get, at least, is called from typedmemclr which must not be interruptible.
These were previously nosplit by accident before CL 424395 (the only
call they had was an intrinsic, so they were leaf functions, so they had
no prologue). After CL 424395 they contained a call (in noinline builds),
thus had a prologue, thus had a suspension point.

I have no idea how we might test this.

This is another motivating use case for having a nosplitrec directive
in the runtime.

Fixes #55156
Fixes #54779
Fixes #54906
Fixes #54907

Change-Id: I851d733d71bda7172c4c96e027657e22b499ee00
Reviewed-on: https://go-review.googlesource.com/c/go/+/431919
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-09-19 21:57:04 +00:00
Michael Anthony Knyszek
b7c28f484d runtime/metrics: add CPU stats
This changes adds a breakdown for estimated CPU usage by time. These
estimates are not based on real on-CPU counters, so each metric has a
disclaimer explaining so. They can, however, be more reasonably
compared to a total CPU time metric that this change also adds.

Fixes #47216.

Change-Id: I125006526be9f8e0d609200e193da5a78d9935be
Reviewed-on: https://go-review.googlesource.com/c/go/+/404307
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Josh MacDonald <jmacd@lightstep.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-09-16 16:32:20 +00:00
Michael Anthony Knyszek
5a37965495 runtime: make mheap.pagesInUse an atomic.Uintptr
This change fixes an old TODO that made it a uint64 because it would
make alignment within mheap more complicated. Now that we don't have to
worry about it since we're using atomic types as much as possible,
switch to using a Uintptr. This likely will improve performance a tiny
bit on 32-bit platforms, but really it's mostly cleanup.

Change-Id: Ie705799a111ccad977fc1f43de8b50cf611be303
Reviewed-on: https://go-review.googlesource.com/c/go/+/429221
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-09-08 16:06:01 +00:00
Michael Anthony Knyszek
e28cc362a8 runtime: remove alignment padding in mheap and pageAlloc
All subfields use atomic types to ensure alignment, so there's no more
need for these fields.

Change-Id: Iada4253f352a074073ce603f1f6b07cbd5b7c58a
Reviewed-on: https://go-review.googlesource.com/c/go/+/429220
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
2022-09-08 16:06:00 +00:00
Ludi Rehak
f15761b50b runtime: fix formula for computing number of padding bytes
In order to prevent false sharing of cache lines, structs are
padded with some number of bytes. These bytes are unused, serving
only to make the size of the struct a multiple of the size of the
cache line.

The current calculation of how much to pad is an overestimation,
when the struct size is already a multiple of the cache line size
without padding. For these cases, no padding is necessary, and
the size of the inner pad field should be 0. The bug is that the
pad field is sized to a whole 'nother cache line, wasting space.

Here is the current formula that can never return 0:
cpu.CacheLinePadSize - unsafe.Sizeof(myStruct{})%cpu.CacheLinePadSize

This change simply mods that calculation by cpu.CacheLinePadSize,
so that 0 will be returned instead of cpu.CacheLinePadSize.

Change-Id: I26a2b287171bf47a3b9121873b2722f728381b5e
Reviewed-on: https://go-review.googlesource.com/c/go/+/414214
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-19 16:04:12 +00:00
Cuong Manh Le
a719a78c1b runtime: add and use runtime/internal/sys.NotInHeap
Updates #46731

Change-Id: Ic2208c8bb639aa1e390be0d62e2bd799ecf20654
Reviewed-on: https://go-review.googlesource.com/c/go/+/421878
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
2022-08-19 00:29:18 +00:00
Cuong Manh Le
014f0e8205 runtime: convert mSpanStateBox.s to atomic type
Updates #53821

Change-Id: I02f31a7a8295deb3e840565412abf10ff776c2c3
Reviewed-on: https://go-review.googlesource.com/c/go/+/424395
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2022-08-17 16:26:24 +00:00
Keith Randall
6a9c674a09 runtime: redo heap bitmap
[this is a retry of CL 407035 + its revert CL 422395. The content is unchanged]

Use just 1 bit per word to record the ptr/nonptr bitmap.
Use word-sized operations to manipulate the bitmap, so we can operate
on up to 64 ptr/nonptr bits at a time.

Use a separate bitmap, one bit per word of the ptr/nonptr bitmap,
to encode a no-more-pointers signal. Since we can check 64 ptr/nonptr
bits at once, knowing the exact last pointer location is not necessary.

As a followon CL, we should make the gcdata bitmap an array of
uintptr instead of an array of byte, so we can load 64 bits of it at once.
Similarly for the processing of gc programs.

Change-Id: Ica5eb622f5b87e647be64f471d67b02732ef8be6
Reviewed-on: https://go-review.googlesource.com/c/go/+/422634
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
2022-08-16 20:39:36 +00:00
Keith Randall
cb13022a24 runtime: ensure that we don't scan noscan objects
We claim to not maintain pointer bits for noscan objects. But in fact
we do, since whenever we switch a page from scannable to noscan, we
call heapBits.initSpan which zeroes the heap bits.

Switch to ensure that we never scan noscan objects. This ensures that
we don't depend on the ptrbits for noscan objects. That fixes a bug
in the 1-bit bitmap CL which depended on that fact.

Change-Id: I4e66f582605b53732f8fca310c1f6bd2892963cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/422435
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-09 22:28:42 +00:00
Keith Randall
ad0287f496 Revert "runtime: redo heap bitmap"
This reverts commit b589208c8c.

Reason for revert: Bug somewhere in this code, causing wasm and maybe linux/386 to fail.

Change-Id: I5e1e501d839584e0219271bb937e94348f83c11f
Reviewed-on: https://go-review.googlesource.com/c/go/+/422395
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-08-09 16:10:10 +00:00
Keith Randall
b589208c8c runtime: redo heap bitmap
Use just 1 bit per word to record the ptr/nonptr bitmap.
Use word-sized operations to manipulate the bitmap, so we can operate
on up to 64 ptr/nonptr bits at a time.

Use a separate bitmap, one bit per word of the ptr/nonptr bitmap,
to encode a no-more-pointers signal. Since we can check 64 ptr/nonptr
bits at once, knowing the exact last pointer location is not necessary.

This cleans up the bitmap implementation significantly, which will
hopefully make it faster. TODO: measure

As a followon CL, we should make the gcdata bitmap an array of
uintptr instead of an array of byte, so we can load 64 bits of it at once.
Similarly for the processing of gc programs.

Change-Id: I18151b1876d9543599800dec51e2a1b19df97d49
Reviewed-on: https://go-review.googlesource.com/c/go/+/407035
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
2022-08-08 16:57:33 +00:00
Michael Anthony Knyszek
d7941030c9 runtime: only use CPU time from the current window in the GC CPU limiter
Currently the GC CPU limiter consumes CPU time from a few pools, but
because the events that flush to those pools may overlap, rather than be
strictly contained within, the update window for the GC CPU limiter, the
limiter's accounting is ultimately sloppy.

This sloppiness complicates accounting for idle time more completely,
and makes reasoning about the transient behavior of the GC CPU limiter
much more difficult.

To remedy this, this CL adds a field to the P struct that tracks the
start time of any in-flight event the limiter might care about, along
with information about the nature of that event. This timestamp is
managed atomically so that the GC CPU limiter can come in and perform a
read of the partial CPU time consumed by a given event. The limiter also
updates the timestamp so that only what's left over is flushed by the
event itself when it completes.

The end result of this change is that, since the GC CPU limiter is aware
of all past completed events, and all in-flight events, it can much more
accurately collect the CPU time of events since the last update. There's
still the possibility for skew, but any leftover time will be captured
in the following update, and the magnitude of this leftover time is
effectively bounded by the update period of the GC CPU limiter, which is
much easier to consider.

One caveat of managing this timestamp-type combo atomically is that they
need to be packed in 64 bits. So, this CL gives up the top 3 bits of the
timestamp and places the type information there. What this means is we
effectively have only a 61-bit resolution timestamp. This is fine when
the top 3 bits are the same between calls to nanotime, but becomes a
problem on boundaries when those 3 bits change. These cases may cause
hiccups in the GC CPU limiter by not accounting for some source of CPU
time correctly, but with 61 bits of resolution this should be extremely
rare. The rate of update is on the order of milliseconds, so at worst
the runtime will be off of any given measurement by only a few
CPU-milliseconds (and this is directly bounded by the rate of update).
We're probably more inaccurate from the fact that we don't measure real
CPU time but only approximate it.

For #52890.

Change-Id: I347f30ac9e2ba6061806c21dfe0193ef2ab3bbe9
Reviewed-on: https://go-review.googlesource.com/c/go/+/410120
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-06-03 20:16:15 +00:00
Michael Anthony Knyszek
1f0ef6bec7 runtime: cancel mark and scavenge assists if the limiter is enabled
This change forces mark and scavenge assists to be cancelled early if
the limiter is enabled. This avoids goroutines getting stuck in really
long assists if the limiter happens to be disabled when they first come
into the assist. This can get especially bad for mark assists, which, in
dire situations, can end up "owing" the GC a really significant debt.

For #52890.

Change-Id: I4bfaa76b8de3e167d49d2ffd8bc2127b87ea566a
Reviewed-on: https://go-review.googlesource.com/c/go/+/408816
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-27 17:30:15 +00:00
Michael Anthony Knyszek
b58067013e runtime: allocate physical-page-aligned memory differently
Currently, physical-page-aligned allocations for stacks (where the
physical page size is greater than the runtime page size) first
overallocates some memory, then frees the unaligned portions back to the
heap.

However, because allocating via h.pages.alloc causes scavenged bits to
get cleared, we need to account for that memory correctly in heapFree
and heapReleased. Currently that is not the case, leading to throws at
runtime.

Trying to get that accounting right is complicated, because information
about exactly which pages were scavenged needs to get plumbed up.
Instead, find the oversized region first, and then only allocate the
aligned part. This avoids any accounting issues.

However, this does come with some performance cost, because we don't
update searchAddr (which is safe, it just means the next allocation
potentially must look harder) and we skip the fast path that
h.pages.alloc has for simplicity.

Fixes #52682.

Change-Id: Iefa68317584d73b187634979d730eb30db770bb6
Reviewed-on: https://go-review.googlesource.com/c/go/+/407502
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-05-20 21:54:20 +00:00
Michael Anthony Knyszek
2eb8b6eec6 runtime: make CPU limiter assist time much less error-prone
At the expense of performance (having to update another atomic counter)
this change makes CPU limiter assist time much less error-prone to
manage. There are currently a number of issues with respect to how
scavenge assist time is treated, and this change resolves those by just
having the limiter maintain its own internal pool that's drained on each
update.

While we're here, clear the measured assist time each cycle, which was
the impetus for the change.

Change-Id: I84c513a9f012b4007362a33cddb742c5779782b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/404304
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-13 16:02:20 +00:00
Michael Anthony Knyszek
91f863013e runtime: redesign scavenging algorithm
Currently the runtime's scavenging algorithm involves running from the
top of the heap address space to the bottom (or as far as it gets) once
per GC cycle. Once it treads some ground, it doesn't tread it again
until the next GC cycle.

This works just fine for the background scavenger, for heap-growth
scavenging, and for debug.FreeOSMemory. However, it breaks down in the
face of a memory limit for small heaps in the tens of MiB. Basically,
because the scavenger never retreads old ground, it's completely
oblivious to new memory it could scavenge, and that it really *should*
in the face of a memory limit.

Also, every time some thread goes to scavenge in the runtime, it
reserves what could be a considerable amount of address space, hiding it
from other scavengers.

This change modifies and simplifies the implementation overall. It's
less code with complexities that are much better encapsulated. The
current implementation iterates optimistically over the address space
looking for memory to scavenge, keeping track of what it last saw. The
new implementation does the same, but instead of directly iterating over
pages, it iterates over chunks. It maintains an index of chunks (as a
bitmap over the address space) that indicate which chunks may contain
scavenge work. The page allocator populates this index, while scavengers
consume it and iterate over it optimistically.

This has a two key benefits:
1. Scavenging is much simpler: find a candidate chunk, and check it,
   essentially just using the scavengeOne fast path. There's no need for
   the complexity of iterating beyond one chunk, because the index is
   lock-free and already maintains that information.
2. If pages are freed to the page allocator (always guaranteed to be
   unscavenged), the page allocator immediately notifies all scavengers
   of the new source of work, avoiding the hiding issues of the old
   implementation.

One downside of the new implementation, however, is that it's
potentially more expensive to find pages to scavenge. In the past, if
a single page would become free high up in the address space, the
runtime's scavengers would ignore it. Now that scavengers won't, one or
more scavengers may need to iterate potentially across the whole heap to
find the next source of work. For the background scavenger, this just
means a potentially less reactive scavenger -- overall it should still
use the same amount of CPU. It means worse overheads for memory limit
scavenging, but that's not exactly something with a baseline yet.

In practice, this shouldn't be too bad, hopefully since the chunk index
is extremely compact. For a 48-bit address space, the index is only 8
MiB in size at worst, but even just one physical page in the index is
able to support up to 128 GiB heaps, provided they aren't terribly
sparse. On 32-bit platforms, the index is only 128 bytes in size.

For #48409.

Change-Id: I72b7e74365046b18c64a6417224c5d85511194fb
Reviewed-on: https://go-review.googlesource.com/c/go/+/399474
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:13:53 +00:00
Michael Anthony Knyszek
b4d81147d8 runtime: make the scavenger and allocator respect the memory limit
This change does everything necessary to make the memory allocator and
the scavenger respect the memory limit. In particular, it:

- Adds a second goal for the background scavenge that's based on the
  memory limit, setting a target 5% below the limit to make sure it's
  working hard when the application is close to it.
- Makes span allocation assist the scavenger if the next allocation is
  about to put total memory use above the memory limit.
- Measures any scavenge assist time and adds it to GC assist time for
  the sake of GC CPU limiting, to avoid a death spiral as a result of
  scavenging too much.

All of these changes have a relatively small impact, but each is
intimately related and thus benefit from being done together.

For #48409.

Change-Id: I35517a752f74dd12a151dd620f102c77e095d3e8
Reviewed-on: https://go-review.googlesource.com/c/go/+/397017
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:13:45 +00:00
Michael Anthony Knyszek
375d696ddf runtime: move inconsistent memstats into gcController
Fundamentally, all of these memstats exist to serve the runtime in
managing memory. For the sake of simpler testing, couple these stats
more tightly with the GC.

This CL was mostly done automatically. The fields had to be moved
manually, but the references to the fields were updated via

    gofmt -w -r 'memstats.<field> -> gcController.<field>' *.go

For #48409.

Change-Id: Ic036e875c98138d9a11e1c35f8c61b784c376134
Reviewed-on: https://go-review.googlesource.com/c/go/+/397678
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:12:38 +00:00
Michael Anthony Knyszek
d36d5bd3c1 runtime: clean up inconsistent heap stats
The inconsistent heaps stats in memstats are a bit messy. Primarily,
heap_sys is non-orthogonal with heap_released and heap_inuse. In later
CLs, we're going to want heap_sys-heap_released-heap_inuse, so clean
this up by replacing heap_sys with an orthogonal metric: heapFree.
heapFree represents page heap memory that is free but not released.

I think this change also simplifies a lot of reasoning about these
stats; it's much clearer what they mean, and to obtain HeapSys for
memstats, we no longer need to do the strange subtraction from heap_sys
when allocating specifically non-heap memory from the page heap.

Because we're removing heap_sys, we need to replace it with a sysMemStat
for mem.go functions. In this case, heap_released is the most
appropriate because we increase it anyway (again, non-orthogonality). In
which case, it makes sense for heap_inuse, heap_released, and heapFree
to become more uniform, and to just represent them all as sysMemStats.

While we're here and messing with the types of heap_inuse and
heap_released, let's also fix their names (and last_heap_inuse's name)
up to the more modern Go convention of camelCase.

For #48409.

Change-Id: I87fcbf143b3e36b065c7faf9aa888d86bd11710b
Reviewed-on: https://go-review.googlesource.com/c/go/+/397677
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:12:31 +00:00
Michael Anthony Knyszek
4649a43903 runtime: track how much memory is mapped in the Ready state
This change adds a field to memstats called mappedReady that tracks how
much memory is in the Ready state at any given time. In essence, it's
the total memory usage by the Go runtime (with one exception which is
documented). Essentially, all memory mapped read/write that has either
been paged in or will soon.

To make tracking this not involve the many different stats that track
mapped memory, we track this statistic at a very low level. The downside
of tracking this statistic at such a low level is that it managed to
catch lots of situations where the runtime wasn't fully accounting for
memory. This change rectifies these situations by always accounting for
memory that's mapped in some way (i.e. always passing a sysMemStat to a
mem.go function), with *two* exceptions.

Rectifying these situations means also having the memory mapped during
testing being accounted for, so that tests (i.e. ReadMemStats) that
ultimately check mappedReady continue to work correctly without special
exceptions. We choose to simply account for this memory in other_sys.

Let's talk about the exceptions. The first is the arenas array for
finding heap arena metadata from an address is mapped as read/write in
one large chunk. It's tens of MiB in size. On systems with demand
paging, we assume that the whole thing isn't paged in at once (after
all, it maps to the whole address space, and it's exceedingly difficult
with today's technology to even broach having as much physical memory as
the total address space). On systems where we have to commit memory
manually, we use a two-level structure.

Now, the reason why this is an exception is because we have no mechanism
to track what memory is paged in, and we can't just account for the
entire thing, because that would *look* like an enormous overhead.
Furthermore, this structure is on a few really, really critical paths in
the runtime, so doing more explicit tracking isn't really an option. So,
we explicitly don't and call sysAllocOS to map this memory.

The second exception is that we call sysFree with no accounting to clean
up address space reservations, or otherwise to throw out mappings we
don't care about. In this case, also drop down to a lower level and call
sysFreeOS to explicitly avoid accounting.

The third exception is debuglog allocations. That is purely a debugging
facility and ideally we want it to have as small an impact on the
runtime as possible. If we include it in mappedReady calculations, it
could cause GC pacing shifts in future CLs, especailly if one increases
the debuglog buffer sizes as a one-off.

As of this CL, these are the only three places in the runtime that would
pass nil for a stat to any of the functions in mem.go. As a result, this
CL makes sysMemStats mandatory to facilitate better accounting in the
future. It's now much easier to grep and find out where accounting is
explicitly elided, because one doesn't have to follow the trail of
sysMemStat nil pointer values, and can just look at the function name.

For #48409.

Change-Id: I274eb467fc2603881717482214fddc47c9eaf218
Reviewed-on: https://go-review.googlesource.com/c/go/+/393402
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-03 15:12:21 +00:00
Michael Anthony Knyszek
79db59ded9 runtime: make alloc count metrics truly monotonic
Right now we export alloc count metrics via the runtime/metrics package
and mark them as monotonic, but that's not actually true. As an
optimization, the runtime assumes a span is always fully allocated
before being uncached, and updates the accounting as such. In the rare
case that it's wrong, the span has enough information to back out what
did not get allocated.

This change uses 16 bits of padding in the mspan to house another field
that represents the amount of mspan slots filled just as the mspan is
cached. This is information is enough to get an exact count, allowing us
to make the metrics truly monotonic.

Change-Id: Iaff3ca43f8745dc1bbb0232372423e014b89b920
Reviewed-on: https://go-review.googlesource.com/c/go/+/377516
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-26 22:08:00 +00:00
Russ Cox
19309779ac all: gofmt main repo
[This CL is part of a sequence implementing the proposal #51082.
The design doc is at https://go.dev/s/godocfmt-design.]

Run the updated gofmt, which reformats doc comments,
on the main repository. Vendored files are excluded.

For #51082.

Change-Id: I7332f099b60f716295fb34719c98c04eb1a85407
Reviewed-on: https://go-review.googlesource.com/c/go/+/384268
Reviewed-by: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2022-04-11 16:34:30 +00:00
Russ Cox
9839668b56 all: separate doc comment from //go: directives
A future change to gofmt will rewrite

	// Doc comment.
	//go:foo

to

	// Doc comment.
	//
	//go:foo

Apply that change preemptively to all comments (not necessarily just doc comments).

For #51082.

Change-Id: Iffe0285418d1e79d34526af3520b415a12203ca9
Reviewed-on: https://go-review.googlesource.com/c/go/+/384260
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-05 17:54:15 +00:00
Russ Cox
7d87ccc860 all: fix various doc comment formatting nits
A run of lines that are indented with any number of spaces or tabs
format as a <pre> block. This commit fixes various doc comments
that format badly according to that (standard) rule.

For example, consider:

	// - List item.
	//   Second line.
	// - Another item.

Because the - lines are unindented, this is actually two paragraphs
separated by a one-line <pre> block. This CL rewrites it to:

	//  - List item.
	//    Second line.
	//  - Another item.

Today, that will format as a single <pre> block.
In a future release, we hope to format it as a bulleted list.

Various other minor fixes as well, all in preparation for reformatting.

For #51082.

Change-Id: I95cf06040d4186830e571cd50148be3bf8daf189
Reviewed-on: https://go-review.googlesource.com/c/go/+/384257
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 18:18:01 +00:00
Michael Anthony Knyszek
4f543b59c5 runtime: don't hold the heap lock while scavenging
This change modifies the scavenger to no longer hold the heap lock while
actively scavenging pages. To achieve this, the change also:
* Reverses the locking behavior of the (*pageAlloc).scavenge API, to
  only acquire the heap lock when necessary.
* Introduces a new lock on the scavenger-related fields in a pageAlloc
  so that access to those fields doesn't require the heap lock. There
  are a few places in the scavenge path, notably reservation, that
  requires synchronization. The heap lock is far too heavy handed for
  this case.
* Changes the scavenger to marks pages that are actively being scavenged
  as allocated, and "frees" them back to the page allocator the usual
  way.
* Lifts the heap-growth scavenging code out of mheap.grow, where the
  heap lock is held, and into allocSpan, just after the lock is
  released. Releasing the lock during mheap.grow is not feasible if we
  want to ensure that allocation always makes progress (post-growth,
  another allocator could come in and take all that space, forcing the
  goroutine that just grew the heap to do so again).

This change means that the scavenger now must do more work for each
scavenge, but it is also now much more scalable. Although in theory it's
not great by always taking the locked paths in the page allocator, it
takes advantage of some properties of the allocator:
* Most of the time, the scavenger will be working with one page at a
  time. The page allocator's locked path is optimized for this case.
* On the allocation path, it doesn't need to do the find operation at
  all; it can go straight to setting bits for the range and updating the
  summary structure.

Change-Id: Ie941d5e7c05dcc96476795c63fef74bcafc2a0f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/353974
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-11-05 17:46:27 +00:00
Michael Anthony Knyszek
fc5e8cd6c9 runtime: update and access scavengeGoal atomically
The first step toward acquiring the heap lock less frequently in the
scavenger.

Change-Id: Idc69fd8602be2c83268c155951230d60e20b42fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/353973
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-11-04 20:01:11 +00:00
fanzha02
6f327f7b88 runtime, syscall: add calls to asan functions
Add explicit address sanitizer instrumentation to the runtime and
syscall packages. The compiler does not instrument the runtime
package. It does instrument the syscall package, but we need to add
a couple of cases that it can't see.

Refer to the implementation of the asan malloc runtime library,
this patch also allocates extra memory as the redzone, around the
returned memory region, and marks the redzone as unaddressable to
detect the overflows or underflows.

Updates #44853.

Change-Id: I2753d1cc1296935a66bf521e31ce91e35fcdf798
Reviewed-on: https://go-review.googlesource.com/c/go/+/298614
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: fannie zhang <Fannie.Zhang@arm.com>
2021-11-02 05:35:11 +00:00
Michael Anthony Knyszek
d8fc7f785e runtime: clean up allocation zeroing
Currently, the runtime zeroes allocations in several ways. First, small
object spans are always zeroed if they come from mheap, and their slots
are zeroed later in mallocgc if needed. Second, large object spans
(objects that have their own spans) plumb the need for zeroing down into
mheap. Thirdly, large objects that have no pointers have their zeroing
delayed until after preemption is reenabled, but before returning in
mallocgc.

All of this has two consequences:
1. Spans for small objects that come from mheap are sometimes
   unnecessarily zeroed, even if the mallocgc call that created them
   doesn't need the object slot to be zeroed.
2. This is all messy and difficult to reason about.

This CL simplifies this code, resolving both (1) and (2). First, it
recognizes that zeroing in mheap is unnecessary for small object spans;
mallocgc and its callees in mcache and mcentral, by design, are *always*
able to deal with non-zeroed spans. They must, for they deal with
recycled spans all the time. Once this fact is made clear, the only
remaining use of zeroing in mheap is for large objects.

As a result, this CL lifts mheap zeroing for large objects into
mallocgc, to parallel all the other codepaths in mallocgc. This is makes
the large object allocation code less surprising.

Next, this CL sets the flag for the delayed zeroing explicitly in the one
case where it matters, and inverts and renames the flag from isZeroed to
delayZeroing.

Finally, it adds a check to make sure that only pointer-free allocations
take the delayed zeroing codepath, as an extra safety measure.

Benchmark results: https://perf.golang.org/search?q=upload:20211028.8

Inspired by tapir.liu@gmail.com's CL 343470.

Change-Id: I7e1296adc19ce8a02c8d93a0a5082aefb2673e8f
Reviewed-on: https://go-review.googlesource.com/c/go/+/359477
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
2021-10-29 17:44:15 +00:00
Michael Anthony Knyszek
3aecb3a8f7 runtime: fix sweep termination condition
Currently, there is a chance that the sweep termination condition could
flap, causing e.g. runtime.GC to return before all sweep work has not
only been drained, but also completed. CL 307915 and CL 307916 attempted
to fix this problem, but it is still possible that mheap_.sweepDrained is
marked before any outstanding sweepers are accounted for in
mheap_.sweepers, leaving a window in which a thread could observe
isSweepDone as true before it actually was (and after some time it would
revert to false, then true again, depending on the number of outstanding
sweepers at that point).

This change fixes the sweep termination condition by merging
mheap_.sweepers and mheap_.sweepDrained into a single atomic value.

This value is updated such that a new potential sweeper will increment
the oustanding sweeper count iff there are still outstanding spans to be
swept without an outstanding sweeper to pick them up. This design
simplifies the sweep termination condition into a single atomic load and
comparison and ensures the condition never flaps.

Updates #46500.
Fixes #45315.

Change-Id: I6d69aff156b8d48428c4cc8cfdbf28be346dbf04
Reviewed-on: https://go-review.googlesource.com/c/go/+/333389
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2021-10-29 17:12:47 +00:00
Michael Anthony Knyszek
016d5eea11 runtime: retype mheap.reclaimCredit as atomic.Uintptr
[git-generate]
cd src/runtime
mv export_test.go export.go
GOROOT=$(dirname $(dirname $PWD)) rf '
  add mheap.reclaimCredit \
// reclaimCredit is spare credit for extra pages swept. Since \
// the page reclaimer works in large chunks, it may reclaim \
// more than requested. Any spare pages released go to this \
// credit pool. \
reclaimCredit_ atomic.Uintptr
  ex {
    import "runtime/internal/atomic"

    var t mheap
    var v, w uintptr
    var d uintptr

    t.reclaimCredit -> t.reclaimCredit_.Load()
    t.reclaimCredit = v -> t.reclaimCredit_.Store(v)
    atomic.Loaduintptr(&t.reclaimCredit) -> t.reclaimCredit_.Load()
    atomic.LoadAcquintptr(&t.reclaimCredit) -> t.reclaimCredit_.LoadAcquire()
    atomic.Storeuintptr(&t.reclaimCredit, v) -> t.reclaimCredit_.Store(v)
    atomic.StoreReluintptr(&t.reclaimCredit, v) -> t.reclaimCredit_.StoreRelease(v)
    atomic.Casuintptr(&t.reclaimCredit, v, w) -> t.reclaimCredit_.CompareAndSwap(v, w)
    atomic.Xchguintptr(&t.reclaimCredit, v) -> t.reclaimCredit_.Swap(v)
    atomic.Xadduintptr(&t.reclaimCredit, d) -> t.reclaimCredit_.Add(d)
  }
  rm mheap.reclaimCredit
  mv mheap.reclaimCredit_ mheap.reclaimCredit
'
mv export.go export_test.go

Change-Id: I2c567781a28f5d8c2275ff18f2cf605b82f22d09
Reviewed-on: https://go-review.googlesource.com/c/go/+/356712
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2021-10-20 20:39:36 +00:00
Michael Anthony Knyszek
a91e976fd2 runtime: retype mheap.reclaimIndex as atomic.Uint64
[git-generate]
cd src/runtime
mv export_test.go export.go
GOROOT=$(dirname $(dirname $PWD)) rf '
  add mheap.reclaimIndex \
// reclaimIndex is the page index in allArenas of next page to \
// reclaim. Specifically, it refers to page (i % \
// pagesPerArena) of arena allArenas[i / pagesPerArena]. \
// \
// If this is >= 1<<63, the page reclaimer is done scanning \
// the page marks. \
reclaimIndex_ atomic.Uint64
  ex {
    import "runtime/internal/atomic"

    var t mheap
    var v, w uint64
    var d int64

    t.reclaimIndex -> t.reclaimIndex_.Load()
    t.reclaimIndex = v -> t.reclaimIndex_.Store(v)
    atomic.Load64(&t.reclaimIndex) -> t.reclaimIndex_.Load()
    atomic.LoadAcq64(&t.reclaimIndex) -> t.reclaimIndex_.LoadAcquire()
    atomic.Store64(&t.reclaimIndex, v) -> t.reclaimIndex_.Store(v)
    atomic.StoreRel64(&t.reclaimIndex, v) -> t.reclaimIndex_.StoreRelease(v)
    atomic.Cas64(&t.reclaimIndex, v, w) -> t.reclaimIndex_.CompareAndSwap(v, w)
    atomic.Xchg64(&t.reclaimIndex, v) -> t.reclaimIndex_.Swap(v)
    atomic.Xadd64(&t.reclaimIndex, d) -> t.reclaimIndex_.Add(d)
  }
  rm mheap.reclaimIndex
  mv mheap.reclaimIndex_ mheap.reclaimIndex
'
mv export.go export_test.go

Change-Id: I1d619e3ac032285b5f7eb6c563a5188c8e36d089
Reviewed-on: https://go-review.googlesource.com/c/go/+/356711
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
2021-10-20 20:39:33 +00:00
Michael Anthony Knyszek
1dff8f0a05 runtime: retype mheap.pagesSweptBasis as atomic.Uint64
[git-generate]
cd src/runtime
mv export_test.go export.go
GOROOT=$(dirname $(dirname $PWD)) rf '
  add mheap.pagesSweptBasis pagesSweptBasis_ atomic.Uint64 // pagesSwept to use as the origin of the sweep ratio
  ex {
    import "runtime/internal/atomic"

    var t mheap
    var v, w uint64
    var d int64

    t.pagesSweptBasis -> t.pagesSweptBasis_.Load()
    t.pagesSweptBasis = v -> t.pagesSweptBasis_.Store(v)
    atomic.Load64(&t.pagesSweptBasis) -> t.pagesSweptBasis_.Load()
    atomic.LoadAcq64(&t.pagesSweptBasis) -> t.pagesSweptBasis_.LoadAcquire()
    atomic.Store64(&t.pagesSweptBasis, v) -> t.pagesSweptBasis_.Store(v)
    atomic.StoreRel64(&t.pagesSweptBasis, v) -> t.pagesSweptBasis_.StoreRelease(v)
    atomic.Cas64(&t.pagesSweptBasis, v, w) -> t.pagesSweptBasis_.CompareAndSwap(v, w)
    atomic.Xchg64(&t.pagesSweptBasis, v) -> t.pagesSweptBasis_.Swap(v)
    atomic.Xadd64(&t.pagesSweptBasis, d) -> t.pagesSweptBasis_.Add(d)
  }
  rm mheap.pagesSweptBasis
  mv mheap.pagesSweptBasis_ mheap.pagesSweptBasis
'
mv export.go export_test.go

Change-Id: Id9438184b9bd06d96894c02376385bad45dee154
Reviewed-on: https://go-review.googlesource.com/c/go/+/356710
Reviewed-by: Austin Clements <austin@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
2021-10-20 20:39:29 +00:00
Michael Anthony Knyszek
e90492882a runtime: retype mheap.pagesSwept as atomic.Uint64
[git-generate]
cd src/runtime
mv export_test.go export.go
GOROOT=$(dirname $(dirname $PWD)) rf '
  add mheap.pagesSwept pagesSwept_ atomic.Uint64 // pages swept this cycle
  ex {
    import "runtime/internal/atomic"

    var t mheap
    var v, w uint64
    var d int64

    t.pagesSwept -> t.pagesSwept_.Load()
    t.pagesSwept = v -> t.pagesSwept_.Store(v)
    atomic.Load64(&t.pagesSwept) -> t.pagesSwept_.Load()
    atomic.LoadAcq64(&t.pagesSwept) -> t.pagesSwept_.LoadAcquire()
    atomic.Store64(&t.pagesSwept, v) -> t.pagesSwept_.Store(v)
    atomic.StoreRel64(&t.pagesSwept, v) -> t.pagesSwept_.StoreRelease(v)
    atomic.Cas64(&t.pagesSwept, v, w) -> t.pagesSwept_.CompareAndSwap(v, w)
    atomic.Xchg64(&t.pagesSwept, v) -> t.pagesSwept_.Swap(v)
    atomic.Xadd64(&t.pagesSwept, d) -> t.pagesSwept_.Add(d)
  }
  rm mheap.pagesSwept
  mv mheap.pagesSwept_ mheap.pagesSwept
'
mv export.go export_test.go

Change-Id: Ife99893d90a339655f604bc3a64ee3decec645ea
Reviewed-on: https://go-review.googlesource.com/c/go/+/356709
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2021-10-20 20:39:25 +00:00
Michael Anthony Knyszek
d419a80bc7 runtime: retype mheap.pagesInUse as atomic.Uint64
[git-generate]
cd src/runtime
mv export_test.go export.go
GOROOT=$(dirname $(dirname $PWD)) rf '
  add mheap.pagesInUse \
// Proportional sweep \
// \
// These parameters represent a linear function from gcController.heapLive \
// to page sweep count. The proportional sweep system works to \
// stay in the black by keeping the current page sweep count \
// above this line at the current gcController.heapLive. \
// \
// The line has slope sweepPagesPerByte and passes through a \
// basis point at (sweepHeapLiveBasis, pagesSweptBasis). At \
// any given time, the system is at (gcController.heapLive, \
// pagesSwept) in this space. \
// \
// It is important that the line pass through a point we \
// control rather than simply starting at a 0,0 origin \
// because that lets us adjust sweep pacing at any time while \
// accounting for current progress. If we could only adjust \
// the slope, it would create a discontinuity in debt if any \
// progress has already been made. \
pagesInUse_ atomic.Uint64 // pages of spans in stats mSpanInUse
  ex {
    import "runtime/internal/atomic"

    var t mheap
    var v, w uint64
    var d int64

    t.pagesInUse -> t.pagesInUse_.Load()
    t.pagesInUse = v -> t.pagesInUse_.Store(v)
    atomic.Load64(&t.pagesInUse) -> t.pagesInUse_.Load()
    atomic.LoadAcq64(&t.pagesInUse) -> t.pagesInUse_.LoadAcquire()
    atomic.Store64(&t.pagesInUse, v) -> t.pagesInUse_.Store(v)
    atomic.StoreRel64(&t.pagesInUse, v) -> t.pagesInUse_.StoreRelease(v)
    atomic.Cas64(&t.pagesInUse, v, w) -> t.pagesInUse_.CompareAndSwap(v, w)
    atomic.Xchg64(&t.pagesInUse, v) -> t.pagesInUse_.Swap(v)
    atomic.Xadd64(&t.pagesInUse, d) -> t.pagesInUse_.Add(d)
  }
  rm mheap.pagesInUse
  mv mheap.pagesInUse_ mheap.pagesInUse
'
mv export.go export_test.go

Change-Id: I495d188683dba0778518563c46755b5ad43be298
Reviewed-on: https://go-review.googlesource.com/c/go/+/356549
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2021-10-20 20:39:19 +00:00
Michael Anthony Knyszek
9c58e399a4 [dev.typeparams] runtime: fix import sort order [generated]
[git-generate]
cd src/runtime
goimports -w *.go

Change-Id: I1387af0f2fd1a213dc2f4c122e83a8db0fcb15f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/329189
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-06-17 20:42:23 +00:00
Michael Anthony Knyszek
6d85891b29 [dev.typeparams] runtime: replace uses of runtime/internal/sys.PtrSize with internal/goarch.PtrSize [generated]
[git-generate]
cd src/runtime/internal/math
gofmt -w -r "sys.PtrSize -> goarch.PtrSize" .
goimports -w *.go
cd ../..
gofmt -w -r "sys.PtrSize -> goarch.PtrSize" .
goimports -w *.go

Change-Id: I43491cdd54d2e06d4d04152b3d213851b7d6d423
Reviewed-on: https://go-review.googlesource.com/c/go/+/328337
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
2021-06-17 18:54:48 +00:00
David Chase
8a4b7294af cmd/compile: fix possible nil deref added in CL 270943
In the event allocSpan returned a nil, this would crash.
Cleaned up the code and comments slightly, too.

Change-Id: I6231d4b4c14218e6956b4a97a205adc3206f59ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/316429
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-05-03 18:17:42 +00:00
David Chase
0bbfc5c31e runtime: break up large calls to memclrNoHeapPointers to allow preemption
If something "huge" is allocated, and the zeroing is trivial (no pointers
involved) then zero it by chunks in a loop so that preemption can occur,
not all in a single non-preemptible call.

Benchmarking suggests that 256K is the best chunk size.

Updates #42642.

Change-Id: I94015e467eaa098c59870e479d6d83bc88efbfb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/270943
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-04-30 19:41:02 +00:00