Commit graph

26 commits

Author SHA1 Message Date
Michael Anthony Knyszek
b7c28f484d runtime/metrics: add CPU stats
This changes adds a breakdown for estimated CPU usage by time. These
estimates are not based on real on-CPU counters, so each metric has a
disclaimer explaining so. They can, however, be more reasonably
compared to a total CPU time metric that this change also adds.

Fixes #47216.

Change-Id: I125006526be9f8e0d609200e193da5a78d9935be
Reviewed-on: https://go-review.googlesource.com/c/go/+/404307
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Josh MacDonald <jmacd@lightstep.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-09-16 16:32:20 +00:00
Michael Anthony Knyszek
87eda2a782 runtime: shrink time histogram buckets
There are lots of useless buckets with too much precision. Introduce a
minimum level of precision with a minimum bucket bit. This cuts down on
the size of a time histogram dramatically (~3x). Also, pick a smaller
sub bucket count; we don't need 6% precision.

Also, rename super-buckets to buckets to more closely line up with HDR
histogram literature.

Change-Id: I199449650e4b34f2a6dca3cf1d8edb071c6655c0
Reviewed-on: https://go-review.googlesource.com/c/go/+/427615
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2022-09-16 16:32:01 +00:00
Michael Pratt
b04e4637db runtime: convert timeHistogram to atomic types
I've dropped the note that sched.timeToRun is protected by sched.lock,
as it does not seem to be true.

For #53821.

Change-Id: I03f8dc6ca0bcd4ccf3ec113010a0aa39c6f7d6ef
Reviewed-on: https://go-review.googlesource.com/c/go/+/419449
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Pratt <mpratt@google.com>
2022-08-12 01:53:11 +00:00
Michael Pratt
d6481d5b96 runtime: add race annotations to metricsSema
metricsSema protects the metrics map. The map implementation is race
instrumented regardless of which package is it called from.

semacquire/semrelease are not automatically race instrumented, so we can
trigger race false positives without manually annotating our lock
acquire and release.

See similar instrumentation on trace.shutdownSema and reflectOffs.lock.

Fixes #53542.

Change-Id: Ia3fd239ac860e037d09c7cb9c4ad267391e70705
Reviewed-on: https://go-review.googlesource.com/c/go/+/414517
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
2022-06-29 16:30:19 +00:00
Michael Anthony Knyszek
cfccb5cb7c runtime/metrics: add the last GC cycle that had the limiter enabled
This metric exports the the last GC cycle index that the GC limiter was
enabled. This metric is useful for debugging and identifying the root
cause of OOMs, especially when SetMemoryLimit is in use.

For #48409.

Change-Id: Ic6383b19e88058366a74f6ede1683b8ffb30a69c
Reviewed-on: https://go-review.googlesource.com/c/go/+/403614
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-13 20:45:19 +00:00
Michael Anthony Knyszek
9bd6e2776f runtime/metrics: add the number of Go-to-C calls
For #47216.

Change-Id: I1c2cd518e6ff510cc3ac8d8f72fd52eadcabc16c
Reviewed-on: https://go-review.googlesource.com/c/go/+/404306
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-13 20:45:09 +00:00
Michael Anthony Knyszek
ece6ac4d4d runtime/metrics: add gomaxprocs metric
For #47216.

Change-Id: Ib2d48c4583570a2dae9510a52d4c6ffc20161b31
Reviewed-on: https://go-review.googlesource.com/c/go/+/404305
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
2022-05-13 16:02:59 +00:00
Keith Randall
016d755213 runtime: measure stack usage; start stacks larger if needed
Measure the average stack size used by goroutines at every GC. When
starting a new goroutine, allocate an initial goroutine stack of that
average size. Intuition is that we'll waste at most 2x in stack space
because only half the goroutines can be below average. In turn, we
avoid some of the early stack growth / copying needed in the average
case.

More details in the design doc at: https://docs.google.com/document/d/1YDlGIdVTPnmUiTAavlZxBI1d9pwGQgZT7IKFKlIXohQ/edit?usp=sharing

name        old time/op  new time/op  delta
Issue18138  95.3µs ± 0%  67.3µs ±13%  -29.35%  (p=0.000 n=9+10)

Fixes #18138

Change-Id: Iba34d22ed04279da7e718bbd569bbf2734922eaa
Reviewed-on: https://go-review.googlesource.com/c/go/+/345889
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
2022-05-12 22:32:42 +00:00
Michael Anthony Knyszek
7c404d59db runtime: store consistent total allocation stats as uint64
Currently the consistent total allocation stats are managed as uintptrs,
which means they can easily overflow on 32-bit systems. Fix this by
storing these stats as uint64s. This will cause some minor performance
degradation on 32-bit systems, but there really isn't a way around this,
and it affects the correctness of the metrics we export.

Fixes #52680.

Change-Id: I7e6ca44047d46b4bd91c6f87c2d29f730e0d6191
Reviewed-on: https://go-review.googlesource.com/c/go/+/403758
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2022-05-03 19:58:15 +00:00
Michael Anthony Knyszek
129dcb7226 runtime: check the heap goal and trigger dynamically
As it stands, the heap goal and the trigger are set once by
gcController.commit, and then read out of gcController. However with the
coming memory limit we need the GC to be able to respond to changes in
non-heap memory. The simplest way of achieving this is to compute the
heap goal and its associated trigger dynamically.

In order to make this easier to implement, the GC trigger is now based
on the heap goal, as opposed to the status quo of computing both
simultaneously. In many cases we just want the heap goal anyway, not
both, but we definitely need the goal to compute the trigger, because
the trigger's bounds are entirely based on the goal (the initial runway
is not). A consequence of this is that we can't rely on the trigger to
enforce a minimum heap size anymore, and we need to lift that up
directly to the goal. Specifically, we need to lift up any part of the
calculation that *could* put the trigger ahead of the goal. Luckily this
is just the heap minimum and minimum sweep distance. In the first case,
the pacer may behave slightly differently, as the heap minimum is no
longer the minimum trigger, but the actual minimum heap goal. In the
second case it should be the same, as we ensure the additional runway
for sweeping is added to both the goal *and* the trigger, as before, by
computing that in gcControllerState.commit.

There's also another place we update the heap goal: if a GC starts and
we triggered beyond the goal, we always ensure there's some runway.
That calculation uses the current trigger, which violates the rule of
keeping the goal based on the trigger. Notice, however, that using the
precomputed trigger for this isn't even quite correct: due to a bug, or
something else, we might trigger a GC beyond the precomputed trigger.

So this change also adds a "triggered" field to gcControllerState that
tracks the point at which a GC actually triggered. This is independent
of the precomputed trigger, so it's fine for the heap goal calculation
to rely on it. It also turns out, there's more than just that one place
where we really should be using the actual trigger point, so this change
fixes those up too.

Also, because the heap minimum is set by the goal and not the trigger,
the maximum trigger calculation now happens *after* the goal is set, so
the maximum trigger actually does what I originally intended (and what
the comment says): at small heaps, the pacer picks 95% of the runway as
the maximum trigger. Currently, the pacer picks a small trigger based
on a not-yet-rounded-up heap goal, so the trigger gets rounded up to the
goal, and as per the "ensure there's some runway" check, the runway ends
up at always being 64 KiB. That check is supposed to be for exceptional
circumstances, not the status quo. There's a test introduced in the last
CL that needs to be updated to accomodate this slight change in
behavior.

So, this all sounds like a lot that changed, but what we're talking about
here are really, really tight corner cases that arise from situations
outside of our control, like pathologically bad behavior on the part of
an OS or CPU. Even in these corner cases, it's very unlikely that users
will notice any difference at all. What's more important, I think, is
that the pacer behaves more closely to what all the comments describe,
and what the original intent was.

Another note: at first, one might think that computing the heap goal and
trigger dynamically introduces some raciness, but not in this CL: the heap
goal and trigger are completely static.

Allocation outside of a GC cycle may now be a bit slower than before, as
the GC trigger check is now significantly more complex. However, note
that this executes basically just as often as gcController.revise, and
that makes up for a vanishingly small part of any CPU profile. The next
CL cleans up the floating point multiplications on this path
nonetheless, just to be safe.

For #48409.

Change-Id: I280f5ad607a86756d33fb8449ad08555cbee93f9
Reviewed-on: https://go-review.googlesource.com/c/go/+/397014
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-05-03 15:12:52 +00:00
Michael Anthony Knyszek
897baae953 runtime/metrics: add additional allocation metrics
This change adds four additional metrics to the runtime/metrics package
to fill in a few gaps with runtime.MemStats that were overlooked. The
biggest one is TotalAlloc, which is impossible to find with the
runtime/metrics package, but also add a few others for convenience and
clarity. For instance, the total number of objects allocated and freed
are technically available via allocs-by-size and frees-by-size, but it's
onerous to get them (one needs to sum the sample counts in the
histograms).

The four additional metrics are:
- /gc/heap/allocs:bytes   -- total bytes allocated (TotalAlloc)
- /gc/heap/allocs:objects -- total objects allocated (Mallocs - [tiny])
- /gc/heap/frees:bytes    -- total bytes frees (TotalAlloc-HeapAlloc)
- /gc/heap/frees:objects  -- total objects freed (Frees - [tiny])

This change also updates the descriptions of allocs-by-size and
frees-by-size to be more precise.

Change-Id: Iec8c1797a584491e3484b198f2e7f325b68954a7
Reviewed-on: https://go-review.googlesource.com/c/go/+/312431
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2021-04-29 21:54:05 +00:00
Michael Anthony Knyszek
0b9ca4d907 runtime/metrics: add tiny allocs metric
Currently tiny allocations are not represented in either MemStats or
runtime/metrics, but they're represented in MemStats (indirectly) via
Mallocs. Add them to runtime/metrics by first merging
memstats.tinyallocs into consistentHeapStats (just for simplicity; it's
monotonic so metrics would still be self-consistent if we just read it
atomically) and then adding /gc/heap/tiny/allocs:objects to the list of
supported metrics.

Change-Id: Ie478006ab942a3e877b4a79065ffa43569722f3d
Reviewed-on: https://go-review.googlesource.com/c/go/+/312909
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-04-27 13:59:22 +00:00
Michael Pratt
bedfeed54a runtime,runtime/metrics: add metric to track scheduling latencies
This change adds a metric to track scheduling latencies, defined as the
cumulative amount of time a goroutine spends being runnable before
running again. The metric is an approximations and samples instead of
trying to record every goroutine scheduling latency.

This change was primarily authored by mknyszek@google.com.

Change-Id: Ie0be7e6e7be421572eb2317d3dd8dd6f3d6aa152
Reviewed-on: https://go-review.googlesource.com/c/go/+/308933
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2021-04-23 13:48:10 +00:00
Michael Anthony Knyszek
3eaf75c13a runtime: move next_gc and last_next_gc into gcControllerState
This change moves next_gc and last_next_gc into gcControllerState under
the names heapGoal and lastHeapGoal respectively. These are
fundamentally GC pacer related values, and so it makes sense for them to
live here.

Partially generated by

rf '
    ex . {
	memstats.next_gc -> gcController.heapGoal
	memstats.last_next_gc -> gcController.lastHeapGoal
    }
'

except for updates to comments and gcControllerState methods, where
they're accessed through the receiver, and trace-related renames of
NextGC -> HeapGoal, while we're here.

For #44167.

Change-Id: I1e871ad78a57b01be8d9f71bd662530c84853bed
Reviewed-on: https://go-review.googlesource.com/c/go/+/306603
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
2021-04-14 14:03:30 +00:00
John Bampton
6ba4a300d8 docs: fix spelling
Change-Id: Ib689e5793d9cb372e759c4f34af71f004010c822
GitHub-Last-Rev: d63798388e
GitHub-Pull-Request: golang/go#44259
Reviewed-on: https://go-review.googlesource.com/c/go/+/291949
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: Matthew Dempsky <mdempsky@google.com>
Trust: Robert Griesemer <gri@golang.org>
2021-02-24 04:11:43 +00:00
Michael Anthony Knyszek
32afcc9436 runtime/metrics: change unit on *-by-size metrics to match bucket unit
This change modifies the *-by-size metrics' units to be based off the
bucket's unit (bytes) as opposed to the unit of the counts (objects).
This convention is more in-line with distributions in other metrics
systems.

Change-Id: Id3b68a09f52f0e1ff9f4346f613ae1cbd9f52f73
Reviewed-on: https://go-review.googlesource.com/c/go/+/282352
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
2021-01-08 16:28:15 +00:00
Michael Anthony Knyszek
ae97717133 runtime,runtime/metrics: use explicit histogram boundaries
This change modifies the semantics of
runtime/metrics.Float64Histogram.Buckets to remove implicit buckets to
that extend to positive and negative infinity and instead defines all
bucket boundaries as explicitly listed.

Bucket boundaries remain the same as before except
/gc/heap/allocs-by-size:objects and /gc/heap/frees-by-size:objects no
longer have a bucket that extends to negative infinity.

This change simplifies the Float64Histogram API, making it both easier
to understand and easier to use.

Also, add a test for allocs-by-size and frees-by-size that checks them
against MemStats.

Fixes #43443.

Change-Id: I5620f15bd084562dadf288f733c4a8cace21910c
Reviewed-on: https://go-review.googlesource.com/c/go/+/281238
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
2021-01-08 03:43:44 +00:00
Michael Anthony Knyszek
b116404444 runtime: shift timeHistogram buckets and allow negative durations
Today, timeHistogram, when copied, has the wrong set of counts for the
bucket that should represent (-inf, 0), when in fact it contains [0, 1).
In essence, the buckets are all shifted over by one from where they're
supposed to be.

But this also means that the existence of the overflow bucket is wrong:
the top bucket is supposed to extend to infinity, and what we're really
missing is an underflow bucket to represent the range (-inf, 0).

We could just always zero this bucket and continue ignoring negative
durations, but that likely isn't prudent.

timeHistogram is intended to be used with differences in nanotime, but
depending on how a platform is implemented (or due to a bug in that
platform) it's possible to get a negative duration without having done
anything wrong. We should just be resilient to that and be able to
detect it.

So this change removes the overflow bucket and replaces it with an
underflow bucket, and timeHistogram no longer panics when faced with a
negative duration.

Fixes #43328.
Fixes #43329.

Change-Id: If336425d7d080fd37bf071e18746800e22d38108
Reviewed-on: https://go-review.googlesource.com/c/go/+/279468
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-12-23 17:31:18 +00:00
Michael Anthony Knyszek
8db7e2fecd runtime: fix allocs-by-size and frees-by-size buckets
Currently these two metrics are reported incorrectly, going by the
documentation in the runtime/metrics package. We just copy in the
size-class-based values from the runtime wholesale, but those implicitly
have an inclusive upper-bound and exclusive lower-bound (e.g. 48-byte
size class contains objects in the size range (32, 48]) but the API
declares inclusive lower-bounds and exclusive upper-bounds.

Also, the bottom bucket representing (-inf, 1) should always be empty.
Extend the consistency check to verify this.

Updates #43329.

Change-Id: I11b5b062a34e13405ab662d15334bda91f779775
Reviewed-on: https://go-review.googlesource.com/c/go/+/279467
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-12-23 17:31:08 +00:00
Michael Anthony Knyszek
80c6b92ecb runtime,runtime/metrics: export goroutine count as a metric
For #37112.

Change-Id: I994dfe848605b95ef6aec24f53869e929247e987
Reviewed-on: https://go-review.googlesource.com/c/go/+/247049
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 21:48:03 +00:00
Michael Anthony Knyszek
d39a89fd58 runtime,runtime/metrics: add metric for distribution of GC pauses
For #37112.

Change-Id: Ibb0425c9c582ae3da3b2662d5bbe830d7df9079c
Reviewed-on: https://go-review.googlesource.com/c/go/+/247047
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 21:47:54 +00:00
Michael Anthony Knyszek
36c5edd8d9 runtime: add timeHistogram type
This change adds a concurrent HDR time histogram to the runtime with
tests. It also adds a function to generate boundaries for use by the
metrics package.

For #37112.

Change-Id: Ifbef8ddce8e3a965a0dcd58ccd4915c282ae2098
Reviewed-on: https://go-review.googlesource.com/c/go/+/247046
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 21:47:49 +00:00
Michael Anthony Knyszek
8e2370bf7f runtime,runtime/metrics: add object size distribution metrics
This change adds metrics for the distribution of objects allocated and
freed by size, mirroring MemStats' BySize field.

For #37112.

Change-Id: Ibaf1812da93598b37265ec97abc6669c1a5efcbf
Reviewed-on: https://go-review.googlesource.com/c/go/+/247045
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 21:47:43 +00:00
Michael Anthony Knyszek
a8b28ebc87 runtime,runtime/metrics: add heap goal and GC cycle metrics
This change adds three new metrics: the heap goal, GC cycle count, and
forced GC count. These metrics are identical to their MemStats
counterparts.

For #37112.

Change-Id: I5a5e8dd550c0d646e5dcdbdf38274895e27cdd88
Reviewed-on: https://go-review.googlesource.com/c/go/+/247044
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:29:24 +00:00
Michael Anthony Knyszek
07c3f65d53 runtime,runtime/metrics: add heap object count metric
For #37112.

Change-Id: Idd3dd5c84215ddd1ab05c2e76e848aa0a4d40fb0
Reviewed-on: https://go-review.googlesource.com/c/go/+/247043
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2020-10-26 18:29:18 +00:00
Michael Anthony Knyszek
b08dfbaa43 runtime,runtime/metrics: add memory metrics
This change adds support for a variety of runtime memory metrics and
contains the base implementation of Read for the runtime/metrics
package, which lives in the runtime.

It also adds testing infrastructure for the metrics package, and a bunch
of format and documentation tests.

For #37112.

Change-Id: I16a2c4781eeeb2de0abcb045c15105f1210e2d8a
Reviewed-on: https://go-review.googlesource.com/c/go/+/247041
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Trust: Michael Knyszek <mknyszek@google.com>
2020-10-26 18:29:05 +00:00