Commit graph

184 commits

Author SHA1 Message Date
Michael Anthony Knyszek
eb96f8a574 runtime: scavenge on growth instead of inline with allocation
Inline scavenging causes significant performance regressions in tail
latency for k8s and has relatively little benefit for RSS footprint.

We disabled inline scavenging in Go 1.12.5 (CL 174102) as well, but
we thought other changes in Go 1.13 had mitigated the issues with
inline scavenging. Apparently we were wrong.

This CL switches back to only doing foreground scavenging on heap
growth, rather than doing it when allocation tries to allocate from
scavenged space.

Fixes #32828.

Change-Id: I1f5df44046091f0b4f89fec73c2cde98bf9448cb
Reviewed-on: https://go-review.googlesource.com/c/go/+/183857
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2019-09-25 22:20:36 +00:00
Austin Clements
f18109d7e3 runtime: grow the heap incrementally
Currently, we map and grow the heap a whole arena (64MB) at a time.
Unfortunately, in order to fix #32828, we need to switch from
scavenging inline with allocation back to scavenging on heap growth,
but heap-growth scavenging happens in large jumps because we grow the
heap in large jumps.

In order to prepare for better heap-growth scavenging, this CL
separates mapping more space for the heap from actually "growing" it
(tracking the new space with spans). Instead, growing the heap keeps
track of the "current arena" it's growing into. It track that with new
spans as needed, and only maps more arena space when the current arena
is inadequate. The effect to the user is the same, but this will let
us scavenge on much smaller increments of heap growth.

There are two slightly subtleties to this change:

1. If an allocation requires mapping a new arena and that new arena
   isn't contiguous with the current arena, we don't want to lose the
   unused space in the current arena, so we have to immediately track
   that with a span.

2. The mapped space must be accounted as released and idle, even
   though it isn't actually tracked in a span.

For #32828, since this makes heap-growth scavenging far more
effective, especially at small heap sizes. For example, this change is
necessary for TestPhysicalMemoryUtilization to pass once we remove
inline scavenging.

Change-Id: I300e74a0534062467e4ce91cdc3508e5ef9aa73a
Reviewed-on: https://go-review.googlesource.com/c/go/+/189957
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2019-09-25 22:17:21 +00:00
Michael Knyszek
8c3040d768 runtime: call sysHugePage less often
Currently when we coalesce memory we make a sysHugePage call
(MADV_HUGEPAGE) to ensure freed and coalesced huge pages are treated as
such so the scavenger's assumptions about performance are more in line
with reality.

Unfortunately we do it way too often because we do it if there was any
change to the huge page count for the span we're coalescing into, not
taking into account that it could coalesce with its neighbors and not
actually create a new huge page.

This change makes it so that it only calls sysHugePage if the original
huge page counts between the span to be coalesced into and its neighbors
do not add up (i.e. a new huge page was created due to alignment). Calls
to sysHugePage will now happen much less frequently, as intended.

Updates #32828.

Change-Id: Ia175919cb79b730a658250425f97189e27d7fda3
Reviewed-on: https://go-review.googlesource.com/c/go/+/186926
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-07-30 18:53:01 +00:00
Michael Anthony Knyszek
a41ebe6e25 runtime: add physHugePageShift
This change adds physHugePageShift which is defined such that
1 << physHugePageShift == physHugePageSize. The purpose of this variable
is to avoid doing expensive divisions in key functions, such as
(*mspan).hugePages.

This change also does a sweep of any place we might do a division or mod
operation with physHugePageSize and turns it into bit shifts and other
bitwise operations.

Finally, this change adds a check to mallocinit which ensures that
physHugePageSize is always a power of two. osinit might choose to ignore
non-powers-of-two for the value and replace it with zero, but mallocinit
will fail if it's not a power of two (or zero). It also derives
physHugePageShift from physHugePageSize.

This change helps improve the performance of most applications because
of how often (*mspan).hugePages is called.

Updates #32828.

Change-Id: I1a6db113d52d563f59ae8fd4f0e130858859e68f
Reviewed-on: https://go-review.googlesource.com/c/go/+/186598
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-07-30 18:44:52 +00:00
Michael Anthony Knyszek
7ed7669c0d runtime: ensure mheap lock stack growth invariant is maintained
Currently there's an invariant in the runtime wherein the heap lock
can only be acquired on the system stack, otherwise a self-deadlock
could occur if the stack grows while the lock is held.

This invariant is upheld and documented in a number of situations (e.g.
allocManual, freeManual) but there are other places where the invariant
is either not maintained at all which risks self-deadlock (e.g.
setGCPercent, gcResetMarkState, allocmcache) or is maintained but
undocumented (e.g. gcSweep, readGCStats_m).

This change adds go:systemstack to any function that acquires the heap
lock or adds a systemstack(func() { ... }) around the critical section,
where appropriate. It also documents the invariant on (*mheap).lock
directly and updates repetitive documentation to refer to that comment.

Fixes #32105.

Change-Id: I702b1290709c118b837389c78efde25c51a2cafb
Reviewed-on: https://go-review.googlesource.com/c/go/+/177857
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2019-05-24 15:34:57 +00:00
Michael Anthony Knyszek
4e7bef84c1 runtime: mark newly-mapped memory as scavenged
On most platforms newly-mapped memory is untouched, meaning the pages
backing the region haven't been faulted in yet. However, we mark this
memory as unscavenged which means the background scavenger
aggressively "returns" this memory to the OS if the heap is small.

The only platform where newly-mapped memory is actually unscavenged (and
counts toward the application's RSS) is on Windows, since
(*mheap).sysAlloc commits the reservation. Instead of making a special
case for Windows, I change the requirements a bit for a sysReserve'd
region. It must now be both sysMap'd and sysUsed'd, with sysMap being a
no-op on Windows. Comments about memory allocation have been updated to
include a more up-to-date mental model of which states a region of memory
may be in (at a very low level) and how to transition between these
states.

Now this means we can correctly mark newly-mapped heap memory as
scavenged on every platform, reducing the load on the background
scavenger early on in the application for small heaps. As a result,
heap-growth scavenging is no longer necessary, since any actual RSS
growth will be accounted for on the allocation codepath.

Finally, this change also cleans up grow a little bit to avoid
pretending that it's freeing an in-use span and just does the necessary
operations directly.

Fixes #32012.
Fixes #31966.
Updates #26473.

Change-Id: Ie06061eb638162e0560cdeb0b8993d94cfb4d290
Reviewed-on: https://go-review.googlesource.com/c/go/+/177097
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-05-16 22:00:47 +00:00
Michael Anthony Knyszek
ff70494b75 runtime: split spans when scavenging if it's more than we need
This change makes it so that during scavenging we split spans when the
span we have next for scavenging is larger than the amount of work we
have left to do.

The purpose of this change is to improve the worst-case behavior of the
scavenger: currently, if the scavenger only has a little bit of work to
do but sees a very large free span, it will scavenge the whole thing,
spending a lot of time to get way ahead of the scavenge pacing for no
reason.

With this change the scavenger should follow the pacing more closely,
but may still over-scavenge by up to a physical huge page since the
splitting behavior avoids breaking up huge pages in free spans.

This change is also the culmination of the scavenging improvements, so
we also include benchmark results for this series (starting from
"runtime: merge all treaps into one implementation" until this patch).

This patch stack results in average and peak RSS reductions (up to 11%
and 7% respectively) for some benchmarks, with mostly minimal
performance degredation (3-4% for some benchmarks, ~0% geomean). Each of
these benchmarks was executed with GODEBUG=madvdontneed=1 on Linux; the
performance degredation is even smaller when MADV_FREE may be used, but
the impact on RSS is much harder to measure. Applications that generally
maintain a steady heap size for the most part show no change in
application performance.

These benchmarks are taken from an experimental benchmarking suite
representing a variety of open-source Go packages, the raw results may
be found here:

https://perf.golang.org/search?q=upload:20190509.1

For #30333.

Change-Id: I618a48534d2d6ce5f656bb66825e3c383ab1ffba
Reviewed-on: https://go-review.googlesource.com/c/go/+/175797
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2019-05-09 16:23:09 +00:00
Michael Anthony Knyszek
fe67ea32bf runtime: add background scavenger
This change adds a background scavenging goroutine whose pacing is
determined when the heap goal changes. The scavenger is paced to use
at most 1% of the mutator's time for most systems. Furthermore, the
scavenger's pacing is computed based on the estimated number of
scavengable huge pages to take advantage of optimizations provided by
the OS.

The purpose of this scavenger is to deal with a shrinking heap: if the
heap goal is falling over time, the scavenger should kick in and start
returning free pages from the heap to the OS.

Also, now that we have a pacing system, the credit system used by
scavengeLocked has become redundant. Replace it with a mechanism which
only scavenges on the allocation path if it makes sense to do so with
respect to the new pacing system.

Fixes #30333.

Change-Id: I6203f8dc84affb26c3ab04528889dd9663530edc
Reviewed-on: https://go-review.googlesource.com/c/go/+/142960
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-05-09 16:21:43 +00:00
Michael Anthony Knyszek
eaa1c87b00 runtime: remove periodic scavenging
This change removes the periodic scavenger which goes over every span
in the heap and scavenges it if it hasn't been used for 5 minutes. It
should no longer be necessary if we have background scavenging
(follow-up).

For #30333.

Change-Id: Ic3a1a4e85409dc25719ba4593a3b60273a4c71e0
Reviewed-on: https://go-review.googlesource.com/c/go/+/143157
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-05-09 16:20:52 +00:00
Michael Anthony Knyszek
31c4e09915 runtime: ensure free and unscavenged spans may be backed by huge pages
This change adds a new sysHugePage function to provide the equivalent of
Linux's madvise(MADV_HUGEPAGE) support to the runtime. It then uses
sysHugePage to mark a newly-coalesced free span as backable by huge
pages to make the freeHugePages approximation a bit more accurate.

The problem being solved here is that if a large free span is composed
of many small spans which were coalesced together, then there's a chance
that they have had madvise(MADV_NOHUGEPAGE) called on them at some point,
which makes freeHugePages less accurate.

For #30333.

Change-Id: Idd4b02567619fc8d45647d9abd18da42f96f0522
Reviewed-on: https://go-review.googlesource.com/c/go/+/173338
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-05-06 21:15:01 +00:00
Michael Anthony Knyszek
5c15ed64de runtime: split spans during allocation without treap removal
Now that the treap is first-fit, we can make a nice optimization.
Mainly, since we know that span splitting doesn't modify the relative
position of a span in a treap, we can actually modify a span in-place
on the treap. The only caveat is that we need to update the relevant
metadata.

To enable this optimization, this change introduces a mutate method on
the iterator which takes a callback that is passed the iterator's span.
The method records some properties of the span before it calls into the
callback and then uses those records to see what changed and update
treap metadata appropriately.

Change-Id: I74f7d2ee172800828434ba0194d3d78d3942acf2
Reviewed-on: https://go-review.googlesource.com/c/go/+/174879
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
2019-05-06 21:14:52 +00:00
Michael Anthony Knyszek
f4a5ae5594 runtime: track the number of free unscavenged huge pages
This change tracks the number of potential free and unscavenged huge
pages which will be used to inform the rate at which scavenging should
occur.

For #30333.

Change-Id: I47663e5ffb64cac44ffa10db158486783f707479
Reviewed-on: https://go-review.googlesource.com/c/go/+/170860
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-05-06 20:59:20 +00:00
Michael Anthony Knyszek
a62b5723be runtime: scavenge huge spans first
This change adds two new treap iteration types: one for large
unscavenged spans (contain at least one huge page) and one for small
unscavenged spans. This allows us to scavenge the huge spans first by
first iterating over the large ones, then the small ones.

Also, since we now depend on physHugePageSize being a power of two,
ensure that that's the case when it's retrieved from the OS.

For #30333.

Change-Id: I51662740205ad5e4905404a0856f5f2b2d2a5680
Reviewed-on: https://go-review.googlesource.com/c/go/+/174399
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-05-06 20:57:39 +00:00
Michael Anthony Knyszek
fa8470a8cd runtime: make treap iteration more efficient
This change introduces a treapIterFilter type which represents the
power set of states described by a treapIterType.

This change then adds a treapIterFilter field to each treap node
indicating the types of spans that live in that subtree. The field is
maintained via the same mechanism used to maintain maxPages. This allows
pred, succ, start, and end to be judicious about which subtrees it will
visit, ensuring that iteration avoids traversing irrelevant territory.

Without this change, repeated scavenging attempts can end up being N^2
as the scavenger walks over what it already scavenged before finding new
spans available for scavenging.

Finally, this change also only scavenges a span once it is removed from
the treap. There was always an invariant that spans owned by the treap
may not be mutated in-place, but with this change violating that
invariant can cause issues with scavenging.

For #30333.

Change-Id: I8040b997e21c94a8d3d9c8c6accfe23cebe0c3d3
Reviewed-on: https://go-review.googlesource.com/c/go/+/174878
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-05-06 20:50:16 +00:00
Michael Anthony Knyszek
9baa4301cf runtime: merge all treaps into one implementation
This change modifies the treap implementation to support holding all
spans in a single treap, instead of keeping them all in separate treaps.

This improves ergonomics for nearly all treap-related callsites.
With that said, iteration is now more expensive, but it never occurs on
the fast path, only on scavenging-related paths.

This change opens up the opportunity for further optimizations, such as
splitting spans without treap removal (taking treap removal off the span
allocator's critical path) as well as improvements to treap iteration
(building linked lists for each iteration type and managing them on
insert/removal, since those operations should be less frequent).

For #30333.

Change-Id: I3dac97afd3682a37fda09ae8656a770e1369d0a9
Reviewed-on: https://go-review.googlesource.com/c/go/+/174398
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-05-06 20:19:44 +00:00
Iskander Sharipov
12e63226b9 all: remove commented-out print statements
Those print statements are not a good debug helpers
and only clutter the code.

Change-Id: Ifbf450a04e6fa538af68e6352c016728edb4119a
Reviewed-on: https://go-review.googlesource.com/c/go/+/160537
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2019-05-05 08:09:30 +00:00
Michael Anthony Knyszek
40036a99a0 runtime: change the span allocation policy to first-fit
This change modifies the treap implementation to be address-ordered
instead of size-ordered, and further augments it so it may be used for
allocation. It then modifies the find method to implement a first-fit
allocation policy.

This change to the treap implementation consequently makes it so that
spans are scavenged in highest-address-first order without any
additional changes to the scavenging code. Because the treap itself is
now address ordered, and the scavenging code iterates over it in
reverse, the highest address is now chosen instead of the largest span.

This change also renames the now wrongly-named "scavengeLargest" method
on mheap to just "scavengeLocked" and also fixes up logic in that method
which made assumptions about size.

For #30333.

Change-Id: I94b6f3209211cc1bfdc8cdaea04152a232cfbbb4
Reviewed-on: https://go-review.googlesource.com/c/go/+/164101
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-05-01 14:50:40 +00:00
Michael Anthony Knyszek
7b33b6274f runtime: introduce treapForSpan to reduce code duplication
Currently which treap a span should be inserted into/removed from is
checked by looking at the span's properties. This logic is repeated in
four places. As this logic gets more complex, it makes sense to
de-duplicate this, so introduce treapForSpan instead which captures this
logic by returning the appropriate treap for the span.

For #30333.

Change-Id: I4bd933d93dc50c5fc7c7c7f56ceb95194dcbfbcc
Reviewed-on: https://go-review.googlesource.com/c/go/+/170857
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-04-10 22:01:12 +00:00
Michael Anthony Knyszek
2ab6d0172e runtime: throw if scavenge necessary during coalescing
Currently when coalescing if two adjacent spans are scavenged, we
subtract their sizes from memstats and re-scavenge the new combined
span. This is wasteful however, since the realignment semantics make
this case of having to re-scavenge impossible.

In realign() inside of coalesce(), there was also a bug: on systems
where physPageSize > pageSize, we wouldn't realign because a condition
had the wrong sign. This wasteful re-scavenging has been masking this
bug this whole time. So, this change fixes that first.

Then this change gets rid of the needsScavenge logic and instead checks
explicitly for the possibility of unscavenged pages near the physical
page boundary. If the possibility exists, it throws. The intent of
throwing here is to catch changes to the runtime which cause this
invariant to no longer hold, at which point it would likely be
appropriate to scavenge the additional pages (and only the additional
pages) at that point.

Change-Id: I185e3d7b53e36e90cf9ace5fa297a9e8008d75f3
Reviewed-on: https://go-review.googlesource.com/c/go/+/158377
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-04-10 22:00:03 +00:00
Michael Anthony Knyszek
7bb8fc1033 runtime: use iterator instead of raw node for treap find
Right now the mTreap structure exposes the treapNode structure through
only one interface: find. There's no reason (performance or otherwise)
for exposing this, and we get a cleaner abstraction through the
iterators this way. This change also makes it easier to make changes to
the mTreap implementation without violating its interface.

Change-Id: I5ef86b8ac81a47d05d8404df65af9ec5f419dc40
Reviewed-on: https://go-review.googlesource.com/c/go/+/164098
Reviewed-by: Austin Clements <austin@google.com>
2019-04-08 15:41:43 +00:00
Michael Anthony Knyszek
23d4c6cdd6 runtime: merge codepaths in scavengeLargest
This change just makes the code in scavengeLargest easier to reason
about by reducing the number of exit points to the method. It should
still be correct either way because the condition checked at the end
(released > nbytes) will always be false if we return, but this just
makes the code a little easier to maintain.

Change-Id: If60da7696aca3fab3b5ddfc795d600d87c988238
Reviewed-on: https://go-review.googlesource.com/c/go/+/160617
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-04-08 15:36:03 +00:00
Michael Anthony Knyszek
faf187fb8e runtime: add credit system for scavenging
When scavenging small amounts it's possible we over-scavenge by a
significant margin since we choose to scavenge the largest spans first.
This over-scavenging is never accounted for.

With this change, we add a scavenge credit pool, similar to the reclaim
credit pool. Any time scavenging triggered by RSS growth starts up, it
checks if it can cash in some credit first. If after using all the
credit it still needs to scavenge, then any extra it does it adds back
into the credit pool.

This change mitigates the performance impact of golang.org/cl/159500 on
the Garbage benchmark. On Go1 it suggests some improvements, but most of
that is within the realm of noise (Revcomp seems very sensitive to
GC-related changes, both postively and negatively).

Garbage: https://perf.golang.org/search?q=upload:20190131.5
Go1:     https://perf.golang.org/search?q=upload:20190131.4

Performance change with both changes:

Garbage: https://perf.golang.org/search?q=upload:20190131.7
Go1:     https://perf.golang.org/search?q=upload:20190131.6

Change-Id: I87bd3c183e71656fdafef94714194b9fdbb77aa2
Reviewed-on: https://go-review.googlesource.com/c/160297
Reviewed-by: Austin Clements <austin@google.com>
2019-01-31 16:55:43 +00:00
Michael Anthony Knyszek
8e093e7a1c runtime: scavenge memory upon allocating from scavenged memory
Because scavenged and unscavenged spans no longer coalesce, memory that
is freed no longer has a high likelihood of being re-scavenged. As a
result, if an application is allocating at a fast rate, it may work fast
enough to undo all the scavenging work performed by the runtime's
current scavenging mechanisms. This behavior is exacerbated by the
global best-fit allocation policy the runtime uses, since scavenged
spans are just as likely to be chosen as unscavenged spans on average.

To remedy that, we treat each allocation of scavenged space as a heap
growth, and scavenge other memory to make up for the allocation.

This change makes performance of the runtime slightly worse, as now
we're scavenging more often during allocation. The regression is
particularly obvious with the garbage benchmark (3%) but most of the Go1
benchmarks are within the margin of noise. A follow-up change should
help.

Garbage: https://perf.golang.org/search?q=upload:20190131.3
Go1:     https://perf.golang.org/search?q=upload:20190131.2

Updates #14045.

Change-Id: I44a7e6586eca33b5f97b6d40418db53a8a7ae715
Reviewed-on: https://go-review.googlesource.com/c/159500
Reviewed-by: Austin Clements <austin@google.com>
2019-01-31 16:54:41 +00:00
Michael Anthony Knyszek
6e9f664b9a runtime: don't coalesce scavenged spans with unscavenged spans
As a result of changes earlier in Go 1.12, the scavenger became much
more aggressive. In particular, when scavenged and unscavenged spans
coalesced, they would always become scavenged. This resulted in most
spans becoming scavenged over time. While this is good for keeping the
RSS of the program low, it also causes many more undue page faults and
many more calls to madvise.

For most applications, the impact of this was negligible. But for
applications that repeatedly grow and shrink the heap by large amounts,
the overhead can be significant. The overhead was especially obvious on
older versions of Linux where MADV_FREE isn't available and
MADV_DONTNEED must be used.

This change makes it so that scavenged spans will never coalesce with
unscavenged spans. This  results in fewer page faults overall. Aside
from this, the expected impact of this change is more heap growths on
average, as span allocations will be less likely to be fulfilled. To
mitigate this slightly, this change also coalesces spans eagerly after
scavenging, to at least ensure that all scavenged spans and all
unscavenged spans are coalesced with each other.

Also, this change adds additional logic in the case where two adjacent
spans cannot coalesce. In this case, on platforms where the physical
page size is larger than the runtime's page size, we realign the
boundary between the two adjacent spans to a physical page boundary. The
advantage of this approach is that "unscavengable" spans, that is, spans
which cannot be scavenged because they don't cover at least a single
physical page are grown to a size where they have a higher likelihood of
being discovered by the runtime's scavenging mechanisms when they border
a scavenged span. This helps prevent the runtime from accruing pockets
of "unscavengable" memory in between scavenged spans, preventing them
from coalescing.

We specifically choose to apply this logic to all spans because it
simplifies the code, even though it isn't strictly necessary. The
expectation is that this change will result in a slight loss in
performance on platforms where the physical page size is larger than the
runtime page size.

Update #14045.

Change-Id: I64fd43eac1d6de6f51d7a2ecb72670f10bb12589
Reviewed-on: https://go-review.googlesource.com/c/158078
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-01-17 17:07:23 +00:00
Michael Anthony Knyszek
2f99e889f0 runtime: de-duplicate coalescing code
Currently the code surrounding coalescing is duplicated between merging
with the span before the span considered for coalescing and merging with
the span after. This change factors out the shared portions of these
codepaths into a local closure which acts as a helper.

Change-Id: I7919fbed3f9a833eafb324a21a4beaa81f2eaa91
Reviewed-on: https://go-review.googlesource.com/c/158077
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-01-17 17:05:37 +00:00
Michael Anthony Knyszek
79ac638e41 runtime: refactor coalescing into its own method
The coalescing process is complex and in a follow-up change we'll need
to do it in more than one place, so this change factors out the
coalescing code in freeSpanLocked into a method on mheap.

Change-Id: Ia266b6cb1157c1b8d3d8a4287b42fbcc032bbf3a
Reviewed-on: https://go-review.googlesource.com/c/157838
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2019-01-17 16:59:49 +00:00
Michael Anthony Knyszek
4b3f04c63b runtime: make mTreap iterator bidirectional
This change makes mTreap's iterator type, treapIter, bidirectional
instead of unidirectional. This change helps support moving the find
operation on a treap to return an iterator instead of a treapNode, in
order to hide the details of the treap when accessing elements.

For #28479.

Change-Id: I5dbea4fd4fb9bede6e81bfd089f2368886f98943
Reviewed-on: https://go-review.googlesource.com/c/156918
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2019-01-10 18:15:48 +00:00
Michael Anthony Knyszek
064842450b runtime: allocate from free and scav fairly
This change modifies the behavior of span allocations to no longer
prefer the free treap over the scavenged treap.

While there is an additional cost to allocating out of the scavenged
treap, the current behavior of preferring the unscavenged spans can
lead to unbounded growth of a program's virtual memory footprint.

In small programs (low # of Ps, low resident set size, low allocation
rate) this behavior isn't really apparent and is difficult to
reproduce.

However, in relatively large, long-running programs we see this
unbounded growth in free spans, and an unbounded amount of heap
growths.

It still remains unclear how this policy change actually ends up
increasing the number of heap growths over time, but switching the
policy back to best-fit does indeed solve the problem.

Change-Id: Ibb88d24f9ef6766baaa7f12b411974cc03341e7b
Reviewed-on: https://go-review.googlesource.com/c/148979
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-12-17 23:28:36 +00:00
Michael Anthony Knyszek
3651476075 runtime: add iterator abstraction for mTreap
This change adds the treapIter type which provides an iterator
abstraction for walking over an mTreap. In particular, the mTreap type
now has iter() and rev() for iterating both forwards (smallest to
largest) and backwards (largest to smallest). It also has an erase()
method for erasing elements at the iterator's current position.

For #28479.

While the expectation is that this change will slow down Go programs,
the impact on Go1 and Garbage is negligible.

Go1:     https://perf.golang.org/search?q=upload:20181214.6
Garbage: https://perf.golang.org/search?q=upload:20181214.11

Change-Id: I60dbebbbe73cbbe7b78d45d2093cec12cc0bc649
Reviewed-on: https://go-review.googlesource.com/c/151537
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-12-17 23:28:18 +00:00
Austin Clements
b00a6d8bfe runtime: eliminate mheap.busy* lists
The old whole-page reclaimer was the only thing that used the busy
span lists. Remove them so nothing uses them any more.

Change-Id: I4007dd2be08b9ef41bfdb0c387215c73c392cc4c
Reviewed-on: https://go-review.googlesource.com/c/138960
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2018-11-15 19:27:13 +00:00
Austin Clements
5333550bdc runtime: implement efficient page reclaimer
When we attempt to allocate an N page span (either for a large
allocation or when an mcentral runs dry), we first try to sweep spans
to release N pages. Currently, this can be extremely expensive:
sweeping a span to emptiness is the hardest thing to ask for and the
sweeper generally doesn't know where to even look for potentially
fruitful results. Since this is on the critical path of many
allocations, this is unfortunate.

This CL changes how we reclaim empty spans. Instead of trying lots of
spans and hoping for the best, it uses the newly introduced span marks
to efficiently find empty spans. The span marks (and in-use bits) are
in a dense bitmap, so these spans can be found with an efficient
sequential memory scan. This approach can scan for unmarked spans at
about 300 GB/ms and can free unmarked spans at about 32 MB/ms. We
could probably significantly improve the rate at which is can free
unmarked spans, but that's a separate issue.

Like the current reclaimer, this is still linear in the number of
spans that are swept, but the constant factor is now so vanishingly
small that it doesn't matter.

The benchmark in #18155 demonstrates both significant page reclaiming
delays, and object reclaiming delays. With "-retain-count=20000000
-preallocate=true -loop-count=3", the benchmark demonstrates several
page reclaiming delays on the order of 40ms. After this change, the
page reclaims are insignificant. The longest sweeps are still ~150ms,
but are object reclaiming delays. We'll address those in the next
several CLs.

Updates #18155.

Fixes #21378 by completely replacing the logic that had that bug.

Change-Id: Iad80eec11d7fc262d02c8f0761ac6998425c4064
Reviewed-on: https://go-review.googlesource.com/c/138959
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-11-15 19:27:11 +00:00
Austin Clements
ba1698e963 runtime: mark span when marking any object on the span
This adds a mark bit for each span that is set if any objects on the
span are marked. This will be used for sweeping.

For #18155.

The impact of this is negligible for most benchmarks, and < 1% for
GC-heavy benchmarks.

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.18ms ± 0%  2.20ms ± 1%  +0.88%  (p=0.000 n=16+18)

(https://perf.golang.org/search?q=upload:20180928.1)

name                      old time/op    new time/op    delta
BinaryTree17-12              2.68s ± 1%     2.68s ± 1%    ~     (p=0.707 n=17+19)
Fannkuch11-12                2.28s ± 0%     2.39s ± 0%  +4.95%  (p=0.000 n=19+18)
FmtFprintfEmpty-12          40.3ns ± 4%    39.4ns ± 2%  -2.27%  (p=0.000 n=17+18)
FmtFprintfString-12         67.9ns ± 1%    68.3ns ± 1%  +0.55%  (p=0.000 n=18+19)
FmtFprintfInt-12            75.7ns ± 1%    76.1ns ± 1%  +0.44%  (p=0.005 n=18+19)
FmtFprintfIntInt-12          123ns ± 1%     121ns ± 1%  -1.00%  (p=0.000 n=18+18)
FmtFprintfPrefixedInt-12     150ns ± 0%     148ns ± 0%  -1.33%  (p=0.000 n=16+13)
FmtFprintfFloat-12           208ns ± 0%     204ns ± 0%  -1.92%  (p=0.000 n=13+17)
FmtManyArgs-12               501ns ± 1%     498ns ± 0%  -0.55%  (p=0.000 n=19+17)
GobDecode-12                6.24ms ± 0%    6.25ms ± 1%    ~     (p=0.113 n=20+19)
GobEncode-12                5.33ms ± 0%    5.29ms ± 1%  -0.72%  (p=0.000 n=20+18)
Gzip-12                      220ms ± 1%     218ms ± 1%  -1.02%  (p=0.000 n=19+19)
Gunzip-12                   35.5ms ± 0%    35.7ms ± 0%  +0.45%  (p=0.000 n=16+18)
HTTPClientServer-12         77.9µs ± 1%    77.7µs ± 1%  -0.30%  (p=0.047 n=20+19)
JSONEncode-12               8.82ms ± 0%    8.93ms ± 0%  +1.20%  (p=0.000 n=18+17)
JSONDecode-12               47.3ms ± 0%    47.0ms ± 0%  -0.49%  (p=0.000 n=17+18)
Mandelbrot200-12            3.69ms ± 0%    3.68ms ± 0%  -0.25%  (p=0.000 n=19+18)
GoParse-12                  3.13ms ± 1%    3.13ms ± 1%    ~     (p=0.640 n=20+20)
RegexpMatchEasy0_32-12      76.2ns ± 1%    76.2ns ± 1%    ~     (p=0.818 n=20+19)
RegexpMatchEasy0_1K-12       226ns ± 0%     226ns ± 0%  -0.22%  (p=0.001 n=17+18)
RegexpMatchEasy1_32-12      71.9ns ± 1%    72.0ns ± 1%    ~     (p=0.653 n=18+18)
RegexpMatchEasy1_1K-12       355ns ± 1%     356ns ± 1%    ~     (p=0.160 n=18+19)
RegexpMatchMedium_32-12      106ns ± 1%     106ns ± 1%    ~     (p=0.325 n=17+20)
RegexpMatchMedium_1K-12     31.1µs ± 2%    31.2µs ± 0%  +0.59%  (p=0.007 n=19+15)
RegexpMatchHard_32-12       1.54µs ± 2%    1.53µs ± 2%  -0.78%  (p=0.021 n=17+18)
RegexpMatchHard_1K-12       46.0µs ± 1%    45.9µs ± 1%  -0.31%  (p=0.025 n=17+19)
Revcomp-12                   391ms ± 1%     394ms ± 2%  +0.80%  (p=0.000 n=17+19)
Template-12                 59.9ms ± 1%    59.9ms ± 1%    ~     (p=0.428 n=20+19)
TimeParse-12                 304ns ± 1%     312ns ± 0%  +2.88%  (p=0.000 n=20+17)
TimeFormat-12                318ns ± 0%     326ns ± 0%  +2.64%  (p=0.000 n=20+17)

(https://perf.golang.org/search?q=upload:20180928.2)

Change-Id: I336b9bf054113580a24103192904c8c76593e90e
Reviewed-on: https://go-review.googlesource.com/c/138958
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2018-11-15 19:27:09 +00:00
Austin Clements
69e666e4f7 runtime: record in-use spans in a page-indexed bitmap
This adds a bitmap indexed by page number that marks the starts of
in-use spans. This will be used to quickly find in-use spans with no
marked objects for sweeping.

For #18155.

Change-Id: Icee56f029cde502447193e136fa54a74c74326dd
Reviewed-on: https://go-review.googlesource.com/c/138957
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2018-11-15 19:27:06 +00:00
Austin Clements
e500ffd88c runtime: track all heap arenas in a slice
Currently, there's no efficient way to iterate over the Go heap. We're
going to need this for fast free page sweeping, so this CL adds a
slice of all allocated heap arenas. This will also be useful for
generational GC.

For #18155.

Change-Id: I58d126cfb9c3f61b3125d80b74ccb1b2169efbcc
Reviewed-on: https://go-review.googlesource.com/c/138076
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-11-15 19:14:10 +00:00
Michael Anthony Knyszek
3a7a56cc70 runtime: gofmt all improperly formatted code
This change fixes incorrect formatting in mheap.go (the result of my
previous heap scavenging changes) and map_test.go.

Change-Id: I2963687504abdc4f0cdf2f0c558174b3bc0ed2df
Reviewed-on: https://go-review.googlesource.com/c/148977
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-11-11 16:09:05 +00:00
Michael Anthony Knyszek
06be7cbf3c runtime: stop unnecessary span scavenges on free
This change fixes a bug wherein freeing a scavenged span that didn't
coalesce with any neighboring spans would result in that span getting
scavenged again. This case may actually be a common occurance because
"freeing" span trimmings and newly-grown spans end up using the same
codepath. On systems where madvise is relatively expensive, this can
have a large performance impact.

This change also cleans up some of this logic in freeSpanLocked since
a number of factors made the coalescing code somewhat difficult to
reason about with respect to scavenging. Notably, the way the
needsScavenge boolean is handled could be better expressed and the
inverted conditions (e.g. !after.released) can make things even more
confusing.

Fixes #28595.

Change-Id: I75228dba70b6596b90853020b7c24fbe7ab937cf
Reviewed-on: https://go-review.googlesource.com/c/147559
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-11-09 20:57:57 +00:00
Michael Anthony Knyszek
44dcb5cb61 runtime: clean up MSpan* MCache* MCentral* in docs
This change cleans up references to MSpan, MCache, and MCentral in the
docs via a bunch of sed invocations to better reflect the Go names for
the equivalent structures (i.e. mspan, mcache, mcentral) and their
methods (i.e. MSpan_Sweep -> mspan.sweep).

Change-Id: Ie911ac975a24bd25200a273086dd835ab78b1711
Reviewed-on: https://go-review.googlesource.com/c/147557
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-11-05 22:56:22 +00:00
Michael Anthony Knyszek
2ae8bf7054 runtime: fix stale comments about mheap and mspan
As of 07e738e all spans are allocated out of a treap, and not just
large spans or spans for large objects. Also, now we have a separate
treap for spans that have been scavenged.

Change-Id: I9c2cb7b6798fc536bbd34835da2e888224fd7ed4
Reviewed-on: https://go-review.googlesource.com/c/142958
Reviewed-by: Austin Clements <austin@google.com>
2018-11-05 19:30:42 +00:00
Michael Anthony Knyszek
c803ffc67d runtime: scavenge large spans before heap growth
This change scavenges the largest spans before growing the heap for
physical pages to "make up" for the newly-mapped space which,
presumably, will be touched.

In theory, this approach to scavenging helps reduce the RSS of an
application by marking fragments in memory as reclaimable to the OS
more eagerly than before. In practice this may not necessarily be
true, depending on how sysUnused is implemented for each platform.

Fixes #14045.

Change-Id: Iab60790be05935865fc71f793cb9323ab00a18bd
Reviewed-on: https://go-review.googlesource.com/c/139719
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-10-30 15:41:55 +00:00
Michael Anthony Knyszek
db82a1bc12 runtime: sysUsed spans after trimming
Currently, we mark a whole span as sysUsed before trimming, but this
unnecessarily tells the OS that the trimmed section from the span is
used when it may have been scavenged, if s was scavenged. Overall,
this just makes invocations of sysUsed a little more fine-grained.

It does come with the caveat that now heap_released needs to be managed
a little more carefully in allocSpanLocked. In this case, we choose to
(like before this change) negate any effect the span has on
heap_released before trimming, then add it back if the trimmed part is
scavengable.

For #14045.

Change-Id: Ifa384d989611398bfad3ca39d3bb595a5962a3ea
Reviewed-on: https://go-review.googlesource.com/c/140198
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-10-30 15:28:24 +00:00
Michael Anthony Knyszek
78bb91cbd3 runtime: remove npreleased in favor of boolean
This change removes npreleased from mspan since spans may now either be
scavenged or not scavenged; how many of its pages were actually scavenged
doesn't matter. It saves some space in mpsan overhead too, as the boolean
fits into what would otherwise be struct padding.

For #14045.

Change-Id: I63f25a4d98658f5fe21c6a466fc38c59bfc5d0f5
Reviewed-on: https://go-review.googlesource.com/c/139737
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-10-30 15:28:01 +00:00
Michael Anthony Knyszek
b46bf0240c runtime: separate scavenged spans
This change adds a new treap to mheap which contains scavenged (i.e.
its physical pages were returned to the OS) spans.

As of this change, spans may no longer be partially scavenged.

For #14045.

Change-Id: I0d428a255c6d3f710b9214b378f841b997df0993
Reviewed-on: https://go-review.googlesource.com/c/139298
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-10-30 15:27:51 +00:00
Michael Anthony Knyszek
07e738ec32 runtime: use only treaps for tracking spans
Currently, mheap tracks spans in both mSpanLists and mTreaps, but
mSpanLists, while they tend to be smaller, complicate the
implementation. Here we simplify the implementation by removing
free and busy from mheap and renaming freelarge -> free and busylarge
-> busy.

This change also slightly changes the reclamation policy. Previously,
for allocations under 1MB we would attempt to find a small span of the
right size. Now, we just try to find any number of spans totaling the
right size. This may increase heap fragmentation, but that will be dealt
with using virtual memory tricks in follow-up CLs.

For #14045.

Garbage-heavy benchmarks show very little change, except what appears
to be a decrease in STW times and peak RSS.

name                      old STW-ns/GC       new STW-ns/GC       delta
Garbage/benchmem-MB=64-8           263k ±64%           217k ±24%  -17.66%  (p=0.028 n=25+23)

name                      old STW-ns/op       new STW-ns/op       delta
Garbage/benchmem-MB=64-8          9.39k ±65%          7.80k ±24%  -16.88%  (p=0.037 n=25+23)

name                      old peak-RSS-bytes  new peak-RSS-bytes  delta
Garbage/benchmem-MB=64-8           281M ± 0%           249M ± 4%  -11.40%  (p=0.000 n=19+18)

https://perf.golang.org/search?q=upload:20181005.1

Go1 benchmarks perform roughly the same, the most notable regression
being the JSON encode/decode benchmark with worsens by ~2%.

name                     old time/op    new time/op    delta
BinaryTree17-8              3.02s ± 2%     2.99s ± 2%  -1.18%  (p=0.000 n=25+24)
Fannkuch11-8                3.05s ± 1%     3.02s ± 2%  -1.20%  (p=0.000 n=25+25)
FmtFprintfEmpty-8          43.6ns ± 5%    43.4ns ± 3%    ~     (p=0.528 n=25+25)
FmtFprintfString-8         74.9ns ± 3%    73.4ns ± 1%  -2.03%  (p=0.001 n=25+24)
FmtFprintfInt-8            79.3ns ± 3%    77.9ns ± 1%  -1.73%  (p=0.003 n=25+25)
FmtFprintfIntInt-8          119ns ± 6%     116ns ± 0%  -2.68%  (p=0.000 n=25+18)
FmtFprintfPrefixedInt-8     134ns ± 4%     132ns ± 1%  -1.52%  (p=0.004 n=25+25)
FmtFprintfFloat-8           240ns ± 1%     241ns ± 1%    ~     (p=0.403 n=24+23)
FmtManyArgs-8               543ns ± 1%     537ns ± 1%  -1.00%  (p=0.000 n=25+25)
GobDecode-8                6.88ms ± 1%    6.92ms ± 4%    ~     (p=0.088 n=24+22)
GobEncode-8                5.92ms ± 1%    5.93ms ± 1%    ~     (p=0.898 n=25+24)
Gzip-8                      267ms ± 2%     266ms ± 2%    ~     (p=0.213 n=25+24)
Gunzip-8                   35.4ms ± 1%    35.6ms ± 1%  +0.70%  (p=0.000 n=25+25)
HTTPClientServer-8          104µs ± 2%     104µs ± 2%    ~     (p=0.686 n=25+25)
JSONEncode-8               9.67ms ± 1%    9.80ms ± 4%  +1.32%  (p=0.000 n=25+25)
JSONDecode-8               47.7ms ± 1%    48.8ms ± 5%  +2.33%  (p=0.000 n=25+25)
Mandelbrot200-8            4.87ms ± 1%    4.91ms ± 1%  +0.79%  (p=0.000 n=25+25)
GoParse-8                  3.59ms ± 4%    3.55ms ± 1%    ~     (p=0.199 n=25+24)
RegexpMatchEasy0_32-8      90.3ns ± 1%    89.9ns ± 1%  -0.47%  (p=0.000 n=25+21)
RegexpMatchEasy0_1K-8       204ns ± 1%     204ns ± 1%    ~     (p=0.914 n=25+24)
RegexpMatchEasy1_32-8      84.9ns ± 0%    84.6ns ± 1%  -0.36%  (p=0.000 n=24+25)
RegexpMatchEasy1_1K-8       350ns ± 1%     348ns ± 3%  -0.59%  (p=0.007 n=25+25)
RegexpMatchMedium_32-8      122ns ± 1%     121ns ± 0%  -1.08%  (p=0.000 n=25+18)
RegexpMatchMedium_1K-8     36.1µs ± 1%    34.6µs ± 1%  -4.02%  (p=0.000 n=25+25)
RegexpMatchHard_32-8       1.69µs ± 2%    1.65µs ± 1%  -2.38%  (p=0.000 n=25+25)
RegexpMatchHard_1K-8       50.8µs ± 1%    49.4µs ± 1%  -2.69%  (p=0.000 n=25+24)
Revcomp-8                   453ms ± 2%     449ms ± 3%  -0.74%  (p=0.022 n=25+24)
Template-8                 63.2ms ± 2%    63.4ms ± 1%    ~     (p=0.127 n=25+24)
TimeParse-8                 313ns ± 1%     315ns ± 3%    ~     (p=0.924 n=24+25)
TimeFormat-8                294ns ± 1%     292ns ± 2%  -0.65%  (p=0.004 n=23+24)
[Geo mean]                 49.9µs         49.6µs       -0.65%

name                     old speed      new speed      delta
GobDecode-8               112MB/s ± 1%   110MB/s ± 4%  -1.00%  (p=0.036 n=24+24)
GobEncode-8               130MB/s ± 1%   129MB/s ± 1%    ~     (p=0.894 n=25+24)
Gzip-8                   72.7MB/s ± 2%  73.0MB/s ± 2%    ~     (p=0.208 n=25+24)
Gunzip-8                  549MB/s ± 1%   545MB/s ± 1%  -0.70%  (p=0.000 n=25+25)
JSONEncode-8              201MB/s ± 1%   198MB/s ± 3%  -1.29%  (p=0.000 n=25+25)
JSONDecode-8             40.7MB/s ± 1%  39.8MB/s ± 5%  -2.23%  (p=0.000 n=25+25)
GoParse-8                16.2MB/s ± 4%  16.3MB/s ± 1%    ~     (p=0.211 n=25+24)
RegexpMatchEasy0_32-8     354MB/s ± 1%   356MB/s ± 1%  +0.47%  (p=0.000 n=25+21)
RegexpMatchEasy0_1K-8    5.00GB/s ± 0%  4.99GB/s ± 1%    ~     (p=0.588 n=24+24)
RegexpMatchEasy1_32-8     377MB/s ± 1%   378MB/s ± 1%  +0.39%  (p=0.000 n=25+25)
RegexpMatchEasy1_1K-8    2.92GB/s ± 1%  2.94GB/s ± 3%  +0.65%  (p=0.008 n=25+25)
RegexpMatchMedium_32-8   8.14MB/s ± 1%  8.22MB/s ± 1%  +0.98%  (p=0.000 n=25+24)
RegexpMatchMedium_1K-8   28.4MB/s ± 1%  29.6MB/s ± 1%  +4.19%  (p=0.000 n=25+25)
RegexpMatchHard_32-8     18.9MB/s ± 2%  19.4MB/s ± 1%  +2.43%  (p=0.000 n=25+25)
RegexpMatchHard_1K-8     20.2MB/s ± 1%  20.7MB/s ± 1%  +2.76%  (p=0.000 n=25+24)
Revcomp-8                 561MB/s ± 2%   566MB/s ± 3%  +0.75%  (p=0.021 n=25+24)
Template-8               30.7MB/s ± 2%  30.6MB/s ± 1%    ~     (p=0.131 n=25+24)
[Geo mean]                120MB/s        121MB/s       +0.48%

https://perf.golang.org/search?q=upload:20181004.6

Change-Id: I97f9fee34577961a116a8ddd445c6272253f0f95
Reviewed-on: https://go-review.googlesource.com/c/139837
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-10-17 16:26:10 +00:00
Michael Anthony Knyszek
e508a5f072 runtime: de-duplicate span scavenging
Currently, span scavenging was done nearly identically in two different
locations. This change deduplicates that into one shared routine.

For #14045.

Change-Id: I15006b2c9af0e70b7a9eae9abb4168d3adca3860
Reviewed-on: https://go-review.googlesource.com/c/139297
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
2018-10-17 16:25:42 +00:00
Austin Clements
3f86d7cc67 runtime: tidy mheap.freeSpan
freeSpan currently takes a mysterious "acct int32" argument. This is
really just a boolean and actually just needs to match the "large"
argument to alloc in order to balance out accounting.

To make this clearer, replace acct with a "large bool" argument that
must match the call to mheap.alloc.

Change-Id: Ibc81faefdf9f0583114e1953fcfb362e9c3c76de
Reviewed-on: https://go-review.googlesource.com/c/138655
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2018-10-09 16:43:18 +00:00
Igor Zhilianin
04dc1b2443 all: fix a bunch of misspellings
Change-Id: I94cebca86706e072fbe3be782d3edbe0e22b9432
GitHub-Last-Rev: 8e15a40545
GitHub-Pull-Request: golang/go#28067
Reviewed-on: https://go-review.googlesource.com/c/140437
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
2018-10-08 03:12:03 +00:00
Austin Clements
415e948eae runtime: improve mheap.alloc doc and let compiler check system stack
The alloc_m documentation refers to concepts that don't exist (and
maybe never did?). alloc_m is also not the API entry point to span
allocation.

Hence, rewrite the documentation for alloc and alloc_m. While we're
here, document why alloc_m must run on the system stack and replace
alloc_m's hand-implemented system stack check with a go:systemstack
annotation.

Change-Id: I30e263d8e53c2774a6614e1b44df5464838cef09
Reviewed-on: https://go-review.googlesource.com/c/139459
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2018-10-05 16:05:17 +00:00
Keith Randall
cbafcc55e8 cmd/compile,runtime: implement stack objects
Rework how the compiler+runtime handles stack-allocated variables
whose address is taken.

Direct references to such variables work as before. References through
pointers, however, use a new mechanism. The new mechanism is more
precise than the old "ambiguously live" mechanism. It computes liveness
at runtime based on the actual references among objects on the stack.

Each function records all of its address-taken objects in a FUNCDATA.
These are called "stack objects". The runtime then uses that
information while scanning a stack to find all of the stack objects on
a stack. It then does a mark phase on the stack objects, using all the
pointers found on the stack (and ancillary structures, like defer
records) as the root set. Only stack objects which are found to be
live during this mark phase will be scanned and thus retain any heap
objects they point to.

A subsequent CL will remove all the "ambiguously live" logic from
the compiler, so that the stack object tracing will be required.
For this CL, the stack tracing is all redundant with the current
ambiguously live logic.

Update #22350

Change-Id: Ide19f1f71a5b6ec8c4d54f8f66f0e9a98344772f
Reviewed-on: https://go-review.googlesource.com/c/134155
Reviewed-by: Austin Clements <austin@google.com>
2018-10-03 19:52:49 +00:00
Austin Clements
873bd47dfb runtime: flush mcaches lazily
Currently, all mcaches are flushed during STW mark termination as a
root marking job. This is currently necessary because all spans must
be out of these caches before sweeping begins to avoid races with
allocation and to ensure the spans are in the state expected by
sweeping. We do it as a root marking job because mcache flushing is
somewhat expensive and O(GOMAXPROCS) and this parallelizes the work
across the Ps. However, it's also the last remaining root marking job
performed during mark termination.

This CL moves mcache flushing out of mark termination and performs it
lazily. We keep track of the last sweepgen at which each mcache was
flushed and as each P is woken from STW, it observes that its mcache
is out-of-date and flushes it.

The introduces a complication for spans cached in stale mcaches. These
may now be observed by background or proportional sweeping or when
attempting to add a finalizer, but aren't in a stable state. For
example, they are likely to be on the wrong mcentral list. To fix
this, this CL extends the sweepgen protocol to also capture whether a
span is cached and, if so, whether or not its cache is stale. This
protocol blocks asynchronous sweeping from touching cached spans and
makes it the responsibility of mcache flushing to sweep the flushed
spans.

This eliminates the last mark termination root marking job, which
means we can now eliminate that entire infrastructure.

Updates #26903. This implements lazy mcache flushing.

Change-Id: Iadda7aabe540b2026cffc5195da7be37d5b4125e
Reviewed-on: https://go-review.googlesource.com/c/134783
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-10-02 20:35:35 +00:00
Austin Clements
d398dbdfc3 runtime: eliminate gcBlackenPromptly mode
Now that there is no mark 2 phase, gcBlackenPromptly is no longer
used.

Updates #26903. This is a follow-up to eliminating mark 2.

Change-Id: Ib9c534f21b36b8416fcf3cab667f186167b827f8
Reviewed-on: https://go-review.googlesource.com/c/134319
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
2018-10-02 20:35:21 +00:00