2014-08-21 08:07:42 +02:00
|
|
|
// Copyright 2009 The Go Authors. All rights reserved.
|
|
|
|
|
// Use of this source code is governed by a BSD-style
|
|
|
|
|
// license that can be found in the LICENSE file.
|
|
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
// Malloc profiling.
|
|
|
|
|
// Patterned after tcmalloc's algorithms; shorter code.
|
|
|
|
|
|
2014-08-21 08:07:42 +02:00
|
|
|
package runtime
|
|
|
|
|
|
|
|
|
|
import (
|
2021-05-21 13:37:19 -04:00
|
|
|
"internal/abi"
|
2024-05-17 15:07:07 +02:00
|
|
|
"internal/profilerecord"
|
2024-02-01 10:21:14 +08:00
|
|
|
"internal/runtime/atomic"
|
2022-08-07 17:43:57 +07:00
|
|
|
"runtime/internal/sys"
|
2014-08-21 08:07:42 +02:00
|
|
|
"unsafe"
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
// NOTE(rsc): Everything here could use cas if contention became an issue.
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
var (
|
|
|
|
|
// profInsertLock protects changes to the start of all *bucket linked lists
|
|
|
|
|
profInsertLock mutex
|
|
|
|
|
// profBlockLock protects the contents of every blockRecord struct
|
|
|
|
|
profBlockLock mutex
|
|
|
|
|
// profMemActiveLock protects the active field of every memRecord struct
|
|
|
|
|
profMemActiveLock mutex
|
|
|
|
|
// profMemFutureLock is a set of locks that protect the respective elements
|
|
|
|
|
// of the future array of every memRecord struct
|
|
|
|
|
profMemFutureLock [len(memRecord{}.future)]mutex
|
|
|
|
|
)
|
2014-08-21 08:07:42 +02:00
|
|
|
|
|
|
|
|
// All memory allocations are local and do not escape outside of the profiler.
|
|
|
|
|
// The profiler is forbidden from referring to garbage-collected memory.
|
|
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
const (
|
|
|
|
|
// profile types
|
|
|
|
|
memProfile bucketType = 1 + iota
|
|
|
|
|
blockProfile
|
2016-09-22 09:48:30 -04:00
|
|
|
mutexProfile
|
2014-09-01 00:06:26 -04:00
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
// size of bucket hash table
|
|
|
|
|
buckHashSize = 179999
|
|
|
|
|
|
2024-05-19 15:21:53 +02:00
|
|
|
// maxSkip is to account for deferred inline expansion
|
|
|
|
|
// when using frame pointer unwinding. We record the stack
|
|
|
|
|
// with "physical" frame pointers but handle skipping "logical"
|
|
|
|
|
// frames at some point after collecting the stack. So
|
|
|
|
|
// we need extra space in order to avoid getting fewer than the
|
|
|
|
|
// desired maximum number of frames after expansion.
|
|
|
|
|
// This should be at least as large as the largest skip value
|
|
|
|
|
// used for profiling; otherwise stacks may be truncated inconsistently
|
|
|
|
|
maxSkip = 5
|
2024-04-27 13:41:05 +02:00
|
|
|
|
|
|
|
|
// maxProfStackDepth is the highest valid value for debug.profstackdepth.
|
|
|
|
|
// It's used for the bucket.stk func.
|
|
|
|
|
// TODO(fg): can we get rid of this?
|
|
|
|
|
maxProfStackDepth = 1024
|
2014-09-01 18:51:12 -04:00
|
|
|
)
|
|
|
|
|
|
|
|
|
|
type bucketType int
|
|
|
|
|
|
|
|
|
|
// A bucket holds per-call-stack profiling information.
|
|
|
|
|
// The representation is a bit sleazy, inherited from C.
|
|
|
|
|
// This struct defines the bucket header. It is followed in
|
|
|
|
|
// memory by the stack words and then the actual record
|
|
|
|
|
// data, either a memRecord or a blockRecord.
|
|
|
|
|
//
|
2014-09-01 00:06:26 -04:00
|
|
|
// Per-call-stack profiling information.
|
|
|
|
|
// Lookup by hashing call stack into a linked-list hash table.
|
2016-10-11 22:58:21 -04:00
|
|
|
//
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
// None of the fields in this bucket header are modified after
|
|
|
|
|
// creation, including its next and allnext links.
|
|
|
|
|
//
|
2016-10-11 22:58:21 -04:00
|
|
|
// No heap pointers.
|
2014-09-01 18:51:12 -04:00
|
|
|
type bucket struct {
|
2022-08-07 17:43:57 +07:00
|
|
|
_ sys.NotInHeap
|
2014-09-01 18:51:12 -04:00
|
|
|
next *bucket
|
|
|
|
|
allnext *bucket
|
2016-09-22 09:48:30 -04:00
|
|
|
typ bucketType // memBucket or blockBucket (includes mutexProfile)
|
2014-09-01 18:51:12 -04:00
|
|
|
hash uintptr
|
|
|
|
|
size uintptr
|
|
|
|
|
nstk uintptr
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// A memRecord is the bucket data for a bucket of type memProfile,
|
|
|
|
|
// part of the memory profile.
|
|
|
|
|
type memRecord struct {
|
|
|
|
|
// The following complex 3-stage scheme of stats accumulation
|
|
|
|
|
// is required to obtain a consistent picture of mallocs and frees
|
|
|
|
|
// for some point in time.
|
|
|
|
|
// The problem is that mallocs come in real time, while frees
|
|
|
|
|
// come only after a GC during concurrent sweeping. So if we would
|
|
|
|
|
// naively count them, we would get a skew toward mallocs.
|
|
|
|
|
//
|
2017-02-27 11:36:37 -05:00
|
|
|
// Hence, we delay information to get consistent snapshots as
|
|
|
|
|
// of mark termination. Allocations count toward the next mark
|
|
|
|
|
// termination's snapshot, while sweep frees count toward the
|
|
|
|
|
// previous mark termination's snapshot:
|
|
|
|
|
//
|
|
|
|
|
// MT MT MT MT
|
|
|
|
|
// .·| .·| .·| .·|
|
|
|
|
|
// .·˙ | .·˙ | .·˙ | .·˙ |
|
|
|
|
|
// .·˙ | .·˙ | .·˙ | .·˙ |
|
|
|
|
|
// .·˙ |.·˙ |.·˙ |.·˙ |
|
|
|
|
|
//
|
|
|
|
|
// alloc → ▲ ← free
|
|
|
|
|
// ┠┅┅┅┅┅┅┅┅┅┅┅P
|
2017-03-01 13:58:22 -05:00
|
|
|
// C+2 → C+1 → C
|
2017-02-27 11:36:37 -05:00
|
|
|
//
|
|
|
|
|
// alloc → ▲ ← free
|
|
|
|
|
// ┠┅┅┅┅┅┅┅┅┅┅┅P
|
2017-03-01 13:58:22 -05:00
|
|
|
// C+2 → C+1 → C
|
2017-02-27 11:36:37 -05:00
|
|
|
//
|
|
|
|
|
// Since we can't publish a consistent snapshot until all of
|
|
|
|
|
// the sweep frees are accounted for, we wait until the next
|
|
|
|
|
// mark termination ("MT" above) to publish the previous mark
|
2017-03-01 13:58:22 -05:00
|
|
|
// termination's snapshot ("P" above). To do this, allocation
|
|
|
|
|
// and free events are accounted to *future* heap profile
|
|
|
|
|
// cycles ("C+n" above) and we only publish a cycle once all
|
|
|
|
|
// of the events from that cycle must be done. Specifically:
|
2017-02-27 11:36:37 -05:00
|
|
|
//
|
2017-03-01 13:58:22 -05:00
|
|
|
// Mallocs are accounted to cycle C+2.
|
|
|
|
|
// Explicit frees are accounted to cycle C+2.
|
|
|
|
|
// GC frees (done during sweeping) are accounted to cycle C+1.
|
|
|
|
|
//
|
|
|
|
|
// After mark termination, we increment the global heap
|
|
|
|
|
// profile cycle counter and accumulate the stats from cycle C
|
|
|
|
|
// into the active profile.
|
2017-03-01 11:50:38 -05:00
|
|
|
|
|
|
|
|
// active is the currently published profile. A profiling
|
|
|
|
|
// cycle can be accumulated into active once its complete.
|
|
|
|
|
active memRecordCycle
|
2014-09-01 18:51:12 -04:00
|
|
|
|
2017-03-01 13:58:22 -05:00
|
|
|
// future records the profile events we're counting for cycles
|
|
|
|
|
// that have not yet been published. This is ring buffer
|
|
|
|
|
// indexed by the global heap profile cycle C and stores
|
|
|
|
|
// cycles C, C+1, and C+2. Unlike active, these counts are
|
|
|
|
|
// only for a single cycle; they are not cumulative across
|
|
|
|
|
// cycles.
|
|
|
|
|
//
|
|
|
|
|
// We store cycle C here because there's a window between when
|
|
|
|
|
// C becomes the active cycle and when we've flushed it to
|
|
|
|
|
// active.
|
|
|
|
|
future [3]memRecordCycle
|
2017-03-01 11:50:38 -05:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// memRecordCycle
|
|
|
|
|
type memRecordCycle struct {
|
|
|
|
|
allocs, frees uintptr
|
|
|
|
|
alloc_bytes, free_bytes uintptr
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// add accumulates b into a. It does not zero b.
|
|
|
|
|
func (a *memRecordCycle) add(b *memRecordCycle) {
|
|
|
|
|
a.allocs += b.allocs
|
|
|
|
|
a.frees += b.frees
|
|
|
|
|
a.alloc_bytes += b.alloc_bytes
|
|
|
|
|
a.free_bytes += b.free_bytes
|
2014-09-01 18:51:12 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// A blockRecord is the bucket data for a bucket of type blockProfile,
|
2016-09-22 09:48:30 -04:00
|
|
|
// which is used in blocking and mutex profiles.
|
2014-09-01 18:51:12 -04:00
|
|
|
type blockRecord struct {
|
2021-02-26 14:41:19 +01:00
|
|
|
count float64
|
2014-09-01 18:51:12 -04:00
|
|
|
cycles int64
|
|
|
|
|
}
|
2014-09-01 00:06:26 -04:00
|
|
|
|
2014-08-21 08:07:42 +02:00
|
|
|
var (
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
mbuckets atomic.UnsafePointer // *bucket, memory profile buckets
|
|
|
|
|
bbuckets atomic.UnsafePointer // *bucket, blocking profile buckets
|
|
|
|
|
xbuckets atomic.UnsafePointer // *bucket, mutex profile buckets
|
|
|
|
|
buckhash atomic.UnsafePointer // *buckhashArray
|
|
|
|
|
|
|
|
|
|
mProfCycle mProfCycleHolder
|
2014-08-21 08:07:42 +02:00
|
|
|
)
|
|
|
|
|
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
type buckhashArray [buckHashSize]atomic.UnsafePointer // *bucket
|
|
|
|
|
|
2017-03-01 13:58:22 -05:00
|
|
|
const mProfCycleWrap = uint32(len(memRecord{}.future)) * (2 << 24)
|
|
|
|
|
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
// mProfCycleHolder holds the global heap profile cycle number (wrapped at
|
|
|
|
|
// mProfCycleWrap, stored starting at bit 1), and a flag (stored at bit 0) to
|
|
|
|
|
// indicate whether future[cycle] in all buckets has been queued to flush into
|
|
|
|
|
// the active profile.
|
|
|
|
|
type mProfCycleHolder struct {
|
|
|
|
|
value atomic.Uint32
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// read returns the current cycle count.
|
|
|
|
|
func (c *mProfCycleHolder) read() (cycle uint32) {
|
|
|
|
|
v := c.value.Load()
|
|
|
|
|
cycle = v >> 1
|
|
|
|
|
return cycle
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// setFlushed sets the flushed flag. It returns the current cycle count and the
|
|
|
|
|
// previous value of the flushed flag.
|
|
|
|
|
func (c *mProfCycleHolder) setFlushed() (cycle uint32, alreadyFlushed bool) {
|
|
|
|
|
for {
|
|
|
|
|
prev := c.value.Load()
|
|
|
|
|
cycle = prev >> 1
|
|
|
|
|
alreadyFlushed = (prev & 0x1) != 0
|
|
|
|
|
next := prev | 0x1
|
|
|
|
|
if c.value.CompareAndSwap(prev, next) {
|
|
|
|
|
return cycle, alreadyFlushed
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// increment increases the cycle count by one, wrapping the value at
|
|
|
|
|
// mProfCycleWrap. It clears the flushed flag.
|
|
|
|
|
func (c *mProfCycleHolder) increment() {
|
|
|
|
|
// We explicitly wrap mProfCycle rather than depending on
|
|
|
|
|
// uint wraparound because the memRecord.future ring does not
|
|
|
|
|
// itself wrap at a power of two.
|
|
|
|
|
for {
|
|
|
|
|
prev := c.value.Load()
|
|
|
|
|
cycle := prev >> 1
|
|
|
|
|
cycle = (cycle + 1) % mProfCycleWrap
|
|
|
|
|
next := cycle << 1
|
|
|
|
|
if c.value.CompareAndSwap(prev, next) {
|
|
|
|
|
break
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
// newBucket allocates a bucket with the given type and number of stack entries.
|
|
|
|
|
func newBucket(typ bucketType, nstk int) *bucket {
|
|
|
|
|
size := unsafe.Sizeof(bucket{}) + uintptr(nstk)*unsafe.Sizeof(uintptr(0))
|
|
|
|
|
switch typ {
|
|
|
|
|
default:
|
2014-12-27 20:58:00 -08:00
|
|
|
throw("invalid profile bucket type")
|
2014-09-01 18:51:12 -04:00
|
|
|
case memProfile:
|
|
|
|
|
size += unsafe.Sizeof(memRecord{})
|
2016-09-22 09:48:30 -04:00
|
|
|
case blockProfile, mutexProfile:
|
2014-09-01 18:51:12 -04:00
|
|
|
size += unsafe.Sizeof(blockRecord{})
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
b := (*bucket)(persistentalloc(size, 0, &memstats.buckhash_sys))
|
|
|
|
|
b.typ = typ
|
|
|
|
|
b.nstk = uintptr(nstk)
|
|
|
|
|
return b
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
|
|
|
|
|
2024-04-27 13:41:05 +02:00
|
|
|
// stk returns the slice in b holding the stack. The caller can asssume that the
|
|
|
|
|
// backing array is immutable.
|
2014-09-01 18:51:12 -04:00
|
|
|
func (b *bucket) stk() []uintptr {
|
2024-04-27 13:41:05 +02:00
|
|
|
stk := (*[maxProfStackDepth]uintptr)(add(unsafe.Pointer(b), unsafe.Sizeof(*b)))
|
|
|
|
|
if b.nstk > maxProfStackDepth {
|
2023-11-21 16:03:54 +00:00
|
|
|
// prove that slicing works; otherwise a failure requires a P
|
|
|
|
|
throw("bad profile stack count")
|
|
|
|
|
}
|
2014-09-01 18:51:12 -04:00
|
|
|
return stk[:b.nstk:b.nstk]
|
|
|
|
|
}
|
2014-09-01 00:06:26 -04:00
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
// mp returns the memRecord associated with the memProfile bucket b.
|
|
|
|
|
func (b *bucket) mp() *memRecord {
|
|
|
|
|
if b.typ != memProfile {
|
2014-12-27 20:58:00 -08:00
|
|
|
throw("bad use of bucket.mp")
|
2014-09-01 18:51:12 -04:00
|
|
|
}
|
|
|
|
|
data := add(unsafe.Pointer(b), unsafe.Sizeof(*b)+b.nstk*unsafe.Sizeof(uintptr(0)))
|
|
|
|
|
return (*memRecord)(data)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// bp returns the blockRecord associated with the blockProfile bucket b.
|
|
|
|
|
func (b *bucket) bp() *blockRecord {
|
2016-09-22 09:48:30 -04:00
|
|
|
if b.typ != blockProfile && b.typ != mutexProfile {
|
2014-12-27 20:58:00 -08:00
|
|
|
throw("bad use of bucket.bp")
|
2014-09-01 18:51:12 -04:00
|
|
|
}
|
|
|
|
|
data := add(unsafe.Pointer(b), unsafe.Sizeof(*b)+b.nstk*unsafe.Sizeof(uintptr(0)))
|
|
|
|
|
return (*blockRecord)(data)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Return the bucket for stk[0:nstk], allocating new bucket if needed.
|
|
|
|
|
func stkbucket(typ bucketType, size uintptr, stk []uintptr, alloc bool) *bucket {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
bh := (*buckhashArray)(buckhash.Load())
|
|
|
|
|
if bh == nil {
|
|
|
|
|
lock(&profInsertLock)
|
|
|
|
|
// check again under the lock
|
|
|
|
|
bh = (*buckhashArray)(buckhash.Load())
|
|
|
|
|
if bh == nil {
|
|
|
|
|
bh = (*buckhashArray)(sysAlloc(unsafe.Sizeof(buckhashArray{}), &memstats.buckhash_sys))
|
|
|
|
|
if bh == nil {
|
|
|
|
|
throw("runtime: cannot allocate memory")
|
|
|
|
|
}
|
|
|
|
|
buckhash.StoreNoWB(unsafe.Pointer(bh))
|
2014-09-01 18:51:12 -04:00
|
|
|
}
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
unlock(&profInsertLock)
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Hash stack.
|
2014-09-01 18:51:12 -04:00
|
|
|
var h uintptr
|
|
|
|
|
for _, pc := range stk {
|
|
|
|
|
h += pc
|
|
|
|
|
h += h << 10
|
|
|
|
|
h ^= h >> 6
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
|
|
|
|
// hash in size
|
|
|
|
|
h += size
|
2014-09-01 18:51:12 -04:00
|
|
|
h += h << 10
|
|
|
|
|
h ^= h >> 6
|
2014-09-01 00:06:26 -04:00
|
|
|
// finalize
|
2014-09-01 18:51:12 -04:00
|
|
|
h += h << 3
|
|
|
|
|
h ^= h >> 11
|
2014-09-01 00:06:26 -04:00
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
i := int(h % buckHashSize)
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
// first check optimistically, without the lock
|
|
|
|
|
for b := (*bucket)(bh[i].Load()); b != nil; b = b.next {
|
2014-09-01 18:51:12 -04:00
|
|
|
if b.typ == typ && b.hash == h && b.size == size && eqslice(b.stk(), stk) {
|
2014-09-01 00:06:26 -04:00
|
|
|
return b
|
2014-09-01 18:51:12 -04:00
|
|
|
}
|
|
|
|
|
}
|
2014-09-01 00:06:26 -04:00
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
if !alloc {
|
2014-09-01 00:06:26 -04:00
|
|
|
return nil
|
2014-09-01 18:51:12 -04:00
|
|
|
}
|
2014-09-01 00:06:26 -04:00
|
|
|
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
lock(&profInsertLock)
|
|
|
|
|
// check again under the insertion lock
|
|
|
|
|
for b := (*bucket)(bh[i].Load()); b != nil; b = b.next {
|
|
|
|
|
if b.typ == typ && b.hash == h && b.size == size && eqslice(b.stk(), stk) {
|
|
|
|
|
unlock(&profInsertLock)
|
|
|
|
|
return b
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
// Create new bucket.
|
|
|
|
|
b := newBucket(typ, len(stk))
|
|
|
|
|
copy(b.stk(), stk)
|
2014-09-01 00:06:26 -04:00
|
|
|
b.hash = h
|
|
|
|
|
b.size = size
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
|
|
|
|
|
var allnext *atomic.UnsafePointer
|
2014-09-01 18:51:12 -04:00
|
|
|
if typ == memProfile {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
allnext = &mbuckets
|
2016-09-22 09:48:30 -04:00
|
|
|
} else if typ == mutexProfile {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
allnext = &xbuckets
|
2014-09-01 00:06:26 -04:00
|
|
|
} else {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
allnext = &bbuckets
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
|
|
|
|
|
b.next = (*bucket)(bh[i].Load())
|
|
|
|
|
b.allnext = (*bucket)(allnext.Load())
|
|
|
|
|
|
|
|
|
|
bh[i].StoreNoWB(unsafe.Pointer(b))
|
|
|
|
|
allnext.StoreNoWB(unsafe.Pointer(b))
|
|
|
|
|
|
|
|
|
|
unlock(&profInsertLock)
|
2014-09-01 00:06:26 -04:00
|
|
|
return b
|
|
|
|
|
}
|
|
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
func eqslice(x, y []uintptr) bool {
|
|
|
|
|
if len(x) != len(y) {
|
|
|
|
|
return false
|
|
|
|
|
}
|
|
|
|
|
for i, xi := range x {
|
|
|
|
|
if xi != y[i] {
|
|
|
|
|
return false
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return true
|
|
|
|
|
}
|
2014-09-01 00:06:26 -04:00
|
|
|
|
2017-03-01 13:58:22 -05:00
|
|
|
// mProf_NextCycle publishes the next heap profile cycle and creates a
|
|
|
|
|
// fresh heap profile cycle. This operation is fast and can be done
|
|
|
|
|
// during STW. The caller must call mProf_Flush before calling
|
|
|
|
|
// mProf_NextCycle again.
|
|
|
|
|
//
|
|
|
|
|
// This is called by mark termination during STW so allocations and
|
|
|
|
|
// frees after the world is started again count towards a new heap
|
|
|
|
|
// profiling cycle.
|
|
|
|
|
func mProf_NextCycle() {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
mProfCycle.increment()
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
|
|
|
|
|
2017-03-01 13:58:22 -05:00
|
|
|
// mProf_Flush flushes the events from the current heap profiling
|
|
|
|
|
// cycle into the active profile. After this it is safe to start a new
|
|
|
|
|
// heap profiling cycle with mProf_NextCycle.
|
|
|
|
|
//
|
|
|
|
|
// This is called by GC after mark termination starts the world. In
|
|
|
|
|
// contrast with mProf_NextCycle, this is somewhat expensive, but safe
|
|
|
|
|
// to do concurrently.
|
|
|
|
|
func mProf_Flush() {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
cycle, alreadyFlushed := mProfCycle.setFlushed()
|
|
|
|
|
if alreadyFlushed {
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
index := cycle % uint32(len(memRecord{}.future))
|
|
|
|
|
lock(&profMemActiveLock)
|
|
|
|
|
lock(&profMemFutureLock[index])
|
|
|
|
|
mProf_FlushLocked(index)
|
|
|
|
|
unlock(&profMemFutureLock[index])
|
|
|
|
|
unlock(&profMemActiveLock)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// mProf_FlushLocked flushes the events from the heap profiling cycle at index
|
|
|
|
|
// into the active profile. The caller must hold the lock for the active profile
|
|
|
|
|
// (profMemActiveLock) and for the profiling cycle at index
|
|
|
|
|
// (profMemFutureLock[index]).
|
|
|
|
|
func mProf_FlushLocked(index uint32) {
|
|
|
|
|
assertLockHeld(&profMemActiveLock)
|
|
|
|
|
assertLockHeld(&profMemFutureLock[index])
|
|
|
|
|
head := (*bucket)(mbuckets.Load())
|
|
|
|
|
for b := head; b != nil; b = b.allnext {
|
2017-03-01 13:58:22 -05:00
|
|
|
mp := b.mp()
|
|
|
|
|
|
|
|
|
|
// Flush cycle C into the published profile and clear
|
|
|
|
|
// it for reuse.
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
mpc := &mp.future[index]
|
2017-03-01 13:58:22 -05:00
|
|
|
mp.active.add(mpc)
|
|
|
|
|
*mpc = memRecordCycle{}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2017-02-23 21:50:19 -05:00
|
|
|
// mProf_PostSweep records that all sweep frees for this GC cycle have
|
|
|
|
|
// completed. This has the effect of publishing the heap profile
|
|
|
|
|
// snapshot as of the last mark termination without advancing the heap
|
|
|
|
|
// profile cycle.
|
|
|
|
|
func mProf_PostSweep() {
|
|
|
|
|
// Flush cycle C+1 to the active profile so everything as of
|
|
|
|
|
// the last mark termination becomes visible. *Don't* advance
|
|
|
|
|
// the cycle, since we're still accumulating allocs in cycle
|
|
|
|
|
// C+2, which have to become C+1 in the next mark termination
|
|
|
|
|
// and so on.
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
cycle := mProfCycle.read() + 1
|
|
|
|
|
|
|
|
|
|
index := cycle % uint32(len(memRecord{}.future))
|
|
|
|
|
lock(&profMemActiveLock)
|
|
|
|
|
lock(&profMemFutureLock[index])
|
|
|
|
|
mProf_FlushLocked(index)
|
|
|
|
|
unlock(&profMemFutureLock[index])
|
|
|
|
|
unlock(&profMemActiveLock)
|
2017-02-23 21:50:19 -05:00
|
|
|
}
|
|
|
|
|
|
2014-09-01 00:06:26 -04:00
|
|
|
// Called by malloc to record a profiled block.
|
2024-03-29 19:59:47 +01:00
|
|
|
func mProf_Malloc(mp *m, p unsafe.Pointer, size uintptr) {
|
2024-05-17 15:07:07 +02:00
|
|
|
if mp.profStack == nil {
|
|
|
|
|
// mp.profStack is nil if we happen to sample an allocation during the
|
|
|
|
|
// initialization of mp. This case is rare, so we just ignore such
|
|
|
|
|
// allocations. Change MemProfileRate to 1 if you need to reproduce such
|
|
|
|
|
// cases for testing purposes.
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
// Only use the part of mp.profStack we need and ignore the extra space
|
|
|
|
|
// reserved for delayed inline expansion with frame pointer unwinding.
|
2024-04-27 13:41:05 +02:00
|
|
|
nstk := callers(4, mp.profStack[:debug.profstackdepth])
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
index := (mProfCycle.read() + 2) % uint32(len(memRecord{}.future))
|
|
|
|
|
|
2024-03-29 19:59:47 +01:00
|
|
|
b := stkbucket(memProfile, size, mp.profStack[:nstk], true)
|
|
|
|
|
mr := b.mp()
|
|
|
|
|
mpc := &mr.future[index]
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
|
|
|
|
|
lock(&profMemFutureLock[index])
|
2017-03-01 13:58:22 -05:00
|
|
|
mpc.allocs++
|
|
|
|
|
mpc.alloc_bytes += size
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
unlock(&profMemFutureLock[index])
|
2014-09-01 00:06:26 -04:00
|
|
|
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
// Setprofilebucket locks a bunch of other mutexes, so we call it outside of
|
|
|
|
|
// the profiler locks. This reduces potential contention and chances of
|
|
|
|
|
// deadlocks. Since the object must be alive during the call to
|
|
|
|
|
// mProf_Malloc, it's fine to do this non-atomically.
|
[dev.cc] runtime: delete scalararg, ptrarg; rename onM to systemstack
Scalararg and ptrarg are not "signal safe".
Go code filling them out can be interrupted by a signal,
and then the signal handler runs, and if it also ends up
in Go code that uses scalararg or ptrarg, now the old
values have been smashed.
For the pieces of code that do need to run in a signal handler,
we introduced onM_signalok, which is really just onM
except that the _signalok is meant to convey that the caller
asserts that scalarg and ptrarg will be restored to their old
values after the call (instead of the usual behavior, zeroing them).
Scalararg and ptrarg are also untyped and therefore error-prone.
Go code can always pass a closure instead of using scalararg
and ptrarg; they were only really necessary for C code.
And there's no more C code.
For all these reasons, delete scalararg and ptrarg, converting
the few remaining references to use closures.
Once those are gone, there is no need for a distinction between
onM and onM_signalok, so replace both with a single function
equivalent to the current onM_signalok (that is, it can be called
on any of the curg, g0, and gsignal stacks).
The name onM and the phrase 'm stack' are misnomers,
because on most system an M has two system stacks:
the main thread stack and the signal handling stack.
Correct the misnomer by naming the replacement function systemstack.
Fix a few references to "M stack" in code.
The main motivation for this change is to eliminate scalararg/ptrarg.
Rick and I have already seen them cause problems because
the calling sequence m.ptrarg[0] = p is a heap pointer assignment,
so it gets a write barrier. The write barrier also uses onM, so it has
all the same problems as if it were being invoked by a signal handler.
We worked around this by saving and restoring the old values
and by calling onM_signalok, but there's no point in keeping this nice
home for bugs around any longer.
This CL also changes funcline to return the file name as a result
instead of filling in a passed-in *string. (The *string signature is
left over from when the code was written in and called from C.)
That's arguably an unrelated change, except that once I had done
the ptrarg/scalararg/onM cleanup I started getting false positives
about the *string argument escaping (not allowed in package runtime).
The compiler is wrong, but the easiest fix is to write the code like
Go code instead of like C code. I am a bit worried that the compiler
is wrong because of some use of uninitialized memory in the escape
analysis. If that's the reason, it will go away when we convert the
compiler to Go. (And if not, we'll debug it the next time.)
LGTM=khr
R=r, khr
CC=austin, golang-codereviews, iant, rlh
https://golang.org/cl/174950043
2014-11-12 14:54:31 -05:00
|
|
|
systemstack(func() {
|
2014-11-11 17:05:02 -05:00
|
|
|
setprofilebucket(p, b)
|
|
|
|
|
})
|
2014-09-04 00:54:06 -04:00
|
|
|
}
|
2014-09-01 18:51:12 -04:00
|
|
|
|
|
|
|
|
// Called when freeing a profiled block.
|
2015-11-03 20:00:21 +01:00
|
|
|
func mProf_Free(b *bucket, size uintptr) {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
index := (mProfCycle.read() + 1) % uint32(len(memRecord{}.future))
|
|
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
mp := b.mp()
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
mpc := &mp.future[index]
|
|
|
|
|
|
|
|
|
|
lock(&profMemFutureLock[index])
|
2017-03-01 13:58:22 -05:00
|
|
|
mpc.frees++
|
|
|
|
|
mpc.free_bytes += size
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
unlock(&profMemFutureLock[index])
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
|
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
var blockprofilerate uint64 // in CPU ticks
|
2014-09-01 00:06:26 -04:00
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
// SetBlockProfileRate controls the fraction of goroutine blocking events
|
2016-03-01 23:21:55 +00:00
|
|
|
// that are reported in the blocking profile. The profiler aims to sample
|
2014-09-01 18:51:12 -04:00
|
|
|
// an average of one blocking event per rate nanoseconds spent blocked.
|
|
|
|
|
//
|
|
|
|
|
// To include every blocking event in the profile, pass rate = 1.
|
|
|
|
|
// To turn off profiling entirely, pass rate <= 0.
|
|
|
|
|
func SetBlockProfileRate(rate int) {
|
|
|
|
|
var r int64
|
|
|
|
|
if rate <= 0 {
|
|
|
|
|
r = 0 // disable profiling
|
2014-10-20 15:48:42 -07:00
|
|
|
} else if rate == 1 {
|
|
|
|
|
r = 1 // profile everything
|
2014-09-01 18:51:12 -04:00
|
|
|
} else {
|
2014-09-01 00:06:26 -04:00
|
|
|
// convert ns to cycles, use float64 to prevent overflow during multiplication
|
runtime: improve tickspersecond
Currently tickspersecond forces a 100 millisecond sleep the first time
it's called. This isn't great for profiling short-lived programs, since
both CPU profiling and block profiling might call into it.
100 milliseconds is a long time, but it's chosen to try and capture a
decent estimate of the conversion on platform with course-granularity
clocks. If the granularity is 15 ms, it'll only be 15% off at worst.
Let's try a different strategy. First, let's require 5 milliseconds of
time to have elapsed at a minimum. This should be plenty on platforms
with nanosecond time granularity from the system clock, provided the
caller of tickspersecond intends to use it for calculating durations,
not timestamps. Next, grab a timestamp as close to process start as
possible, so that we can cover some of that 5 millisecond just during
runtime start.
Finally, this function is only ever called from normal goroutine
contexts. Let's do a regular goroutine sleep instead of a thread-level
sleep under a runtime lock, which has all sorts of nasty effects on
preemption.
While we're here, let's also rename tickspersecond to ticksPerSecond.
Also, let's write down some explicit rules of thumb on when to use this
function. Clocks are hard, and using this for timestamp conversion is
likely to make lining up those timestamps with other clocks on the
system difficult if not impossible.
Note that while this improves ticksPerSecond on platforms with good
clocks, we still end up with a pretty coarse sleep on platforms with
coarse clocks, and a pretty coarse result. On these platforms, keep the
minimum required elapsed time at 100 ms. There's not much we can do
about these platforms except spin and try to catch the clock boundary,
but at 10+ ms of granularity, that might be a lot of spinning.
Fixes #63103.
Fixes #63078.
Change-Id: Ic32a4ba70a03bdf5c13cb80c2669c4064aa4cca2
Reviewed-on: https://go-review.googlesource.com/c/go/+/538898
Auto-Submit: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
2023-11-01 20:00:27 +00:00
|
|
|
r = int64(float64(rate) * float64(ticksPerSecond()) / (1000 * 1000 * 1000))
|
2014-09-01 18:51:12 -04:00
|
|
|
if r == 0 {
|
2014-09-01 00:06:26 -04:00
|
|
|
r = 1
|
2014-09-01 18:51:12 -04:00
|
|
|
}
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
2014-09-01 18:51:12 -04:00
|
|
|
|
2015-11-02 14:09:24 -05:00
|
|
|
atomic.Store64(&blockprofilerate, uint64(r))
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
|
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
func blockevent(cycles int64, skip int) {
|
|
|
|
|
if cycles <= 0 {
|
2014-10-20 15:48:42 -07:00
|
|
|
cycles = 1
|
2014-09-01 18:51:12 -04:00
|
|
|
}
|
2021-02-26 14:41:19 +01:00
|
|
|
|
|
|
|
|
rate := int64(atomic.Load64(&blockprofilerate))
|
|
|
|
|
if blocksampled(cycles, rate) {
|
|
|
|
|
saveblockevent(cycles, rate, skip+1, blockProfile)
|
2016-09-22 09:48:30 -04:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2021-02-26 14:41:19 +01:00
|
|
|
// blocksampled returns true for all events where cycles >= rate. Shorter
|
|
|
|
|
// events have a cycles/rate random chance of returning true.
|
|
|
|
|
func blocksampled(cycles, rate int64) bool {
|
math/rand, math/rand/v2: use ChaCha8 for global rand
Move ChaCha8 code into internal/chacha8rand and use it to implement
runtime.rand, which is used for the unseeded global source for
both math/rand and math/rand/v2. This also affects the calculation of
the start point for iteration over very very large maps (when the
32-bit fastrand is not big enough).
The benefit is that misuse of the global random number generators
in math/rand and math/rand/v2 in contexts where non-predictable
randomness is important for security reasons is no longer a
security problem, removing a common mistake among programmers
who are unaware of the different kinds of randomness.
The cost is an extra 304 bytes per thread stored in the m struct
plus 2-3ns more per random uint64 due to the more sophisticated
algorithm. Using PCG looks like it would cost about the same,
although I haven't benchmarked that.
Before this, the math/rand and math/rand/v2 global generator
was wyrand (https://github.com/wangyi-fudan/wyhash).
For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson
ALFG was justifiable, since the latter was not any better.
But for math/rand/v2, the global generator really should be
at least as good as one of the well-studied, specific algorithms
provided directly by the package, and it's not.
(Wyrand is still reasonable for scheduling and cache decisions.)
Good randomness does have a cost: about twice wyrand.
Also rationalize the various runtime rand references.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.amd64 │ 5cf807d1ea.amd64 │
│ sec/op │ sec/op vs base │
ChaCha8-32 1.862n ± 2% 1.861n ± 2% ~ (p=0.825 n=20)
PCG_DXSM-32 1.471n ± 1% 1.460n ± 2% ~ (p=0.153 n=20)
SourceUint64-32 1.636n ± 2% 1.582n ± 1% -3.30% (p=0.000 n=20)
GlobalInt64-32 2.087n ± 1% 3.663n ± 1% +75.54% (p=0.000 n=20)
GlobalInt64Parallel-32 0.1042n ± 1% 0.2026n ± 1% +94.48% (p=0.000 n=20)
GlobalUint64-32 2.263n ± 2% 3.724n ± 1% +64.57% (p=0.000 n=20)
GlobalUint64Parallel-32 0.1019n ± 1% 0.1973n ± 1% +93.67% (p=0.000 n=20)
Int64-32 1.771n ± 1% 1.774n ± 1% ~ (p=0.449 n=20)
Uint64-32 1.863n ± 2% 1.866n ± 1% ~ (p=0.364 n=20)
GlobalIntN1000-32 3.134n ± 3% 4.730n ± 2% +50.95% (p=0.000 n=20)
IntN1000-32 2.489n ± 1% 2.489n ± 1% ~ (p=0.683 n=20)
Int64N1000-32 2.521n ± 1% 2.516n ± 1% ~ (p=0.394 n=20)
Int64N1e8-32 2.479n ± 1% 2.478n ± 2% ~ (p=0.743 n=20)
Int64N1e9-32 2.530n ± 2% 2.514n ± 2% ~ (p=0.193 n=20)
Int64N2e9-32 2.501n ± 1% 2.494n ± 1% ~ (p=0.616 n=20)
Int64N1e18-32 3.227n ± 1% 3.205n ± 1% ~ (p=0.101 n=20)
Int64N2e18-32 3.647n ± 1% 3.599n ± 1% ~ (p=0.019 n=20)
Int64N4e18-32 5.135n ± 1% 5.069n ± 2% ~ (p=0.034 n=20)
Int32N1000-32 2.657n ± 1% 2.637n ± 1% ~ (p=0.180 n=20)
Int32N1e8-32 2.636n ± 1% 2.636n ± 1% ~ (p=0.763 n=20)
Int32N1e9-32 2.660n ± 2% 2.638n ± 1% ~ (p=0.358 n=20)
Int32N2e9-32 2.662n ± 2% 2.618n ± 2% ~ (p=0.064 n=20)
Float32-32 2.272n ± 2% 2.239n ± 2% ~ (p=0.194 n=20)
Float64-32 2.272n ± 1% 2.286n ± 2% ~ (p=0.763 n=20)
ExpFloat64-32 3.762n ± 1% 3.744n ± 1% ~ (p=0.171 n=20)
NormFloat64-32 3.706n ± 1% 3.655n ± 2% ~ (p=0.066 n=20)
Perm3-32 32.93n ± 3% 34.62n ± 1% +5.13% (p=0.000 n=20)
Perm30-32 202.9n ± 1% 204.0n ± 1% ~ (p=0.482 n=20)
Perm30ViaShuffle-32 115.0n ± 1% 114.9n ± 1% ~ (p=0.358 n=20)
ShuffleOverhead-32 112.8n ± 1% 112.7n ± 1% ~ (p=0.692 n=20)
Concurrent-32 2.107n ± 0% 3.725n ± 1% +76.75% (p=0.000 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
│ bbb48afeb7.arm64 │ 5cf807d1ea.arm64 │
│ sec/op │ sec/op vs base │
ChaCha8-8 2.480n ± 0% 2.429n ± 0% -2.04% (p=0.000 n=20)
PCG_DXSM-8 2.531n ± 0% 2.530n ± 0% ~ (p=0.877 n=20)
SourceUint64-8 2.534n ± 0% 2.533n ± 0% ~ (p=0.732 n=20)
GlobalInt64-8 2.172n ± 1% 4.794n ± 0% +120.67% (p=0.000 n=20)
GlobalInt64Parallel-8 0.4320n ± 0% 0.9605n ± 0% +122.32% (p=0.000 n=20)
GlobalUint64-8 2.182n ± 0% 4.770n ± 0% +118.58% (p=0.000 n=20)
GlobalUint64Parallel-8 0.4307n ± 0% 0.9583n ± 0% +122.51% (p=0.000 n=20)
Int64-8 4.107n ± 0% 4.104n ± 0% ~ (p=0.416 n=20)
Uint64-8 4.080n ± 0% 4.080n ± 0% ~ (p=0.052 n=20)
GlobalIntN1000-8 2.814n ± 2% 5.643n ± 0% +100.50% (p=0.000 n=20)
IntN1000-8 4.141n ± 0% 4.139n ± 0% ~ (p=0.140 n=20)
Int64N1000-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.313 n=20)
Int64N1e8-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.103 n=20)
Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.761 n=20)
Int64N2e9-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.636 n=20)
Int64N1e18-8 5.266n ± 0% 5.326n ± 1% +1.14% (p=0.001 n=20)
Int64N2e18-8 6.052n ± 0% 6.167n ± 0% +1.90% (p=0.000 n=20)
Int64N4e18-8 8.826n ± 0% 9.051n ± 0% +2.55% (p=0.000 n=20)
Int32N1000-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N1e8-8 4.126n ± 0% 4.131n ± 0% +0.12% (p=0.000 n=20)
Int32N1e9-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N2e9-8 4.132n ± 0% 4.131n ± 0% ~ (p=0.017 n=20)
Float32-8 4.109n ± 0% 4.105n ± 0% ~ (p=0.379 n=20)
Float64-8 4.107n ± 0% 4.106n ± 0% ~ (p=0.867 n=20)
ExpFloat64-8 5.339n ± 0% 5.383n ± 0% +0.82% (p=0.000 n=20)
NormFloat64-8 5.735n ± 0% 5.737n ± 1% ~ (p=0.856 n=20)
Perm3-8 26.65n ± 0% 26.80n ± 1% +0.58% (p=0.000 n=20)
Perm30-8 194.8n ± 1% 197.0n ± 0% +1.18% (p=0.000 n=20)
Perm30ViaShuffle-8 156.6n ± 0% 157.6n ± 1% +0.61% (p=0.000 n=20)
ShuffleOverhead-8 124.9n ± 0% 125.5n ± 0% +0.52% (p=0.000 n=20)
Concurrent-8 2.434n ± 3% 5.066n ± 0% +108.09% (p=0.000 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.386 │ 5cf807d1ea.386 │
│ sec/op │ sec/op vs base │
ChaCha8-32 11.295n ± 1% 4.748n ± 2% -57.96% (p=0.000 n=20)
PCG_DXSM-32 7.693n ± 1% 7.738n ± 2% ~ (p=0.542 n=20)
SourceUint64-32 7.658n ± 2% 7.622n ± 2% ~ (p=0.344 n=20)
GlobalInt64-32 3.473n ± 2% 7.526n ± 2% +116.73% (p=0.000 n=20)
GlobalInt64Parallel-32 0.3198n ± 0% 0.5444n ± 0% +70.22% (p=0.000 n=20)
GlobalUint64-32 3.612n ± 0% 7.575n ± 1% +109.69% (p=0.000 n=20)
GlobalUint64Parallel-32 0.3168n ± 0% 0.5403n ± 0% +70.51% (p=0.000 n=20)
Int64-32 7.673n ± 2% 7.789n ± 1% ~ (p=0.122 n=20)
Uint64-32 7.773n ± 1% 7.827n ± 2% ~ (p=0.920 n=20)
GlobalIntN1000-32 6.268n ± 1% 9.581n ± 1% +52.87% (p=0.000 n=20)
IntN1000-32 10.33n ± 2% 10.45n ± 1% ~ (p=0.233 n=20)
Int64N1000-32 10.98n ± 2% 11.01n ± 1% ~ (p=0.401 n=20)
Int64N1e8-32 11.19n ± 2% 10.97n ± 1% ~ (p=0.033 n=20)
Int64N1e9-32 11.06n ± 1% 11.08n ± 1% ~ (p=0.498 n=20)
Int64N2e9-32 11.10n ± 1% 11.01n ± 2% ~ (p=0.995 n=20)
Int64N1e18-32 15.23n ± 2% 15.04n ± 1% ~ (p=0.973 n=20)
Int64N2e18-32 15.89n ± 1% 15.85n ± 1% ~ (p=0.409 n=20)
Int64N4e18-32 18.96n ± 2% 19.34n ± 2% ~ (p=0.048 n=20)
Int32N1000-32 10.46n ± 2% 10.44n ± 2% ~ (p=0.480 n=20)
Int32N1e8-32 10.46n ± 2% 10.49n ± 2% ~ (p=0.951 n=20)
Int32N1e9-32 10.28n ± 2% 10.26n ± 1% ~ (p=0.431 n=20)
Int32N2e9-32 10.50n ± 2% 10.44n ± 2% ~ (p=0.249 n=20)
Float32-32 13.80n ± 2% 13.80n ± 2% ~ (p=0.751 n=20)
Float64-32 23.55n ± 2% 23.87n ± 0% ~ (p=0.408 n=20)
ExpFloat64-32 15.36n ± 1% 15.29n ± 2% ~ (p=0.316 n=20)
NormFloat64-32 13.57n ± 1% 13.79n ± 1% +1.66% (p=0.005 n=20)
Perm3-32 45.70n ± 2% 46.99n ± 2% +2.81% (p=0.001 n=20)
Perm30-32 399.0n ± 1% 403.8n ± 1% +1.19% (p=0.006 n=20)
Perm30ViaShuffle-32 349.0n ± 1% 350.4n ± 1% ~ (p=0.909 n=20)
ShuffleOverhead-32 322.3n ± 1% 323.8n ± 1% ~ (p=0.410 n=20)
Concurrent-32 3.331n ± 1% 7.312n ± 1% +119.50% (p=0.000 n=20)
For #61716.
Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2
Reviewed-on: https://go-review.googlesource.com/c/go/+/516860
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-08-06 13:26:28 +10:00
|
|
|
if rate <= 0 || (rate > cycles && cheaprand64()%rate > cycles) {
|
2016-09-22 09:48:30 -04:00
|
|
|
return false
|
2014-09-01 18:51:12 -04:00
|
|
|
}
|
2016-09-22 09:48:30 -04:00
|
|
|
return true
|
|
|
|
|
}
|
|
|
|
|
|
2023-10-06 13:02:40 -04:00
|
|
|
// saveblockevent records a profile event of the type specified by which.
|
|
|
|
|
// cycles is the quantity associated with this event and rate is the sampling rate,
|
|
|
|
|
// used to adjust the cycles value in the manner determined by the profile type.
|
|
|
|
|
// skip is the number of frames to omit from the traceback associated with the event.
|
|
|
|
|
// The traceback will be recorded from the stack of the goroutine associated with the current m.
|
|
|
|
|
// skip should be positive if this event is recorded from the current stack
|
|
|
|
|
// (e.g. when this is not called from a system stack)
|
2021-02-26 14:41:19 +01:00
|
|
|
func saveblockevent(cycles, rate int64, skip int, which bucketType) {
|
2024-04-27 13:41:05 +02:00
|
|
|
if debug.profstackdepth == 0 {
|
|
|
|
|
// profstackdepth is set to 0 by the user, so mp.profStack is nil and we
|
|
|
|
|
// can't record a stack trace.
|
|
|
|
|
return
|
|
|
|
|
}
|
2024-05-19 15:21:53 +02:00
|
|
|
if skip > maxSkip {
|
|
|
|
|
print("requested skip=", skip)
|
|
|
|
|
throw("invalid skip value")
|
|
|
|
|
}
|
2024-03-29 19:59:47 +01:00
|
|
|
gp := getg()
|
|
|
|
|
mp := acquirem() // we must not be preempted while accessing profstack
|
2024-04-27 13:41:05 +02:00
|
|
|
|
2023-10-06 13:02:40 -04:00
|
|
|
nstk := 1
|
|
|
|
|
if tracefpunwindoff() || gp.m.hasCgoOnStack() {
|
|
|
|
|
mp.profStack[0] = logicalStackSentinel
|
|
|
|
|
if gp.m.curg == nil || gp.m.curg == gp {
|
|
|
|
|
nstk = callers(skip, mp.profStack[1:])
|
|
|
|
|
} else {
|
|
|
|
|
nstk = gcallers(gp.m.curg, skip, mp.profStack[1:])
|
|
|
|
|
}
|
2014-09-01 18:51:12 -04:00
|
|
|
} else {
|
2023-10-06 13:02:40 -04:00
|
|
|
mp.profStack[0] = uintptr(skip)
|
|
|
|
|
if gp.m.curg == nil || gp.m.curg == gp {
|
|
|
|
|
if skip > 0 {
|
|
|
|
|
// We skip one fewer frame than the provided value for frame
|
|
|
|
|
// pointer unwinding because the skip value includes the current
|
|
|
|
|
// frame, whereas the saved frame pointer will give us the
|
|
|
|
|
// caller's return address first (so, not including
|
|
|
|
|
// saveblockevent)
|
|
|
|
|
mp.profStack[0] -= 1
|
|
|
|
|
}
|
|
|
|
|
nstk += fpTracebackPCs(unsafe.Pointer(getfp()), mp.profStack[1:])
|
|
|
|
|
} else {
|
|
|
|
|
mp.profStack[1] = gp.m.curg.sched.pc
|
|
|
|
|
nstk += 1 + fpTracebackPCs(unsafe.Pointer(gp.m.curg.sched.bp), mp.profStack[2:])
|
|
|
|
|
}
|
2014-09-01 18:51:12 -04:00
|
|
|
}
|
2023-11-21 16:03:54 +00:00
|
|
|
|
2024-03-29 19:59:47 +01:00
|
|
|
saveBlockEventStack(cycles, rate, mp.profStack[:nstk], which)
|
|
|
|
|
releasem(mp)
|
2023-11-21 16:03:54 +00:00
|
|
|
}
|
|
|
|
|
|
runtime: double-link list of waiting Ms
When an M unlocks a contended mutex, it needs to consult a list of the
Ms that had to wait during its critical section. This allows the M to
attribute the appropriate amount of blame to the unlocking call stack.
Mirroring the implementation for users' sync.Mutex contention (via
sudog), we can (in a future commit) use the time that the head and tail
of the wait list started waiting, and the number of waiters, to estimate
the sum of the Ms' delays.
When an M acquires the mutex, it needs to remove itself from the list of
waiters. Since the futex-based lock implementation leaves the OS in
control of the order of M wakeups, we need to be prepared for quickly
(constant time) removing any M from the list.
First, have each M add itself to a singly-linked wait list when it finds
that its lock call will need to sleep. This case is safe against
live-lock, since any delay to one M adding itself to the list would be
due to another M making durable progress.
Second, have the M that holds the lock (either right before releasing,
or right after acquiring) update metadata on the list of waiting Ms to
double-link the list and maintain a tail pointer and waiter count. That
work is amortized-constant: we'll avoid contended locks becoming
proportionally more contended and undergoing performance collapse.
For #66999
Change-Id: If75cdea915afb59ccec47294e0b52c466aac8736
Reviewed-on: https://go-review.googlesource.com/c/go/+/585637
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
2024-05-13 13:00:52 -07:00
|
|
|
// mWaitList is part of the M struct, and holds the list of Ms that are waiting
|
|
|
|
|
// for a particular runtime.mutex.
|
|
|
|
|
//
|
|
|
|
|
// When an M is unable to immediately obtain a mutex, it notes the current time
|
|
|
|
|
// and it adds itself to the list of Ms waiting for the mutex. It does that via
|
|
|
|
|
// this struct's next field, forming a singly-linked list with the mutex's key
|
|
|
|
|
// field pointing to the head of the list.
|
|
|
|
|
//
|
|
|
|
|
// Immediately before releasing the mutex, the previous holder calculates how
|
|
|
|
|
// much delay it caused for the Ms that had to wait. First, it sets the prev
|
|
|
|
|
// links of each node in the list -- starting at the head and continuing until
|
|
|
|
|
// it finds the portion of the list that is already doubly linked. That part of
|
|
|
|
|
// the list also has correct values for the tail pointer and the waiters count,
|
|
|
|
|
// which we'll apply to the head of the wait list. This is amortized-constant
|
|
|
|
|
// work, though it takes place within the critical section of the contended
|
|
|
|
|
// mutex.
|
|
|
|
|
//
|
|
|
|
|
// Having found the head and tail nodes and a correct waiters count, the
|
2024-05-29 16:36:36 +00:00
|
|
|
// unlocking M can read and update those two nodes' acquireTimes fields and thus
|
runtime: double-link list of waiting Ms
When an M unlocks a contended mutex, it needs to consult a list of the
Ms that had to wait during its critical section. This allows the M to
attribute the appropriate amount of blame to the unlocking call stack.
Mirroring the implementation for users' sync.Mutex contention (via
sudog), we can (in a future commit) use the time that the head and tail
of the wait list started waiting, and the number of waiters, to estimate
the sum of the Ms' delays.
When an M acquires the mutex, it needs to remove itself from the list of
waiters. Since the futex-based lock implementation leaves the OS in
control of the order of M wakeups, we need to be prepared for quickly
(constant time) removing any M from the list.
First, have each M add itself to a singly-linked wait list when it finds
that its lock call will need to sleep. This case is safe against
live-lock, since any delay to one M adding itself to the list would be
due to another M making durable progress.
Second, have the M that holds the lock (either right before releasing,
or right after acquiring) update metadata on the list of waiting Ms to
double-link the list and maintain a tail pointer and waiter count. That
work is amortized-constant: we'll avoid contended locks becoming
proportionally more contended and undergoing performance collapse.
For #66999
Change-Id: If75cdea915afb59ccec47294e0b52c466aac8736
Reviewed-on: https://go-review.googlesource.com/c/go/+/585637
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
2024-05-13 13:00:52 -07:00
|
|
|
// take responsibility for (an estimate of) the entire list's delay since the
|
|
|
|
|
// last unlock call.
|
|
|
|
|
//
|
|
|
|
|
// Finally, the M that is then able to acquire the mutex needs to remove itself
|
|
|
|
|
// from the list of waiters. This is simpler than with many lock-free linked
|
|
|
|
|
// lists, since deletion here is guarded by the mutex itself. If the M's prev
|
|
|
|
|
// field isn't set and also isn't at the head of the list, it does the same
|
|
|
|
|
// amortized-constant double-linking as in unlock, enabling quick deletion
|
|
|
|
|
// regardless of where the M is in the list. Note that with lock_sema.go the
|
|
|
|
|
// runtime controls the order of thread wakeups (it's a LIFO stack), but with
|
|
|
|
|
// lock_futex.go the OS can wake an arbitrary thread.
|
|
|
|
|
type mWaitList struct {
|
2024-05-29 16:36:36 +00:00
|
|
|
acquireTimes timePair // start of current wait (set by us, updated by others during unlock)
|
runtime: double-link list of waiting Ms
When an M unlocks a contended mutex, it needs to consult a list of the
Ms that had to wait during its critical section. This allows the M to
attribute the appropriate amount of blame to the unlocking call stack.
Mirroring the implementation for users' sync.Mutex contention (via
sudog), we can (in a future commit) use the time that the head and tail
of the wait list started waiting, and the number of waiters, to estimate
the sum of the Ms' delays.
When an M acquires the mutex, it needs to remove itself from the list of
waiters. Since the futex-based lock implementation leaves the OS in
control of the order of M wakeups, we need to be prepared for quickly
(constant time) removing any M from the list.
First, have each M add itself to a singly-linked wait list when it finds
that its lock call will need to sleep. This case is safe against
live-lock, since any delay to one M adding itself to the list would be
due to another M making durable progress.
Second, have the M that holds the lock (either right before releasing,
or right after acquiring) update metadata on the list of waiting Ms to
double-link the list and maintain a tail pointer and waiter count. That
work is amortized-constant: we'll avoid contended locks becoming
proportionally more contended and undergoing performance collapse.
For #66999
Change-Id: If75cdea915afb59ccec47294e0b52c466aac8736
Reviewed-on: https://go-review.googlesource.com/c/go/+/585637
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
2024-05-13 13:00:52 -07:00
|
|
|
next muintptr // next m waiting for lock (set by us, cleared by another during unlock)
|
|
|
|
|
prev muintptr // previous m waiting for lock (an amortized hint, set by another during unlock)
|
|
|
|
|
tail muintptr // final m waiting for lock (an amortized hint, set by others during unlock)
|
|
|
|
|
waiters int32 // length of waiting m list (an amortized hint, set by another during unlock)
|
|
|
|
|
}
|
|
|
|
|
|
2024-05-29 16:36:36 +00:00
|
|
|
type timePair struct {
|
|
|
|
|
nanotime int64
|
|
|
|
|
cputicks int64
|
|
|
|
|
}
|
|
|
|
|
|
runtime: double-link list of waiting Ms
When an M unlocks a contended mutex, it needs to consult a list of the
Ms that had to wait during its critical section. This allows the M to
attribute the appropriate amount of blame to the unlocking call stack.
Mirroring the implementation for users' sync.Mutex contention (via
sudog), we can (in a future commit) use the time that the head and tail
of the wait list started waiting, and the number of waiters, to estimate
the sum of the Ms' delays.
When an M acquires the mutex, it needs to remove itself from the list of
waiters. Since the futex-based lock implementation leaves the OS in
control of the order of M wakeups, we need to be prepared for quickly
(constant time) removing any M from the list.
First, have each M add itself to a singly-linked wait list when it finds
that its lock call will need to sleep. This case is safe against
live-lock, since any delay to one M adding itself to the list would be
due to another M making durable progress.
Second, have the M that holds the lock (either right before releasing,
or right after acquiring) update metadata on the list of waiting Ms to
double-link the list and maintain a tail pointer and waiter count. That
work is amortized-constant: we'll avoid contended locks becoming
proportionally more contended and undergoing performance collapse.
For #66999
Change-Id: If75cdea915afb59ccec47294e0b52c466aac8736
Reviewed-on: https://go-review.googlesource.com/c/go/+/585637
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
2024-05-13 13:00:52 -07:00
|
|
|
// clearLinks resets the fields related to the M's position in the list of Ms
|
2024-05-29 16:36:36 +00:00
|
|
|
// waiting for a mutex. It leaves acquireTimes intact, since this M may still be
|
|
|
|
|
// waiting and may have had its acquireTimes updated by an unlock2 call.
|
runtime: double-link list of waiting Ms
When an M unlocks a contended mutex, it needs to consult a list of the
Ms that had to wait during its critical section. This allows the M to
attribute the appropriate amount of blame to the unlocking call stack.
Mirroring the implementation for users' sync.Mutex contention (via
sudog), we can (in a future commit) use the time that the head and tail
of the wait list started waiting, and the number of waiters, to estimate
the sum of the Ms' delays.
When an M acquires the mutex, it needs to remove itself from the list of
waiters. Since the futex-based lock implementation leaves the OS in
control of the order of M wakeups, we need to be prepared for quickly
(constant time) removing any M from the list.
First, have each M add itself to a singly-linked wait list when it finds
that its lock call will need to sleep. This case is safe against
live-lock, since any delay to one M adding itself to the list would be
due to another M making durable progress.
Second, have the M that holds the lock (either right before releasing,
or right after acquiring) update metadata on the list of waiting Ms to
double-link the list and maintain a tail pointer and waiter count. That
work is amortized-constant: we'll avoid contended locks becoming
proportionally more contended and undergoing performance collapse.
For #66999
Change-Id: If75cdea915afb59ccec47294e0b52c466aac8736
Reviewed-on: https://go-review.googlesource.com/c/go/+/585637
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
2024-05-13 13:00:52 -07:00
|
|
|
//
|
|
|
|
|
// In lock_sema.go, the previous owner of the mutex dequeues an M and then wakes
|
|
|
|
|
// it; with semaphore-based sleep, it's important that each M receives only one
|
|
|
|
|
// wakeup for each time they sleep. If the dequeued M fails to obtain the lock,
|
|
|
|
|
// it will need to sleep again -- and may have a different position in the list.
|
|
|
|
|
//
|
|
|
|
|
// With lock_futex.go, each thread is responsible for removing itself from the
|
|
|
|
|
// list, upon securing ownership of the mutex.
|
|
|
|
|
//
|
|
|
|
|
// Called while stack splitting is disabled in lock2.
|
|
|
|
|
//
|
|
|
|
|
//go:nosplit
|
|
|
|
|
func (l *mWaitList) clearLinks() {
|
|
|
|
|
l.next = 0
|
|
|
|
|
l.prev = 0
|
|
|
|
|
l.tail = 0
|
|
|
|
|
l.waiters = 0
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// verifyMutexWaitList instructs fixMutexWaitList to confirm that the mutex wait
|
|
|
|
|
// list invariants are intact. Operations on the list are typically
|
|
|
|
|
// amortized-constant; but when active, these extra checks require visiting
|
|
|
|
|
// every other M that is waiting for the lock.
|
|
|
|
|
const verifyMutexWaitList = false
|
|
|
|
|
|
|
|
|
|
// fixMutexWaitList restores the invariants of the linked list of Ms waiting for
|
|
|
|
|
// a particular mutex.
|
|
|
|
|
//
|
|
|
|
|
// It takes as an argument the pointer bits of the mutex's key. (The caller is
|
|
|
|
|
// responsible for clearing flag values.)
|
|
|
|
|
//
|
|
|
|
|
// On return, the list will be doubly-linked, and the head of the list (if not
|
|
|
|
|
// nil) will point to an M where mWaitList.tail points to the end of the linked
|
|
|
|
|
// list and where mWaitList.waiters is the number of Ms in the list.
|
|
|
|
|
//
|
|
|
|
|
// The caller must hold the mutex that the Ms of the list are waiting to
|
|
|
|
|
// acquire.
|
|
|
|
|
//
|
|
|
|
|
// Called while stack splitting is disabled in lock2.
|
|
|
|
|
//
|
|
|
|
|
//go:nosplit
|
|
|
|
|
func fixMutexWaitList(head muintptr) {
|
|
|
|
|
if head == 0 {
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
hp := head.ptr()
|
|
|
|
|
node := hp
|
|
|
|
|
|
|
|
|
|
var waiters int32
|
|
|
|
|
var tail *m
|
|
|
|
|
for {
|
|
|
|
|
// For amortized-constant cost, stop searching once we reach part of the
|
|
|
|
|
// list that's been visited before. Identify it by the presence of a
|
|
|
|
|
// tail pointer.
|
|
|
|
|
if node.mWaitList.tail.ptr() != nil {
|
|
|
|
|
tail = node.mWaitList.tail.ptr()
|
|
|
|
|
waiters += node.mWaitList.waiters
|
|
|
|
|
break
|
|
|
|
|
}
|
|
|
|
|
waiters++
|
|
|
|
|
|
|
|
|
|
next := node.mWaitList.next.ptr()
|
|
|
|
|
if next == nil {
|
|
|
|
|
break
|
|
|
|
|
}
|
|
|
|
|
next.mWaitList.prev.set(node)
|
|
|
|
|
|
|
|
|
|
node = next
|
|
|
|
|
}
|
|
|
|
|
if tail == nil {
|
|
|
|
|
tail = node
|
|
|
|
|
}
|
|
|
|
|
hp.mWaitList.tail.set(tail)
|
|
|
|
|
hp.mWaitList.waiters = waiters
|
|
|
|
|
|
|
|
|
|
if verifyMutexWaitList {
|
|
|
|
|
var revisit int32
|
|
|
|
|
var reTail *m
|
|
|
|
|
for node := hp; node != nil; node = node.mWaitList.next.ptr() {
|
|
|
|
|
revisit++
|
|
|
|
|
reTail = node
|
|
|
|
|
}
|
|
|
|
|
if revisit != waiters {
|
|
|
|
|
throw("miscounted mutex waiters")
|
|
|
|
|
}
|
|
|
|
|
if reTail != tail {
|
|
|
|
|
throw("incorrect mutex wait list tail")
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// removeMutexWaitList removes mp from the list of Ms waiting for a particular
|
|
|
|
|
// mutex. It relies on (and keeps up to date) the invariants that
|
|
|
|
|
// fixMutexWaitList establishes and repairs.
|
|
|
|
|
//
|
|
|
|
|
// It modifies the nodes that are to remain in the list. It returns the value to
|
|
|
|
|
// assign as the head of the list, with the caller responsible for ensuring that
|
|
|
|
|
// the (atomic, contended) head assignment worked and subsequently clearing the
|
|
|
|
|
// list-related fields of mp.
|
|
|
|
|
//
|
|
|
|
|
// The only change it makes to mp is to clear the tail field -- so a subsequent
|
|
|
|
|
// call to fixMutexWaitList will be able to re-establish the prev link from its
|
|
|
|
|
// next node (just in time for another removeMutexWaitList call to clear it
|
|
|
|
|
// again).
|
|
|
|
|
//
|
|
|
|
|
// The caller must hold the mutex that the Ms of the list are waiting to
|
|
|
|
|
// acquire.
|
|
|
|
|
//
|
|
|
|
|
// Called while stack splitting is disabled in lock2.
|
|
|
|
|
//
|
|
|
|
|
//go:nosplit
|
|
|
|
|
func removeMutexWaitList(head muintptr, mp *m) muintptr {
|
|
|
|
|
if head == 0 {
|
|
|
|
|
return 0
|
|
|
|
|
}
|
|
|
|
|
hp := head.ptr()
|
|
|
|
|
tail := hp.mWaitList.tail
|
|
|
|
|
waiters := hp.mWaitList.waiters
|
2024-05-29 16:36:36 +00:00
|
|
|
headTimes := hp.mWaitList.acquireTimes
|
|
|
|
|
tailTimes := hp.mWaitList.tail.ptr().mWaitList.acquireTimes
|
runtime: double-link list of waiting Ms
When an M unlocks a contended mutex, it needs to consult a list of the
Ms that had to wait during its critical section. This allows the M to
attribute the appropriate amount of blame to the unlocking call stack.
Mirroring the implementation for users' sync.Mutex contention (via
sudog), we can (in a future commit) use the time that the head and tail
of the wait list started waiting, and the number of waiters, to estimate
the sum of the Ms' delays.
When an M acquires the mutex, it needs to remove itself from the list of
waiters. Since the futex-based lock implementation leaves the OS in
control of the order of M wakeups, we need to be prepared for quickly
(constant time) removing any M from the list.
First, have each M add itself to a singly-linked wait list when it finds
that its lock call will need to sleep. This case is safe against
live-lock, since any delay to one M adding itself to the list would be
due to another M making durable progress.
Second, have the M that holds the lock (either right before releasing,
or right after acquiring) update metadata on the list of waiting Ms to
double-link the list and maintain a tail pointer and waiter count. That
work is amortized-constant: we'll avoid contended locks becoming
proportionally more contended and undergoing performance collapse.
For #66999
Change-Id: If75cdea915afb59ccec47294e0b52c466aac8736
Reviewed-on: https://go-review.googlesource.com/c/go/+/585637
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
2024-05-13 13:00:52 -07:00
|
|
|
|
|
|
|
|
mp.mWaitList.tail = 0
|
|
|
|
|
|
|
|
|
|
if head.ptr() == mp {
|
|
|
|
|
// mp is the head
|
|
|
|
|
if mp.mWaitList.prev.ptr() != nil {
|
|
|
|
|
throw("removeMutexWaitList node at head of list, but has prev field set")
|
|
|
|
|
}
|
|
|
|
|
head = mp.mWaitList.next
|
|
|
|
|
} else {
|
|
|
|
|
// mp is not the head
|
|
|
|
|
if mp.mWaitList.prev.ptr() == nil {
|
|
|
|
|
throw("removeMutexWaitList node not in list (not at head, no prev pointer)")
|
|
|
|
|
}
|
|
|
|
|
mp.mWaitList.prev.ptr().mWaitList.next = mp.mWaitList.next
|
|
|
|
|
if tail.ptr() == mp {
|
|
|
|
|
// mp is the tail
|
|
|
|
|
if mp.mWaitList.next.ptr() != nil {
|
|
|
|
|
throw("removeMutexWaitList node at tail of list, but has next field set")
|
|
|
|
|
}
|
|
|
|
|
tail = mp.mWaitList.prev
|
|
|
|
|
} else {
|
|
|
|
|
if mp.mWaitList.next.ptr() == nil {
|
|
|
|
|
throw("removeMutexWaitList node in body of list, but without next field set")
|
|
|
|
|
}
|
|
|
|
|
mp.mWaitList.next.ptr().mWaitList.prev = mp.mWaitList.prev
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// head and tail nodes are responsible for having current versions of
|
|
|
|
|
// certain metadata
|
|
|
|
|
if hp := head.ptr(); hp != nil {
|
|
|
|
|
hp.mWaitList.prev = 0
|
|
|
|
|
hp.mWaitList.tail = tail
|
|
|
|
|
hp.mWaitList.waiters = waiters - 1
|
2024-05-29 16:36:36 +00:00
|
|
|
hp.mWaitList.acquireTimes = headTimes
|
runtime: double-link list of waiting Ms
When an M unlocks a contended mutex, it needs to consult a list of the
Ms that had to wait during its critical section. This allows the M to
attribute the appropriate amount of blame to the unlocking call stack.
Mirroring the implementation for users' sync.Mutex contention (via
sudog), we can (in a future commit) use the time that the head and tail
of the wait list started waiting, and the number of waiters, to estimate
the sum of the Ms' delays.
When an M acquires the mutex, it needs to remove itself from the list of
waiters. Since the futex-based lock implementation leaves the OS in
control of the order of M wakeups, we need to be prepared for quickly
(constant time) removing any M from the list.
First, have each M add itself to a singly-linked wait list when it finds
that its lock call will need to sleep. This case is safe against
live-lock, since any delay to one M adding itself to the list would be
due to another M making durable progress.
Second, have the M that holds the lock (either right before releasing,
or right after acquiring) update metadata on the list of waiting Ms to
double-link the list and maintain a tail pointer and waiter count. That
work is amortized-constant: we'll avoid contended locks becoming
proportionally more contended and undergoing performance collapse.
For #66999
Change-Id: If75cdea915afb59ccec47294e0b52c466aac8736
Reviewed-on: https://go-review.googlesource.com/c/go/+/585637
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
2024-05-13 13:00:52 -07:00
|
|
|
}
|
|
|
|
|
if tp := tail.ptr(); tp != nil {
|
2024-05-29 16:36:36 +00:00
|
|
|
tp.mWaitList.acquireTimes = tailTimes
|
runtime: double-link list of waiting Ms
When an M unlocks a contended mutex, it needs to consult a list of the
Ms that had to wait during its critical section. This allows the M to
attribute the appropriate amount of blame to the unlocking call stack.
Mirroring the implementation for users' sync.Mutex contention (via
sudog), we can (in a future commit) use the time that the head and tail
of the wait list started waiting, and the number of waiters, to estimate
the sum of the Ms' delays.
When an M acquires the mutex, it needs to remove itself from the list of
waiters. Since the futex-based lock implementation leaves the OS in
control of the order of M wakeups, we need to be prepared for quickly
(constant time) removing any M from the list.
First, have each M add itself to a singly-linked wait list when it finds
that its lock call will need to sleep. This case is safe against
live-lock, since any delay to one M adding itself to the list would be
due to another M making durable progress.
Second, have the M that holds the lock (either right before releasing,
or right after acquiring) update metadata on the list of waiting Ms to
double-link the list and maintain a tail pointer and waiter count. That
work is amortized-constant: we'll avoid contended locks becoming
proportionally more contended and undergoing performance collapse.
For #66999
Change-Id: If75cdea915afb59ccec47294e0b52c466aac8736
Reviewed-on: https://go-review.googlesource.com/c/go/+/585637
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
2024-05-13 13:00:52 -07:00
|
|
|
}
|
|
|
|
|
return head
|
|
|
|
|
}
|
|
|
|
|
|
2024-05-29 16:36:36 +00:00
|
|
|
// claimMutexWaitTime advances the acquireTimes of the list of waiting Ms at
|
2024-05-14 12:32:14 -07:00
|
|
|
// head to now, returning an estimate of the total wait time claimed by that
|
|
|
|
|
// action.
|
2024-05-29 16:36:36 +00:00
|
|
|
func claimMutexWaitTime(now timePair, head muintptr) timePair {
|
2024-05-14 12:32:14 -07:00
|
|
|
fixMutexWaitList(head)
|
|
|
|
|
hp := head.ptr()
|
|
|
|
|
if hp == nil {
|
2024-05-29 16:36:36 +00:00
|
|
|
return timePair{}
|
2023-11-21 16:03:54 +00:00
|
|
|
}
|
2024-05-14 12:32:14 -07:00
|
|
|
tp := hp.mWaitList.tail.ptr()
|
|
|
|
|
waiters := hp.mWaitList.waiters
|
2024-05-29 16:36:36 +00:00
|
|
|
headTimes := hp.mWaitList.acquireTimes
|
|
|
|
|
tailTimes := tp.mWaitList.acquireTimes
|
2023-11-21 16:03:54 +00:00
|
|
|
|
2024-05-29 16:36:36 +00:00
|
|
|
var dt timePair
|
|
|
|
|
dt.nanotime = now.nanotime - headTimes.nanotime
|
|
|
|
|
dt.cputicks = now.cputicks - headTimes.cputicks
|
2024-05-14 12:32:14 -07:00
|
|
|
if waiters > 1 {
|
2024-05-29 16:36:36 +00:00
|
|
|
dt.nanotime = int64(waiters) * (dt.nanotime + now.nanotime - tailTimes.nanotime) / 2
|
|
|
|
|
dt.cputicks = int64(waiters) * (dt.cputicks + now.cputicks - tailTimes.cputicks) / 2
|
2023-11-21 16:03:54 +00:00
|
|
|
}
|
|
|
|
|
|
2024-05-14 12:32:14 -07:00
|
|
|
// When removeMutexWaitList removes a head or tail node, it's responsible
|
|
|
|
|
// for applying these changes to the new head or tail.
|
2024-05-29 16:36:36 +00:00
|
|
|
hp.mWaitList.acquireTimes = now
|
|
|
|
|
tp.mWaitList.acquireTimes = now
|
2023-11-21 16:03:54 +00:00
|
|
|
|
2024-05-29 16:36:36 +00:00
|
|
|
return dt
|
2023-11-21 16:03:54 +00:00
|
|
|
}
|
|
|
|
|
|
runtime: double-link list of waiting Ms
When an M unlocks a contended mutex, it needs to consult a list of the
Ms that had to wait during its critical section. This allows the M to
attribute the appropriate amount of blame to the unlocking call stack.
Mirroring the implementation for users' sync.Mutex contention (via
sudog), we can (in a future commit) use the time that the head and tail
of the wait list started waiting, and the number of waiters, to estimate
the sum of the Ms' delays.
When an M acquires the mutex, it needs to remove itself from the list of
waiters. Since the futex-based lock implementation leaves the OS in
control of the order of M wakeups, we need to be prepared for quickly
(constant time) removing any M from the list.
First, have each M add itself to a singly-linked wait list when it finds
that its lock call will need to sleep. This case is safe against
live-lock, since any delay to one M adding itself to the list would be
due to another M making durable progress.
Second, have the M that holds the lock (either right before releasing,
or right after acquiring) update metadata on the list of waiting Ms to
double-link the list and maintain a tail pointer and waiter count. That
work is amortized-constant: we'll avoid contended locks becoming
proportionally more contended and undergoing performance collapse.
For #66999
Change-Id: If75cdea915afb59ccec47294e0b52c466aac8736
Reviewed-on: https://go-review.googlesource.com/c/go/+/585637
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
2024-05-13 13:00:52 -07:00
|
|
|
// mLockProfile is part of the M struct to hold information relating to mutex
|
|
|
|
|
// contention delay attributed to this M.
|
2024-05-13 12:23:58 -07:00
|
|
|
//
|
runtime: double-link list of waiting Ms
When an M unlocks a contended mutex, it needs to consult a list of the
Ms that had to wait during its critical section. This allows the M to
attribute the appropriate amount of blame to the unlocking call stack.
Mirroring the implementation for users' sync.Mutex contention (via
sudog), we can (in a future commit) use the time that the head and tail
of the wait list started waiting, and the number of waiters, to estimate
the sum of the Ms' delays.
When an M acquires the mutex, it needs to remove itself from the list of
waiters. Since the futex-based lock implementation leaves the OS in
control of the order of M wakeups, we need to be prepared for quickly
(constant time) removing any M from the list.
First, have each M add itself to a singly-linked wait list when it finds
that its lock call will need to sleep. This case is safe against
live-lock, since any delay to one M adding itself to the list would be
due to another M making durable progress.
Second, have the M that holds the lock (either right before releasing,
or right after acquiring) update metadata on the list of waiting Ms to
double-link the list and maintain a tail pointer and waiter count. That
work is amortized-constant: we'll avoid contended locks becoming
proportionally more contended and undergoing performance collapse.
For #66999
Change-Id: If75cdea915afb59ccec47294e0b52c466aac8736
Reviewed-on: https://go-review.googlesource.com/c/go/+/585637
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
2024-05-13 13:00:52 -07:00
|
|
|
// Adding records to the process-wide mutex contention profile involves
|
|
|
|
|
// acquiring mutexes, so the M uses this to buffer a single contention event
|
|
|
|
|
// until it can safely transfer it to the shared profile.
|
|
|
|
|
//
|
|
|
|
|
// When the M unlocks its last mutex, it transfers the local buffer into the
|
|
|
|
|
// profile. As part of that step, it also transfers any "additional contention"
|
|
|
|
|
// time to the profile. Any lock contention that it experiences while adding
|
|
|
|
|
// samples to the profile will be recorded later as "additional contention" and
|
|
|
|
|
// not include a call stack, to avoid an echo.
|
2023-11-21 16:03:54 +00:00
|
|
|
type mLockProfile struct {
|
2024-03-29 19:59:47 +01:00
|
|
|
waitTime atomic.Int64 // total nanoseconds spent waiting in runtime.lockWithRank
|
2024-05-14 12:32:14 -07:00
|
|
|
stack []uintptr // unlock stack that caused delay in other Ms' runtime.lockWithRank
|
|
|
|
|
cycles int64 // cycles attributable to "stack"
|
2024-03-29 19:59:47 +01:00
|
|
|
cyclesLost int64 // contention for which we weren't able to record a call stack
|
|
|
|
|
disabled bool // attribute all time to "lost"
|
2023-11-21 16:03:54 +00:00
|
|
|
}
|
|
|
|
|
|
2024-05-14 12:32:14 -07:00
|
|
|
// recordUnlock considers the current unlock call (which caused a total of dt
|
|
|
|
|
// delay in other Ms) for later inclusion in the mutex contention profile. If
|
|
|
|
|
// this M holds no other locks, it transfers the buffered contention record to
|
|
|
|
|
// the mutex contention profile.
|
|
|
|
|
//
|
|
|
|
|
// From unlock2, we might not be holding a p in this code.
|
|
|
|
|
//
|
|
|
|
|
//go:nowritebarrierrec
|
2024-05-29 16:36:36 +00:00
|
|
|
func (prof *mLockProfile) recordUnlock(dt timePair) {
|
|
|
|
|
if dt != (timePair{}) {
|
2024-05-14 12:32:14 -07:00
|
|
|
// We could make a point of clearing out the local storage right before
|
|
|
|
|
// this, to have a slightly better chance of being able to see the call
|
|
|
|
|
// stack if the program has several (nested) contended locks. If apps
|
|
|
|
|
// are seeing a lot of _LostContendedRuntimeLock samples, maybe that'll
|
|
|
|
|
// be a worthwhile change.
|
2024-05-29 16:36:36 +00:00
|
|
|
prof.proposeUnlock(dt)
|
2024-05-14 12:32:14 -07:00
|
|
|
}
|
|
|
|
|
if getg().m.locks == 1 && prof.cycles != 0 {
|
|
|
|
|
prof.store()
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2024-05-29 16:36:36 +00:00
|
|
|
func (prof *mLockProfile) proposeUnlock(dt timePair) {
|
|
|
|
|
if nanos := dt.nanotime; nanos > 0 {
|
|
|
|
|
prof.waitTime.Add(nanos)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
cycles := dt.cputicks
|
2023-11-21 16:03:54 +00:00
|
|
|
if cycles <= 0 {
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
|
2024-05-14 12:32:14 -07:00
|
|
|
rate := int64(atomic.Load64(&mutexprofilerate))
|
|
|
|
|
if rate <= 0 || int64(cheaprand())%rate != 0 {
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
|
2023-11-21 16:03:54 +00:00
|
|
|
if prof.disabled {
|
|
|
|
|
// We're experiencing contention while attempting to report contention.
|
|
|
|
|
// Make a note of its magnitude, but don't allow it to be the sole cause
|
|
|
|
|
// of another contention report.
|
|
|
|
|
prof.cyclesLost += cycles
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if prev := prof.cycles; prev > 0 {
|
|
|
|
|
// We can only store one call stack for runtime-internal lock contention
|
|
|
|
|
// on this M, and we've already got one. Decide which should stay, and
|
2023-12-04 14:28:30 -05:00
|
|
|
// add the other to the report for runtime._LostContendedRuntimeLock.
|
math/rand, math/rand/v2: use ChaCha8 for global rand
Move ChaCha8 code into internal/chacha8rand and use it to implement
runtime.rand, which is used for the unseeded global source for
both math/rand and math/rand/v2. This also affects the calculation of
the start point for iteration over very very large maps (when the
32-bit fastrand is not big enough).
The benefit is that misuse of the global random number generators
in math/rand and math/rand/v2 in contexts where non-predictable
randomness is important for security reasons is no longer a
security problem, removing a common mistake among programmers
who are unaware of the different kinds of randomness.
The cost is an extra 304 bytes per thread stored in the m struct
plus 2-3ns more per random uint64 due to the more sophisticated
algorithm. Using PCG looks like it would cost about the same,
although I haven't benchmarked that.
Before this, the math/rand and math/rand/v2 global generator
was wyrand (https://github.com/wangyi-fudan/wyhash).
For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson
ALFG was justifiable, since the latter was not any better.
But for math/rand/v2, the global generator really should be
at least as good as one of the well-studied, specific algorithms
provided directly by the package, and it's not.
(Wyrand is still reasonable for scheduling and cache decisions.)
Good randomness does have a cost: about twice wyrand.
Also rationalize the various runtime rand references.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.amd64 │ 5cf807d1ea.amd64 │
│ sec/op │ sec/op vs base │
ChaCha8-32 1.862n ± 2% 1.861n ± 2% ~ (p=0.825 n=20)
PCG_DXSM-32 1.471n ± 1% 1.460n ± 2% ~ (p=0.153 n=20)
SourceUint64-32 1.636n ± 2% 1.582n ± 1% -3.30% (p=0.000 n=20)
GlobalInt64-32 2.087n ± 1% 3.663n ± 1% +75.54% (p=0.000 n=20)
GlobalInt64Parallel-32 0.1042n ± 1% 0.2026n ± 1% +94.48% (p=0.000 n=20)
GlobalUint64-32 2.263n ± 2% 3.724n ± 1% +64.57% (p=0.000 n=20)
GlobalUint64Parallel-32 0.1019n ± 1% 0.1973n ± 1% +93.67% (p=0.000 n=20)
Int64-32 1.771n ± 1% 1.774n ± 1% ~ (p=0.449 n=20)
Uint64-32 1.863n ± 2% 1.866n ± 1% ~ (p=0.364 n=20)
GlobalIntN1000-32 3.134n ± 3% 4.730n ± 2% +50.95% (p=0.000 n=20)
IntN1000-32 2.489n ± 1% 2.489n ± 1% ~ (p=0.683 n=20)
Int64N1000-32 2.521n ± 1% 2.516n ± 1% ~ (p=0.394 n=20)
Int64N1e8-32 2.479n ± 1% 2.478n ± 2% ~ (p=0.743 n=20)
Int64N1e9-32 2.530n ± 2% 2.514n ± 2% ~ (p=0.193 n=20)
Int64N2e9-32 2.501n ± 1% 2.494n ± 1% ~ (p=0.616 n=20)
Int64N1e18-32 3.227n ± 1% 3.205n ± 1% ~ (p=0.101 n=20)
Int64N2e18-32 3.647n ± 1% 3.599n ± 1% ~ (p=0.019 n=20)
Int64N4e18-32 5.135n ± 1% 5.069n ± 2% ~ (p=0.034 n=20)
Int32N1000-32 2.657n ± 1% 2.637n ± 1% ~ (p=0.180 n=20)
Int32N1e8-32 2.636n ± 1% 2.636n ± 1% ~ (p=0.763 n=20)
Int32N1e9-32 2.660n ± 2% 2.638n ± 1% ~ (p=0.358 n=20)
Int32N2e9-32 2.662n ± 2% 2.618n ± 2% ~ (p=0.064 n=20)
Float32-32 2.272n ± 2% 2.239n ± 2% ~ (p=0.194 n=20)
Float64-32 2.272n ± 1% 2.286n ± 2% ~ (p=0.763 n=20)
ExpFloat64-32 3.762n ± 1% 3.744n ± 1% ~ (p=0.171 n=20)
NormFloat64-32 3.706n ± 1% 3.655n ± 2% ~ (p=0.066 n=20)
Perm3-32 32.93n ± 3% 34.62n ± 1% +5.13% (p=0.000 n=20)
Perm30-32 202.9n ± 1% 204.0n ± 1% ~ (p=0.482 n=20)
Perm30ViaShuffle-32 115.0n ± 1% 114.9n ± 1% ~ (p=0.358 n=20)
ShuffleOverhead-32 112.8n ± 1% 112.7n ± 1% ~ (p=0.692 n=20)
Concurrent-32 2.107n ± 0% 3.725n ± 1% +76.75% (p=0.000 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
│ bbb48afeb7.arm64 │ 5cf807d1ea.arm64 │
│ sec/op │ sec/op vs base │
ChaCha8-8 2.480n ± 0% 2.429n ± 0% -2.04% (p=0.000 n=20)
PCG_DXSM-8 2.531n ± 0% 2.530n ± 0% ~ (p=0.877 n=20)
SourceUint64-8 2.534n ± 0% 2.533n ± 0% ~ (p=0.732 n=20)
GlobalInt64-8 2.172n ± 1% 4.794n ± 0% +120.67% (p=0.000 n=20)
GlobalInt64Parallel-8 0.4320n ± 0% 0.9605n ± 0% +122.32% (p=0.000 n=20)
GlobalUint64-8 2.182n ± 0% 4.770n ± 0% +118.58% (p=0.000 n=20)
GlobalUint64Parallel-8 0.4307n ± 0% 0.9583n ± 0% +122.51% (p=0.000 n=20)
Int64-8 4.107n ± 0% 4.104n ± 0% ~ (p=0.416 n=20)
Uint64-8 4.080n ± 0% 4.080n ± 0% ~ (p=0.052 n=20)
GlobalIntN1000-8 2.814n ± 2% 5.643n ± 0% +100.50% (p=0.000 n=20)
IntN1000-8 4.141n ± 0% 4.139n ± 0% ~ (p=0.140 n=20)
Int64N1000-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.313 n=20)
Int64N1e8-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.103 n=20)
Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.761 n=20)
Int64N2e9-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.636 n=20)
Int64N1e18-8 5.266n ± 0% 5.326n ± 1% +1.14% (p=0.001 n=20)
Int64N2e18-8 6.052n ± 0% 6.167n ± 0% +1.90% (p=0.000 n=20)
Int64N4e18-8 8.826n ± 0% 9.051n ± 0% +2.55% (p=0.000 n=20)
Int32N1000-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N1e8-8 4.126n ± 0% 4.131n ± 0% +0.12% (p=0.000 n=20)
Int32N1e9-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N2e9-8 4.132n ± 0% 4.131n ± 0% ~ (p=0.017 n=20)
Float32-8 4.109n ± 0% 4.105n ± 0% ~ (p=0.379 n=20)
Float64-8 4.107n ± 0% 4.106n ± 0% ~ (p=0.867 n=20)
ExpFloat64-8 5.339n ± 0% 5.383n ± 0% +0.82% (p=0.000 n=20)
NormFloat64-8 5.735n ± 0% 5.737n ± 1% ~ (p=0.856 n=20)
Perm3-8 26.65n ± 0% 26.80n ± 1% +0.58% (p=0.000 n=20)
Perm30-8 194.8n ± 1% 197.0n ± 0% +1.18% (p=0.000 n=20)
Perm30ViaShuffle-8 156.6n ± 0% 157.6n ± 1% +0.61% (p=0.000 n=20)
ShuffleOverhead-8 124.9n ± 0% 125.5n ± 0% +0.52% (p=0.000 n=20)
Concurrent-8 2.434n ± 3% 5.066n ± 0% +108.09% (p=0.000 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.386 │ 5cf807d1ea.386 │
│ sec/op │ sec/op vs base │
ChaCha8-32 11.295n ± 1% 4.748n ± 2% -57.96% (p=0.000 n=20)
PCG_DXSM-32 7.693n ± 1% 7.738n ± 2% ~ (p=0.542 n=20)
SourceUint64-32 7.658n ± 2% 7.622n ± 2% ~ (p=0.344 n=20)
GlobalInt64-32 3.473n ± 2% 7.526n ± 2% +116.73% (p=0.000 n=20)
GlobalInt64Parallel-32 0.3198n ± 0% 0.5444n ± 0% +70.22% (p=0.000 n=20)
GlobalUint64-32 3.612n ± 0% 7.575n ± 1% +109.69% (p=0.000 n=20)
GlobalUint64Parallel-32 0.3168n ± 0% 0.5403n ± 0% +70.51% (p=0.000 n=20)
Int64-32 7.673n ± 2% 7.789n ± 1% ~ (p=0.122 n=20)
Uint64-32 7.773n ± 1% 7.827n ± 2% ~ (p=0.920 n=20)
GlobalIntN1000-32 6.268n ± 1% 9.581n ± 1% +52.87% (p=0.000 n=20)
IntN1000-32 10.33n ± 2% 10.45n ± 1% ~ (p=0.233 n=20)
Int64N1000-32 10.98n ± 2% 11.01n ± 1% ~ (p=0.401 n=20)
Int64N1e8-32 11.19n ± 2% 10.97n ± 1% ~ (p=0.033 n=20)
Int64N1e9-32 11.06n ± 1% 11.08n ± 1% ~ (p=0.498 n=20)
Int64N2e9-32 11.10n ± 1% 11.01n ± 2% ~ (p=0.995 n=20)
Int64N1e18-32 15.23n ± 2% 15.04n ± 1% ~ (p=0.973 n=20)
Int64N2e18-32 15.89n ± 1% 15.85n ± 1% ~ (p=0.409 n=20)
Int64N4e18-32 18.96n ± 2% 19.34n ± 2% ~ (p=0.048 n=20)
Int32N1000-32 10.46n ± 2% 10.44n ± 2% ~ (p=0.480 n=20)
Int32N1e8-32 10.46n ± 2% 10.49n ± 2% ~ (p=0.951 n=20)
Int32N1e9-32 10.28n ± 2% 10.26n ± 1% ~ (p=0.431 n=20)
Int32N2e9-32 10.50n ± 2% 10.44n ± 2% ~ (p=0.249 n=20)
Float32-32 13.80n ± 2% 13.80n ± 2% ~ (p=0.751 n=20)
Float64-32 23.55n ± 2% 23.87n ± 0% ~ (p=0.408 n=20)
ExpFloat64-32 15.36n ± 1% 15.29n ± 2% ~ (p=0.316 n=20)
NormFloat64-32 13.57n ± 1% 13.79n ± 1% +1.66% (p=0.005 n=20)
Perm3-32 45.70n ± 2% 46.99n ± 2% +2.81% (p=0.001 n=20)
Perm30-32 399.0n ± 1% 403.8n ± 1% +1.19% (p=0.006 n=20)
Perm30ViaShuffle-32 349.0n ± 1% 350.4n ± 1% ~ (p=0.909 n=20)
ShuffleOverhead-32 322.3n ± 1% 323.8n ± 1% ~ (p=0.410 n=20)
Concurrent-32 3.331n ± 1% 7.312n ± 1% +119.50% (p=0.000 n=20)
For #61716.
Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2
Reviewed-on: https://go-review.googlesource.com/c/go/+/516860
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-08-06 13:26:28 +10:00
|
|
|
prevScore := uint64(cheaprand64()) % uint64(prev)
|
|
|
|
|
thisScore := uint64(cheaprand64()) % uint64(cycles)
|
2023-11-21 16:03:54 +00:00
|
|
|
if prevScore > thisScore {
|
|
|
|
|
prof.cyclesLost += cycles
|
|
|
|
|
return
|
|
|
|
|
} else {
|
|
|
|
|
prof.cyclesLost += prev
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
prof.cycles = cycles
|
2024-05-14 12:32:14 -07:00
|
|
|
prof.captureStack()
|
2023-11-21 16:03:54 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
func (prof *mLockProfile) captureStack() {
|
2024-04-27 13:41:05 +02:00
|
|
|
if debug.profstackdepth == 0 {
|
|
|
|
|
// profstackdepth is set to 0 by the user, so mp.profStack is nil and we
|
|
|
|
|
// can't record a stack trace.
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
|
2024-05-14 12:32:14 -07:00
|
|
|
skip := 4 // runtime.(*mLockProfile).proposeUnlock runtime.(*mLockProfile).recordUnlock runtime.unlock2 runtime.unlockWithRank
|
2023-11-21 16:03:54 +00:00
|
|
|
if staticLockRanking {
|
|
|
|
|
// When static lock ranking is enabled, we'll always be on the system
|
|
|
|
|
// stack at this point. There will be a runtime.unlockWithRank.func1
|
|
|
|
|
// frame, and if the call to runtime.unlock took place on a user stack
|
|
|
|
|
// then there'll also be a runtime.systemstack frame. To keep stack
|
|
|
|
|
// traces somewhat consistent whether or not static lock ranking is
|
|
|
|
|
// enabled, we'd like to skip those. But it's hard to tell how long
|
|
|
|
|
// we've been on the system stack so accept an extra frame in that case,
|
|
|
|
|
// with a leaf of "runtime.unlockWithRank runtime.unlock" instead of
|
|
|
|
|
// "runtime.unlock".
|
|
|
|
|
skip += 1 // runtime.unlockWithRank.func1
|
|
|
|
|
}
|
|
|
|
|
|
2023-10-06 13:02:40 -04:00
|
|
|
prof.stack[0] = logicalStackSentinel
|
2023-11-21 16:03:54 +00:00
|
|
|
|
|
|
|
|
var nstk int
|
|
|
|
|
gp := getg()
|
|
|
|
|
sp := getcallersp()
|
|
|
|
|
pc := getcallerpc()
|
|
|
|
|
systemstack(func() {
|
|
|
|
|
var u unwinder
|
|
|
|
|
u.initAt(pc, sp, 0, gp, unwindSilentErrors|unwindJumpStack)
|
2023-10-06 13:02:40 -04:00
|
|
|
nstk = 1 + tracebackPCs(&u, skip, prof.stack[1:])
|
2023-11-21 16:03:54 +00:00
|
|
|
})
|
|
|
|
|
if nstk < len(prof.stack) {
|
|
|
|
|
prof.stack[nstk] = 0
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
func (prof *mLockProfile) store() {
|
|
|
|
|
// Report any contention we experience within this function as "lost"; it's
|
|
|
|
|
// important that the act of reporting a contention event not lead to a
|
|
|
|
|
// reportable contention event. This also means we can use prof.stack
|
|
|
|
|
// without copying, since it won't change during this function.
|
|
|
|
|
mp := acquirem()
|
|
|
|
|
prof.disabled = true
|
|
|
|
|
|
2024-04-27 13:41:05 +02:00
|
|
|
nstk := int(debug.profstackdepth)
|
2023-11-21 16:03:54 +00:00
|
|
|
for i := 0; i < nstk; i++ {
|
|
|
|
|
if pc := prof.stack[i]; pc == 0 {
|
|
|
|
|
nstk = i
|
|
|
|
|
break
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
cycles, lost := prof.cycles, prof.cyclesLost
|
|
|
|
|
prof.cycles, prof.cyclesLost = 0, 0
|
|
|
|
|
|
|
|
|
|
rate := int64(atomic.Load64(&mutexprofilerate))
|
|
|
|
|
saveBlockEventStack(cycles, rate, prof.stack[:nstk], mutexProfile)
|
|
|
|
|
if lost > 0 {
|
|
|
|
|
lostStk := [...]uintptr{
|
2023-10-06 13:02:40 -04:00
|
|
|
logicalStackSentinel,
|
2023-12-04 14:28:30 -05:00
|
|
|
abi.FuncPCABIInternal(_LostContendedRuntimeLock) + sys.PCQuantum,
|
2023-11-21 16:03:54 +00:00
|
|
|
}
|
|
|
|
|
saveBlockEventStack(lost, rate, lostStk[:], mutexProfile)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
prof.disabled = false
|
|
|
|
|
releasem(mp)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
func saveBlockEventStack(cycles, rate int64, stk []uintptr, which bucketType) {
|
|
|
|
|
b := stkbucket(which, 0, stk, true)
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
bp := b.bp()
|
2021-02-26 14:41:19 +01:00
|
|
|
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
lock(&profBlockLock)
|
2022-10-12 10:05:51 -04:00
|
|
|
// We want to up-scale the count and cycles according to the
|
|
|
|
|
// probability that the event was sampled. For block profile events,
|
|
|
|
|
// the sample probability is 1 if cycles >= rate, and cycles / rate
|
|
|
|
|
// otherwise. For mutex profile events, the sample probability is 1 / rate.
|
|
|
|
|
// We scale the events by 1 / (probability the event was sampled).
|
2021-02-26 14:41:19 +01:00
|
|
|
if which == blockProfile && cycles < rate {
|
|
|
|
|
// Remove sampling bias, see discussion on http://golang.org/cl/299991.
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
bp.count += float64(rate) / float64(cycles)
|
|
|
|
|
bp.cycles += rate
|
2022-10-12 10:05:51 -04:00
|
|
|
} else if which == mutexProfile {
|
|
|
|
|
bp.count += float64(rate)
|
|
|
|
|
bp.cycles += rate * cycles
|
2021-02-26 14:41:19 +01:00
|
|
|
} else {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
bp.count++
|
|
|
|
|
bp.cycles += cycles
|
2021-02-26 14:41:19 +01:00
|
|
|
}
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
unlock(&profBlockLock)
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
|
|
|
|
|
2016-09-22 09:48:30 -04:00
|
|
|
var mutexprofilerate uint64 // fraction sampled
|
|
|
|
|
|
|
|
|
|
// SetMutexProfileFraction controls the fraction of mutex contention events
|
|
|
|
|
// that are reported in the mutex profile. On average 1/rate events are
|
|
|
|
|
// reported. The previous rate is returned.
|
|
|
|
|
//
|
|
|
|
|
// To turn off profiling entirely, pass rate 0.
|
2018-04-19 12:24:53 -04:00
|
|
|
// To just read the current rate, pass rate < 0.
|
2016-09-22 09:48:30 -04:00
|
|
|
// (For n>1 the details of sampling may change.)
|
|
|
|
|
func SetMutexProfileFraction(rate int) int {
|
|
|
|
|
if rate < 0 {
|
|
|
|
|
return int(mutexprofilerate)
|
|
|
|
|
}
|
|
|
|
|
old := mutexprofilerate
|
|
|
|
|
atomic.Store64(&mutexprofilerate, uint64(rate))
|
|
|
|
|
return int(old)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
//go:linkname mutexevent sync.event
|
|
|
|
|
func mutexevent(cycles int64, skip int) {
|
2016-10-28 15:12:18 -04:00
|
|
|
if cycles < 0 {
|
|
|
|
|
cycles = 0
|
|
|
|
|
}
|
2016-09-22 09:48:30 -04:00
|
|
|
rate := int64(atomic.Load64(&mutexprofilerate))
|
math/rand, math/rand/v2: use ChaCha8 for global rand
Move ChaCha8 code into internal/chacha8rand and use it to implement
runtime.rand, which is used for the unseeded global source for
both math/rand and math/rand/v2. This also affects the calculation of
the start point for iteration over very very large maps (when the
32-bit fastrand is not big enough).
The benefit is that misuse of the global random number generators
in math/rand and math/rand/v2 in contexts where non-predictable
randomness is important for security reasons is no longer a
security problem, removing a common mistake among programmers
who are unaware of the different kinds of randomness.
The cost is an extra 304 bytes per thread stored in the m struct
plus 2-3ns more per random uint64 due to the more sophisticated
algorithm. Using PCG looks like it would cost about the same,
although I haven't benchmarked that.
Before this, the math/rand and math/rand/v2 global generator
was wyrand (https://github.com/wangyi-fudan/wyhash).
For math/rand, using wyrand instead of the Mitchell/Reeds/Thompson
ALFG was justifiable, since the latter was not any better.
But for math/rand/v2, the global generator really should be
at least as good as one of the well-studied, specific algorithms
provided directly by the package, and it's not.
(Wyrand is still reasonable for scheduling and cache decisions.)
Good randomness does have a cost: about twice wyrand.
Also rationalize the various runtime rand references.
goos: linux
goarch: amd64
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.amd64 │ 5cf807d1ea.amd64 │
│ sec/op │ sec/op vs base │
ChaCha8-32 1.862n ± 2% 1.861n ± 2% ~ (p=0.825 n=20)
PCG_DXSM-32 1.471n ± 1% 1.460n ± 2% ~ (p=0.153 n=20)
SourceUint64-32 1.636n ± 2% 1.582n ± 1% -3.30% (p=0.000 n=20)
GlobalInt64-32 2.087n ± 1% 3.663n ± 1% +75.54% (p=0.000 n=20)
GlobalInt64Parallel-32 0.1042n ± 1% 0.2026n ± 1% +94.48% (p=0.000 n=20)
GlobalUint64-32 2.263n ± 2% 3.724n ± 1% +64.57% (p=0.000 n=20)
GlobalUint64Parallel-32 0.1019n ± 1% 0.1973n ± 1% +93.67% (p=0.000 n=20)
Int64-32 1.771n ± 1% 1.774n ± 1% ~ (p=0.449 n=20)
Uint64-32 1.863n ± 2% 1.866n ± 1% ~ (p=0.364 n=20)
GlobalIntN1000-32 3.134n ± 3% 4.730n ± 2% +50.95% (p=0.000 n=20)
IntN1000-32 2.489n ± 1% 2.489n ± 1% ~ (p=0.683 n=20)
Int64N1000-32 2.521n ± 1% 2.516n ± 1% ~ (p=0.394 n=20)
Int64N1e8-32 2.479n ± 1% 2.478n ± 2% ~ (p=0.743 n=20)
Int64N1e9-32 2.530n ± 2% 2.514n ± 2% ~ (p=0.193 n=20)
Int64N2e9-32 2.501n ± 1% 2.494n ± 1% ~ (p=0.616 n=20)
Int64N1e18-32 3.227n ± 1% 3.205n ± 1% ~ (p=0.101 n=20)
Int64N2e18-32 3.647n ± 1% 3.599n ± 1% ~ (p=0.019 n=20)
Int64N4e18-32 5.135n ± 1% 5.069n ± 2% ~ (p=0.034 n=20)
Int32N1000-32 2.657n ± 1% 2.637n ± 1% ~ (p=0.180 n=20)
Int32N1e8-32 2.636n ± 1% 2.636n ± 1% ~ (p=0.763 n=20)
Int32N1e9-32 2.660n ± 2% 2.638n ± 1% ~ (p=0.358 n=20)
Int32N2e9-32 2.662n ± 2% 2.618n ± 2% ~ (p=0.064 n=20)
Float32-32 2.272n ± 2% 2.239n ± 2% ~ (p=0.194 n=20)
Float64-32 2.272n ± 1% 2.286n ± 2% ~ (p=0.763 n=20)
ExpFloat64-32 3.762n ± 1% 3.744n ± 1% ~ (p=0.171 n=20)
NormFloat64-32 3.706n ± 1% 3.655n ± 2% ~ (p=0.066 n=20)
Perm3-32 32.93n ± 3% 34.62n ± 1% +5.13% (p=0.000 n=20)
Perm30-32 202.9n ± 1% 204.0n ± 1% ~ (p=0.482 n=20)
Perm30ViaShuffle-32 115.0n ± 1% 114.9n ± 1% ~ (p=0.358 n=20)
ShuffleOverhead-32 112.8n ± 1% 112.7n ± 1% ~ (p=0.692 n=20)
Concurrent-32 2.107n ± 0% 3.725n ± 1% +76.75% (p=0.000 n=20)
goos: darwin
goarch: arm64
pkg: math/rand/v2
│ bbb48afeb7.arm64 │ 5cf807d1ea.arm64 │
│ sec/op │ sec/op vs base │
ChaCha8-8 2.480n ± 0% 2.429n ± 0% -2.04% (p=0.000 n=20)
PCG_DXSM-8 2.531n ± 0% 2.530n ± 0% ~ (p=0.877 n=20)
SourceUint64-8 2.534n ± 0% 2.533n ± 0% ~ (p=0.732 n=20)
GlobalInt64-8 2.172n ± 1% 4.794n ± 0% +120.67% (p=0.000 n=20)
GlobalInt64Parallel-8 0.4320n ± 0% 0.9605n ± 0% +122.32% (p=0.000 n=20)
GlobalUint64-8 2.182n ± 0% 4.770n ± 0% +118.58% (p=0.000 n=20)
GlobalUint64Parallel-8 0.4307n ± 0% 0.9583n ± 0% +122.51% (p=0.000 n=20)
Int64-8 4.107n ± 0% 4.104n ± 0% ~ (p=0.416 n=20)
Uint64-8 4.080n ± 0% 4.080n ± 0% ~ (p=0.052 n=20)
GlobalIntN1000-8 2.814n ± 2% 5.643n ± 0% +100.50% (p=0.000 n=20)
IntN1000-8 4.141n ± 0% 4.139n ± 0% ~ (p=0.140 n=20)
Int64N1000-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.313 n=20)
Int64N1e8-8 4.140n ± 0% 4.139n ± 0% ~ (p=0.103 n=20)
Int64N1e9-8 4.139n ± 0% 4.140n ± 0% ~ (p=0.761 n=20)
Int64N2e9-8 4.140n ± 0% 4.140n ± 0% ~ (p=0.636 n=20)
Int64N1e18-8 5.266n ± 0% 5.326n ± 1% +1.14% (p=0.001 n=20)
Int64N2e18-8 6.052n ± 0% 6.167n ± 0% +1.90% (p=0.000 n=20)
Int64N4e18-8 8.826n ± 0% 9.051n ± 0% +2.55% (p=0.000 n=20)
Int32N1000-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N1e8-8 4.126n ± 0% 4.131n ± 0% +0.12% (p=0.000 n=20)
Int32N1e9-8 4.127n ± 0% 4.132n ± 0% +0.12% (p=0.000 n=20)
Int32N2e9-8 4.132n ± 0% 4.131n ± 0% ~ (p=0.017 n=20)
Float32-8 4.109n ± 0% 4.105n ± 0% ~ (p=0.379 n=20)
Float64-8 4.107n ± 0% 4.106n ± 0% ~ (p=0.867 n=20)
ExpFloat64-8 5.339n ± 0% 5.383n ± 0% +0.82% (p=0.000 n=20)
NormFloat64-8 5.735n ± 0% 5.737n ± 1% ~ (p=0.856 n=20)
Perm3-8 26.65n ± 0% 26.80n ± 1% +0.58% (p=0.000 n=20)
Perm30-8 194.8n ± 1% 197.0n ± 0% +1.18% (p=0.000 n=20)
Perm30ViaShuffle-8 156.6n ± 0% 157.6n ± 1% +0.61% (p=0.000 n=20)
ShuffleOverhead-8 124.9n ± 0% 125.5n ± 0% +0.52% (p=0.000 n=20)
Concurrent-8 2.434n ± 3% 5.066n ± 0% +108.09% (p=0.000 n=20)
goos: linux
goarch: 386
pkg: math/rand/v2
cpu: AMD Ryzen 9 7950X 16-Core Processor
│ bbb48afeb7.386 │ 5cf807d1ea.386 │
│ sec/op │ sec/op vs base │
ChaCha8-32 11.295n ± 1% 4.748n ± 2% -57.96% (p=0.000 n=20)
PCG_DXSM-32 7.693n ± 1% 7.738n ± 2% ~ (p=0.542 n=20)
SourceUint64-32 7.658n ± 2% 7.622n ± 2% ~ (p=0.344 n=20)
GlobalInt64-32 3.473n ± 2% 7.526n ± 2% +116.73% (p=0.000 n=20)
GlobalInt64Parallel-32 0.3198n ± 0% 0.5444n ± 0% +70.22% (p=0.000 n=20)
GlobalUint64-32 3.612n ± 0% 7.575n ± 1% +109.69% (p=0.000 n=20)
GlobalUint64Parallel-32 0.3168n ± 0% 0.5403n ± 0% +70.51% (p=0.000 n=20)
Int64-32 7.673n ± 2% 7.789n ± 1% ~ (p=0.122 n=20)
Uint64-32 7.773n ± 1% 7.827n ± 2% ~ (p=0.920 n=20)
GlobalIntN1000-32 6.268n ± 1% 9.581n ± 1% +52.87% (p=0.000 n=20)
IntN1000-32 10.33n ± 2% 10.45n ± 1% ~ (p=0.233 n=20)
Int64N1000-32 10.98n ± 2% 11.01n ± 1% ~ (p=0.401 n=20)
Int64N1e8-32 11.19n ± 2% 10.97n ± 1% ~ (p=0.033 n=20)
Int64N1e9-32 11.06n ± 1% 11.08n ± 1% ~ (p=0.498 n=20)
Int64N2e9-32 11.10n ± 1% 11.01n ± 2% ~ (p=0.995 n=20)
Int64N1e18-32 15.23n ± 2% 15.04n ± 1% ~ (p=0.973 n=20)
Int64N2e18-32 15.89n ± 1% 15.85n ± 1% ~ (p=0.409 n=20)
Int64N4e18-32 18.96n ± 2% 19.34n ± 2% ~ (p=0.048 n=20)
Int32N1000-32 10.46n ± 2% 10.44n ± 2% ~ (p=0.480 n=20)
Int32N1e8-32 10.46n ± 2% 10.49n ± 2% ~ (p=0.951 n=20)
Int32N1e9-32 10.28n ± 2% 10.26n ± 1% ~ (p=0.431 n=20)
Int32N2e9-32 10.50n ± 2% 10.44n ± 2% ~ (p=0.249 n=20)
Float32-32 13.80n ± 2% 13.80n ± 2% ~ (p=0.751 n=20)
Float64-32 23.55n ± 2% 23.87n ± 0% ~ (p=0.408 n=20)
ExpFloat64-32 15.36n ± 1% 15.29n ± 2% ~ (p=0.316 n=20)
NormFloat64-32 13.57n ± 1% 13.79n ± 1% +1.66% (p=0.005 n=20)
Perm3-32 45.70n ± 2% 46.99n ± 2% +2.81% (p=0.001 n=20)
Perm30-32 399.0n ± 1% 403.8n ± 1% +1.19% (p=0.006 n=20)
Perm30ViaShuffle-32 349.0n ± 1% 350.4n ± 1% ~ (p=0.909 n=20)
ShuffleOverhead-32 322.3n ± 1% 323.8n ± 1% ~ (p=0.410 n=20)
Concurrent-32 3.331n ± 1% 7.312n ± 1% +119.50% (p=0.000 n=20)
For #61716.
Change-Id: Ibdddeed85c34d9ae397289dc899e04d4845f9ed2
Reviewed-on: https://go-review.googlesource.com/c/go/+/516860
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Filippo Valsorda <filippo@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-08-06 13:26:28 +10:00
|
|
|
if rate > 0 && cheaprand64()%rate == 0 {
|
2021-02-26 14:41:19 +01:00
|
|
|
saveblockevent(cycles, rate, skip+1, mutexProfile)
|
2016-09-22 09:48:30 -04:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2014-09-01 00:06:26 -04:00
|
|
|
// Go interface to profile data.
|
|
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
// A StackRecord describes a single execution stack.
|
|
|
|
|
type StackRecord struct {
|
|
|
|
|
Stack0 [32]uintptr // stack trace for this record; ends at first 0 entry
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Stack returns the stack trace associated with the record,
|
|
|
|
|
// a prefix of r.Stack0.
|
|
|
|
|
func (r *StackRecord) Stack() []uintptr {
|
|
|
|
|
for i, v := range r.Stack0 {
|
|
|
|
|
if v == 0 {
|
|
|
|
|
return r.Stack0[0:i]
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return r.Stack0[0:]
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// MemProfileRate controls the fraction of memory allocations
|
|
|
|
|
// that are recorded and reported in the memory profile.
|
|
|
|
|
// The profiler aims to sample an average of
|
|
|
|
|
// one allocation per MemProfileRate bytes allocated.
|
|
|
|
|
//
|
|
|
|
|
// To include every allocated block in the profile, set MemProfileRate to 1.
|
|
|
|
|
// To turn off profiling entirely, set MemProfileRate to 0.
|
|
|
|
|
//
|
|
|
|
|
// The tools that process the memory profiles assume that the
|
|
|
|
|
// profile rate is constant across the lifetime of the program
|
2016-03-01 23:21:55 +00:00
|
|
|
// and equal to the current value. Programs that change the
|
2014-09-01 18:51:12 -04:00
|
|
|
// memory profiling rate should do so just once, as early as
|
|
|
|
|
// possible in the execution of the program (for example,
|
|
|
|
|
// at the beginning of main).
|
2022-09-16 18:56:48 +08:00
|
|
|
var MemProfileRate int = 512 * 1024
|
2021-03-07 20:52:48 -08:00
|
|
|
|
|
|
|
|
// disableMemoryProfiling is set by the linker if runtime.MemProfile
|
|
|
|
|
// is not used and the link type guarantees nobody else could use it
|
|
|
|
|
// elsewhere.
|
|
|
|
|
var disableMemoryProfiling bool
|
2014-09-01 18:51:12 -04:00
|
|
|
|
|
|
|
|
// A MemProfileRecord describes the live objects allocated
|
|
|
|
|
// by a particular call sequence (stack trace).
|
|
|
|
|
type MemProfileRecord struct {
|
|
|
|
|
AllocBytes, FreeBytes int64 // number of bytes allocated, freed
|
|
|
|
|
AllocObjects, FreeObjects int64 // number of objects allocated, freed
|
|
|
|
|
Stack0 [32]uintptr // stack trace for this record; ends at first 0 entry
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// InUseBytes returns the number of bytes in use (AllocBytes - FreeBytes).
|
|
|
|
|
func (r *MemProfileRecord) InUseBytes() int64 { return r.AllocBytes - r.FreeBytes }
|
|
|
|
|
|
|
|
|
|
// InUseObjects returns the number of objects in use (AllocObjects - FreeObjects).
|
|
|
|
|
func (r *MemProfileRecord) InUseObjects() int64 {
|
|
|
|
|
return r.AllocObjects - r.FreeObjects
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Stack returns the stack trace associated with the record,
|
|
|
|
|
// a prefix of r.Stack0.
|
|
|
|
|
func (r *MemProfileRecord) Stack() []uintptr {
|
|
|
|
|
for i, v := range r.Stack0 {
|
|
|
|
|
if v == 0 {
|
|
|
|
|
return r.Stack0[0:i]
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return r.Stack0[0:]
|
|
|
|
|
}
|
|
|
|
|
|
2015-11-12 17:33:15 -05:00
|
|
|
// MemProfile returns a profile of memory allocated and freed per allocation
|
|
|
|
|
// site.
|
|
|
|
|
//
|
2014-08-21 08:07:42 +02:00
|
|
|
// MemProfile returns n, the number of records in the current memory profile.
|
|
|
|
|
// If len(p) >= n, MemProfile copies the profile into p and returns n, true.
|
|
|
|
|
// If len(p) < n, MemProfile does not change p and returns n, false.
|
|
|
|
|
//
|
|
|
|
|
// If inuseZero is true, the profile includes allocation records
|
|
|
|
|
// where r.AllocBytes > 0 but r.AllocBytes == r.FreeBytes.
|
|
|
|
|
// These are sites where memory was allocated, but it has all
|
|
|
|
|
// been released back to the runtime.
|
|
|
|
|
//
|
2015-11-12 17:33:15 -05:00
|
|
|
// The returned profile may be up to two garbage collection cycles old.
|
|
|
|
|
// This is to avoid skewing the profile toward allocations; because
|
|
|
|
|
// allocations happen in real time but frees are delayed until the garbage
|
|
|
|
|
// collector performs sweeping, the profile only accounts for allocations
|
|
|
|
|
// that have had a chance to be freed by the garbage collector.
|
|
|
|
|
//
|
2014-08-21 08:07:42 +02:00
|
|
|
// Most clients should use the runtime/pprof package or
|
|
|
|
|
// the testing package's -test.memprofile flag instead
|
|
|
|
|
// of calling MemProfile directly.
|
|
|
|
|
func MemProfile(p []MemProfileRecord, inuseZero bool) (n int, ok bool) {
|
2024-05-17 15:07:07 +02:00
|
|
|
return memProfileInternal(len(p), inuseZero, func(r profilerecord.MemProfileRecord) {
|
|
|
|
|
copyMemProfileRecord(&p[0], r)
|
|
|
|
|
p = p[1:]
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// memProfileInternal returns the number of records n in the profile. If there
|
|
|
|
|
// are less than size records, copyFn is invoked for each record, and ok returns
|
|
|
|
|
// true.
|
|
|
|
|
func memProfileInternal(size int, inuseZero bool, copyFn func(profilerecord.MemProfileRecord)) (n int, ok bool) {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
cycle := mProfCycle.read()
|
2017-03-01 13:58:22 -05:00
|
|
|
// If we're between mProf_NextCycle and mProf_Flush, take care
|
|
|
|
|
// of flushing to the active profile so we only have to look
|
|
|
|
|
// at the active profile below.
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
index := cycle % uint32(len(memRecord{}.future))
|
|
|
|
|
lock(&profMemActiveLock)
|
|
|
|
|
lock(&profMemFutureLock[index])
|
|
|
|
|
mProf_FlushLocked(index)
|
|
|
|
|
unlock(&profMemFutureLock[index])
|
2014-08-21 08:07:42 +02:00
|
|
|
clear := true
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
head := (*bucket)(mbuckets.Load())
|
|
|
|
|
for b := head; b != nil; b = b.allnext {
|
2014-09-01 18:51:12 -04:00
|
|
|
mp := b.mp()
|
2017-03-01 11:50:38 -05:00
|
|
|
if inuseZero || mp.active.alloc_bytes != mp.active.free_bytes {
|
2014-08-21 08:07:42 +02:00
|
|
|
n++
|
|
|
|
|
}
|
2017-03-01 11:50:38 -05:00
|
|
|
if mp.active.allocs != 0 || mp.active.frees != 0 {
|
2014-08-21 08:07:42 +02:00
|
|
|
clear = false
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
if clear {
|
|
|
|
|
// Absolutely no data, suggesting that a garbage collection
|
|
|
|
|
// has not yet happened. In order to allow profiling when
|
|
|
|
|
// garbage collection is disabled from the beginning of execution,
|
2017-03-01 13:58:22 -05:00
|
|
|
// accumulate all of the cycles, and recount buckets.
|
2014-08-21 08:07:42 +02:00
|
|
|
n = 0
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
for b := head; b != nil; b = b.allnext {
|
2014-09-01 18:51:12 -04:00
|
|
|
mp := b.mp()
|
2017-03-01 13:58:22 -05:00
|
|
|
for c := range mp.future {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
lock(&profMemFutureLock[c])
|
2017-03-01 13:58:22 -05:00
|
|
|
mp.active.add(&mp.future[c])
|
|
|
|
|
mp.future[c] = memRecordCycle{}
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
unlock(&profMemFutureLock[c])
|
2017-03-01 13:58:22 -05:00
|
|
|
}
|
2017-03-01 11:50:38 -05:00
|
|
|
if inuseZero || mp.active.alloc_bytes != mp.active.free_bytes {
|
2014-08-21 08:07:42 +02:00
|
|
|
n++
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
2024-05-17 15:07:07 +02:00
|
|
|
if n <= size {
|
2014-08-21 08:07:42 +02:00
|
|
|
ok = true
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
for b := head; b != nil; b = b.allnext {
|
2014-09-01 18:51:12 -04:00
|
|
|
mp := b.mp()
|
2017-03-01 11:50:38 -05:00
|
|
|
if inuseZero || mp.active.alloc_bytes != mp.active.free_bytes {
|
2024-05-17 15:07:07 +02:00
|
|
|
r := profilerecord.MemProfileRecord{
|
|
|
|
|
AllocBytes: int64(mp.active.alloc_bytes),
|
|
|
|
|
FreeBytes: int64(mp.active.free_bytes),
|
|
|
|
|
AllocObjects: int64(mp.active.allocs),
|
|
|
|
|
FreeObjects: int64(mp.active.frees),
|
|
|
|
|
Stack: b.stk(),
|
|
|
|
|
}
|
|
|
|
|
copyFn(r)
|
2014-08-21 08:07:42 +02:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
unlock(&profMemActiveLock)
|
2014-08-21 08:07:42 +02:00
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
|
2024-05-17 15:07:07 +02:00
|
|
|
func copyMemProfileRecord(dst *MemProfileRecord, src profilerecord.MemProfileRecord) {
|
|
|
|
|
dst.AllocBytes = src.AllocBytes
|
|
|
|
|
dst.FreeBytes = src.FreeBytes
|
|
|
|
|
dst.AllocObjects = src.AllocObjects
|
|
|
|
|
dst.FreeObjects = src.FreeObjects
|
2016-09-21 09:44:40 -07:00
|
|
|
if raceenabled {
|
2024-05-17 15:07:07 +02:00
|
|
|
racewriterangepc(unsafe.Pointer(&dst.Stack0[0]), unsafe.Sizeof(dst.Stack0), getcallerpc(), abi.FuncPCABIInternal(MemProfile))
|
2016-09-21 09:44:40 -07:00
|
|
|
}
|
|
|
|
|
if msanenabled {
|
2024-05-17 15:07:07 +02:00
|
|
|
msanwrite(unsafe.Pointer(&dst.Stack0[0]), unsafe.Sizeof(dst.Stack0))
|
2016-09-21 09:44:40 -07:00
|
|
|
}
|
2021-01-05 17:52:43 +08:00
|
|
|
if asanenabled {
|
2024-05-17 15:07:07 +02:00
|
|
|
asanwrite(unsafe.Pointer(&dst.Stack0[0]), unsafe.Sizeof(dst.Stack0))
|
2021-01-05 17:52:43 +08:00
|
|
|
}
|
2024-05-17 15:07:07 +02:00
|
|
|
i := copy(dst.Stack0[:], src.Stack)
|
|
|
|
|
clear(dst.Stack0[i:])
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
//go:linkname pprof_memProfileInternal
|
|
|
|
|
func pprof_memProfileInternal(p []profilerecord.MemProfileRecord, inuseZero bool) (n int, ok bool) {
|
|
|
|
|
return memProfileInternal(len(p), inuseZero, func(r profilerecord.MemProfileRecord) {
|
|
|
|
|
p[0] = r
|
|
|
|
|
p = p[1:]
|
|
|
|
|
})
|
2014-08-21 08:07:42 +02:00
|
|
|
}
|
|
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
func iterate_memprof(fn func(*bucket, uintptr, *uintptr, uintptr, uintptr, uintptr)) {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
lock(&profMemActiveLock)
|
|
|
|
|
head := (*bucket)(mbuckets.Load())
|
|
|
|
|
for b := head; b != nil; b = b.allnext {
|
2014-09-01 18:51:12 -04:00
|
|
|
mp := b.mp()
|
2017-03-01 11:50:38 -05:00
|
|
|
fn(b, b.nstk, &b.stk()[0], b.size, mp.active.allocs, mp.active.frees)
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
unlock(&profMemActiveLock)
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
2014-09-01 18:51:12 -04:00
|
|
|
|
|
|
|
|
// BlockProfileRecord describes blocking events originated
|
|
|
|
|
// at a particular call sequence (stack trace).
|
|
|
|
|
type BlockProfileRecord struct {
|
|
|
|
|
Count int64
|
|
|
|
|
Cycles int64
|
|
|
|
|
StackRecord
|
|
|
|
|
}
|
2014-09-01 00:06:26 -04:00
|
|
|
|
2014-08-21 08:07:42 +02:00
|
|
|
// BlockProfile returns n, the number of records in the current blocking profile.
|
|
|
|
|
// If len(p) >= n, BlockProfile copies the profile into p and returns n, true.
|
|
|
|
|
// If len(p) < n, BlockProfile does not change p and returns n, false.
|
|
|
|
|
//
|
2023-11-09 22:04:38 +01:00
|
|
|
// Most clients should use the [runtime/pprof] package or
|
|
|
|
|
// the [testing] package's -test.blockprofile flag instead
|
2014-08-21 08:07:42 +02:00
|
|
|
// of calling BlockProfile directly.
|
|
|
|
|
func BlockProfile(p []BlockProfileRecord) (n int, ok bool) {
|
2024-05-17 15:07:07 +02:00
|
|
|
return blockProfileInternal(len(p), func(r profilerecord.BlockProfileRecord) {
|
|
|
|
|
copyBlockProfileRecord(&p[0], r)
|
|
|
|
|
p = p[1:]
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// blockProfileInternal returns the number of records n in the profile. If there
|
|
|
|
|
// are less than size records, copyFn is invoked for each record, and ok returns
|
|
|
|
|
// true.
|
|
|
|
|
func blockProfileInternal(size int, copyFn func(profilerecord.BlockProfileRecord)) (n int, ok bool) {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
lock(&profBlockLock)
|
|
|
|
|
head := (*bucket)(bbuckets.Load())
|
|
|
|
|
for b := head; b != nil; b = b.allnext {
|
2014-08-21 08:07:42 +02:00
|
|
|
n++
|
|
|
|
|
}
|
2024-05-17 15:07:07 +02:00
|
|
|
if n <= size {
|
2014-08-21 08:07:42 +02:00
|
|
|
ok = true
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
for b := head; b != nil; b = b.allnext {
|
2014-09-01 18:51:12 -04:00
|
|
|
bp := b.bp()
|
2024-05-17 15:07:07 +02:00
|
|
|
r := profilerecord.BlockProfileRecord{
|
|
|
|
|
Count: int64(bp.count),
|
|
|
|
|
Cycles: bp.cycles,
|
|
|
|
|
Stack: b.stk(),
|
|
|
|
|
}
|
2021-02-26 14:41:19 +01:00
|
|
|
// Prevent callers from having to worry about division by zero errors.
|
|
|
|
|
// See discussion on http://golang.org/cl/299991.
|
|
|
|
|
if r.Count == 0 {
|
|
|
|
|
r.Count = 1
|
|
|
|
|
}
|
2024-05-17 15:07:07 +02:00
|
|
|
copyFn(r)
|
2014-08-21 08:07:42 +02:00
|
|
|
}
|
|
|
|
|
}
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
unlock(&profBlockLock)
|
2014-08-21 08:07:42 +02:00
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
|
2024-05-17 15:07:07 +02:00
|
|
|
func copyBlockProfileRecord(dst *BlockProfileRecord, src profilerecord.BlockProfileRecord) {
|
|
|
|
|
dst.Count = src.Count
|
|
|
|
|
dst.Cycles = src.Cycles
|
|
|
|
|
if raceenabled {
|
|
|
|
|
racewriterangepc(unsafe.Pointer(&dst.Stack0[0]), unsafe.Sizeof(dst.Stack0), getcallerpc(), abi.FuncPCABIInternal(BlockProfile))
|
|
|
|
|
}
|
|
|
|
|
if msanenabled {
|
|
|
|
|
msanwrite(unsafe.Pointer(&dst.Stack0[0]), unsafe.Sizeof(dst.Stack0))
|
|
|
|
|
}
|
|
|
|
|
if asanenabled {
|
|
|
|
|
asanwrite(unsafe.Pointer(&dst.Stack0[0]), unsafe.Sizeof(dst.Stack0))
|
|
|
|
|
}
|
|
|
|
|
i := fpunwindExpand(dst.Stack0[:], src.Stack)
|
|
|
|
|
clear(dst.Stack0[i:])
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
//go:linkname pprof_blockProfileInternal
|
|
|
|
|
func pprof_blockProfileInternal(p []profilerecord.BlockProfileRecord) (n int, ok bool) {
|
|
|
|
|
return blockProfileInternal(len(p), func(r profilerecord.BlockProfileRecord) {
|
|
|
|
|
p[0] = r
|
|
|
|
|
p = p[1:]
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
|
2016-09-22 09:48:30 -04:00
|
|
|
// MutexProfile returns n, the number of records in the current mutex profile.
|
|
|
|
|
// If len(p) >= n, MutexProfile copies the profile into p and returns n, true.
|
|
|
|
|
// Otherwise, MutexProfile does not change p, and returns n, false.
|
|
|
|
|
//
|
2023-11-07 17:35:46 +08:00
|
|
|
// Most clients should use the [runtime/pprof] package
|
2016-09-22 09:48:30 -04:00
|
|
|
// instead of calling MutexProfile directly.
|
|
|
|
|
func MutexProfile(p []BlockProfileRecord) (n int, ok bool) {
|
2024-05-17 15:07:07 +02:00
|
|
|
return mutexProfileInternal(len(p), func(r profilerecord.BlockProfileRecord) {
|
|
|
|
|
copyBlockProfileRecord(&p[0], r)
|
|
|
|
|
p = p[1:]
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// mutexProfileInternal returns the number of records n in the profile. If there
|
|
|
|
|
// are less than size records, copyFn is invoked for each record, and ok returns
|
|
|
|
|
// true.
|
|
|
|
|
func mutexProfileInternal(size int, copyFn func(profilerecord.BlockProfileRecord)) (n int, ok bool) {
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
lock(&profBlockLock)
|
|
|
|
|
head := (*bucket)(xbuckets.Load())
|
|
|
|
|
for b := head; b != nil; b = b.allnext {
|
2016-09-22 09:48:30 -04:00
|
|
|
n++
|
|
|
|
|
}
|
2024-05-17 15:07:07 +02:00
|
|
|
if n <= size {
|
2016-09-22 09:48:30 -04:00
|
|
|
ok = true
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
for b := head; b != nil; b = b.allnext {
|
2016-09-22 09:48:30 -04:00
|
|
|
bp := b.bp()
|
2024-05-17 15:07:07 +02:00
|
|
|
r := profilerecord.BlockProfileRecord{
|
|
|
|
|
Count: int64(bp.count),
|
|
|
|
|
Cycles: bp.cycles,
|
|
|
|
|
Stack: b.stk(),
|
|
|
|
|
}
|
|
|
|
|
copyFn(r)
|
2016-09-22 09:48:30 -04:00
|
|
|
}
|
|
|
|
|
}
|
runtime: split mprof locks
The profiles for memory allocations, sync.Mutex contention, and general
blocking store their data in a shared hash table. The bookkeeping work
at the end of a garbage collection cycle involves maintenance on each
memory allocation record. Previously, a single lock guarded access to
the hash table and the contents of all records. When a program has
allocated memory at a large number of unique call stacks, the
maintenance following every garbage collection can hold that lock for
several milliseconds. That can prevent progress on all other goroutines
by delaying acquirep's call to mcache.prepareForSweep, which needs the
lock in mProf_Free to report when a profiled allocation is no longer in
use. With no user goroutines making progress, it is in effect a
multi-millisecond GC-related stop-the-world pause.
Split the lock so the call to mProf_Flush no longer delays each P's call
to mProf_Free: mProf_Free uses a lock on the memory records' N+1 cycle,
and mProf_Flush uses locks on the memory records' accumulator and their
N cycle. mProf_Malloc also no longer competes with mProf_Flush, as it
uses a lock on the memory records' N+2 cycle. The profiles for
sync.Mutex contention and general blocking now share a separate lock,
and another lock guards insertions to the shared hash table (uncommon in
the steady-state). Consumers of each type of profile take the matching
accumulator lock, so will observe consistent count and magnitude values
for each record.
For #45894
Change-Id: I615ff80618d10e71025423daa64b0b7f9dc57daa
Reviewed-on: https://go-review.googlesource.com/c/go/+/399956
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: Rhys Hiltner <rhys@justin.tv>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
2022-04-01 12:56:49 -07:00
|
|
|
unlock(&profBlockLock)
|
2016-09-22 09:48:30 -04:00
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
|
2024-05-17 15:07:07 +02:00
|
|
|
//go:linkname pprof_mutexProfileInternal
|
|
|
|
|
func pprof_mutexProfileInternal(p []profilerecord.BlockProfileRecord) (n int, ok bool) {
|
|
|
|
|
return mutexProfileInternal(len(p), func(r profilerecord.BlockProfileRecord) {
|
|
|
|
|
p[0] = r
|
|
|
|
|
p = p[1:]
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
|
2014-09-01 00:06:26 -04:00
|
|
|
// ThreadCreateProfile returns n, the number of records in the thread creation profile.
|
|
|
|
|
// If len(p) >= n, ThreadCreateProfile copies the profile into p and returns n, true.
|
|
|
|
|
// If len(p) < n, ThreadCreateProfile does not change p and returns n, false.
|
|
|
|
|
//
|
|
|
|
|
// Most clients should use the runtime/pprof package instead
|
|
|
|
|
// of calling ThreadCreateProfile directly.
|
|
|
|
|
func ThreadCreateProfile(p []StackRecord) (n int, ok bool) {
|
2024-05-17 15:07:07 +02:00
|
|
|
return threadCreateProfileInternal(len(p), func(r profilerecord.StackRecord) {
|
|
|
|
|
copy(p[0].Stack0[:], r.Stack)
|
|
|
|
|
p = p[1:]
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// threadCreateProfileInternal returns the number of records n in the profile.
|
|
|
|
|
// If there are less than size records, copyFn is invoked for each record, and
|
|
|
|
|
// ok returns true.
|
|
|
|
|
func threadCreateProfileInternal(size int, copyFn func(profilerecord.StackRecord)) (n int, ok bool) {
|
2015-11-02 14:09:24 -05:00
|
|
|
first := (*m)(atomic.Loadp(unsafe.Pointer(&allm)))
|
2014-09-01 00:06:26 -04:00
|
|
|
for mp := first; mp != nil; mp = mp.alllink {
|
|
|
|
|
n++
|
|
|
|
|
}
|
2024-05-17 15:07:07 +02:00
|
|
|
if n <= size {
|
2014-09-01 00:06:26 -04:00
|
|
|
ok = true
|
|
|
|
|
for mp := first; mp != nil; mp = mp.alllink {
|
2024-05-17 15:07:07 +02:00
|
|
|
r := profilerecord.StackRecord{Stack: mp.createstack[:]}
|
|
|
|
|
copyFn(r)
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
|
2024-05-17 15:07:07 +02:00
|
|
|
//go:linkname pprof_threadCreateInternal
|
|
|
|
|
func pprof_threadCreateInternal(p []profilerecord.StackRecord) (n int, ok bool) {
|
|
|
|
|
return threadCreateProfileInternal(len(p), func(r profilerecord.StackRecord) {
|
|
|
|
|
p[0] = r
|
|
|
|
|
p = p[1:]
|
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
//go:linkname pprof_goroutineProfileWithLabels
|
|
|
|
|
func pprof_goroutineProfileWithLabels(p []profilerecord.StackRecord, labels []unsafe.Pointer) (n int, ok bool) {
|
2019-08-04 15:14:48 -04:00
|
|
|
return goroutineProfileWithLabels(p, labels)
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// labels may be nil. If labels is non-nil, it must have the same length as p.
|
2024-05-17 15:07:07 +02:00
|
|
|
func goroutineProfileWithLabels(p []profilerecord.StackRecord, labels []unsafe.Pointer) (n int, ok bool) {
|
2019-08-04 15:14:48 -04:00
|
|
|
if labels != nil && len(labels) != len(p) {
|
|
|
|
|
labels = nil
|
|
|
|
|
}
|
2022-02-18 10:56:16 -08:00
|
|
|
|
2023-01-26 14:49:03 -08:00
|
|
|
return goroutineProfileWithLabelsConcurrent(p, labels)
|
2022-02-18 10:56:16 -08:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
var goroutineProfile = struct {
|
|
|
|
|
sema uint32
|
|
|
|
|
active bool
|
|
|
|
|
offset atomic.Int64
|
2024-05-17 15:07:07 +02:00
|
|
|
records []profilerecord.StackRecord
|
2022-02-18 10:56:16 -08:00
|
|
|
labels []unsafe.Pointer
|
|
|
|
|
}{
|
|
|
|
|
sema: 1,
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// goroutineProfileState indicates the status of a goroutine's stack for the
|
|
|
|
|
// current in-progress goroutine profile. Goroutines' stacks are initially
|
|
|
|
|
// "Absent" from the profile, and end up "Satisfied" by the time the profile is
|
|
|
|
|
// complete. While a goroutine's stack is being captured, its
|
|
|
|
|
// goroutineProfileState will be "InProgress" and it will not be able to run
|
|
|
|
|
// until the capture completes and the state moves to "Satisfied".
|
|
|
|
|
//
|
|
|
|
|
// Some goroutines (the finalizer goroutine, which at various times can be
|
|
|
|
|
// either a "system" or a "user" goroutine, and the goroutine that is
|
|
|
|
|
// coordinating the profile, any goroutines created during the profile) move
|
|
|
|
|
// directly to the "Satisfied" state.
|
|
|
|
|
type goroutineProfileState uint32
|
|
|
|
|
|
|
|
|
|
const (
|
|
|
|
|
goroutineProfileAbsent goroutineProfileState = iota
|
|
|
|
|
goroutineProfileInProgress
|
|
|
|
|
goroutineProfileSatisfied
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
type goroutineProfileStateHolder atomic.Uint32
|
|
|
|
|
|
|
|
|
|
func (p *goroutineProfileStateHolder) Load() goroutineProfileState {
|
|
|
|
|
return goroutineProfileState((*atomic.Uint32)(p).Load())
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
func (p *goroutineProfileStateHolder) Store(value goroutineProfileState) {
|
|
|
|
|
(*atomic.Uint32)(p).Store(uint32(value))
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
func (p *goroutineProfileStateHolder) CompareAndSwap(old, new goroutineProfileState) bool {
|
|
|
|
|
return (*atomic.Uint32)(p).CompareAndSwap(uint32(old), uint32(new))
|
|
|
|
|
}
|
|
|
|
|
|
2024-05-17 15:07:07 +02:00
|
|
|
func goroutineProfileWithLabelsConcurrent(p []profilerecord.StackRecord, labels []unsafe.Pointer) (n int, ok bool) {
|
2024-01-30 11:01:05 -05:00
|
|
|
if len(p) == 0 {
|
|
|
|
|
// An empty slice is obviously too small. Return a rough
|
|
|
|
|
// allocation estimate without bothering to STW. As long as
|
|
|
|
|
// this is close, then we'll only need to STW once (on the next
|
|
|
|
|
// call).
|
|
|
|
|
return int(gcount()), false
|
|
|
|
|
}
|
|
|
|
|
|
2022-02-18 10:56:16 -08:00
|
|
|
semacquire(&goroutineProfile.sema)
|
|
|
|
|
|
|
|
|
|
ourg := getg()
|
|
|
|
|
|
2024-05-17 15:07:07 +02:00
|
|
|
pcbuf := makeProfStack() // see saveg() for explanation
|
runtime/metrics: add STW stopping and total time metrics
This CL adds four new time histogram metrics:
/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-10 15:28:32 -04:00
|
|
|
stw := stopTheWorld(stwGoroutineProfile)
|
2022-02-18 10:56:16 -08:00
|
|
|
// Using gcount while the world is stopped should give us a consistent view
|
|
|
|
|
// of the number of live goroutines, minus the number of goroutines that are
|
|
|
|
|
// alive and permanently marked as "system". But to make this count agree
|
|
|
|
|
// with what we'd get from isSystemGoroutine, we need special handling for
|
|
|
|
|
// goroutines that can vary between user and system to ensure that the count
|
|
|
|
|
// doesn't change during the collection. So, check the finalizer goroutine
|
|
|
|
|
// in particular.
|
|
|
|
|
n = int(gcount())
|
2022-04-13 21:14:22 +08:00
|
|
|
if fingStatus.Load()&fingRunningFinalizer != 0 {
|
2022-02-18 10:56:16 -08:00
|
|
|
n++
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if n > len(p) {
|
|
|
|
|
// There's not enough space in p to store the whole profile, so (per the
|
|
|
|
|
// contract of runtime.GoroutineProfile) we're not allowed to write to p
|
|
|
|
|
// at all and must return n, false.
|
runtime/metrics: add STW stopping and total time metrics
This CL adds four new time histogram metrics:
/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-10 15:28:32 -04:00
|
|
|
startTheWorld(stw)
|
2022-02-18 10:56:16 -08:00
|
|
|
semrelease(&goroutineProfile.sema)
|
|
|
|
|
return n, false
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// Save current goroutine.
|
|
|
|
|
sp := getcallersp()
|
|
|
|
|
pc := getcallerpc()
|
|
|
|
|
systemstack(func() {
|
2024-05-17 15:07:07 +02:00
|
|
|
saveg(pc, sp, ourg, &p[0], pcbuf)
|
2022-02-18 10:56:16 -08:00
|
|
|
})
|
2023-10-24 13:10:13 -07:00
|
|
|
if labels != nil {
|
|
|
|
|
labels[0] = ourg.labels
|
|
|
|
|
}
|
2022-02-18 10:56:16 -08:00
|
|
|
ourg.goroutineProfiled.Store(goroutineProfileSatisfied)
|
|
|
|
|
goroutineProfile.offset.Store(1)
|
|
|
|
|
|
|
|
|
|
// Prepare for all other goroutines to enter the profile. Aside from ourg,
|
|
|
|
|
// every goroutine struct in the allgs list has its goroutineProfiled field
|
|
|
|
|
// cleared. Any goroutine created from this point on (while
|
|
|
|
|
// goroutineProfile.active is set) will start with its goroutineProfiled
|
|
|
|
|
// field set to goroutineProfileSatisfied.
|
|
|
|
|
goroutineProfile.active = true
|
|
|
|
|
goroutineProfile.records = p
|
|
|
|
|
goroutineProfile.labels = labels
|
2022-07-16 14:31:14 +00:00
|
|
|
// The finalizer goroutine needs special handling because it can vary over
|
2022-02-18 10:56:16 -08:00
|
|
|
// time between being a user goroutine (eligible for this profile) and a
|
|
|
|
|
// system goroutine (to be excluded). Pick one before restarting the world.
|
|
|
|
|
if fing != nil {
|
|
|
|
|
fing.goroutineProfiled.Store(goroutineProfileSatisfied)
|
2022-05-10 15:09:12 +00:00
|
|
|
if readgstatus(fing) != _Gdead && !isSystemGoroutine(fing, false) {
|
2024-05-17 15:07:07 +02:00
|
|
|
doRecordGoroutineProfile(fing, pcbuf)
|
2022-05-10 15:09:12 +00:00
|
|
|
}
|
2022-02-18 10:56:16 -08:00
|
|
|
}
|
runtime/metrics: add STW stopping and total time metrics
This CL adds four new time histogram metrics:
/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-10 15:28:32 -04:00
|
|
|
startTheWorld(stw)
|
2022-02-18 10:56:16 -08:00
|
|
|
|
|
|
|
|
// Visit each goroutine that existed as of the startTheWorld call above.
|
|
|
|
|
//
|
|
|
|
|
// New goroutines may not be in this list, but we didn't want to know about
|
|
|
|
|
// them anyway. If they do appear in this list (via reusing a dead goroutine
|
|
|
|
|
// struct, or racing to launch between the world restarting and us getting
|
2022-05-17 21:25:43 +00:00
|
|
|
// the list), they will already have their goroutineProfiled field set to
|
2022-02-18 10:56:16 -08:00
|
|
|
// goroutineProfileSatisfied before their state transitions out of _Gdead.
|
|
|
|
|
//
|
|
|
|
|
// Any goroutine that the scheduler tries to execute concurrently with this
|
|
|
|
|
// call will start by adding itself to the profile (before the act of
|
|
|
|
|
// executing can cause any changes in its stack).
|
|
|
|
|
forEachGRace(func(gp1 *g) {
|
2024-05-17 15:07:07 +02:00
|
|
|
tryRecordGoroutineProfile(gp1, pcbuf, Gosched)
|
2022-02-18 10:56:16 -08:00
|
|
|
})
|
|
|
|
|
|
runtime/metrics: add STW stopping and total time metrics
This CL adds four new time histogram metrics:
/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-10 15:28:32 -04:00
|
|
|
stw = stopTheWorld(stwGoroutineProfileCleanup)
|
2022-02-18 10:56:16 -08:00
|
|
|
endOffset := goroutineProfile.offset.Swap(0)
|
|
|
|
|
goroutineProfile.active = false
|
|
|
|
|
goroutineProfile.records = nil
|
|
|
|
|
goroutineProfile.labels = nil
|
runtime/metrics: add STW stopping and total time metrics
This CL adds four new time histogram metrics:
/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-10 15:28:32 -04:00
|
|
|
startTheWorld(stw)
|
2022-02-18 10:56:16 -08:00
|
|
|
|
|
|
|
|
// Restore the invariant that every goroutine struct in allgs has its
|
|
|
|
|
// goroutineProfiled field cleared.
|
|
|
|
|
forEachGRace(func(gp1 *g) {
|
|
|
|
|
gp1.goroutineProfiled.Store(goroutineProfileAbsent)
|
|
|
|
|
})
|
|
|
|
|
|
|
|
|
|
if raceenabled {
|
|
|
|
|
raceacquire(unsafe.Pointer(&labelSync))
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if n != int(endOffset) {
|
|
|
|
|
// It's a big surprise that the number of goroutines changed while we
|
|
|
|
|
// were collecting the profile. But probably better to return a
|
|
|
|
|
// truncated profile than to crash the whole process.
|
|
|
|
|
//
|
|
|
|
|
// For instance, needm moves a goroutine out of the _Gdead state and so
|
|
|
|
|
// might be able to change the goroutine count without interacting with
|
|
|
|
|
// the scheduler. For code like that, the race windows are small and the
|
|
|
|
|
// combination of features is uncommon, so it's hard to be (and remain)
|
|
|
|
|
// sure we've caught them all.
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
semrelease(&goroutineProfile.sema)
|
|
|
|
|
return n, true
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// tryRecordGoroutineProfileWB asserts that write barriers are allowed and calls
|
|
|
|
|
// tryRecordGoroutineProfile.
|
|
|
|
|
//
|
|
|
|
|
//go:yeswritebarrierrec
|
|
|
|
|
func tryRecordGoroutineProfileWB(gp1 *g) {
|
|
|
|
|
if getg().m.p.ptr() == nil {
|
|
|
|
|
throw("no P available, write barriers are forbidden")
|
|
|
|
|
}
|
2024-05-17 15:07:07 +02:00
|
|
|
tryRecordGoroutineProfile(gp1, nil, osyield)
|
2022-02-18 10:56:16 -08:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// tryRecordGoroutineProfile ensures that gp1 has the appropriate representation
|
|
|
|
|
// in the current goroutine profile: either that it should not be profiled, or
|
|
|
|
|
// that a snapshot of its call stack and labels are now in the profile.
|
2024-05-17 15:07:07 +02:00
|
|
|
func tryRecordGoroutineProfile(gp1 *g, pcbuf []uintptr, yield func()) {
|
2022-02-18 10:56:16 -08:00
|
|
|
if readgstatus(gp1) == _Gdead {
|
|
|
|
|
// Dead goroutines should not appear in the profile. Goroutines that
|
|
|
|
|
// start while profile collection is active will get goroutineProfiled
|
|
|
|
|
// set to goroutineProfileSatisfied before transitioning out of _Gdead,
|
|
|
|
|
// so here we check _Gdead first.
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
if isSystemGoroutine(gp1, true) {
|
|
|
|
|
// System goroutines should not appear in the profile. (The finalizer
|
|
|
|
|
// goroutine is marked as "already profiled".)
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
for {
|
|
|
|
|
prev := gp1.goroutineProfiled.Load()
|
|
|
|
|
if prev == goroutineProfileSatisfied {
|
|
|
|
|
// This goroutine is already in the profile (or is new since the
|
|
|
|
|
// start of collection, so shouldn't appear in the profile).
|
|
|
|
|
break
|
|
|
|
|
}
|
|
|
|
|
if prev == goroutineProfileInProgress {
|
|
|
|
|
// Something else is adding gp1 to the goroutine profile right now.
|
|
|
|
|
// Give that a moment to finish.
|
|
|
|
|
yield()
|
|
|
|
|
continue
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// While we have gp1.goroutineProfiled set to
|
|
|
|
|
// goroutineProfileInProgress, gp1 may appear _Grunnable but will not
|
|
|
|
|
// actually be able to run. Disable preemption for ourselves, to make
|
|
|
|
|
// sure we finish profiling gp1 right away instead of leaving it stuck
|
|
|
|
|
// in this limbo.
|
|
|
|
|
mp := acquirem()
|
|
|
|
|
if gp1.goroutineProfiled.CompareAndSwap(goroutineProfileAbsent, goroutineProfileInProgress) {
|
2024-05-17 15:07:07 +02:00
|
|
|
doRecordGoroutineProfile(gp1, pcbuf)
|
2022-02-18 10:56:16 -08:00
|
|
|
gp1.goroutineProfiled.Store(goroutineProfileSatisfied)
|
|
|
|
|
}
|
|
|
|
|
releasem(mp)
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// doRecordGoroutineProfile writes gp1's call stack and labels to an in-progress
|
|
|
|
|
// goroutine profile. Preemption is disabled.
|
|
|
|
|
//
|
|
|
|
|
// This may be called via tryRecordGoroutineProfile in two ways: by the
|
|
|
|
|
// goroutine that is coordinating the goroutine profile (running on its own
|
|
|
|
|
// stack), or from the scheduler in preparation to execute gp1 (running on the
|
|
|
|
|
// system stack).
|
2024-05-17 15:07:07 +02:00
|
|
|
func doRecordGoroutineProfile(gp1 *g, pcbuf []uintptr) {
|
2022-02-18 10:56:16 -08:00
|
|
|
if readgstatus(gp1) == _Grunning {
|
|
|
|
|
print("doRecordGoroutineProfile gp1=", gp1.goid, "\n")
|
|
|
|
|
throw("cannot read stack of running goroutine")
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
offset := int(goroutineProfile.offset.Add(1)) - 1
|
|
|
|
|
|
|
|
|
|
if offset >= len(goroutineProfile.records) {
|
|
|
|
|
// Should be impossible, but better to return a truncated profile than
|
|
|
|
|
// to crash the entire process at this point. Instead, deal with it in
|
|
|
|
|
// goroutineProfileWithLabelsConcurrent where we have more context.
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
// saveg calls gentraceback, which may call cgo traceback functions. When
|
|
|
|
|
// called from the scheduler, this is on the system stack already so
|
|
|
|
|
// traceback.go:cgoContextPCs will avoid calling back into the scheduler.
|
|
|
|
|
//
|
|
|
|
|
// When called from the goroutine coordinating the profile, we still have
|
|
|
|
|
// set gp1.goroutineProfiled to goroutineProfileInProgress and so are still
|
|
|
|
|
// preventing it from being truly _Grunnable. So we'll use the system stack
|
|
|
|
|
// to avoid schedule delays.
|
2024-05-17 15:07:07 +02:00
|
|
|
systemstack(func() { saveg(^uintptr(0), ^uintptr(0), gp1, &goroutineProfile.records[offset], pcbuf) })
|
2022-02-18 10:56:16 -08:00
|
|
|
|
|
|
|
|
if goroutineProfile.labels != nil {
|
|
|
|
|
goroutineProfile.labels[offset] = gp1.labels
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2024-05-17 15:07:07 +02:00
|
|
|
func goroutineProfileWithLabelsSync(p []profilerecord.StackRecord, labels []unsafe.Pointer) (n int, ok bool) {
|
2016-01-26 22:58:59 -05:00
|
|
|
gp := getg()
|
|
|
|
|
|
|
|
|
|
isOK := func(gp1 *g) bool {
|
|
|
|
|
// Checking isSystemGoroutine here makes GoroutineProfile
|
|
|
|
|
// consistent with both NumGoroutine and Stack.
|
2018-08-13 16:08:03 -04:00
|
|
|
return gp1 != gp && readgstatus(gp1) != _Gdead && !isSystemGoroutine(gp1, false)
|
2016-01-26 22:58:59 -05:00
|
|
|
}
|
|
|
|
|
|
2024-05-17 15:07:07 +02:00
|
|
|
pcbuf := makeProfStack() // see saveg() for explanation
|
runtime/metrics: add STW stopping and total time metrics
This CL adds four new time histogram metrics:
/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-10 15:28:32 -04:00
|
|
|
stw := stopTheWorld(stwGoroutineProfile)
|
2016-01-26 22:58:59 -05:00
|
|
|
|
2020-12-23 15:05:37 -05:00
|
|
|
// World is stopped, no locking required.
|
2016-01-26 22:58:59 -05:00
|
|
|
n = 1
|
2020-12-23 15:05:37 -05:00
|
|
|
forEachGRace(func(gp1 *g) {
|
2016-01-26 22:58:59 -05:00
|
|
|
if isOK(gp1) {
|
|
|
|
|
n++
|
|
|
|
|
}
|
2020-12-23 15:05:37 -05:00
|
|
|
})
|
2014-09-01 00:06:26 -04:00
|
|
|
|
2014-09-01 18:51:12 -04:00
|
|
|
if n <= len(p) {
|
2016-01-26 22:58:59 -05:00
|
|
|
ok = true
|
2019-08-04 15:14:48 -04:00
|
|
|
r, lbl := p, labels
|
2016-01-26 22:58:59 -05:00
|
|
|
|
|
|
|
|
// Save current goroutine.
|
2018-04-26 14:06:08 -04:00
|
|
|
sp := getcallersp()
|
2017-09-22 15:16:26 -04:00
|
|
|
pc := getcallerpc()
|
2016-01-26 22:58:59 -05:00
|
|
|
systemstack(func() {
|
2024-05-17 15:07:07 +02:00
|
|
|
saveg(pc, sp, gp, &r[0], pcbuf)
|
2016-01-26 22:58:59 -05:00
|
|
|
})
|
|
|
|
|
r = r[1:]
|
|
|
|
|
|
2019-08-04 15:14:48 -04:00
|
|
|
// If we have a place to put our goroutine labelmap, insert it there.
|
|
|
|
|
if labels != nil {
|
|
|
|
|
lbl[0] = gp.labels
|
|
|
|
|
lbl = lbl[1:]
|
|
|
|
|
}
|
|
|
|
|
|
2016-01-26 22:58:59 -05:00
|
|
|
// Save other goroutines.
|
2020-12-23 15:05:37 -05:00
|
|
|
forEachGRace(func(gp1 *g) {
|
|
|
|
|
if !isOK(gp1) {
|
|
|
|
|
return
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
2020-12-23 15:05:37 -05:00
|
|
|
|
|
|
|
|
if len(r) == 0 {
|
|
|
|
|
// Should be impossible, but better to return a
|
|
|
|
|
// truncated profile than to crash the entire process.
|
|
|
|
|
return
|
|
|
|
|
}
|
2021-11-09 19:50:47 -05:00
|
|
|
// saveg calls gentraceback, which may call cgo traceback functions.
|
|
|
|
|
// The world is stopped, so it cannot use cgocall (which will be
|
|
|
|
|
// blocked at exitsyscall). Do it on the system stack so it won't
|
|
|
|
|
// call into the schedular (see traceback.go:cgoContextPCs).
|
2024-05-17 15:07:07 +02:00
|
|
|
systemstack(func() { saveg(^uintptr(0), ^uintptr(0), gp1, &r[0], pcbuf) })
|
2020-12-23 15:05:37 -05:00
|
|
|
if labels != nil {
|
|
|
|
|
lbl[0] = gp1.labels
|
|
|
|
|
lbl = lbl[1:]
|
|
|
|
|
}
|
|
|
|
|
r = r[1:]
|
|
|
|
|
})
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
|
|
|
|
|
2022-02-14 12:16:22 -08:00
|
|
|
if raceenabled {
|
|
|
|
|
raceacquire(unsafe.Pointer(&labelSync))
|
|
|
|
|
}
|
|
|
|
|
|
runtime/metrics: add STW stopping and total time metrics
This CL adds four new time histogram metrics:
/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-10 15:28:32 -04:00
|
|
|
startTheWorld(stw)
|
2014-09-01 18:51:12 -04:00
|
|
|
return n, ok
|
|
|
|
|
}
|
2014-09-01 00:06:26 -04:00
|
|
|
|
2019-08-04 15:14:48 -04:00
|
|
|
// GoroutineProfile returns n, the number of records in the active goroutine stack profile.
|
|
|
|
|
// If len(p) >= n, GoroutineProfile copies the profile into p and returns n, true.
|
|
|
|
|
// If len(p) < n, GoroutineProfile does not change p and returns n, false.
|
|
|
|
|
//
|
2023-11-07 17:35:46 +08:00
|
|
|
// Most clients should use the [runtime/pprof] package instead
|
2019-08-04 15:14:48 -04:00
|
|
|
// of calling GoroutineProfile directly.
|
|
|
|
|
func GoroutineProfile(p []StackRecord) (n int, ok bool) {
|
2024-05-17 15:07:07 +02:00
|
|
|
records := make([]profilerecord.StackRecord, len(p))
|
|
|
|
|
n, ok = goroutineProfileInternal(records)
|
|
|
|
|
if !ok {
|
|
|
|
|
return
|
|
|
|
|
}
|
|
|
|
|
for i, mr := range records[0:n] {
|
|
|
|
|
copy(p[i].Stack0[:], mr.Stack)
|
|
|
|
|
}
|
|
|
|
|
return
|
|
|
|
|
}
|
2019-08-04 15:14:48 -04:00
|
|
|
|
2024-05-17 15:07:07 +02:00
|
|
|
func goroutineProfileInternal(p []profilerecord.StackRecord) (n int, ok bool) {
|
2019-08-04 15:14:48 -04:00
|
|
|
return goroutineProfileWithLabels(p, nil)
|
|
|
|
|
}
|
|
|
|
|
|
2024-05-17 15:07:07 +02:00
|
|
|
func saveg(pc, sp uintptr, gp *g, r *profilerecord.StackRecord, pcbuf []uintptr) {
|
|
|
|
|
// To reduce memory usage, we want to allocate a r.Stack that is just big
|
|
|
|
|
// enough to hold gp's stack trace. Naively we might achieve this by
|
|
|
|
|
// recording our stack trace into mp.profStack, and then allocating a
|
|
|
|
|
// r.Stack of the right size. However, mp.profStack is also used for
|
|
|
|
|
// allocation profiling, so it could get overwritten if the slice allocation
|
|
|
|
|
// gets profiled. So instead we record the stack trace into a temporary
|
|
|
|
|
// pcbuf which is usually given to us by our caller. When it's not, we have
|
|
|
|
|
// to allocate one here. This will only happen for goroutines that were in a
|
|
|
|
|
// syscall when the goroutine profile started or for goroutines that manage
|
|
|
|
|
// to execute before we finish iterating over all the goroutines.
|
|
|
|
|
if pcbuf == nil {
|
|
|
|
|
pcbuf = makeProfStack()
|
|
|
|
|
}
|
|
|
|
|
|
2023-02-14 12:25:11 -05:00
|
|
|
var u unwinder
|
|
|
|
|
u.initAt(pc, sp, 0, gp, unwindSilentErrors)
|
2024-05-17 15:07:07 +02:00
|
|
|
n := tracebackPCs(&u, 0, pcbuf)
|
|
|
|
|
r.Stack = make([]uintptr, n)
|
|
|
|
|
copy(r.Stack, pcbuf)
|
2014-09-01 00:06:26 -04:00
|
|
|
}
|
|
|
|
|
|
2014-08-26 08:34:46 +02:00
|
|
|
// Stack formats a stack trace of the calling goroutine into buf
|
|
|
|
|
// and returns the number of bytes written to buf.
|
|
|
|
|
// If all is true, Stack formats stack traces of all other goroutines
|
|
|
|
|
// into buf after the trace for the current goroutine.
|
|
|
|
|
func Stack(buf []byte, all bool) int {
|
runtime/metrics: add STW stopping and total time metrics
This CL adds four new time histogram metrics:
/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-10 15:28:32 -04:00
|
|
|
var stw worldStop
|
2014-08-26 08:34:46 +02:00
|
|
|
if all {
|
runtime/metrics: add STW stopping and total time metrics
This CL adds four new time histogram metrics:
/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-10 15:28:32 -04:00
|
|
|
stw = stopTheWorld(stwAllGoroutinesStack)
|
2014-08-26 08:34:46 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
n := 0
|
|
|
|
|
if len(buf) > 0 {
|
2014-12-15 14:39:28 -08:00
|
|
|
gp := getg()
|
2018-04-26 14:06:08 -04:00
|
|
|
sp := getcallersp()
|
2017-09-22 15:16:26 -04:00
|
|
|
pc := getcallerpc()
|
[dev.cc] runtime: delete scalararg, ptrarg; rename onM to systemstack
Scalararg and ptrarg are not "signal safe".
Go code filling them out can be interrupted by a signal,
and then the signal handler runs, and if it also ends up
in Go code that uses scalararg or ptrarg, now the old
values have been smashed.
For the pieces of code that do need to run in a signal handler,
we introduced onM_signalok, which is really just onM
except that the _signalok is meant to convey that the caller
asserts that scalarg and ptrarg will be restored to their old
values after the call (instead of the usual behavior, zeroing them).
Scalararg and ptrarg are also untyped and therefore error-prone.
Go code can always pass a closure instead of using scalararg
and ptrarg; they were only really necessary for C code.
And there's no more C code.
For all these reasons, delete scalararg and ptrarg, converting
the few remaining references to use closures.
Once those are gone, there is no need for a distinction between
onM and onM_signalok, so replace both with a single function
equivalent to the current onM_signalok (that is, it can be called
on any of the curg, g0, and gsignal stacks).
The name onM and the phrase 'm stack' are misnomers,
because on most system an M has two system stacks:
the main thread stack and the signal handling stack.
Correct the misnomer by naming the replacement function systemstack.
Fix a few references to "M stack" in code.
The main motivation for this change is to eliminate scalararg/ptrarg.
Rick and I have already seen them cause problems because
the calling sequence m.ptrarg[0] = p is a heap pointer assignment,
so it gets a write barrier. The write barrier also uses onM, so it has
all the same problems as if it were being invoked by a signal handler.
We worked around this by saving and restoring the old values
and by calling onM_signalok, but there's no point in keeping this nice
home for bugs around any longer.
This CL also changes funcline to return the file name as a result
instead of filling in a passed-in *string. (The *string signature is
left over from when the code was written in and called from C.)
That's arguably an unrelated change, except that once I had done
the ptrarg/scalararg/onM cleanup I started getting false positives
about the *string argument escaping (not allowed in package runtime).
The compiler is wrong, but the easiest fix is to write the code like
Go code instead of like C code. I am a bit worried that the compiler
is wrong because of some use of uninitialized memory in the escape
analysis. If that's the reason, it will go away when we convert the
compiler to Go. (And if not, we'll debug it the next time.)
LGTM=khr
R=r, khr
CC=austin, golang-codereviews, iant, rlh
https://golang.org/cl/174950043
2014-11-12 14:54:31 -05:00
|
|
|
systemstack(func() {
|
runtime: avoid gentraceback of self on user goroutine stack
Gentraceback may grow the stack.
One of the gentraceback wrappers may grow the stack.
One of the gentraceback callback calls may grow the stack.
Various stack pointers are stored in various stack locations
as type uintptr during the execution of these calls.
If the stack does grow, these stack pointers will not be
updated and will start trying to decode stack memory that
is no longer valid.
It may be possible to change the type of the stack pointer
variables to be unsafe.Pointer, but that's pretty subtle and
may still have problems, even if we catch every last one.
An easier, more obviously correct fix is to require that
gentraceback of the currently running goroutine must run
on the g0 stack, not on the goroutine's own stack.
Not doing this causes faults when you set
StackFromSystem = 1
StackFaultOnFree = 1
The new check in gentraceback will catch future lapses.
The more general problem is calling getcallersp but then
calling a function that might relocate the stack, which would
invalidate the result of getcallersp. Add note to stubs.go
declaration of getcallersp explaining the problem, and
check all existing calls to getcallersp. Most needed fixes.
This affects Callers, Stack, and nearly all the runtime
profiling routines. It does not affect stack copying directly
nor garbage collection.
LGTM=khr
R=khr, bradfitz
CC=golang-codereviews, r
https://golang.org/cl/167060043
2014-11-05 23:01:48 -05:00
|
|
|
g0 := getg()
|
2016-01-06 21:16:01 -05:00
|
|
|
// Force traceback=1 to override GOTRACEBACK setting,
|
|
|
|
|
// so that Stack's results are consistent.
|
|
|
|
|
// GOTRACEBACK is only about crash dumps.
|
|
|
|
|
g0.m.traceback = 1
|
runtime: avoid gentraceback of self on user goroutine stack
Gentraceback may grow the stack.
One of the gentraceback wrappers may grow the stack.
One of the gentraceback callback calls may grow the stack.
Various stack pointers are stored in various stack locations
as type uintptr during the execution of these calls.
If the stack does grow, these stack pointers will not be
updated and will start trying to decode stack memory that
is no longer valid.
It may be possible to change the type of the stack pointer
variables to be unsafe.Pointer, but that's pretty subtle and
may still have problems, even if we catch every last one.
An easier, more obviously correct fix is to require that
gentraceback of the currently running goroutine must run
on the g0 stack, not on the goroutine's own stack.
Not doing this causes faults when you set
StackFromSystem = 1
StackFaultOnFree = 1
The new check in gentraceback will catch future lapses.
The more general problem is calling getcallersp but then
calling a function that might relocate the stack, which would
invalidate the result of getcallersp. Add note to stubs.go
declaration of getcallersp explaining the problem, and
check all existing calls to getcallersp. Most needed fixes.
This affects Callers, Stack, and nearly all the runtime
profiling routines. It does not affect stack copying directly
nor garbage collection.
LGTM=khr
R=khr, bradfitz
CC=golang-codereviews, r
https://golang.org/cl/167060043
2014-11-05 23:01:48 -05:00
|
|
|
g0.writebuf = buf[0:0:len(buf)]
|
|
|
|
|
goroutineheader(gp)
|
|
|
|
|
traceback(pc, sp, 0, gp)
|
|
|
|
|
if all {
|
|
|
|
|
tracebackothers(gp)
|
|
|
|
|
}
|
2016-01-06 21:16:01 -05:00
|
|
|
g0.m.traceback = 0
|
runtime: avoid gentraceback of self on user goroutine stack
Gentraceback may grow the stack.
One of the gentraceback wrappers may grow the stack.
One of the gentraceback callback calls may grow the stack.
Various stack pointers are stored in various stack locations
as type uintptr during the execution of these calls.
If the stack does grow, these stack pointers will not be
updated and will start trying to decode stack memory that
is no longer valid.
It may be possible to change the type of the stack pointer
variables to be unsafe.Pointer, but that's pretty subtle and
may still have problems, even if we catch every last one.
An easier, more obviously correct fix is to require that
gentraceback of the currently running goroutine must run
on the g0 stack, not on the goroutine's own stack.
Not doing this causes faults when you set
StackFromSystem = 1
StackFaultOnFree = 1
The new check in gentraceback will catch future lapses.
The more general problem is calling getcallersp but then
calling a function that might relocate the stack, which would
invalidate the result of getcallersp. Add note to stubs.go
declaration of getcallersp explaining the problem, and
check all existing calls to getcallersp. Most needed fixes.
This affects Callers, Stack, and nearly all the runtime
profiling routines. It does not affect stack copying directly
nor garbage collection.
LGTM=khr
R=khr, bradfitz
CC=golang-codereviews, r
https://golang.org/cl/167060043
2014-11-05 23:01:48 -05:00
|
|
|
n = len(g0.writebuf)
|
|
|
|
|
g0.writebuf = nil
|
|
|
|
|
})
|
2014-08-26 08:34:46 +02:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if all {
|
runtime/metrics: add STW stopping and total time metrics
This CL adds four new time histogram metrics:
/sched/pauses/stopping/gc:seconds
/sched/pauses/stopping/other:seconds
/sched/pauses/total/gc:seconds
/sched/pauses/total/other:seconds
The "stopping" metrics measure the time taken to start a stop-the-world
pause. i.e., how long it takes stopTheWorldWithSema to stop all Ps.
This can be used to detect STW struggling to preempt Ps.
The "total" metrics measure the total duration of a stop-the-world
pause, from starting to stop-the-world until the world is started again.
This includes the time spent in the "start" phase.
The "gc" metrics are used for GC-related STW pauses. The "other" metrics
are used for all other STW pauses.
All of these metrics start timing in stopTheWorldWithSema only after
successfully acquiring sched.lock, thus excluding lock contention on
sched.lock. The reasoning behind this is that while waiting on
sched.lock the world is not stopped at all (all other Ps can run), so
the impact of this contention is primarily limited to the goroutine
attempting to stop-the-world. Additionally, we already have some
visibility into sched.lock contention via contention profiles (#57071).
/sched/pauses/total/gc:seconds is conceptually equivalent to
/gc/pauses:seconds, so the latter is marked as deprecated and returns
the same histogram as the former.
In the implementation, there are a few minor differences:
* For both mark and sweep termination stops, /gc/pauses:seconds started
timing prior to calling startTheWorldWithSema, thus including lock
contention.
These details are minor enough, that I do not believe the slight change
in reporting will matter. For mark termination stops, moving timing stop
into startTheWorldWithSema does have the side effect of requiring moving
other GC metric calculations outside of the STW, as they depend on the
same end time.
Fixes #63340
Change-Id: Iacd0bab11bedab85d3dcfb982361413a7d9c0d05
Reviewed-on: https://go-review.googlesource.com/c/go/+/534161
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Pratt <mpratt@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-10 15:28:32 -04:00
|
|
|
startTheWorld(stw)
|
2014-08-26 08:34:46 +02:00
|
|
|
}
|
|
|
|
|
return n
|
|
|
|
|
}
|