Ignored these linknames which have not worked for a while:
github.com/xtls/xray-core:
context.newCancelCtx removed in CL 463999 (Feb 2023)
github.com/u-root/u-root:
funcPC removed in CL 513837 (Jul 2023)
tinygo.org/x/drivers:
net.useNetdev never existed
For #67401.
Change-Id: I9293f4ef197bb5552b431de8939fa94988a060ce
Reviewed-on: https://go-review.googlesource.com/c/go/+/587576
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Note that this depends on the revert of CL 581395 to move zeroVal back.
For #67401.
Change-Id: I507c27c2404ad1348aabf1ffa3740e6b1957495b
Reviewed-on: https://go-review.googlesource.com/c/go/+/587217
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Currently bulkBarrierPreWrite follows a fairly slow path wherein it
calls typePointersOf, which ends up calling into fastForward. This does
some fairly heavy computation to move the iterator forward without any
assumptions about where it lands at all. It needs to be completely
general to support splitting at arbitrary boundaries, for example for
scanning oblets.
This means that copying objects during the GC mark phase is fairly
expensive, and is a regression from before allocheaders.
However, in almost all cases bulkBarrierPreWrite and
bulkBarrierPreWriteSrcOnly have perfect type information. We can do a
lot better in these cases because we're starting on a type-size
boundary, which is exactly what the iterator is built around.
This change adds the typePointersOfType method which produces a
typePointers iterator from a pointer and a type. This change
significantly improves the performance of these bulk write barriers,
eliminating some performance regressions that were noticed on the perf
dashboard.
There are still just a couple cases where we have to use the more
general typePointersOf calls, but they're fairly rare; most bulk
barriers have perfect type information.
This change is tested by the GCInfo tests in the runtime and the GCBits
tests in the reflect package via an additional check in getgcmask.
Results for tile38 before and after allocheaders. There was previous a
regression in the p90, now it's gone. Also, the overall win has been
boosted slightly.
tile38 $ benchstat noallocheaders.results allocheaders.results
name old time/op new time/op delta
Tile38QueryLoad 481µs ± 1% 468µs ± 1% -2.71% (p=0.000 n=10+10)
name old average-RSS-bytes new average-RSS-bytes delta
Tile38QueryLoad 6.32GB ± 1% 6.23GB ± 0% -1.38% (p=0.000 n=9+8)
name old peak-RSS-bytes new peak-RSS-bytes delta
Tile38QueryLoad 6.49GB ± 1% 6.40GB ± 1% -1.38% (p=0.002 n=10+10)
name old peak-VM-bytes new peak-VM-bytes delta
Tile38QueryLoad 7.72GB ± 1% 7.64GB ± 1% -1.07% (p=0.007 n=10+10)
name old p50-latency-ns new p50-latency-ns delta
Tile38QueryLoad 212k ± 1% 205k ± 0% -3.02% (p=0.000 n=10+9)
name old p90-latency-ns new p90-latency-ns delta
Tile38QueryLoad 622k ± 1% 616k ± 1% -1.03% (p=0.005 n=10+10)
name old p99-latency-ns new p99-latency-ns delta
Tile38QueryLoad 4.55M ± 2% 4.39M ± 2% -3.51% (p=0.000 n=10+10)
name old ops/s new ops/s delta
Tile38QueryLoad 12.5k ± 1% 12.8k ± 1% +2.78% (p=0.000 n=10+10)
Change-Id: I0a48f848eae8777d0fd6769c3a1fe449f8d9d0a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/542219
Reviewed-by: Cherry Mui <cherryyz@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
This change replaces the 1-bit-per-word heap bitmap for most size
classes with allocation headers for objects that contain pointers. The
header consists of a single pointer to a type. All allocations with
headers are treated as implicitly containing one or more instances of
the type in the header.
As the name implies, headers are usually stored as the first word of an
object. There are two additional exceptions to where headers are stored
and how they're used.
Objects smaller than 512 bytes do not have headers. Instead, a heap
bitmap is reserved at the end of spans for objects of this size. A full
word of overhead is too much for these small objects. The bitmap is of
the same format of the old bitmap, minus the noMorePtrs bits which are
unnecessary. All the objects <512 bytes have a bitmap less than a
pointer-word in size, and that was the granularity at which noMorePtrs
could stop scanning early anyway.
Objects that are larger than 32 KiB (which have their own span) have
their headers stored directly in the span, to allow power-of-two-sized
allocations to not spill over into an extra page.
The full implementation is behind GOEXPERIMENT=allocheaders.
The purpose of this change is performance. First and foremost, with
headers we no longer have to unroll pointer/scalar data at allocation
time for most size classes. Small size classes still need some
unrolling, but their bitmaps are small so we can optimize that case
fairly well. Larger objects effectively have their pointer/scalar data
unrolled on-demand from type data, which is much more compactly
represented and results in less TLB pressure. Furthermore, since the
headers are usually right next to the object and where we're about to
start scanning, we get an additional temporal locality benefit in the
data cache when looking up type metadata. The pointer/scalar data is
now effectively unrolled on-demand, but it's also simpler to unroll than
before; that unrolled data is never written anywhere, and for arrays we
get the benefit of retreading the same data per element, as opposed to
looking it up from scratch for each pointer-word of bitmap. Lastly,
because we no longer have a heap bitmap that spans the entire heap,
there's a flat 1.5% memory use reduction. This is balanced slightly by
some objects possibly being bumped up a size class, but most objects are
not tightly optimized to size class sizes so there's some memory to
spare, making the header basically free in those cases.
See the follow-up CL which turns on this experiment by default for
benchmark results. (CL 538217.)
Change-Id: I4c9034ee200650d06d8bdecd579d5f7c1bbf1fc5
Reviewed-on: https://go-review.googlesource.com/c/go/+/437955
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
This allows to reuse the slice cap computation across
specialized growslice funcs.
Updates #49480
Change-Id: Ie075d9c3075659ea14c11d51a9cd4ed46aa0e961
Reviewed-on: https://go-review.googlesource.com/c/go/+/495876
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Egon Elbre <egonelbre@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
This is tiny optimization for growslice, which is probably too small to
measure easily.
Move the for loop to avoid multiple checks inside the loop.
Also, use >> 2 instead of /4, which generates fewer instructions.
Change-Id: I9ab09bdccb56f98ab22073f23d9e102c252238c7
Reviewed-on: https://go-review.googlesource.com/c/go/+/493795
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Egon Elbre <egonelbre@gmail.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
This touches a lot of files, which is bad, but it is also good,
since there's N copies of this information commoned into 1.
The new files in internal/abi are copied from the end of the stack;
ultimately this will all end up being used.
Change-Id: Ia252c0055aaa72ca569411ef9f9e96e3d610889e
Reviewed-on: https://go-review.googlesource.com/c/go/+/462995
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Replace all uses of Ctz64/32/8 with TrailingZeros64/32/8, because they
are the same and maybe duplicated. Also renamed CtzXX functions in 386
assembly code.
Change-Id: I19290204858083750f4be589bb0923393950ae6d
Reviewed-on: https://go-review.googlesource.com/c/go/+/438935
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
Auto-Submit: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
The Grow method is like the proposed slices.Grow function
in that it ensures that the slice has enough capacity to append
n elements without allocating.
The implementation of Grow is a thin wrapper over runtime.growslice.
This also changes Append and AppendSlice to use growslice under the hood.
Fixes#48000
Change-Id: I992a58584a2ff1448c1c2bc0877fe76073609111
Reviewed-on: https://go-review.googlesource.com/c/go/+/389635
Run-TryBot: Joseph Tsai <joetsai@digital-static.net>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Instead of passing the original length and the new length, pass
the new length and the length increment. Also use the new length
in all the post-growslice calculations so that the original length
is dead and does not need to be spilled/restored around the growslice.
old: growslice(typ, oldPtr, oldLen, oldCap, newLen) (newPtr, newLen, newCap)
new: growslice(oldPtr, newLen, oldCap, inc, typ) (newPtr, newLen, newCap)
where inc = # of elements added = newLen-oldLen
Also move the element type to the end of the call. This makes register
allocation more efficient, as oldPtr and newPtr can often be in the
same register (e.g. AX on amd64) and thus the phi takes no instructions.
Makes the go binary 0.3% smaller.
Change-Id: I7295a60227dbbeecec2bf039eeef2950a72df760
Reviewed-on: https://go-review.googlesource.com/c/go/+/418554
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
[this is a retry of CL 407035 + its revert CL 422395. The content is unchanged]
Use just 1 bit per word to record the ptr/nonptr bitmap.
Use word-sized operations to manipulate the bitmap, so we can operate
on up to 64 ptr/nonptr bits at a time.
Use a separate bitmap, one bit per word of the ptr/nonptr bitmap,
to encode a no-more-pointers signal. Since we can check 64 ptr/nonptr
bits at once, knowing the exact last pointer location is not necessary.
As a followon CL, we should make the gcdata bitmap an array of
uintptr instead of an array of byte, so we can load 64 bits of it at once.
Similarly for the processing of gc programs.
Change-Id: Ica5eb622f5b87e647be64f471d67b02732ef8be6
Reviewed-on: https://go-review.googlesource.com/c/go/+/422634
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
This reverts commit b589208c8c.
Reason for revert: Bug somewhere in this code, causing wasm and maybe linux/386 to fail.
Change-Id: I5e1e501d839584e0219271bb937e94348f83c11f
Reviewed-on: https://go-review.googlesource.com/c/go/+/422395
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
There's no need for a and b to match types. The typechecker already
ensured that a and b are both slices with the same base type, or
a and b are (possibly named) []byte and string.
The optimization to treat append(b, make([], ...)) as a zeroing
slice extension doesn't fire when there's a OCONVNOP wrapping the make.
Fixes#53888
Change-Id: Ied871ed0bbb8e4a4b35d280c71acbab8103691bc
Reviewed-on: https://go-review.googlesource.com/c/go/+/418475
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Keith Randall <khr@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Use just 1 bit per word to record the ptr/nonptr bitmap.
Use word-sized operations to manipulate the bitmap, so we can operate
on up to 64 ptr/nonptr bits at a time.
Use a separate bitmap, one bit per word of the ptr/nonptr bitmap,
to encode a no-more-pointers signal. Since we can check 64 ptr/nonptr
bits at once, knowing the exact last pointer location is not necessary.
This cleans up the bitmap implementation significantly, which will
hopefully make it faster. TODO: measure
As a followon CL, we should make the gcdata bitmap an array of
uintptr instead of an array of byte, so we can load 64 bits of it at once.
Similarly for the processing of gc programs.
Change-Id: I18151b1876d9543599800dec51e2a1b19df97d49
Reviewed-on: https://go-review.googlesource.com/c/go/+/407035
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@google.com>
So prevent heavy runtime call overhead, and the compiler will have a
chance to optimize the bound check.
With this optimization, changing runtime/stack.go to use unsafe.Slice
no longer negatively impacts stack copying performance:
name old time/op new time/op delta
StackCopyWithStkobj-8 16.3ms ± 6% 16.5ms ± 5% ~ (p=0.382 n=8+8)
name old alloc/op new alloc/op delta
StackCopyWithStkobj-8 17.0B ± 0% 17.0B ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
StackCopyWithStkobj-8 1.00 ± 0% 1.00 ± 0% ~ (all equal)
Fixes#48798
Change-Id: I731a9a4abd6dd6846f44eece7f86025b7bb1141b
Reviewed-on: https://go-review.googlesource.com/c/go/+/362934
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
The code was updated, the comments were not.
Change-Id: If387779f3abd5e8a1b487fe34c33dcf9ce5fa7ff
Reviewed-on: https://go-review.googlesource.com/c/go/+/383495
Trust: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Add explicit address sanitizer instrumentation to the runtime and
syscall packages. The compiler does not instrument the runtime
package. It does instrument the syscall package, but we need to add
a couple of cases that it can't see.
Refer to the implementation of the asan malloc runtime library,
this patch also allocates extra memory as the redzone, around the
returned memory region, and marks the redzone as unaddressable to
detect the overflows or underflows.
Updates #44853.
Change-Id: I2753d1cc1296935a66bf521e31ce91e35fcdf798
Reviewed-on: https://go-review.googlesource.com/c/go/+/298614
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Trust: fannie zhang <Fannie.Zhang@arm.com>
This reduces the number of branches to bounds check non-empty slices
from 5 to 3. It does also increase the number of branches to handle
empty slices from 1 to 3; but for non-panicking calls, they should all
be predictable.
Updates #48798.
Change-Id: I3ffa66857096486f4dee417e1a66eb8fdf7a3777
Reviewed-on: https://go-review.googlesource.com/c/go/+/355490
Trust: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Allow the user to construct slices that are larger than the Go heap as
long as they don't overflow the address space.
Updates #48798.
Change-Id: I659c8334d04676e1f253b9c3cd499eab9b9f989a
Reviewed-on: https://go-review.googlesource.com/c/go/+/355489
Trust: Matthew Dempsky <mdempsky@google.com>
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Instead of growing 2x for < 1024 elements and 1.25x for >= 1024 elements,
use a somewhat smoother formula for the growth factor. Start reducing
the growth factor after 256 elements, but slowly.
starting cap growth factor
256 2.0
512 1.63
1024 1.44
2048 1.35
4096 1.30
(Note that the real growth factor, both before and now, is somewhat
larger because we round up to the next size class.)
This CL also makes the growth monotonic (larger initial capacities
make larger final capacities, which was not true before). See discussion
at https://groups.google.com/g/golang-nuts/c/UaVlMQ8Nz3o
256 was chosen as the threshold to roughly match the total number of
reallocations when appending to eventually make a very large
slice. (We allocate smaller when appending to capacities [256,1024]
and larger with capacities [1024,...]).
Change-Id: I876df09fdc9ae911bb94e41cb62675229cb10512
Reviewed-on: https://go-review.googlesource.com/c/go/+/347917
Trust: Keith Randall <khr@golang.org>
Trust: Martin Möhrmann <martin@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Martin Möhrmann <martin@golang.org>
Conflicts:
- src/cmd/compile/internal/walk/builtin.go
On dev.typeparams, CL 330194 changed OCHECKNIL to not require manual
SetTypecheck(1) anymore; while on master, CL 331070 got rid of the
OCHECKNIL altogether by moving the check into the runtime support
functions.
- src/internal/buildcfg/exp.go
On master, CL 331109 refactored the logic for parsing the
GOEXPERIMENT string, so that it could be more easily reused by
cmd/go; while on dev.typeparams, several CLs tweaked the regabi
experiment defaults.
Merge List:
+ 2021-06-30 4711bf30e5 doc/go1.17: linkify "language changes" in the runtime section
+ 2021-06-30 ed56ea73e8 path/filepath: deflake TestEvalSymlinksAboveRoot on darwin
+ 2021-06-30 c080d0323b cmd/dist: pass -Wno-unknown-warning-option in swig_callback_lto
+ 2021-06-30 7d0e9e6e74 image/gif: fix typo in the comment (io.ReadByte -> io.ByteReader)
+ 2021-06-30 0fa3265fe1 os: change example to avoid deprecated function
+ 2021-06-30 d19a53338f image: add Uniform.RGBA64At and Rectangle.RGBA64At
+ 2021-06-30 c45e800e0c crypto/x509: don't fail on optional auth key id fields
+ 2021-06-29 f9d50953b9 net: fix failure of TestCVE202133195
+ 2021-06-29 e294b8a49e doc/go1.17: fix typo "MacOS" -> "macOS"
+ 2021-06-29 3463852b76 math/big: fix typo of comment (`BytesScanner` to `ByteScanner`)
+ 2021-06-29 fd4b587da3 cmd/compile: suppress details error for invalid variadic argument type
+ 2021-06-29 e2e05af6e1 cmd/internal/obj/arm64: fix an encoding error of CMPW instruction
+ 2021-06-28 4bb0847b08 cmd/compile,runtime: change unsafe.Slice((*T)(nil), 0) to return []T(nil)
+ 2021-06-28 1519271a93 spec: change unsafe.Slice((*T)(nil), 0) to return []T(nil)
+ 2021-06-28 5385e2386b runtime/internal/atomic: drop Cas64 pointer indirection in comments
+ 2021-06-28 956c81bfe6 cmd/go: add GOEXPERIMENT to `go env` output
+ 2021-06-28 a1d27269d6 cmd/go: prep for 'go env' refactoring
+ 2021-06-28 901510ed4e cmd/link/internal/ld: skip the windows ASLR test when CGO_ENABLED=0
+ 2021-06-28 361159c055 cmd/cgo: fix 'see gmp.go' to 'see doc.go'
+ 2021-06-27 c95464f0ea internal/buildcfg: refactor GOEXPERIMENT parsing code somewhat
+ 2021-06-25 ed01ceaf48 runtime/race: use race build tag on syso_test.go
+ 2021-06-25 d1916e5e84 go/types: in TestCheck/issues.src, import regexp/syntax instead of cmd/compile/internal/syntax
+ 2021-06-25 5160896c69 go/types: in TestStdlib, import from source instead of export data
+ 2021-06-25 d01bc571f7 runtime: make ncgocall a global counter
Change-Id: I1ce4a3b3ff7c824c67ad66dd27d9d5f1d25c0023
This CL removes the unconditional OCHECKNIL check added in
walkUnsafeSlice by instead passing it as a pointer to
runtime.unsafeslice, and hiding the check behind a `len == 0` check.
While here, this CL also implements checkptr functionality for
unsafe.Slice and disallows use of unsafe.Slice with //go:notinheap
types.
Updates #46742.
Change-Id: I743a445ac124304a4d7322a7fe089c4a21b9a655
Reviewed-on: https://go-review.googlesource.com/c/go/+/331070
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
At this point all funcPC references are ABIInternal functions.
Replace with the intrinsics.
Change-Id: I3ba7e485c83017408749b53f92877d3727a75e27
Reviewed-on: https://go-review.googlesource.com/c/go/+/321954
Trust: Cherry Mui <cherryyz@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
We grow the backing store on append by 2x for small sizes and 1.25x
for large sizes. The threshold we use for picking the growth factor
used to depend on the old length, not the old capacity. That's kind of
unfortunate, because then doing append(s, 0, 0) and append(append(s,
0), 0) do different things. (If s has one more spot available, then
the former expression chooses its growth based on len(s) and the
latter on len(s)+1.) If we instead use the old capacity, we get more
consistent behavior. (Both expressions use len(s)+1 == cap(s) to
decide.)
Fixes#41239
Change-Id: I40686471d256edd72ec92aef973a89b52e235d4b
Reviewed-on: https://go-review.googlesource.com/c/go/+/257338
Trust: Keith Randall <khr@golang.org>
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Use a common runtime slicecopy function to copy strings or slices
into slices. This deduplicates similar code previously used in
reflect.slicecopy and runtime.stringslicecopy.
Change-Id: I09572ff0647a9e12bb5c6989689ce1c43f16b7f1
Reviewed-on: https://go-review.googlesource.com/c/go/+/254658
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Martin Möhrmann <moehrmann@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
If dst slice length is zero in makeslicecopy then the called mallocgc is
using a fast path to only return a pointer to runtime.zerobase.
There may be no heapBits for that address readable by
bulkBarrierPreWriteSrcOnly which will cause a panic.
Protect against this by not calling bulkBarrierPreWriteSrcOnly if
there is nothing to copy. This is the case for all cases where the
length of the destination slice is zero.
runtime.growslice and runtime.typedslicecopy have fast paths that
do not call bulkBarrierPreWrite for zero copy lengths either.
Fixes#38929
Change-Id: I78ece600203a0a8d24de5b6c9eef56f605d44e99
Reviewed-on: https://go-review.googlesource.com/c/go/+/232800
Reviewed-by: Keith Randall <khr@golang.org>
match:
m = make([]T, x); copy(m, s)
for pointer free T and x==len(s) rewrite to:
m = mallocgc(x*elemsize(T), nil, false); memmove(&m, &s, x*elemsize(T))
otherwise rewrite to:
m = makeslicecopy([]T, x, s)
This avoids memclear and shading of pointers in the newly created slice
before the copy.
With this CL "s" is only be allowed to bev a variable and not a more
complex expression. This restriction could be lifted in future versions
of this optimization when it can be proven that "s" is not referencing "m".
Triggers 450 times during make.bash..
Reduces go binary size by ~8 kbyte.
name old time/op new time/op delta
MakeSliceCopy/mallocmove/Byte 71.1ns ± 1% 65.8ns ± 0% -7.49% (p=0.000 n=10+9)
MakeSliceCopy/mallocmove/Int 71.2ns ± 1% 66.0ns ± 0% -7.27% (p=0.000 n=10+8)
MakeSliceCopy/mallocmove/Ptr 104ns ± 4% 99ns ± 1% -5.13% (p=0.000 n=10+10)
MakeSliceCopy/makecopy/Byte 70.3ns ± 0% 68.0ns ± 0% -3.22% (p=0.000 n=10+9)
MakeSliceCopy/makecopy/Int 70.3ns ± 0% 68.5ns ± 1% -2.59% (p=0.000 n=9+10)
MakeSliceCopy/makecopy/Ptr 102ns ± 0% 99ns ± 1% -2.97% (p=0.000 n=9+9)
MakeSliceCopy/nilappend/Byte 75.4ns ± 0% 74.9ns ± 2% -0.63% (p=0.015 n=9+9)
MakeSliceCopy/nilappend/Int 75.6ns ± 0% 76.4ns ± 3% ~ (p=0.245 n=9+10)
MakeSliceCopy/nilappend/Ptr 107ns ± 0% 108ns ± 1% +0.93% (p=0.005 n=9+10)
Fixes#26252
Change-Id: Iec553dd1fef6ded16197216a472351c8799a8e71
Reviewed-on: https://go-review.googlesource.com/c/go/+/146719
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Some runtime calls accept a slice, but only use ptr and len.
This change modifies most such routines to accept only ptr and len.
After this change, the only runtime calls that accept an unnecessary
cap arg are concatstrings and slicerunetostring.
Neither is particularly common, and both are complicated to modify.
Negligible compiler performance impact. Shrinks binaries a little.
There are only a few regressions; the one I investigated was
due to register allocation fluctuation.
Passes 'go test -race std cmd', modulo #38265 and #38266.
Wow, does that take a long time to run.
Updates #36890
file before after Δ %
compile 19655024 19655152 +128 +0.001%
cover 5244840 5236648 -8192 -0.156%
dist 3662376 3658280 -4096 -0.112%
link 6680056 6675960 -4096 -0.061%
pprof 14789844 14777556 -12288 -0.083%
test2json 2824744 2820648 -4096 -0.145%
trace 11647876 11639684 -8192 -0.070%
vet 8260472 8256376 -4096 -0.050%
total 115163736 115118808 -44928 -0.039%
Change-Id: Idb29fa6a81d6a82bfd3b65740b98cf3275ca0a78
Reviewed-on: https://go-review.googlesource.com/c/go/+/227163
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
In rare circumstances, this helps report a race which would
otherwise go undetected.
Fixes#36794
Change-Id: I8a3c9bd6fc34efa51516393f7ee72531c34fb073
Reviewed-on: https://go-review.googlesource.com/c/go/+/220685
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
We already have the ptrdata field in a type, which encodes exactly
the same information that kindNoPointers does.
My problem with kindNoPointers is that it often leads to
double-negative code like:
t.kind & kindNoPointers != 0
Much clearer is:
t.ptrdata == 0
Update #27167
Change-Id: I92307d7f018a6bbe3daca4a4abb4225e359349b1
Reviewed-on: https://go-review.googlesource.com/c/go/+/169157
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Instrumenting make.bash reveals that almost half (49.54%)
of the >16 million calls to growslice for
pointer-containing slices are
growing from an empty to a non-empty slice.
In that case, there is no need to call the write barrier,
which does some work before discovering that no pointers need shading.
Change-Id: Ide741468d8dee7ad43ea0bfbea6ccdf680030a0f
Reviewed-on: https://go-review.googlesource.com/c/go/+/168959
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Only return a pointer p to the new slices backing array from makeslice.
Makeslice callers then construct sliceheader{p, len, cap} explictly
instead of makeslice returning the slice.
Reduces go binary size by ~0.2%.
Removes 92 (~3.5%) panicindex calls from go binary.
Change-Id: I29b7c3b5fe8b9dcec96e2c43730575071cfe8a94
Reviewed-on: https://go-review.googlesource.com/c/141822
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
All uses of these have been converted to use runtime/internal/math
functions for overflow checking.
Fixes#21588
Change-Id: I0ba57028e471803dc7d445e66d77a8f87edfdafb
Reviewed-on: https://go-review.googlesource.com/c/144037
Run-TryBot: Martin Möhrmann <martisch@uos.de>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>