mirror of
https://github.com/golang/go.git
synced 2025-10-19 19:13:18 +00:00
334 commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
![]() |
27ff0f249c |
cmd/compile/internal/ssa: eliminate string copies for calls to unique.Make
unique.Make always copies strings passed into it, so it's safe to not copy byte slices converted to strings either. Handle this just like map accesses with string(b) as keys. This CL only handles unique.Make(string(b)), not nested cases like unique.Make([2]string{string(b1), string(b2)}); this could be done in a followup CL but the map lookup code in walk is sufficiently different than the call handling code that I didn't attempt it. (SSA is much easier). Fixes #71926 Change-Id: Ic2f82f2f91963d563b4ddb1282bd49fc40da8b85 Reviewed-on: https://go-review.googlesource.com/c/go/+/672135 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
![]() |
7d0cb2a2ad |
cmd/compile: constant fold 128-bit multiplies
The full 64x64->128 multiply comes up when using bits.Mul64. The 64x64->64+overflow multiply comes up in unsafe.Slice when using a constant length. Change-Id: I298515162ca07d804b2d699d03bc957ca30a4ebc Reviewed-on: https://go-review.googlesource.com/c/go/+/667175 Reviewed-by: Junyang Shao <shaojunyang@google.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
![]() |
16a6b71f18 |
cmd/compile: improve store-to-load forwarding with compatible types
Improve the compiler's store-to-load forwarding optimization by relaxing the type comparison condition. Instead of requiring exact type equality (CMPeq), we now use copyCompatibleType which allows forwarding between compatible types where safe. Fix several size comparison bugs in the nested store patterns. Previously, we were comparing the size of the outer store with the load type, rather than comparing with the size of the actual store being forwarded from. Skip OpConvert in dead store elimination to help get rid of dead stores such as zeroing slices. OpConvert, like OpInlMark, doesn't really use the memory. This optimization is particularly beneficial for code that creates slices with computed pointers, such as the runtime's heapBitsSlice function, where intermediate calculations were previously causing the compiler to miss store-to-load forwarding opportunities. Local sweet run result on an x86_64 laptop: │ Orig.res │ Hopt.res │ │ sec/op │ sec/op vs base │ BiogoIgor-8 5.303 ± 1% 5.322 ± 1% ~ (p=0.190 n=10) BiogoKrishna-8 7.894 ± 1% 7.828 ± 2% ~ (p=0.190 n=10) BleveIndexBatch100-8 2.257 ± 1% 2.248 ± 2% ~ (p=0.529 n=10) EtcdPut-8 30.12m ± 1% 30.03m ± 1% ~ (p=0.796 n=10) EtcdSTM-8 127.1m ± 1% 126.2m ± 0% -0.74% (p=0.023 n=10) GoBuildKubelet-8 52.21 ± 0% 52.05 ± 1% ~ (p=0.063 n=10) GoBuildKubeletLink-8 4.342 ± 1% 4.305 ± 0% -0.85% (p=0.000 n=10) GoBuildIstioctl-8 43.33 ± 0% 43.24 ± 0% -0.22% (p=0.015 n=10) GoBuildIstioctlLink-8 4.604 ± 1% 4.598 ± 0% ~ (p=0.063 n=10) GoBuildFrontend-8 15.33 ± 0% 15.29 ± 0% ~ (p=0.143 n=10) GoBuildFrontendLink-8 740.0m ± 1% 737.7m ± 1% ~ (p=0.912 n=10) GopherLuaKNucleotide-8 9.590 ± 1% 9.656 ± 1% ~ (p=0.165 n=10) MarkdownRenderXHTML-8 96.97m ± 1% 97.26m ± 2% ~ (p=0.105 n=10) Tile38QueryLoad-8 335.9µ ± 1% 335.6µ ± 1% ~ (p=0.481 n=10) geomean 1.336 1.333 -0.22% Change-Id: I031552623e6d5a3b1b5be8325e6314706e45534f Reviewed-on: https://go-review.googlesource.com/c/go/+/662075 Reviewed-by: Carlos Amedee <carlos@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Carlos Amedee <carlos@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@golang.org> |
||
![]() |
b60b9cf21f |
cmd/compile: add constant folding for bits.Add64
Change-Id: I0ed4ebeaaa68e274e5902485ccc1165c039440bd Reviewed-on: https://go-review.googlesource.com/c/go/+/656275 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> |
||
![]() |
4ff70cf868 |
cmd/compile: add MakeTuple generic SSA op to remove duplicate Select[01] rules
Change-Id: Id94a5e503f02aa29dc1e334b521770107d4261db Reviewed-on: https://go-review.googlesource.com/c/go/+/656615 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> |
||
![]() |
3e033b7553 |
cmd/compile: add constant folding for PopCount
Change-Id: I6ea3f75ddd5c7af114ef77bc48f28c7f8570997b Reviewed-on: https://go-review.googlesource.com/c/go/+/656156 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> Auto-Submit: Jorropo <jorropo.pgm@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@golang.org> |
||
![]() |
43e6525986 |
cmd/compile: load properly constant values from itabs
While looking at the SSA of following code, i noticed
that these rules do not work properly, and the types
are loaded indirectly through an itab, instead of statically.
type M interface{ M() }
type A interface{ A() }
type Impl struct{}
func (*Impl) M() {}
func (*Impl) A() {}
func main() {
var a M = &Impl{}
a.(A).A()
}
Change-Id: Ia275993f81a2e7302102d4ff87ac28586023d13c
GitHub-Last-Rev:
|
||
![]() |
89c2f282dc |
cmd/compile: move []byte->string map key optimization to ssa
If we call slicebytetostring immediately (with no intervening writes) before calling map access or delete functions with the resulting string as the key, then we can just use the ptr/len of the slicebytetostring argument as the key. This avoids an allocation. Fixes #44898 Update #71132 There's old code in cmd/compile/internal/walk/order.go that handles some of these cases. 1. m[string(b)] 2. s := string(b); m[s] 3. m[[2]string{string(b1),string(b2)}] The old code handled cases 1&3. The new code handles cases 1&2. We'll leave the old code around to keep 3 working, although it seems not terribly common. Case 2 happens particularly after inlining, so it is pretty common. Change-Id: I8913226ca79d2c65f4e2bd69a38ac8c976a57e43 Reviewed-on: https://go-review.googlesource.com/c/go/+/640656 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
![]() |
072eea9b3b |
cmd/compile: avoid ifaceeq call if we know the interface is direct
We can just use == if the interface is direct. Fixes #70738 Change-Id: Ia9a644791a370fec969c04c42d28a9b58f16911f Reviewed-on: https://go-review.googlesource.com/c/go/+/635435 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
![]() |
f7dbbf2519 |
cmd/compile: distribute 8 and 16-bit multiplication
Expand the existing rule to cover 8 and 16 bit variants. compilecmp linux/amd64: time time.parseStrictRFC3339.func1 80 -> 70 (-12.50%) time.Time.appendStrictRFC3339.func1 80 -> 70 (-12.50%) time.Time.appendStrictRFC3339 439 -> 428 (-2.51%) time [cmd/compile] time.parseStrictRFC3339.func1 80 -> 70 (-12.50%) time.Time.appendStrictRFC3339.func1 80 -> 70 (-12.50%) time.Time.appendStrictRFC3339 439 -> 428 (-2.51%) linux/arm64: time time.parseStrictRFC3339.func1 changed time.Time.appendStrictRFC3339.func1 changed time.Time.appendStrictRFC3339 416 -> 400 (-3.85%) time [cmd/compile] time.Time.appendStrictRFC3339 416 -> 400 (-3.85%) time.parseStrictRFC3339.func1 changed time.Time.appendStrictRFC3339.func1 changed Change-Id: I0ad3b2363a9fe8c322dd05fbc13bf151a146d8cb Reviewed-on: https://go-review.googlesource.com/c/go/+/641756 Auto-Submit: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> |
||
![]() |
f0b0109242 |
cmd/compile: pull multiple adds out of an unsafe.Pointer<->uintptr conversion
This came up in some swissmap code. Change-Id: I3c6705a5cafec8cb4953dfa9535cf0b45255cc83 Reviewed-on: https://go-review.googlesource.com/c/go/+/629516 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> |
||
![]() |
7d2e3da243 |
cmd/compile: remove redundant nil checks
Optimize them away if we can. If not, be more careful about splicing them out after scheduling. Change-Id: I660e54649d753dc456d2e25d389d375a16d76940 Reviewed-on: https://go-review.googlesource.com/c/go/+/627418 Reviewed-by: Shengwei Zhao <wingrez@126.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
![]() |
32f4864216 |
cmd/compile: arithmetic optimization for shifts
Fixes #69635 Change-Id: I4f8d7dafb34ccfb943c29f96c982278ab7edcd05 Reviewed-on: https://go-review.googlesource.com/c/go/+/621357 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Michael Pratt <mpratt@google.com> |
||
![]() |
6d856a804c |
cmd/compile: generalize struct load/store
The SSA backend currently only handle struct with up to 4 fields. Thus, there are different operations corresponding to number fields of the struct. This CL generalizes these with just one OpStructMake, allow struct types with arbitrary number of fields. However, the ssa.MaxStruct is still kept as-is, and future CL will increase this value to optimize large structs. Updates #24416 Change-Id: I192ffbea881186693584476b5639394e79be45c5 Reviewed-on: https://go-review.googlesource.com/c/go/+/611075 Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: David Chase <drchase@google.com> |
||
![]() |
944a2ac3c7 |
cmd/compile: small cleanups to rewrite rule helpers
Change-Id: I50a19bd971176598bf8e4ef86ec98f008abe245c Reviewed-on: https://go-review.googlesource.com/c/go/+/615198 Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@google.com> |
||
![]() |
be86f09e01 |
cmd/compile: use generics for isPowerOfTwo predicates
Change-Id: I097b53e9f13de6ff6eb18ae2261842b097f26390 Reviewed-on: https://go-review.googlesource.com/c/go/+/615197 Auto-Submit: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> |
||
![]() |
090f03fd2f |
cmd/compile: do constant folding for BitLen*
Change-Id: I56c27d606b55ea882f4db264fd4735b0cccdf7c6 Reviewed-on: https://go-review.googlesource.com/c/go/+/604015 Reviewed-by: Keith Randall <khr@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Keith Randall <khr@golang.org> |
||
![]() |
9f9dd2bfd8 |
cmd/compile: fix cmpstring rewrite rule
We need to ensure that the Select0 lives in the same block as its argument. Divide up the rule into 2 so that we can put the parts in the right places. (This would be simpler if we could use @block syntax mid-rule, but that feature currently only works at the top level.) This fixes the ssacheck builder after CL 578835 Change-Id: Id26a01d9fac0684e0b732d35d0f7999f6de07825 Reviewed-on: https://go-review.googlesource.com/c/go/+/580815 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> |
||
![]() |
1a0b86375f |
cmd/compile: remove redundant calls to cmpstring
The results of cmpstring are reuseable if the second call has the same arguments and memory. Note that this gets rid of cmpstring, but we still generate a redundant </<= test and branch afterwards, because the compiler doesn't know that cmpstring only ever returns -1,0,1. Update #61725 Change-Id: I93a0d1ccca50d90b1e1a888240ffb75a3b10b59b Reviewed-on: https://go-review.googlesource.com/c/go/+/578835 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Cherry Mui <cherryyz@google.com> |
||
![]() |
bbd0bc22ea |
cmd/compile: improve integer comparisons with numeric bounds
This do: - Fold always false or always true comparisons for ints and uint. - Reduce < and <= where the true set is only one value to == with such value. Change-Id: Ie9e3f70efd1845bef62db56543f051a50ad2532e Reviewed-on: https://go-review.googlesource.com/c/go/+/555135 Auto-Submit: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@golang.org> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
![]() |
962ccbef91 |
cmd/compile: ensure pointer arithmetic happens after the nil check
Have nil checks return a pointer that is known non-nil. Users of that pointer can use the result, ensuring that they are ordered after the nil check itself. The order dependence goes away after scheduling, when we've fixed an order. At that point we move uses back to the original pointer so it doesn't change regalloc any. This prevents pointer arithmetic on nil from being spilled to the stack and then observed by a stack scan. Fixes #63657 Change-Id: I1a5fa4f2e6d9000d672792b4f90dfc1b7b67f6ea Reviewed-on: https://go-review.googlesource.com/c/go/+/537775 Reviewed-by: David Chase <drchase@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Keith Randall <khr@google.com> |
||
![]() |
45d3d10071 |
cmd/compile/internal/ssa: rename ssagen.TypeOK as CanSSA
No need to indirect through Frontend for this. Change-Id: I5812eb4dadfda79267cabc9d13aeab126c1479e3 Reviewed-on: https://go-review.googlesource.com/c/go/+/526517 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Keith Randall <khr@golang.org> Auto-Submit: Matthew Dempsky <mdempsky@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> |
||
![]() |
d0964e172b |
cmd/compile: optimize s==s for strings
s==s is always true for strings. This comes up in NaN testing in generic code, where we want x==x to compile completely away except for float types. Fixes #60777 Change-Id: I3ce054b5121354de2f9751b010fb409f148cb637 Reviewed-on: https://go-review.googlesource.com/c/go/+/503795 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Keith Randall <khr@golang.org> |
||
![]() |
bd3f44e4ff |
cmd/compile: constant-fold loads from constant dictionaries and types
Retrying the original CL with a small modification. The original CL did not handle the case of reading an itab out of a dictionary correctly. When we read an itab out of a dictionary, we must treat the type inside that itab as maybe being put in an interface. Original CL: 486895 Revert CL: 490156 Change-Id: Id2dc1699d184cd8c63dac83986a70b60b4e6cbd7 Reviewed-on: https://go-review.googlesource.com/c/go/+/491495 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
||
![]() |
95c4f320d5 |
cmd/compile: add De Morgan's rewrite rule
Adds rules that rewrites statements such as ~P&~Q as ~(P|Q) and ~P|~Q as ~(P&Q), removing an extraneous instruction.
Change-Id: Icedb97df741680ddf9799df79df78657173aa500
GitHub-Last-Rev:
|
||
![]() |
f046180890 |
Revert "cmd/compile: constant-fold loads from constant dictionaries and types"
This reverts CL 486895. Reason for revert: This breaks internal tests at Google, see b/280035614. Change-Id: I48772a44f5f6070a7f06b5704e9f9aa272b5f978 Reviewed-on: https://go-review.googlesource.com/c/go/+/490156 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Michael Stapelberg <stapelberg@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Run-TryBot: Michael Stapelberg <stapelberg@google.com> Reviewed-by: David Chase <drchase@google.com> |
||
![]() |
635839a17a |
cmd/compile: constant-fold loads from constant dictionaries and types
Update #59591 Change-Id: Id250a7779c5b53776fff73f3e678fec54d92a8e3 Reviewed-on: https://go-review.googlesource.com/c/go/+/486895 Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Auto-Submit: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> |
||
![]() |
6b165577fe |
cmd/compile: remove memequal call from string compares in more cases
Add more rules to ensure that order doesn't matter. Add memequal 0 rule. Try to use a constant argument to memequal when one is available. Fixes #59684 Change-Id: I36e85ffbd949396ed700ed6e8ec2bc3ae013f5d2 Reviewed-on: https://go-review.googlesource.com/c/go/+/485535 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> |
||
![]() |
da4687923b |
cmd/compile: add rewrite rules for arithmetic operations
Add the following common local transformations
(t + x) - (t + y) == x - y
(t + x) - (y + t) == x - y
(x + t) - (y + t) == x - y
(x + t) - (t + y) == x - y
(x - t) + (t + y) == x + y
(x - t) + (y + t) == x + y
The compiler itself matches such patterns many times. This also aligns with other popular compilers.
Fixes #59111
Change-Id: Ibdfdb414782f8fcaa20b84ac5d43d0d9ae2c7b60
GitHub-Last-Rev:
|
||
![]() |
85d54a7667 |
cmd/compile: use zero constants in comparisons where possible
Some integer comparisons with 1 and -1 can be rewritten as comparisons with 0. For example, x < 1 is equivalent to x <= 0. This is an advantageous transformation on riscv64 because comparisons with zero do not require a constant to be loaded into a register. Other architectures will likely benefit too and the transformation is relatively benign on architectures that do not benefit. Change-Id: I2ce9821dd7605a660eb71d76e83a61f9bae1bf25 Reviewed-on: https://go-review.googlesource.com/c/go/+/350831 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Michael Knyszek <mknyszek@google.com> Run-TryBot: Michael Munday <mike.munday@lowrisc.org> TryBot-Result: Gopher Robot <gobot@golang.org> |
||
![]() |
ebe49f98c8 |
cmd/compile: inline constant sized memclrNoHeapPointers calls on PPC64
Update the function isInlinableMemclr for ppc64 and ppc64le to enable inlining for the constant sized cases < 512. Larger cases can use dcbz which performs better but requires alignment checking so it is best to continue using memclrNoHeapPointers for those cases. Results on p10: MemclrKnownSize1 2.07ns ± 0% 0.57ns ± 0% -72.59% MemclrKnownSize2 2.56ns ± 5% 0.57ns ± 0% -77.82% MemclrKnownSize4 5.15ns ± 0% 0.57ns ± 0% -89.00% MemclrKnownSize8 2.23ns ± 0% 0.57ns ± 0% -74.57% MemclrKnownSize16 2.23ns ± 0% 0.50ns ± 0% -77.74% MemclrKnownSize32 2.28ns ± 0% 0.56ns ± 0% -75.28% MemclrKnownSize64 2.49ns ± 0% 0.72ns ± 0% -70.95% MemclrKnownSize112 2.97ns ± 2% 1.14ns ± 0% -61.72% MemclrKnownSize128 4.64ns ± 6% 2.45ns ± 1% -47.17% MemclrKnownSize192 5.45ns ± 5% 2.79ns ± 0% -48.87% MemclrKnownSize248 4.51ns ± 0% 2.83ns ± 0% -37.12% MemclrKnownSize256 6.34ns ± 1% 3.58ns ± 0% -43.53% MemclrKnownSize512 3.64ns ± 0% 3.64ns ± 0% -0.03% MemclrKnownSize1024 4.73ns ± 0% 4.73ns ± 0% +0.01% MemclrKnownSize4096 17.1ns ± 0% 17.1ns ± 0% +0.07% MemclrKnownSize512KiB 2.12µs ± 0% 2.12µs ± 0% ~ (all equal) Change-Id: If1abf5749f4802c64523a41fe0058bd144d0ea46 Reviewed-on: https://go-review.googlesource.com/c/go/+/464340 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> Reviewed-by: Jakub Ciolek <jakub@ciolek.dev> Reviewed-by: Archana Ravindar <aravind5@in.ibm.com> Reviewed-by: Ian Lance Taylor <iant@google.com> Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Than McIntosh <thanm@google.com> |
||
![]() |
0d8d181bd5 |
cmd/compile: use MakeResult in empty MakeSlice elimination
This gets eliminated by thoses rules above: // for rewriting results of some late-expanded rewrites (below) (SelectN [0] (MakeResult x ___)) => x (SelectN [1] (MakeResult x y ___)) => y (SelectN [2] (MakeResult x y z ___)) => z Fixes #58161 Change-Id: I4fbfd52c72c06b6b3db906bd9910b6dbb7fe8975 Reviewed-on: https://go-review.googlesource.com/c/go/+/463846 Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Michael Knyszek <mknyszek@google.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> |
||
![]() |
1fc585dc2f |
cmd/compile: inline known-size memclrNoHeapPointers calls
This patch rewrites known-size calls to memclrNoHeapPointers with an OpZero. This significantly improves performance and lets some clears get DSE'd. One of the cases where this applies is zeroing a known-size array, example: var x [256]int8 ... for a := range x { x[a] = 0 } Other cases can be found in the runtime itself where memclrNoHeapPointers is sometimes directly invoked with a constant. It seems that for some sized-clears on some architectures (AMD64, maybe others) the memcrlNoHeapPointers is more performant than OpZero. See the issue #56997 for more details. Benches ARM (M1 Pro): name old time/op new time/op delta MemclrKnownSize1-10 2.03ns ± 0% 0.31ns ± 0% -84.69% (p=0.000 n=18+19) MemclrKnownSize2-10 1.97ns ± 0% 0.31ns ± 0% -84.19% (p=0.000 n=12+19) MemclrKnownSize4-10 2.02ns ± 0% 0.31ns ± 0% -84.56% (p=0.000 n=12+20) MemclrKnownSize8-10 2.02ns ± 0% 0.31ns ± 0% -84.59% (p=0.000 n=14+19) MemclrKnownSize16-10 2.15ns ± 0% 0.31ns ± 0% -85.50% (p=0.000 n=18+19) MemclrKnownSize32-10 2.48ns ± 0% 0.31ns ± 0% -87.48% (p=0.000 n=20+19) MemclrKnownSize64-10 1.93ns ± 0% 0.62ns ± 0% -67.88% (p=0.000 n=20+19) MemclrKnownSize112-10 2.48ns ± 0% 1.80ns ± 0% -27.74% (p=0.000 n=19+20) MemclrKnownSize128-10 10.0ns ±112% 2.0ns ± 0% -79.76% (p=0.000 n=18+17) MemclrKnownSize192-10 27.4ns ±103% 2.6ns ± 0% -90.38% (p=0.000 n=16+19) MemclrKnownSize248-10 9.67ns ±43% 3.26ns ± 0% -66.29% (p=0.000 n=19+19) MemclrKnownSize256-10 85.4ns ±148% 3.3ns ± 0% -96.18% (p=0.000 n=20+20) MemclrKnownSize512-10 223ns ±54% 6ns ± 0% -97.42% (p=0.000 n=18+20) MemclrKnownSize1024-10 216ns ±26% 11ns ± 0% -95.00% (p=0.000 n=18+15) MemclrKnownSize4096-10 265ns ± 2% 88ns ± 0% -66.84% (p=0.000 n=19+17) MemclrKnownSize512KiB-10 9.91µs ± 1% 10.23µs ± 2% +3.14% (p=0.000 n=19+19) [Geo mean] 15.6ns 2.5ns -83.62% name old speed new speed delta MemclrKnownSize1-10 493MB/s ± 0% 3216MB/s ± 0% +553.04% (p=0.000 n=18+19) MemclrKnownSize2-10 1.02GB/s ± 0% 6.43GB/s ± 0% +532.33% (p=0.000 n=16+19) MemclrKnownSize4-10 1.99GB/s ± 0% 12.86GB/s ± 0% +547.67% (p=0.000 n=18+20) MemclrKnownSize8-10 3.96GB/s ± 0% 25.72GB/s ± 0% +548.81% (p=0.000 n=19+19) MemclrKnownSize16-10 7.46GB/s ± 0% 51.43GB/s ± 0% +589.72% (p=0.000 n=20+19) MemclrKnownSize32-10 12.9GB/s ± 0% 102.9GB/s ± 0% +698.60% (p=0.000 n=20+18) MemclrKnownSize64-10 33.1GB/s ± 0% 103.0GB/s ± 0% +211.34% (p=0.000 n=19+19) MemclrKnownSize112-10 45.1GB/s ± 0% 62.4GB/s ± 0% +38.38% (p=0.000 n=19+20) MemclrKnownSize128-10 13.3GB/s ±107% 63.5GB/s ± 0% +378.03% (p=0.000 n=19+18) MemclrKnownSize192-10 6.97GB/s ±139% 72.72GB/s ± 0% +943.44% (p=0.000 n=19+19) MemclrKnownSize248-10 25.9GB/s ±46% 76.1GB/s ± 0% +194.16% (p=0.000 n=20+17) MemclrKnownSize256-10 8.64GB/s ±196% 78.51GB/s ± 0% +808.19% (p=0.000 n=20+20) MemclrKnownSize512-10 2.33GB/s ±86% 89.13GB/s ± 0% +3719.50% (p=0.000 n=17+20) MemclrKnownSize1024-10 4.85GB/s ±32% 94.93GB/s ± 0% +1856.74% (p=0.000 n=18+19) MemclrKnownSize4096-10 15.4GB/s ± 2% 46.6GB/s ± 0% +201.55% (p=0.000 n=19+18) MemclrKnownSize512KiB-10 52.9GB/s ± 1% 51.3GB/s ± 2% -3.04% (p=0.000 n=19+19) [Geo mean] 7.54GB/s 42.86GB/s +468.76% Intel Alder Lake 12600k: name old time/op new time/op delta MemclrKnownSize1-16 0.59ns ± 3% 0.38ns ± 6% -36.00% (p=0.000 n=19+18) MemclrKnownSize2-16 0.57ns ± 1% 0.19ns ± 5% -66.27% (p=0.000 n=19+19) MemclrKnownSize4-16 0.66ns ± 2% 0.36ns ±21% -45.12% (p=0.000 n=19+20) MemclrKnownSize8-16 0.74ns ± 1% 0.30ns ±26% -59.81% (p=0.000 n=18+20) MemclrKnownSize16-16 1.00ns ± 7% 0.21ns ± 8% -79.51% (p=0.000 n=20+19) MemclrKnownSize32-16 0.95ns ± 1% 0.40ns ± 1% -57.61% (p=0.000 n=20+18) MemclrKnownSize64-16 1.20ns ± 2% 0.41ns ± 0% -65.82% (p=0.000 n=20+18) MemclrKnownSize112-16 1.27ns ± 2% 1.03ns ± 0% -19.35% (p=0.000 n=20+18) MemclrKnownSize128-16 1.34ns ± 2% 1.03ns ± 0% -23.02% (p=0.000 n=20+20) MemclrKnownSize192-16 1.92ns ± 2% 1.44ns ± 0% -24.89% (p=0.000 n=20+16) MemclrKnownSize248-16 2.77ns ± 1% 3.29ns ± 0% +18.81% (p=0.000 n=20+16) MemclrKnownSize256-16 1.92ns ± 1% 1.86ns ± 0% -3.49% (p=0.000 n=19+15) MemclrKnownSize512-16 2.81ns ± 2% 3.49ns ± 0% +24.15% (p=0.000 n=20+17) MemclrKnownSize1024-16 4.02ns ± 1% 6.78ns ± 0% +68.44% (p=0.000 n=20+18) MemclrKnownSize4096-16 17.2ns ± 2% 14.4ns ± 0% -16.73% (p=0.000 n=20+17) MemclrKnownSize512KiB-16 6.71µs ± 1% 6.52µs ± 0% -2.85% (p=0.000 n=20+18) [Geo mean] 2.60ns 1.71ns -34.06% name old speed new speed delta MemclrKnownSize1-16 1.71GB/s ± 3% 2.67GB/s ± 6% +56.39% (p=0.000 n=19+18) MemclrKnownSize2-16 3.52GB/s ± 2% 10.43GB/s ± 6% +196.04% (p=0.000 n=20+20) MemclrKnownSize4-16 6.06GB/s ± 1% 10.83GB/s ±11% +78.63% (p=0.000 n=19+18) MemclrKnownSize8-16 10.7GB/s ± 1% 27.0GB/s ±21% +151.49% (p=0.000 n=18+20) MemclrKnownSize16-16 16.0GB/s ± 8% 78.1GB/s ± 7% +387.24% (p=0.000 n=20+19) MemclrKnownSize32-16 33.6GB/s ± 1% 79.4GB/s ± 1% +135.89% (p=0.000 n=20+18) MemclrKnownSize64-16 53.3GB/s ± 2% 155.9GB/s ± 0% +192.58% (p=0.000 n=20+18) MemclrKnownSize112-16 88.0GB/s ± 2% 109.1GB/s ± 0% +23.97% (p=0.000 n=20+18) MemclrKnownSize128-16 95.3GB/s ± 2% 123.8GB/s ± 0% +29.88% (p=0.000 n=20+20) MemclrKnownSize192-16 100GB/s ± 2% 133GB/s ± 0% +33.12% (p=0.000 n=20+17) MemclrKnownSize248-16 89.7GB/s ± 1% 75.5GB/s ± 0% -15.84% (p=0.000 n=20+19) MemclrKnownSize256-16 133GB/s ± 1% 138GB/s ± 0% +3.61% (p=0.000 n=19+14) MemclrKnownSize512-16 182GB/s ± 2% 147GB/s ± 0% -19.46% (p=0.000 n=20+17) MemclrKnownSize1024-16 254GB/s ± 1% 151GB/s ± 0% -40.64% (p=0.000 n=20+18) MemclrKnownSize4096-16 237GB/s ± 2% 285GB/s ± 0% +20.09% (p=0.000 n=20+17) MemclrKnownSize512KiB-16 78.2GB/s ± 1% 80.4GB/s ± 0% +2.93% (p=0.000 n=20+18) [Geo mean] 42.1GB/s 63.8GB/s +51.53% compilecmp linux/amd64: runtime runtime.(*pallocData).allocAll 85 -> 45 (-47.06%) runtime.(*pageAlloc).allocRange 942 -> 923 (-2.02%) runtime.(*pageAlloc).free 798 -> 774 (-3.01%) runtime.(*pageBits).clearAll 66 -> 20 (-69.70%) runtime.startCheckmarks 255 -> 246 (-3.53%) runtime.(*pallocData).freeAll 86 -> 46 (-46.51%) runtime.(*pallocBits).freeAll 66 -> 20 (-69.70%) runtime.(*consistentHeapStats).unsafeClear 66 -> 19 (-71.21%) runtime.newproc1 965 -> 933 (-3.32%) crypto/rc4 crypto/rc4.(*Cipher).Reset 78 -> 69 (-11.54%) compress/bzip2 compress/bzip2.(*reader).readBlock 2973 -> 2941 (-1.08%) image/jpeg image/jpeg.(*decoder).processDHT 1179 -> 1166 (-1.10%) index/suffixarray index/suffixarray.bucketMax_8_32 394 -> 241 (-38.83%) index/suffixarray.freq_8_32 317 -> 185 (-41.64%) index/suffixarray.freq_8_64 317 -> 178 (-43.85%) index/suffixarray.bucketMin_8_32 394 -> 243 (-38.32%) index/suffixarray.bucketMin_8_64 398 -> 234 (-41.21%) index/suffixarray.bucketMax_8_64 398 -> 234 (-41.21%) compress/flate compress/flate.(*huffmanBitWriter).generateCodegen 965 -> 838 (-13.16%) compress/flate.(*compressor).reset 429 -> 409 (-4.66%) cmd/vendor/golang.org/x/sys/unix cmd/vendor/golang.org/x/sys/unix.(*FdSet).Zero 66 -> 60 (-9.09%) cmd/vendor/golang.org/x/sys/unix.(*Ifreq).SetInet4Addr 211 -> 129 (-38.86%) cmd/vendor/golang.org/x/sys/unix.(*Ifreq).SetUint32 98 -> 14 (-85.71%) cmd/vendor/golang.org/x/sys/unix.(*Ifreq).clear 66 -> 11 (-83.33%) cmd/vendor/golang.org/x/sys/unix.(*Ifreq).SetUint16 101 -> 15 (-85.15%) cmd/vendor/golang.org/x/sys/unix.(*CPUSet).Zero 66 -> 60 (-9.09%) internal/coverage/decodemeta internal/coverage/decodemeta.(*CoverageMetaFileReader).rdUint64 325 -> 293 (-9.85%) crypto/tls crypto/tls.(*halfConn).setTrafficSecret 253 -> 247 (-2.37%) crypto/tls.(*Conn).readRecordOrCCS 10315 -> 10283 (-0.31%) crypto/tls.(*halfConn).changeCipherSpec 271 -> 261 (-3.69%) crypto/tls.(*Conn).writeRecordLocked 1765 -> 1748 (-0.96%) file before after Δ % runtime.s 512467 512164 -303 -0.059% crypto/rc4.s 955 946 -9 -0.942% compress/bzip2.s 9586 9554 -32 -0.334% image/jpeg.s 32122 32109 -13 -0.040% index/suffixarray.s 38547 37644 -903 -2.343% compress/flate.s 46668 46521 -147 -0.315% cmd/vendor/golang.org/x/sys/unix.s 118620 118301 -319 -0.269% internal/coverage/decodemeta.s 7224 7192 -32 -0.443% crypto/tls.s 288762 288697 -65 -0.023% cmd/compile/internal/ssa.s 3639799 3640727 +928 +0.025% total 20790248 20789353 -895 -0.004% src/runtime benchmarks (Linux Alder Lake 12600k): name old time/op new time/op delta MakeChan/Byte-16 26.2ns ± 2% 25.6ns ± 3% -2.05% (p=0.003 n=9+10) MakeChan/Int-16 33.9ns ± 2% 33.3ns ± 4% -1.99% (p=0.015 n=10+10) MakeChan/Ptr-16 54.2ns ± 2% 53.7ns ± 1% -0.90% (p=0.016 n=10+9) MakeChan/Struct/0-16 23.8ns ± 3% 23.4ns ± 1% -1.72% (p=0.009 n=10+8) MakeChan/Struct/32-16 55.9ns ± 2% 53.9ns ± 1% -3.48% (p=0.000 n=10+10) MakeChan/Struct/40-16 63.5ns ± 1% 61.1ns ± 2% -3.79% (p=0.000 n=10+9) ChanNonblocking-16 0.22ns ± 0% 0.22ns ± 0% +0.40% (p=0.011 n=9+8) SelectUncontended-16 4.63ns ± 1% 4.62ns ± 0% -0.35% (p=0.001 n=10+8) SelectSyncContended-16 1.58µs ± 2% 1.59µs ± 1% ~ (p=0.540 n=10+10) SelectAsyncContended-16 290ns ± 0% 291ns ± 0% +0.14% (p=0.012 n=8+9) SelectNonblock-16 0.95ns ± 1% 0.95ns ± 1% ~ (p=0.546 n=9+9) ChanUncontended-16 239ns ± 3% 242ns ± 6% ~ (p=0.886 n=9+10) ChanContended-16 17.7µs ± 1% 18.2µs ± 1% +2.87% (p=0.000 n=10+9) ChanSync-16 109ns ± 2% 109ns ± 1% ~ (p=0.342 n=10+10) ChanSyncWork-16 6.55µs ± 1% 6.53µs ± 1% ~ (p=0.101 n=10+10) ChanProdCons0-16 502ns ± 1% 499ns ± 0% -0.55% (p=0.001 n=10+9) ChanProdCons10-16 373ns ± 2% 377ns ± 1% ~ (p=0.095 n=10+9) ChanProdCons100-16 224ns ± 2% 223ns ± 3% ~ (p=0.150 n=9+10) ChanProdConsWork0-16 491ns ± 1% 484ns ± 0% -1.26% (p=0.000 n=10+9) ChanProdConsWork10-16 451ns ± 2% 448ns ± 2% ~ (p=0.210 n=8+10) ChanProdConsWork100-16 406ns ± 0% 407ns ± 1% ~ (p=0.138 n=8+8) SelectProdCons-16 509ns ± 0% 509ns ± 0% ~ (p=0.917 n=9+9) ReceiveDataFromClosedChan-16 12.1ns ± 0% 12.1ns ± 0% ~ (p=0.780 n=10+10) ChanCreation-16 22.6ns ± 1% 22.4ns ± 0% -0.72% (p=0.001 n=10+8) ChanSem-16 165ns ± 1% 166ns ± 1% +0.72% (p=0.002 n=10+10) ChanPopular-16 500µs ± 2% 498µs ± 1% ~ (p=0.218 n=10+10) ChanClosed-16 0.29ns ± 0% 0.29ns ± 0% +0.09% (p=0.019 n=9+8) CallClosure-16 1.28ns ± 0% 1.27ns ± 0% -0.51% (p=0.000 n=9+9) CallClosure1-16 1.50ns ± 0% 1.50ns ± 0% ~ (p=0.123 n=9+9) CallClosure2-16 8.86ns ± 1% 8.86ns ± 3% ~ (p=0.590 n=9+10) CallClosure3-16 8.75ns ± 2% 8.69ns ± 2% ~ (p=0.247 n=10+10) CallClosure4-16 8.65ns ± 2% 8.56ns ± 2% ~ (p=0.105 n=10+10) Complex128DivNormal-16 2.47ns ± 0% 2.47ns ± 0% ~ (p=0.790 n=10+9) Complex128DivNisNaN-16 4.44ns ± 0% 4.43ns ± 0% ~ (p=0.564 n=10+10) Complex128DivDisNaN-16 4.48ns ± 0% 4.48ns ± 0% ~ (p=0.101 n=10+10) Complex128DivNisInf-16 2.58ns ± 0% 2.58ns ± 0% ~ (p=0.808 n=10+10) Complex128DivDisInf-16 6.30ns ± 0% 6.31ns ± 0% ~ (p=0.305 n=10+10) SetTypePtr-16 0.73ns ± 1% 0.73ns ± 3% ~ (p=0.644 n=10+10) SetTypePtr8-16 4.12ns ± 0% 4.12ns ± 0% ~ (p=0.127 n=10+10) SetTypePtr16-16 4.13ns ± 1% 4.12ns ± 0% ~ (p=0.109 n=10+10) SetTypePtr32-16 4.12ns ± 0% 4.12ns ± 0% ~ (p=0.203 n=9+10) SetTypePtr64-16 4.12ns ± 0% 4.12ns ± 0% ~ (p=0.696 n=10+10) SetTypePtr126-16 6.91ns ± 0% 6.91ns ± 0% ~ (p=0.469 n=10+10) SetTypePtr128-16 6.66ns ± 0% 6.67ns ± 0% ~ (p=0.246 n=9+10) SetTypePtrSlice-16 54.1ns ± 1% 54.1ns ± 1% ~ (p=0.509 n=9+10) SetTypeNode1-16 4.13ns ± 1% 4.12ns ± 0% ~ (p=0.342 n=10+10) SetTypeNode1Slice-16 10.1ns ± 1% 10.0ns ± 1% -1.18% (p=0.000 n=10+10) SetTypeNode8-16 4.12ns ± 0% 4.12ns ± 0% ~ (p=0.137 n=8+8) SetTypeNode8Slice-16 22.6ns ± 0% 22.6ns ± 0% ~ (p=0.423 n=10+10) SetTypeNode64-16 6.90ns ± 0% 6.91ns ± 0% ~ (p=0.275 n=10+10) SetTypeNode64Slice-16 173ns ± 0% 173ns ± 0% ~ (p=0.610 n=9+10) SetTypeNode64Dead-16 5.53ns ± 0% 5.52ns ± 0% ~ (p=0.123 n=10+6) SetTypeNode64DeadSlice-16 150ns ± 0% 150ns ± 0% ~ (p=0.398 n=10+10) SetTypeNode124-16 6.90ns ± 0% 6.90ns ± 0% ~ (p=0.779 n=10+10) SetTypeNode124Slice-16 222ns ± 5% 217ns ± 0% ~ (p=0.302 n=10+10) SetTypeNode126-16 6.66ns ± 0% 6.66ns ± 0% ~ (p=0.324 n=10+9) SetTypeNode126Slice-16 218ns ± 0% 218ns ± 0% ~ (p=0.119 n=9+10) SetTypeNode128-16 9.76ns ± 0% 9.73ns ± 0% -0.31% (p=0.003 n=9+10) SetTypeNode128Slice-16 279ns ± 0% 278ns ± 0% ~ (p=0.112 n=10+9) SetTypeNode130-16 9.77ns ± 0% 9.73ns ± 0% -0.33% (p=0.002 n=10+10) SetTypeNode130Slice-16 284ns ± 0% 284ns ± 0% ~ (p=0.668 n=10+10) SetTypeNode1024-16 51.2ns ± 0% 51.6ns ± 1% ~ (p=0.080 n=9+9) SetTypeNode1024Slice-16 1.83µs ± 0% 1.82µs ± 0% ~ (p=0.115 n=10+10) Allocation-16 4.64µs ± 1% 4.37µs ± 1% -5.69% (p=0.000 n=9+9) ReadMemStats-16 5.62µs ± 2% 5.55µs ± 5% -1.36% (p=0.050 n=10+10) WriteBarrier-16 4.95ns ± 3% 4.99ns ± 3% ~ (p=0.255 n=10+10) BulkWriteBarrier-16 1.69ns ± 2% 1.63ns ± 4% -3.77% (p=0.001 n=10+10) ScanStackNoLocals-16 12.8ms ± 2% 12.9ms ± 1% +0.72% (p=0.019 n=10+10) MSpanCountAlloc/bits=64-16 1.65ns ± 0% 1.65ns ± 0% ~ (p=0.124 n=10+10) MSpanCountAlloc/bits=128-16 2.08ns ± 1% 2.06ns ± 1% -0.87% (p=0.000 n=10+10) MSpanCountAlloc/bits=256-16 2.71ns ± 1% 2.69ns ± 1% -0.74% (p=0.001 n=10+9) MSpanCountAlloc/bits=512-16 4.15ns ± 0% 4.23ns ± 2% +2.15% (p=0.000 n=10+10) MSpanCountAlloc/bits=1024-16 7.89ns ± 1% 7.89ns ± 1% ~ (p=0.867 n=10+10) Hash5-16 1.93ns ± 1% 2.01ns ± 0% +3.99% (p=0.000 n=10+8) Hash16-16 2.04ns ± 1% 2.21ns ± 1% +8.61% (p=0.000 n=10+10) Hash64-16 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.154 n=9+9) Hash1024-16 16.4ns ± 0% 16.4ns ± 0% +0.17% (p=0.020 n=9+10) Hash65536-16 886ns ± 0% 885ns ± 0% ~ (p=0.725 n=10+10) AlignedLoad-16 0.96ns ± 2% 0.95ns ± 3% ~ (p=0.123 n=10+10) UnalignedLoad-16 0.95ns ± 2% 1.01ns ± 2% +6.31% (p=0.000 n=10+10) EqEfaceConcrete-16 0.31ns ± 3% 0.33ns ± 5% +8.10% (p=0.000 n=10+10) EqIfaceConcrete-16 0.31ns ±13% 0.28ns ± 2% -9.23% (p=0.001 n=10+10) NeEfaceConcrete-16 0.29ns ± 1% 0.31ns ± 7% +5.59% (p=0.010 n=8+8) NeIfaceConcrete-16 0.28ns ± 2% 0.29ns ± 1% +4.49% (p=0.000 n=9+8) ConvT2EByteSized/bool-16 0.53ns ± 1% 0.52ns ± 1% -2.18% (p=0.000 n=10+10) ConvT2EByteSized/uint8-16 0.53ns ± 1% 0.53ns ± 0% +1.22% (p=0.000 n=10+10) ConvT2ESmall-16 1.13ns ± 0% 1.13ns ± 0% ~ (p=0.774 n=9+9) ConvT2EUintptr-16 1.03ns ± 0% 1.04ns ± 0% +0.50% (p=0.000 n=10+8) ConvT2ELarge-16 14.4ns ± 2% 14.4ns ± 1% ~ (p=0.726 n=10+10) ConvT2ISmall-16 1.13ns ± 0% 1.13ns ± 0% ~ (p=0.693 n=9+10) ConvT2IUintptr-16 1.03ns ± 0% 1.03ns ± 0% +0.44% (p=0.000 n=10+10) ConvT2ILarge-16 14.2ns ± 1% 14.4ns ± 1% +0.85% (p=0.007 n=9+10) ConvI2E-16 0.54ns ± 1% 0.54ns ± 0% -0.39% (p=0.037 n=10+8) ConvI2I-16 2.68ns ± 0% 2.70ns ± 1% +0.73% (p=0.000 n=9+10) AssertE2T-16 0.28ns ± 1% 0.39ns ± 5% +37.38% (p=0.000 n=10+10) AssertE2TLarge-16 0.42ns ± 2% 0.48ns ± 1% +14.92% (p=0.000 n=9+10) AssertE2I-16 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.352 n=9+9) AssertI2T-16 0.37ns ± 3% 0.34ns ± 1% -6.16% (p=0.000 n=10+10) AssertI2I-16 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.286 n=10+10) AssertI2E-16 0.54ns ± 1% 0.54ns ± 0% -0.94% (p=0.000 n=10+10) AssertE2E-16 0.41ns ± 0% 0.41ns ± 1% ~ (p=0.880 n=9+9) AssertE2T2-16 0.41ns ± 1% 0.41ns ± 1% ~ (p=0.725 n=10+10) AssertE2T2Blank-16 0.24ns ± 5% 0.21ns ± 1% -14.79% (p=0.000 n=10+9) AssertI2E2-16 0.69ns ± 0% 0.69ns ± 1% ~ (p=0.541 n=10+10) AssertI2E2Blank-16 0.26ns ± 9% 0.21ns ± 1% -18.86% (p=0.000 n=10+9) AssertE2E2-16 0.53ns ± 1% 0.53ns ± 1% +0.72% (p=0.004 n=10+10) AssertE2E2Blank-16 0.23ns ± 4% 0.21ns ± 1% -8.42% (p=0.000 n=10+10) ConvT2Ezero/zero/16-16 1.13ns ± 0% 1.14ns ± 1% ~ (p=0.583 n=9+10) ConvT2Ezero/zero/32-16 1.13ns ± 0% 1.13ns ± 0% ~ (p=0.417 n=10+10) ConvT2Ezero/zero/64-16 1.03ns ± 1% 1.03ns ± 0% ~ (p=0.051 n=10+10) ConvT2Ezero/zero/str-16 1.03ns ± 0% 1.03ns ± 0% ~ (p=0.132 n=10+10) ConvT2Ezero/zero/slice-16 1.14ns ± 0% 1.15ns ± 0% +0.49% (p=0.001 n=10+10) ConvT2Ezero/zero/big-16 123ns ± 1% 123ns ± 1% ~ (p=0.171 n=10+10) ConvT2Ezero/nonzero/str-16 19.4ns ± 1% 19.3ns ± 3% ~ (p=0.548 n=9+10) ConvT2Ezero/nonzero/slice-16 22.2ns ± 2% 22.0ns ± 2% ~ (p=0.109 n=10+10) ConvT2Ezero/nonzero/big-16 123ns ± 1% 123ns ± 1% ~ (p=0.446 n=10+8) ConvT2Ezero/smallint/16-16 1.13ns ± 0% 1.14ns ± 1% ~ (p=0.362 n=10+10) ConvT2Ezero/smallint/32-16 1.13ns ± 0% 1.13ns ± 0% ~ (p=0.907 n=10+9) ConvT2Ezero/smallint/64-16 1.04ns ± 0% 1.03ns ± 0% -0.38% (p=0.002 n=10+10) ConvT2Ezero/largeint/16-16 6.65ns ± 1% 6.65ns ± 2% ~ (p=0.618 n=10+9) ConvT2Ezero/largeint/32-16 6.75ns ± 3% 6.63ns ± 2% -1.77% (p=0.015 n=10+10) ConvT2Ezero/largeint/64-16 9.19ns ± 1% 9.26ns ± 2% ~ (p=0.123 n=10+10) Malloc8-16 8.66ns ± 1% 8.89ns ± 2% +2.74% (p=0.000 n=10+10) Malloc16-16 13.7ns ± 1% 13.8ns ± 1% +0.71% (p=0.022 n=10+8) MallocTypeInfo8-16 11.7ns ± 3% 11.6ns ± 2% ~ (p=0.469 n=10+10) MallocTypeInfo16-16 18.3ns ± 1% 18.2ns ± 2% ~ (p=0.251 n=9+10) MallocLargeStruct-16 195ns ± 1% 198ns ± 1% +1.65% (p=0.000 n=9+10) GoroutineSelect-16 1.10ms ± 1% 1.12ms ± 1% +1.36% (p=0.000 n=10+8) GoroutineBlocking-16 986µs ± 1% 998µs ± 1% +1.23% (p=0.002 n=10+10) GoroutineForRange-16 985µs ± 1% 1001µs ± 1% +1.68% (p=0.000 n=10+10) GoroutineIdle-16 679µs ± 1% 691µs ± 0% +1.74% (p=0.000 n=10+9) HashStringSpeed-16 5.33ns ± 5% 5.19ns ± 4% ~ (p=0.113 n=9+9) HashBytesSpeed-16 8.20ns ± 3% 8.24ns ± 1% ~ (p=0.497 n=10+9) HashInt32Speed-16 4.01ns ± 2% 3.90ns ± 4% -2.63% (p=0.011 n=9+10) HashInt64Speed-16 3.94ns ± 4% 3.79ns ± 1% -3.74% (p=0.003 n=10+9) HashStringArraySpeed-16 12.5ns ± 4% 12.3ns ± 1% ~ (p=0.055 n=10+10) MegMap-16 3.72ns ± 1% 3.73ns ± 1% ~ (p=0.484 n=9+10) MegOneMap-16 2.28ns ± 1% 2.27ns ± 1% ~ (p=0.287 n=10+10) MegEqMap-16 22.0µs ± 3% 22.3µs ± 2% +1.48% (p=0.028 n=10+9) MegEmptyMap-16 0.93ns ± 1% 0.92ns ± 1% -0.52% (p=0.030 n=10+10) SmallStrMap-16 3.77ns ± 0% 3.77ns ± 0% ~ (p=0.324 n=10+10) MapStringKeysEight_16-16 3.91ns ± 0% 3.91ns ± 0% ~ (p=0.088 n=9+9) MapStringKeysEight_32-16 3.58ns ± 1% 3.50ns ± 0% -2.11% (p=0.000 n=10+10) MapStringKeysEight_64-16 3.58ns ± 1% 3.50ns ± 0% -2.23% (p=0.000 n=10+10) MapStringKeysEight_1M-16 3.57ns ± 1% 3.50ns ± 0% -1.92% (p=0.000 n=10+10) IntMap-16 2.89ns ± 1% 2.89ns ± 0% ~ (p=0.381 n=10+10) MapFirst/1-16 1.60ns ± 1% 1.59ns ± 2% -0.49% (p=0.020 n=10+9) MapFirst/2-16 1.61ns ± 0% 1.59ns ± 1% -1.17% (p=0.001 n=10+10) MapFirst/3-16 1.61ns ± 1% 1.59ns ± 1% -1.45% (p=0.000 n=10+10) MapFirst/4-16 1.60ns ± 1% 1.59ns ± 1% -1.16% (p=0.000 n=10+10) MapFirst/5-16 1.60ns ± 1% 1.58ns ± 1% -0.98% (p=0.000 n=10+10) MapFirst/6-16 1.60ns ± 1% 1.59ns ± 1% -0.87% (p=0.001 n=10+10) MapFirst/7-16 1.60ns ± 1% 1.59ns ± 1% -0.79% (p=0.002 n=10+10) MapFirst/8-16 1.60ns ± 1% 1.59ns ± 1% -0.67% (p=0.017 n=9+10) MapFirst/9-16 2.83ns ± 0% 2.83ns ± 0% ~ (p=0.492 n=10+10) MapFirst/10-16 2.83ns ± 0% 2.84ns ± 0% +0.24% (p=0.017 n=10+10) MapFirst/11-16 2.83ns ± 0% 2.83ns ± 0% ~ (p=0.445 n=10+10) MapFirst/12-16 2.83ns ± 0% 2.83ns ± 0% ~ (p=0.564 n=10+10) MapFirst/13-16 2.83ns ± 0% 2.84ns ± 0% ~ (p=0.175 n=9+10) MapFirst/14-16 2.83ns ± 0% 2.83ns ± 0% ~ (p=0.322 n=10+9) MapFirst/15-16 2.83ns ± 0% 2.84ns ± 1% ~ (p=0.209 n=10+10) MapFirst/16-16 2.83ns ± 1% 2.84ns ± 0% ~ (p=0.238 n=10+10) MapMid/1-16 1.64ns ± 0% 1.64ns ± 0% ~ (p=0.453 n=10+9) MapMid/2-16 1.86ns ± 1% 1.86ns ± 0% ~ (p=0.764 n=10+9) MapMid/3-16 1.86ns ± 0% 1.86ns ± 1% ~ (p=1.000 n=10+10) MapMid/4-16 2.06ns ± 0% 2.06ns ± 0% -0.27% (p=0.014 n=10+9) MapMid/5-16 2.06ns ± 0% 2.06ns ± 0% ~ (p=0.075 n=9+10) MapMid/6-16 2.27ns ± 0% 2.27ns ± 1% ~ (p=0.898 n=10+10) MapMid/7-16 2.27ns ± 1% 2.26ns ± 0% -0.23% (p=0.049 n=10+10) MapMid/8-16 2.47ns ± 0% 2.47ns ± 1% ~ (p=0.840 n=10+10) MapMid/9-16 4.21ns ± 7% 4.13ns ±19% ~ (p=0.315 n=10+10) MapMid/10-16 4.17ns ± 7% 4.31ns ± 5% +3.37% (p=0.021 n=10+9) MapMid/11-16 4.18ns ± 7% 4.32ns ± 6% +3.50% (p=0.015 n=10+10) MapMid/12-16 4.34ns ± 7% 4.30ns ± 5% ~ (p=0.858 n=9+10) MapMid/13-16 4.25ns ± 6% 4.28ns ± 6% ~ (p=0.489 n=9+9) MapMid/14-16 3.75ns ±23% 3.90ns ±16% ~ (p=0.353 n=10+10) MapMid/15-16 3.87ns ±25% 3.95ns ±26% ~ (p=0.315 n=10+10) MapMid/16-16 4.06ns ±19% 3.94ns ±16% ~ (p=0.796 n=10+10) MapLast/1-16 1.65ns ± 0% 1.65ns ± 0% ~ (p=0.607 n=10+10) MapLast/2-16 1.86ns ± 0% 1.86ns ± 0% +0.26% (p=0.029 n=10+10) MapLast/3-16 2.06ns ± 1% 2.06ns ± 0% ~ (p=0.689 n=8+9) MapLast/4-16 2.27ns ± 1% 2.26ns ± 0% ~ (p=0.148 n=10+9) MapLast/5-16 2.47ns ± 0% 2.47ns ± 0% ~ (p=0.385 n=9+10) MapLast/6-16 2.67ns ± 0% 2.68ns ± 0% ~ (p=0.202 n=9+10) MapLast/7-16 2.88ns ± 0% 2.88ns ± 0% ~ (p=0.751 n=10+10) MapLast/8-16 3.08ns ± 0% 3.08ns ± 0% ~ (p=0.826 n=10+9) MapLast/9-16 4.31ns ± 6% 4.54ns ± 5% ~ (p=0.070 n=9+8) MapLast/10-16 4.25ns ± 5% 4.42ns ± 6% ~ (p=0.321 n=9+8) MapLast/11-16 4.59ns ±16% 5.42ns ±44% +17.99% (p=0.019 n=10+10) MapLast/12-16 5.04ns ±19% 6.11ns ±28% +21.35% (p=0.005 n=9+10) MapLast/13-16 6.00ns ±35% 5.76ns ± 3% ~ (p=0.173 n=10+8) MapLast/14-16 4.27ns ± 5% 4.53ns ± 6% +6.14% (p=0.007 n=10+10) MapLast/15-16 4.41ns ± 1% 4.44ns ± 7% ~ (p=0.515 n=8+10) MapLast/16-16 4.18ns ± 6% 4.99ns ±18% +19.48% (p=0.000 n=10+10) MapCycle-16 7.48ns ± 2% 7.46ns ± 1% ~ (p=0.699 n=10+10) RepeatedLookupStrMapKey32-16 6.98ns ± 3% 6.73ns ± 2% -3.63% (p=0.000 n=10+10) RepeatedLookupStrMapKey1M-16 14.7µs ± 5% 14.7µs ± 4% ~ (p=0.604 n=9+10) MakeMap/[Byte]Byte-16 58.5ns ± 1% 58.5ns ± 1% ~ (p=0.780 n=10+9) MakeMap/[Int]Int-16 113ns ± 0% 113ns ± 1% ~ (p=0.100 n=8+10) NewEmptyMap-16 2.47ns ± 0% 2.47ns ± 0% ~ (p=0.638 n=10+10) NewSmallMap-16 11.5ns ± 1% 11.6ns ± 0% +1.18% (p=0.000 n=10+10) MapIter-16 42.2ns ± 0% 42.8ns ± 1% +1.50% (p=0.000 n=10+10) MapIterEmpty-16 1.85ns ± 0% 1.85ns ± 0% ~ (p=0.651 n=10+10) SameLengthMap-16 1.85ns ± 1% 1.85ns ± 0% ~ (p=0.247 n=10+10) BigKeyMap-16 7.18ns ± 1% 7.42ns ± 4% +3.33% (p=0.004 n=10+10) BigValMap-16 7.03ns ± 2% 7.19ns ± 1% +2.33% (p=0.000 n=10+9) SmallKeyMap-16 5.32ns ± 1% 5.24ns ± 1% -1.41% (p=0.000 n=10+10) MapPopulate/1-16 6.30ns ± 0% 6.41ns ± 1% +1.81% (p=0.000 n=8+10) MapPopulate/10-16 239ns ± 2% 234ns ± 2% -2.05% (p=0.001 n=9+10) MapPopulate/100-16 4.19µs ± 2% 4.22µs ± 2% ~ (p=0.171 n=10+10) MapPopulate/1000-16 52.3µs ± 1% 52.5µs ± 1% ~ (p=0.133 n=9+10) MapPopulate/10000-16 459µs ± 1% 466µs ± 2% +1.45% (p=0.005 n=10+10) MapPopulate/100000-16 4.22ms ± 2% 4.25ms ± 2% ~ (p=0.393 n=10+10) ComplexAlgMap-16 12.5ns ± 1% 12.4ns ± 1% -0.95% (p=0.022 n=10+10) GoMapClear/Reflexive/1-16 9.61ns ± 1% 9.58ns ± 0% -0.27% (p=0.027 n=10+10) GoMapClear/Reflexive/10-16 10.0ns ± 1% 10.0ns ± 1% ~ (p=0.648 n=9+9) GoMapClear/Reflexive/100-16 31.4ns ± 0% 31.4ns ± 1% ~ (p=0.305 n=9+10) GoMapClear/Reflexive/1000-16 147ns ± 0% 149ns ± 2% +1.21% (p=0.000 n=10+10) GoMapClear/Reflexive/10000-16 3.99µs ± 0% 4.00µs ± 0% +0.21% (p=0.018 n=9+10) GoMapClear/NonReflexive/1-16 41.4ns ± 2% 41.7ns ± 1% +0.55% (p=0.043 n=9+10) GoMapClear/NonReflexive/10-16 50.3ns ± 1% 50.9ns ± 1% +1.16% (p=0.000 n=10+10) GoMapClear/NonReflexive/100-16 125ns ± 0% 126ns ± 0% +0.96% (p=0.000 n=8+10) GoMapClear/NonReflexive/1000-16 1.08µs ± 0% 1.08µs ± 1% ~ (p=0.097 n=10+10) GoMapClear/NonReflexive/10000-16 8.18µs ± 2% 8.10µs ± 0% -0.91% (p=0.019 n=10+8) MapStringConversion/32/simple-16 4.66ns ± 1% 4.69ns ± 3% ~ (p=0.905 n=9+10) MapStringConversion/32/struct-16 4.65ns ± 3% 4.94ns ± 2% +6.23% (p=0.000 n=10+10) MapStringConversion/32/array-16 4.69ns ± 3% 4.72ns ± 3% ~ (p=0.631 n=10+10) MapStringConversion/64/simple-16 4.14ns ± 0% 4.14ns ± 1% ~ (p=0.342 n=10+10) MapStringConversion/64/struct-16 4.13ns ± 0% 4.13ns ± 0% ~ (p=0.809 n=10+10) MapStringConversion/64/array-16 4.13ns ± 1% 4.13ns ± 1% ~ (p=0.752 n=10+10) MapInterfaceString-16 7.90ns ±23% 8.51ns ±33% ~ (p=0.604 n=9+10) MapInterfacePtr-16 7.68ns ±29% 7.10ns ±36% ~ (p=0.353 n=10+10) NewEmptyMapHintLessThan8-16 3.70ns ± 0% 3.70ns ± 0% ~ (p=0.209 n=10+10) NewEmptyMapHintGreaterThan8-16 270ns ± 1% 272ns ± 1% +0.71% (p=0.005 n=10+9) MapPop100-16 6.45µs ± 0% 6.50µs ± 1% +0.77% (p=0.000 n=10+10) MapPop1000-16 114µs ± 1% 114µs ± 1% ~ (p=0.190 n=10+10) MapPop10000-16 2.28ms ± 2% 2.28ms ± 2% ~ (p=0.912 n=10+10) MapAssign/Int32/256-16 4.75ns ± 2% 4.82ns ± 4% ~ (p=0.101 n=10+10) MapAssign/Int32/65536-16 16.4ns ± 1% 16.7ns ± 0% +1.44% (p=0.000 n=10+9) MapAssign/Int64/256-16 4.79ns ± 5% 4.79ns ± 1% ~ (p=0.616 n=10+8) MapAssign/Int64/65536-16 17.1ns ± 1% 16.8ns ± 0% -1.28% (p=0.000 n=10+8) MapAssign/Str/256-16 6.07ns ± 6% 6.24ns ± 2% +2.84% (p=0.035 n=10+9) MapAssign/Str/65536-16 21.4ns ± 0% 21.4ns ± 3% ~ (p=0.300 n=7+10) MapOperatorAssign/Int32/256-16 4.82ns ± 3% 4.81ns ± 3% ~ (p=0.684 n=10+10) MapOperatorAssign/Int32/65536-16 16.8ns ± 1% 16.5ns ± 1% -1.68% (p=0.000 n=9+10) MapOperatorAssign/Int64/256-16 4.74ns ± 1% 4.77ns ± 3% ~ (p=0.563 n=10+9) MapOperatorAssign/Int64/65536-16 16.9ns ± 1% 17.2ns ± 1% +1.88% (p=0.000 n=10+10) MapOperatorAssign/Str/256-16 1.09µs ± 1% 1.10µs ± 2% ~ (p=0.210 n=10+10) MapOperatorAssign/Str/65536-16 184ns ± 9% 184ns ± 8% ~ (p=0.922 n=10+9) MapAppendAssign/Int32/256-16 13.8ns ±10% 14.4ns ±11% ~ (p=0.190 n=10+10) MapAppendAssign/Int32/65536-16 28.9ns ± 5% 30.7ns ± 6% +6.13% (p=0.001 n=9+10) MapAppendAssign/Int64/256-16 14.5ns ±12% 13.8ns ± 8% -5.02% (p=0.037 n=10+10) MapAppendAssign/Int64/65536-16 30.9ns ± 1% 30.4ns ± 2% -1.56% (p=0.001 n=10+10) MapAppendAssign/Str/256-16 30.2ns ± 6% 30.0ns ±10% ~ (p=0.645 n=10+10) MapAppendAssign/Str/65536-16 44.5ns ± 4% 46.8ns ± 3% +5.17% (p=0.001 n=8+9) MapDelete/Int32/100-16 18.7ns ± 0% 18.7ns ± 0% -0.27% (p=0.017 n=10+10) MapDelete/Int32/1000-16 17.6ns ± 1% 17.5ns ± 1% -0.85% (p=0.000 n=9+10) MapDelete/Int32/10000-16 18.7ns ± 0% 18.3ns ± 1% -1.92% (p=0.000 n=10+10) MapDelete/Int64/100-16 19.1ns ± 0% 19.2ns ± 0% +0.68% (p=0.000 n=10+9) MapDelete/Int64/1000-16 17.7ns ± 2% 18.3ns ± 1% +3.00% (p=0.000 n=10+10) MapDelete/Int64/10000-16 18.8ns ± 1% 19.2ns ± 0% +2.01% (p=0.000 n=10+9) MapDelete/Str/100-16 26.5ns ± 0% 26.4ns ± 1% -0.73% (p=0.000 n=10+10) MapDelete/Str/1000-16 23.5ns ± 2% 23.4ns ± 1% ~ (p=0.425 n=10+10) MapDelete/Str/10000-16 25.1ns ± 0% 25.1ns ± 1% +0.28% (p=0.037 n=10+10) MapDelete/Pointer/100-16 20.6ns ± 1% 20.6ns ± 0% ~ (p=0.117 n=10+10) MapDelete/Pointer/1000-16 19.2ns ± 1% 19.4ns ± 1% +0.97% (p=0.004 n=10+10) MapDelete/Pointer/10000-16 20.0ns ± 0% 20.1ns ± 1% +0.52% (p=0.022 n=10+10) Memmove/0-16 0.21ns ± 2% 0.21ns ± 1% ~ (p=0.671 n=10+10) Memmove/1-16 0.93ns ± 0% 0.93ns ± 0% +0.21% (p=0.034 n=10+10) Memmove/2-16 0.93ns ± 0% 0.93ns ± 0% ~ (p=0.101 n=10+10) Memmove/3-16 0.93ns ± 1% 0.93ns ± 1% +0.49% (p=0.004 n=10+10) Memmove/4-16 1.03ns ± 0% 1.03ns ± 0% ~ (p=0.260 n=10+10) Memmove/5-16 1.13ns ± 0% 1.13ns ± 0% +0.20% (p=0.034 n=10+10) Memmove/6-16 1.13ns ± 0% 1.13ns ± 1% ~ (p=0.126 n=10+10) Memmove/7-16 1.13ns ± 0% 1.13ns ± 1% +0.22% (p=0.028 n=10+10) Memmove/8-16 1.13ns ± 0% 1.13ns ± 0% ~ (p=0.545 n=9+10) Memmove/9-16 1.25ns ± 0% 1.35ns ± 0% +7.98% (p=0.000 n=10+10) Memmove/10-16 1.25ns ± 0% 1.35ns ± 0% +7.96% (p=0.000 n=9+9) Memmove/11-16 1.25ns ± 0% 1.35ns ± 0% +8.53% (p=0.000 n=10+9) Memmove/12-16 1.25ns ± 0% 1.35ns ± 1% +8.24% (p=0.000 n=10+10) Memmove/13-16 1.25ns ± 0% 1.34ns ± 0% +7.75% (p=0.000 n=10+10) Memmove/14-16 1.25ns ± 0% 1.35ns ± 1% +8.28% (p=0.000 n=10+9) Memmove/15-16 1.25ns ± 0% 1.35ns ± 0% +8.07% (p=0.000 n=10+9) Memmove/16-16 1.25ns ± 0% 1.35ns ± 1% +8.35% (p=0.000 n=9+10) Memmove/32-16 1.34ns ± 0% 1.36ns ± 1% +1.22% (p=0.000 n=10+10) Memmove/64-16 1.45ns ± 0% 1.64ns ± 0% +13.07% (p=0.000 n=10+9) Memmove/128-16 1.86ns ± 0% 2.02ns ± 0% +8.64% (p=0.000 n=10+10) Memmove/256-16 2.47ns ± 0% 2.49ns ± 1% +1.14% (p=0.000 n=10+10) Memmove/512-16 3.96ns ± 1% 3.96ns ± 0% ~ (p=0.182 n=10+10) Memmove/1024-16 5.90ns ± 1% 5.87ns ± 1% ~ (p=0.258 n=9+9) Memmove/2048-16 9.62ns ± 1% 9.62ns ± 2% ~ (p=0.963 n=8+9) Memmove/4096-16 16.4ns ± 0% 17.1ns ± 4% +4.19% (p=0.003 n=8+9) MemmoveOverlap/32-16 1.62ns ± 1% 1.68ns ± 1% +3.53% (p=0.000 n=10+10) MemmoveOverlap/64-16 1.64ns ± 0% 1.65ns ± 0% +0.29% (p=0.002 n=9+9) MemmoveOverlap/128-16 2.06ns ± 0% 2.06ns ± 0% ~ (p=0.070 n=10+10) MemmoveOverlap/256-16 2.67ns ± 0% 2.67ns ± 0% +0.26% (p=0.012 n=10+10) MemmoveOverlap/512-16 6.20ns ±18% 5.74ns ± 0% ~ (p=0.645 n=10+8) MemmoveOverlap/1024-16 7.28ns ± 0% 7.30ns ± 0% +0.28% (p=0.006 n=8+10) MemmoveOverlap/2048-16 11.9ns ± 0% 12.0ns ± 1% +0.37% (p=0.014 n=9+9) MemmoveOverlap/4096-16 23.3ns ± 1% 23.1ns ± 1% -0.84% (p=0.000 n=8+10) MemmoveUnalignedDst/0-16 1.03ns ± 0% 1.03ns ± 0% +0.19% (p=0.007 n=10+10) MemmoveUnalignedDst/1-16 1.24ns ± 0% 1.25ns ± 1% +0.52% (p=0.022 n=10+10) MemmoveUnalignedDst/2-16 1.23ns ± 0% 1.23ns ± 0% ~ (p=0.051 n=10+10) MemmoveUnalignedDst/3-16 1.23ns ± 0% 1.23ns ± 0% +0.14% (p=0.006 n=9+9) MemmoveUnalignedDst/4-16 1.23ns ± 0% 1.24ns ± 1% +0.37% (p=0.004 n=10+10) MemmoveUnalignedDst/5-16 1.35ns ± 0% 1.35ns ± 0% ~ (p=0.075 n=10+10) MemmoveUnalignedDst/6-16 1.34ns ± 0% 1.34ns ± 0% ~ (p=0.779 n=10+10) MemmoveUnalignedDst/7-16 1.34ns ± 0% 1.34ns ± 0% ~ (p=1.000 n=10+10) MemmoveUnalignedDst/8-16 1.34ns ± 0% 1.35ns ± 1% +0.39% (p=0.024 n=10+10) MemmoveUnalignedDst/9-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.849 n=10+10) MemmoveUnalignedDst/10-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.255 n=10+10) MemmoveUnalignedDst/11-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.304 n=10+10) MemmoveUnalignedDst/12-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.672 n=10+10) MemmoveUnalignedDst/13-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.435 n=10+10) MemmoveUnalignedDst/14-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.340 n=10+10) MemmoveUnalignedDst/15-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.911 n=10+9) MemmoveUnalignedDst/16-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.074 n=10+10) MemmoveUnalignedDst/32-16 1.62ns ± 0% 1.63ns ± 0% ~ (p=0.059 n=10+10) MemmoveUnalignedDst/64-16 1.65ns ± 0% 1.65ns ± 0% ~ (p=0.234 n=10+10) MemmoveUnalignedDst/128-16 2.06ns ± 0% 2.06ns ± 0% ~ (p=0.709 n=10+9) MemmoveUnalignedDst/256-16 3.69ns ± 0% 3.70ns ± 0% ~ (p=0.144 n=10+10) MemmoveUnalignedDst/512-16 4.15ns ± 1% 4.14ns ± 0% ~ (p=0.778 n=10+8) MemmoveUnalignedDst/1024-16 7.52ns ± 0% 7.53ns ± 1% ~ (p=0.650 n=9+9) MemmoveUnalignedDst/2048-16 12.9ns ± 0% 12.9ns ± 1% ~ (p=0.548 n=8+8) MemmoveUnalignedDst/4096-16 25.4ns ± 0% 25.4ns ± 0% ~ (p=0.947 n=9+9) MemmoveUnalignedDstOverlap/32-16 4.08ns ± 0% 4.09ns ± 0% ~ (p=0.360 n=10+10) MemmoveUnalignedDstOverlap/64-16 4.56ns ± 0% 4.56ns ± 0% ~ (p=0.705 n=10+9) MemmoveUnalignedDstOverlap/128-16 4.67ns ± 0% 4.67ns ± 0% ~ (p=0.397 n=10+10) MemmoveUnalignedDstOverlap/256-16 5.08ns ± 0% 5.08ns ± 0% ~ (p=0.159 n=10+9) MemmoveUnalignedDstOverlap/512-16 8.45ns ± 5% 8.19ns ± 0% -3.10% (p=0.021 n=10+9) MemmoveUnalignedDstOverlap/1024-16 9.55ns ± 0% 9.56ns ± 0% ~ (p=0.221 n=8+8) MemmoveUnalignedDstOverlap/2048-16 14.0ns ± 0% 14.0ns ± 1% ~ (p=0.200 n=10+9) MemmoveUnalignedDstOverlap/4096-16 26.5ns ± 0% 26.5ns ± 0% ~ (p=0.458 n=10+9) MemmoveUnalignedSrc/0-16 1.02ns ± 1% 0.99ns ± 1% -2.67% (p=0.000 n=10+9) MemmoveUnalignedSrc/1-16 1.13ns ± 0% 1.13ns ± 1% -0.25% (p=0.027 n=10+9) MemmoveUnalignedSrc/2-16 1.13ns ± 1% 1.13ns ± 0% -0.28% (p=0.012 n=10+9) MemmoveUnalignedSrc/3-16 1.24ns ± 1% 1.23ns ± 0% -0.25% (p=0.022 n=9+10) MemmoveUnalignedSrc/4-16 1.24ns ± 0% 1.23ns ± 1% ~ (p=0.118 n=9+10) MemmoveUnalignedSrc/5-16 1.34ns ± 0% 1.34ns ± 1% ~ (p=0.564 n=8+10) MemmoveUnalignedSrc/6-16 1.34ns ± 0% 1.34ns ± 0% -0.39% (p=0.000 n=10+10) MemmoveUnalignedSrc/7-16 1.34ns ± 0% 1.34ns ± 0% ~ (p=0.235 n=10+10) MemmoveUnalignedSrc/8-16 1.34ns ± 0% 1.34ns ± 0% -0.37% (p=0.002 n=10+9) MemmoveUnalignedSrc/9-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.579 n=10+9) MemmoveUnalignedSrc/10-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.534 n=10+9) MemmoveUnalignedSrc/11-16 1.44ns ± 0% 1.44ns ± 1% ~ (p=0.415 n=10+10) MemmoveUnalignedSrc/12-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.218 n=10+10) MemmoveUnalignedSrc/13-16 1.44ns ± 0% 1.44ns ± 1% ~ (p=0.693 n=10+10) MemmoveUnalignedSrc/14-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.901 n=10+10) MemmoveUnalignedSrc/15-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.379 n=10+10) MemmoveUnalignedSrc/16-16 1.44ns ± 1% 1.44ns ± 0% ~ (p=0.538 n=10+10) MemmoveUnalignedSrc/32-16 1.60ns ± 1% 1.60ns ± 0% ~ (p=0.491 n=10+10) MemmoveUnalignedSrc/64-16 1.65ns ± 0% 1.65ns ± 0% ~ (p=0.564 n=10+10) MemmoveUnalignedSrc/128-16 2.09ns ± 0% 2.09ns ± 0% ~ (p=0.497 n=10+9) MemmoveUnalignedSrc/256-16 2.70ns ± 0% 2.78ns ± 1% +2.81% (p=0.000 n=10+10) MemmoveUnalignedSrc/512-16 4.31ns ± 0% 4.30ns ± 0% -0.26% (p=0.031 n=8+9) MemmoveUnalignedSrc/1024-16 7.28ns ± 0% 7.21ns ± 1% -1.05% (p=0.000 n=8+10) MemmoveUnalignedSrc/2048-16 13.0ns ± 0% 13.0ns ± 0% ~ (p=0.180 n=9+8) MemmoveUnalignedSrc/4096-16 25.4ns ± 0% 25.3ns ± 1% ~ (p=0.054 n=10+10) MemmoveUnalignedSrcOverlap/32-16 4.04ns ± 0% 4.06ns ± 0% +0.62% (p=0.000 n=9+10) MemmoveUnalignedSrcOverlap/64-16 4.12ns ± 0% 4.12ns ± 0% ~ (p=0.421 n=10+10) MemmoveUnalignedSrcOverlap/128-16 4.53ns ± 0% 4.52ns ± 0% ~ (p=0.251 n=10+10) MemmoveUnalignedSrcOverlap/256-16 6.17ns ± 0% 6.15ns ± 0% -0.35% (p=0.000 n=10+9) MemmoveUnalignedSrcOverlap/512-16 7.43ns ± 0% 7.44ns ± 0% ~ (p=0.524 n=9+8) MemmoveUnalignedSrcOverlap/1024-16 8.94ns ± 0% 8.94ns ± 0% ~ (p=0.419 n=8+8) MemmoveUnalignedSrcOverlap/2048-16 13.2ns ± 0% 14.5ns ±21% ~ (p=0.107 n=8+10) MemmoveUnalignedSrcOverlap/4096-16 25.6ns ± 0% 25.6ns ± 1% ~ (p=0.650 n=9+9) Memclr/5-16 0.86ns ± 1% 0.86ns ± 2% ~ (p=0.531 n=9+9) Memclr/16-16 1.04ns ± 0% 1.04ns ± 0% +0.32% (p=0.013 n=9+10) Memclr/64-16 1.23ns ± 0% 1.26ns ± 0% +2.28% (p=0.000 n=10+10) Memclr/256-16 2.27ns ± 0% 2.27ns ± 0% ~ (p=0.127 n=10+10) Memclr/4096-16 17.1ns ± 1% 17.3ns ± 0% +0.88% (p=0.000 n=10+10) Memclr/65536-16 821ns ± 0% 822ns ± 0% ~ (p=0.516 n=10+10) Memclr/1M-16 14.1µs ± 1% 14.0µs ± 1% ~ (p=0.516 n=10+10) Memclr/4M-16 86.1µs ± 1% 85.9µs ± 0% ~ (p=0.123 n=10+10) Memclr/8M-16 174µs ± 2% 173µs ± 0% ~ (p=0.408 n=10+8) Memclr/16M-16 385µs ± 4% 387µs ± 0% ~ (p=0.173 n=10+8) Memclr/64M-16 2.18ms ± 0% 2.19ms ± 0% ~ (p=0.113 n=10+9) GoMemclr/5-16 0.82ns ± 0% 0.82ns ± 0% ~ (p=0.346 n=9+10) GoMemclr/16-16 1.02ns ± 0% 1.02ns ± 0% +0.22% (p=0.003 n=10+8) GoMemclr/64-16 1.14ns ± 0% 1.14ns ± 0% ~ (p=0.948 n=10+9) GoMemclr/256-16 2.06ns ± 0% 2.06ns ± 0% ~ (p=0.868 n=10+10) MemclrRange/1K_2K-16 457ns ± 0% 428ns ± 1% -6.38% (p=0.000 n=10+10) MemclrRange/2K_8K-16 1.46µs ± 0% 1.46µs ± 0% ~ (p=0.700 n=10+10) MemclrRange/4K_16K-16 1.16µs ± 0% 1.16µs ± 0% ~ (p=0.567 n=9+10) MemclrRange/160K_228K-16 20.7µs ± 0% 20.7µs ± 0% ~ (p=0.160 n=10+10) ClearFat7-16 0.38ns ± 5% 0.21ns ± 1% -45.79% (p=0.000 n=9+10) ClearFat8-16 0.21ns ± 3% 0.12ns ± 2% -44.16% (p=0.000 n=8+9) ClearFat11-16 0.35ns ± 3% 0.21ns ± 1% -40.46% (p=0.000 n=9+9) ClearFat12-16 0.23ns ± 9% 0.21ns ± 1% -10.23% (p=0.000 n=10+9) ClearFat13-16 0.22ns ± 6% 0.21ns ± 2% -6.53% (p=0.000 n=10+10) ClearFat14-16 0.22ns ± 4% 0.21ns ± 1% -5.97% (p=0.000 n=10+10) ClearFat15-16 0.22ns ± 4% 0.21ns ± 1% -6.96% (p=0.000 n=10+9) ClearFat16-16 0.19ns ± 9% 0.12ns ± 6% -34.89% (p=0.000 n=9+10) ClearFat24-16 0.23ns ± 6% 0.21ns ± 1% -10.26% (p=0.000 n=10+9) ClearFat32-16 0.22ns ± 5% 0.21ns ± 2% -5.31% (p=0.000 n=10+10) ClearFat40-16 0.34ns ± 4% 0.62ns ± 1% +83.00% (p=0.000 n=10+10) ClearFat48-16 0.33ns ± 2% 0.41ns ± 0% +26.71% (p=0.000 n=10+10) ClearFat56-16 0.41ns ± 1% 0.41ns ± 0% ~ (p=0.838 n=10+10) ClearFat64-16 0.41ns ± 0% 0.41ns ± 0% ~ (p=0.178 n=10+8) ClearFat72-16 0.82ns ± 0% 0.82ns ± 0% ~ (p=0.669 n=10+10) ClearFat128-16 1.04ns ± 0% 1.04ns ± 0% ~ (p=0.679 n=10+10) ClearFat256-16 1.86ns ± 0% 1.86ns ± 0% ~ (p=0.066 n=9+10) ClearFat512-16 3.50ns ± 0% 3.50ns ± 0% ~ (p=0.626 n=10+10) ClearFat1024-16 6.79ns ± 0% 6.79ns ± 0% ~ (p=0.986 n=10+10) ClearFat1032-16 13.6ns ± 0% 13.6ns ± 0% +0.13% (p=0.044 n=10+10) ClearFat1040-16 10.3ns ± 0% 10.3ns ± 0% ~ (p=0.175 n=10+9) CopyFat7-16 0.37ns ±13% 0.25ns ± 1% -31.74% (p=0.000 n=10+9) CopyFat8-16 0.17ns ± 1% 0.17ns ± 2% +1.35% (p=0.004 n=9+9) CopyFat11-16 0.26ns ± 1% 0.30ns ± 3% +12.58% (p=0.000 n=9+10) CopyFat12-16 0.28ns ± 2% 0.26ns ± 1% -5.66% (p=0.000 n=9+9) CopyFat13-16 0.26ns ± 0% 0.28ns ± 4% +7.35% (p=0.000 n=8+10) CopyFat14-16 0.29ns ± 6% 0.26ns ± 2% -10.46% (p=0.000 n=10+9) CopyFat15-16 0.26ns ± 1% 0.30ns ± 6% +14.12% (p=0.000 n=8+10) CopyFat16-16 0.21ns ± 1% 0.21ns ± 0% ~ (p=0.426 n=8+8) CopyFat24-16 0.29ns ± 3% 0.25ns ± 1% -12.27% (p=0.000 n=9+10) CopyFat32-16 0.26ns ± 4% 0.29ns ± 4% +11.71% (p=0.000 n=10+10) CopyFat64-16 0.46ns ± 8% 0.42ns ± 1% -8.37% (p=0.002 n=10+10) CopyFat72-16 0.82ns ± 0% 0.82ns ± 0% ~ (p=0.563 n=10+10) CopyFat128-16 1.53ns ± 0% 1.54ns ± 0% +0.62% (p=0.000 n=10+10) CopyFat256-16 2.68ns ± 0% 2.65ns ± 1% -1.23% (p=0.000 n=10+10) CopyFat512-16 4.93ns ± 1% 5.19ns ± 3% +5.16% (p=0.000 n=9+9) CopyFat520-16 6.99ns ± 0% 6.99ns ± 0% ~ (p=0.539 n=10+10) CopyFat1024-16 11.5ns ± 1% 9.8ns ± 1% -14.98% (p=0.000 n=9+10) CopyFat1032-16 13.6ns ± 0% 13.6ns ± 0% ~ (p=0.728 n=10+10) CopyFat1040-16 11.0ns ± 0% 11.1ns ± 0% +0.53% (p=0.000 n=10+10) Issue18740/2byte-16 10.1µs ± 0% 10.1µs ± 0% ~ (p=0.342 n=10+10) Issue18740/4byte-16 2.34µs ± 0% 2.35µs ± 0% +0.30% (p=0.002 n=10+8) Issue18740/8byte-16 1.28µs ± 0% 1.28µs ± 0% +0.32% (p=0.000 n=9+10) Finalizer-16 345µs ± 1% 336µs ± 0% -2.55% (p=0.000 n=10+9) FinalizerRun-16 450ns ± 3% 420ns ± 1% -6.65% (p=0.000 n=10+10) PallocBitsSummarize/Unpacked00-16 2.88ns ± 0% 2.88ns ± 0% ~ (p=0.358 n=10+10) PallocBitsSummarize/UnpackedFFFFFFFFFFFFFFFF-16 15.2ns ± 0% 15.2ns ± 0% ~ (p=0.925 n=10+10) PallocBitsSummarize/UnpackedAA-16 16.4ns ± 0% 16.3ns ± 0% ~ (p=0.113 n=9+9) PallocBitsSummarize/UnpackedAAAAAAAAAAAAAAAA-16 16.5ns ± 0% 16.6ns ± 0% ~ (p=0.238 n=10+10) PallocBitsSummarize/Unpacked80000000AAAAAAAA-16 37.8ns ± 1% 36.4ns ± 0% -3.70% (p=0.000 n=10+9) PallocBitsSummarize/UnpackedAAAAAAAA00000001-16 41.8ns ± 1% 39.9ns ± 0% -4.68% (p=0.000 n=9+10) PallocBitsSummarize/UnpackedBBBBBBBBBBBBBBBB-16 18.3ns ± 0% 18.3ns ± 0% ~ (p=0.781 n=10+10) PallocBitsSummarize/Unpacked80000000BBBBBBBB-16 38.8ns ± 1% 38.1ns ± 0% -1.78% (p=0.000 n=9+10) PallocBitsSummarize/UnpackedBBBBBBBB00000001-16 37.5ns ± 0% 36.1ns ± 1% -3.88% (p=0.000 n=8+10) PallocBitsSummarize/UnpackedCCCCCCCCCCCCCCCC-16 21.8ns ± 0% 21.9ns ± 0% +0.20% (p=0.018 n=10+9) PallocBitsSummarize/Unpacked4444444444444444-16 21.8ns ± 0% 21.9ns ± 0% +0.20% (p=0.029 n=10+9) PallocBitsSummarize/Unpacked4040404040404040-16 26.5ns ± 0% 26.5ns ± 0% -0.24% (p=0.001 n=9+10) PallocBitsSummarize/Unpacked4000400040004000-16 33.4ns ± 1% 31.3ns ± 0% -6.20% (p=0.000 n=9+10) PallocBitsSummarize/Unpacked1000404044CCAAFF-16 36.4ns ± 1% 35.9ns ± 0% -1.50% (p=0.000 n=10+10) FindBitRange64/Pattern00Size2-16 0.34ns ± 1% 0.35ns ± 1% +3.80% (p=0.000 n=10+9) FindBitRange64/Pattern00Size8-16 0.70ns ± 1% 0.70ns ± 0% -0.68% (p=0.000 n=10+10) FindBitRange64/Pattern00Size32-16 0.70ns ± 1% 0.69ns ± 0% -0.86% (p=0.001 n=10+8) FindBitRange64/PatternFFFFFFFFFFFFFFFFSize2-16 0.34ns ± 1% 0.35ns ± 1% +4.45% (p=0.000 n=9+8) FindBitRange64/PatternFFFFFFFFFFFFFFFFSize8-16 1.54ns ± 0% 1.54ns ± 1% ~ (p=0.914 n=9+9) FindBitRange64/PatternFFFFFFFFFFFFFFFFSize32-16 2.78ns ± 0% 2.78ns ± 0% ~ (p=0.295 n=9+10) FindBitRange64/PatternAASize2-16 0.34ns ± 2% 0.35ns ± 2% +4.61% (p=0.000 n=10+10) FindBitRange64/PatternAASize8-16 0.70ns ± 1% 0.70ns ± 1% -0.82% (p=0.005 n=10+10) FindBitRange64/PatternAASize32-16 0.70ns ± 1% 0.70ns ± 0% -0.73% (p=0.003 n=10+9) FindBitRange64/PatternAAAAAAAAAAAAAAAASize2-16 0.34ns ± 2% 0.35ns ± 2% +3.94% (p=0.000 n=10+10) FindBitRange64/PatternAAAAAAAAAAAAAAAASize8-16 0.70ns ± 1% 0.70ns ± 1% -0.67% (p=0.025 n=10+10) FindBitRange64/PatternAAAAAAAAAAAAAAAASize32-16 0.70ns ± 1% 0.70ns ± 1% ~ (p=0.118 n=9+10) FindBitRange64/Pattern80000000AAAAAAAASize2-16 0.34ns ± 1% 0.35ns ± 2% +3.72% (p=0.000 n=10+9) FindBitRange64/Pattern80000000AAAAAAAASize8-16 0.70ns ± 1% 0.70ns ± 0% ~ (p=0.102 n=10+10) FindBitRange64/Pattern80000000AAAAAAAASize32-16 0.70ns ± 1% 0.70ns ± 1% -0.55% (p=0.011 n=10+10) FindBitRange64/PatternAAAAAAAA00000001Size2-16 0.34ns ± 2% 0.35ns ± 1% +3.83% (p=0.000 n=10+9) FindBitRange64/PatternAAAAAAAA00000001Size8-16 0.70ns ± 1% 0.70ns ± 1% ~ (p=0.065 n=10+10) FindBitRange64/PatternAAAAAAAA00000001Size32-16 0.70ns ± 1% 0.70ns ± 1% -0.95% (p=0.002 n=10+10) FindBitRange64/PatternBBBBBBBBBBBBBBBBSize2-16 0.34ns ± 0% 0.35ns ± 1% +4.12% (p=0.000 n=8+10) FindBitRange64/PatternBBBBBBBBBBBBBBBBSize8-16 1.24ns ± 0% 1.23ns ± 0% -0.30% (p=0.002 n=10+9) FindBitRange64/PatternBBBBBBBBBBBBBBBBSize32-16 1.24ns ± 0% 1.24ns ± 0% -0.17% (p=0.023 n=9+10) FindBitRange64/Pattern80000000BBBBBBBBSize2-16 0.34ns ± 1% 0.35ns ± 2% +4.82% (p=0.000 n=9+10) FindBitRange64/Pattern80000000BBBBBBBBSize8-16 1.24ns ± 1% 1.24ns ± 0% ~ (p=0.063 n=10+10) FindBitRange64/Pattern80000000BBBBBBBBSize32-16 1.24ns ± 0% 1.24ns ± 0% ~ (p=0.164 n=9+10) FindBitRange64/PatternBBBBBBBB00000001Size2-16 0.34ns ± 1% 0.35ns ± 1% +4.38% (p=0.000 n=8+10) FindBitRange64/PatternBBBBBBBB00000001Size8-16 1.24ns ± 1% 1.24ns ± 0% ~ (p=0.052 n=10+10) FindBitRange64/PatternBBBBBBBB00000001Size32-16 1.24ns ± 0% 1.23ns ± 0% -0.40% (p=0.000 n=10+10) FindBitRange64/PatternCCCCCCCCCCCCCCCCSize2-16 0.34ns ± 0% 0.35ns ± 2% +3.96% (p=0.000 n=9+10) FindBitRange64/PatternCCCCCCCCCCCCCCCCSize8-16 1.24ns ± 0% 1.23ns ± 0% -0.30% (p=0.000 n=10+9) FindBitRange64/PatternCCCCCCCCCCCCCCCCSize32-16 1.24ns ± 0% 1.24ns ± 1% ~ (p=0.284 n=10+10) FindBitRange64/Pattern4444444444444444Size2-16 0.34ns ± 1% 0.35ns ± 1% +3.91% (p=0.000 n=9+9) FindBitRange64/Pattern4444444444444444Size8-16 0.70ns ± 1% 0.70ns ± 1% ~ (p=0.617 n=10+10) FindBitRange64/Pattern4444444444444444Size32-16 0.70ns ± 1% 0.70ns ± 1% -0.60% (p=0.006 n=10+10) FindBitRange64/Pattern4040404040404040Size2-16 0.34ns ± 2% 0.35ns ± 2% +3.67% (p=0.000 n=10+10) FindBitRange64/Pattern4040404040404040Size8-16 0.70ns ± 2% 0.70ns ± 1% -0.87% (p=0.014 n=10+10) FindBitRange64/Pattern4040404040404040Size32-16 0.70ns ± 1% 0.70ns ± 1% ~ (p=0.256 n=10+10) FindBitRange64/Pattern4000400040004000Size2-16 0.34ns ± 2% 0.35ns ± 3% +4.71% (p=0.000 n=10+10) FindBitRange64/Pattern4000400040004000Size8-16 0.70ns ± 1% 0.70ns ± 1% ~ (p=0.393 n=10+10) FindBitRange64/Pattern4000400040004000Size32-16 0.70ns ± 1% 0.70ns ± 1% -0.86% (p=0.014 n=10+10) NetpollBreak-16 1.49µs ± 1% 1.50µs ± 3% ~ (p=0.181 n=8+10) Syscall-16 3.68ns ± 1% 3.66ns ± 2% ~ (p=0.148 n=10+10) SyscallWork-16 5.15ns ± 1% 5.13ns ± 0% ~ (p=0.188 n=10+9) SyscallExcess-16 3.89ns ± 2% 3.83ns ± 1% -1.52% (p=0.001 n=10+10) SyscallExcessWork-16 5.34ns ± 1% 5.31ns ± 0% -0.64% (p=0.000 n=10+9) PingPongHog-16 397ns ± 7% 394ns ±11% ~ (p=0.912 n=10+10) StackGrowth-16 67.9ns ± 0% 68.8ns ± 0% +1.28% (p=0.000 n=10+8) StackGrowthDeep-16 7.70µs ± 1% 8.48µs ± 2% +10.06% (p=0.000 n=9+10) CreateGoroutines-16 124ns ± 1% 124ns ± 1% ~ (p=0.254 n=10+10) CreateGoroutinesParallel-16 25.7ns ± 1% 27.6ns ± 2% +7.51% (p=0.000 n=10+10) CreateGoroutinesCapture-16 823ns ± 1% 821ns ± 2% ~ (p=0.699 n=10+10) CreateGoroutinesSingle-16 175ns ± 3% 172ns ± 3% -1.90% (p=0.011 n=10+10) ClosureCall-16 0.11ns ± 7% 0.12ns ± 3% ~ (p=0.842 n=9+10) WakeupParallelSpinning/0s-16 11.4µs ± 0% 11.4µs ± 0% ~ (p=0.325 n=9+10) WakeupParallelSpinning/1µs-16 15.4µs ± 0% 15.4µs ± 1% ~ (p=0.955 n=10+10) WakeupParallelSpinning/2µs-16 18.7µs ± 2% 18.9µs ± 2% ~ (p=0.052 n=10+10) WakeupParallelSpinning/5µs-16 30.7µs ± 0% 30.7µs ± 0% -0.03% (p=0.003 n=10+10) WakeupParallelSpinning/10µs-16 48.8µs ± 0% 48.8µs ± 0% ~ (p=0.670 n=10+10) WakeupParallelSpinning/20µs-16 90.8µs ± 0% 90.8µs ± 0% -0.02% (p=0.004 n=10+10) WakeupParallelSpinning/50µs-16 211µs ± 0% 211µs ± 0% ~ (p=0.194 n=10+10) WakeupParallelSpinning/100µs-16 323µs ± 0% 323µs ± 0% ~ (p=1.000 n=10+9) WakeupParallelSyscall/0s-16 118µs ± 0% 118µs ± 0% ~ (p=0.447 n=10+9) WakeupParallelSyscall/1µs-16 119µs ± 2% 119µs ± 1% ~ (p=0.604 n=10+9) WakeupParallelSyscall/2µs-16 120µs ± 1% 121µs ± 3% ~ (p=0.263 n=8+10) WakeupParallelSyscall/5µs-16 126µs ± 2% 126µs ± 2% ~ (p=0.510 n=10+9) WakeupParallelSyscall/10µs-16 136µs ± 1% 137µs ± 1% ~ (p=0.095 n=9+10) WakeupParallelSyscall/20µs-16 156µs ± 2% 157µs ± 3% ~ (p=0.604 n=10+9) WakeupParallelSyscall/50µs-16 221µs ± 1% 220µs ± 1% ~ (p=0.063 n=10+10) WakeupParallelSyscall/100µs-16 326µs ± 0% 325µs ± 0% -0.26% (p=0.003 n=9+10) Matmult-16 0.67ns ± 2% 0.66ns ± 2% ~ (p=0.256 n=10+10) Fastrand-16 0.08ns ±11% 0.08ns ±13% ~ (p=0.661 n=9+10) Fastrand64-16 0.08ns ±11% 0.08ns ± 6% ~ (p=0.631 n=10+10) FastrandHashiter-16 1.76ns ± 1% 1.76ns ± 1% ~ (p=0.854 n=8+8) Fastrandn/2-16 0.86ns ± 1% 0.86ns ± 1% +1.09% (p=0.000 n=10+9) Fastrandn/3-16 0.85ns ± 1% 0.86ns ± 1% +1.23% (p=0.001 n=10+10) Fastrandn/4-16 0.85ns ± 1% 0.87ns ± 2% +1.60% (p=0.000 n=10+10) Fastrandn/5-16 0.85ns ± 1% 0.86ns ± 1% +1.05% (p=0.000 n=10+10) IfaceCmp100-16 46.6ns ± 0% 46.1ns ± 0% -1.18% (p=0.000 n=10+10) IfaceCmpNil100-16 26.8ns ± 0% 26.8ns ± 0% ~ (p=0.777 n=10+8) EfaceCmpDiff-16 132ns ± 0% 130ns ± 0% -0.95% (p=0.000 n=10+9) EfaceCmpDiffIndirect-16 209ns ± 0% 211ns ± 0% +1.14% (p=0.000 n=10+9) Defer-16 3.40ns ± 1% 3.04ns ± 0% -10.67% (p=0.000 n=10+10) Defer10-16 29.4ns ± 2% 27.2ns ± 3% -7.26% (p=0.000 n=10+10) DeferMany-16 110ns ± 6% 113ns ± 2% +3.45% (p=0.017 n=9+9) PanicRecover-16 67.6ns ± 0% 67.7ns ± 2% ~ (p=0.436 n=9+9) GoroutineProfile/small-nil/idle-16 3.90µs ± 4% 3.86µs ± 2% ~ (p=0.305 n=10+9) GoroutineProfile/small-nil/loaded-16 4.82µs ± 6% 4.82µs ± 4% ~ (p=0.905 n=10+9) GoroutineProfile/small/idle-16 103µs ± 3% 102µs ± 3% ~ (p=0.113 n=9+9) GoroutineProfile/small/loaded-16 432µs ± 5% 440µs ±13% ~ (p=0.604 n=9+10) GoroutineProfile/large-nil/idle-16 3.86µs ± 3% 3.82µs ± 3% ~ (p=0.210 n=10+10) GoroutineProfile/large-nil/loaded-16 4.90µs ± 2% 4.90µs ± 5% ~ (p=0.780 n=10+9) GoroutineProfile/large/idle-16 2.58ms ± 1% 2.52ms ± 1% -2.38% (p=0.000 n=10+10) GoroutineProfile/large/loaded-16 8.62ms ± 9% 8.90ms ±11% ~ (p=0.400 n=9+10) GoroutineProfile/sparse-nil/idle-16 3.85µs ± 4% 3.81µs ± 3% ~ (p=0.470 n=10+10) GoroutineProfile/sparse-nil/loaded-16 4.82µs ± 4% 4.69µs ± 5% ~ (p=0.052 n=10+10) GoroutineProfile/sparse/idle-16 102µs ± 4% 102µs ± 2% ~ (p=0.497 n=10+9) GoroutineProfile/sparse/loaded-16 438µs ± 7% 437µs ± 6% ~ (p=0.796 n=10+10) RWMutexUncontended-16 6.79ns ± 0% 6.78ns ± 0% ~ (p=0.228 n=10+8) RWMutexWrite100-16 85.4ns ± 0% 87.1ns ± 0% +2.00% (p=0.000 n=10+8) RWMutexWrite10-16 168ns ±25% 152ns ±11% ~ (p=0.063 n=10+10) RWMutexWorkWrite100-16 106ns ± 0% 106ns ± 3% ~ (p=0.136 n=10+10) RWMutexWorkWrite10-16 567ns ± 3% 571ns ± 1% ~ (p=0.326 n=10+9) SemTable/OneAddrCollision/n=1000-16 15.9µs ± 1% 16.0µs ± 1% +0.50% (p=0.031 n=9+9) SemTable/ManyAddrCollision/n=1000-16 56.2µs ± 1% 56.8µs ± 1% +1.06% (p=0.000 n=10+10) SemTable/OneAddrCollision/n=2000-16 32.6µs ± 2% 32.9µs ± 4% ~ (p=0.156 n=10+9) SemTable/ManyAddrCollision/n=2000-16 118µs ± 0% 119µs ± 0% +0.75% (p=0.000 n=9+10) SemTable/OneAddrCollision/n=4000-16 65.3µs ± 1% 65.6µs ± 3% ~ (p=0.497 n=9+10) SemTable/ManyAddrCollision/n=4000-16 245µs ± 0% 248µs ± 2% +1.36% (p=0.000 n=9+10) SemTable/OneAddrCollision/n=8000-16 131µs ± 1% 130µs ± 1% -1.01% (p=0.002 n=9+10) SemTable/ManyAddrCollision/n=8000-16 503µs ± 1% 508µs ± 0% +0.97% (p=0.000 n=10+10) MakeSliceCopy/mallocmove/Byte-16 67.6ns ± 1% 64.1ns ± 2% -5.20% (p=0.000 n=10+10) MakeSliceCopy/mallocmove/Int-16 65.0ns ± 7% 61.7ns ± 4% -5.08% (p=0.009 n=10+10) MakeSliceCopy/mallocmove/Ptr-16 88.1ns ± 1% 79.9ns ± 1% -9.29% (p=0.000 n=10+10) MakeSliceCopy/makecopy/Byte-16 65.2ns ± 6% 63.4ns ± 0% ~ (p=0.500 n=10+8) MakeSliceCopy/makecopy/Int-16 63.2ns ± 1% 64.1ns ± 1% +1.34% (p=0.001 n=9+9) MakeSliceCopy/makecopy/Ptr-16 88.1ns ± 1% 80.1ns ± 1% -9.09% (p=0.000 n=10+10) MakeSliceCopy/nilappend/Byte-16 69.8ns ± 1% 65.7ns ± 3% -5.80% (p=0.000 n=10+10) MakeSliceCopy/nilappend/Int-16 69.6ns ± 2% 67.2ns ± 1% -3.50% (p=0.000 n=10+9) MakeSliceCopy/nilappend/Ptr-16 91.5ns ± 1% 83.8ns ± 1% -8.42% (p=0.000 n=9+10) MakeSlice/Byte-16 6.64ns ± 3% 6.58ns ± 2% ~ (p=0.393 n=10+10) MakeSlice/Int16-16 8.60ns ± 1% 8.38ns ± 3% -2.48% (p=0.001 n=9+10) MakeSlice/Int-16 17.7ns ± 3% 16.9ns ± 1% -4.67% (p=0.000 n=10+9) MakeSlice/Ptr-16 24.0ns ± 3% 23.3ns ± 2% -3.25% (p=0.000 n=10+9) MakeSlice/Struct/24-16 34.1ns ± 1% 32.0ns ± 1% -6.11% (p=0.000 n=10+10) MakeSlice/Struct/32-16 39.1ns ± 4% 38.2ns ± 1% ~ (p=0.829 n=10+8) MakeSlice/Struct/40-16 47.0ns ± 5% 43.0ns ± 2% -8.55% (p=0.000 n=10+9) GrowSlice/Byte-16 15.3ns ± 3% 15.0ns ± 2% -1.75% (p=0.005 n=9+9) GrowSlice/Int16-16 18.9ns ± 2% 18.4ns ± 2% -2.71% (p=0.000 n=10+9) GrowSlice/Int-16 33.9ns ± 1% 32.2ns ± 1% -4.89% (p=0.000 n=10+9) GrowSlice/Ptr-16 45.3ns ± 2% 43.5ns ± 1% -4.12% (p=0.000 n=10+10) GrowSlice/Struct/24-16 61.9ns ± 2% 60.0ns ± 4% -3.10% (p=0.002 n=10+10) GrowSlice/Struct/32-16 79.9ns ± 2% 72.3ns ± 3% -9.58% (p=0.000 n=8+10) GrowSlice/Struct/40-16 97.1ns ± 7% 88.8ns ± 5% -8.49% (p=0.000 n=10+10) ExtendSlice/IntSlice-16 21.1ns ± 2% 20.3ns ± 2% -3.71% (p=0.000 n=10+10) ExtendSlice/PointerSlice-16 26.8ns ± 2% 26.3ns ± 2% -1.86% (p=0.004 n=10+10) ExtendSlice/NoGrow-16 1.23ns ± 0% 1.30ns ± 1% +5.03% (p=0.000 n=10+10) Append-16 4.58ns ± 1% 4.53ns ± 0% -1.11% (p=0.000 n=10+10) AppendGrowByte-16 1.46ms ± 8% 1.42ms ± 7% -3.24% (p=0.035 n=10+10) AppendGrowString-16 27.8ms ± 4% 27.2ms ± 5% ~ (p=0.052 n=10+10) AppendSlice/1Bytes-16 1.03ns ± 1% 1.04ns ± 1% ~ (p=0.303 n=10+10) AppendSlice/4Bytes-16 1.04ns ± 0% 1.05ns ± 0% +0.79% (p=0.000 n=9+10) AppendSlice/7Bytes-16 1.23ns ± 1% 1.24ns ± 0% +0.45% (p=0.001 n=10+10) AppendSlice/8Bytes-16 1.24ns ± 0% 1.24ns ± 0% ~ (p=0.183 n=10+10) AppendSlice/15Bytes-16 1.37ns ± 1% 1.43ns ± 1% +3.88% (p=0.000 n=10+10) AppendSlice/16Bytes-16 1.37ns ± 1% 1.42ns ± 1% +3.63% (p=0.000 n=9+10) AppendSlice/32Bytes-16 1.44ns ± 0% 1.47ns ± 1% +1.83% (p=0.000 n=10+10) AppendSliceLarge/1024Bytes-16 257ns ± 2% 234ns ± 1% -8.96% (p=0.000 n=8+9) AppendSliceLarge/4096Bytes-16 871ns ± 6% 812ns ± 1% -6.80% (p=0.000 n=10+10) AppendSliceLarge/16384Bytes-16 3.15µs ± 6% 3.04µs ± 5% ~ (p=0.052 n=10+10) AppendSliceLarge/65536Bytes-16 10.7µs ± 7% 10.8µs ± 2% ~ (p=0.278 n=10+9) AppendSliceLarge/262144Bytes-16 42.9µs ± 1% 39.6µs ± 5% -7.75% (p=0.000 n=9+10) AppendSliceLarge/1048576Bytes-16 147µs ± 4% 144µs ± 4% -2.21% (p=0.035 n=10+10) AppendStr/1Bytes-16 1.20ns ± 0% 1.20ns ± 0% ~ (p=0.755 n=10+10) AppendStr/4Bytes-16 1.13ns ± 0% 1.14ns ± 1% +1.20% (p=0.000 n=10+10) AppendStr/8Bytes-16 1.24ns ± 0% 1.25ns ± 0% +0.93% (p=0.000 n=10+10) AppendStr/16Bytes-16 1.40ns ± 0% 1.42ns ± 0% +2.10% (p=0.000 n=9+10) AppendStr/32Bytes-16 1.44ns ± 0% 1.45ns ± 0% +0.99% (p=0.000 n=10+10) AppendSpecialCase-16 8.64ns ± 1% 8.89ns ± 2% +2.90% (p=0.000 n=10+10) Copy/1Byte-16 1.24ns ± 1% 1.24ns ± 0% -0.28% (p=0.000 n=10+6) Copy/1String-16 1.24ns ± 0% 1.23ns ± 0% ~ (p=0.160 n=10+10) Copy/2Byte-16 1.24ns ± 0% 1.24ns ± 0% ~ (p=0.115 n=10+10) Copy/2String-16 1.24ns ± 0% 1.24ns ± 1% ~ (p=0.954 n=10+10) Copy/4Byte-16 1.24ns ± 0% 1.24ns ± 0% -0.44% (p=0.001 n=10+10) Copy/4String-16 1.23ns ± 0% 1.23ns ± 0% ~ (p=0.081 n=10+10) Copy/8Byte-16 1.37ns ± 0% 1.34ns ± 0% -1.79% (p=0.000 n=9+9) Copy/8String-16 1.34ns ± 0% 1.34ns ± 0% -0.58% (p=0.000 n=9+10) Copy/12Byte-16 1.44ns ± 0% 1.44ns ± 0% ~ (p=0.149 n=9+9) Copy/12String-16 1.44ns ± 0% 1.45ns ± 0% ~ (p=0.124 n=9+9) Copy/16Byte-16 1.44ns ± 0% 1.44ns ± 0% -0.19% (p=0.004 n=10+9) Copy/16String-16 1.44ns ± 0% 1.45ns ± 0% +0.30% (p=0.008 n=10+10) Copy/32Byte-16 1.63ns ± 1% 1.62ns ± 1% -0.72% (p=0.002 n=10+10) Copy/32String-16 1.60ns ± 1% 1.64ns ± 0% +2.23% (p=0.000 n=10+10) Copy/128Byte-16 2.06ns ± 0% 2.06ns ± 0% ~ (p=0.757 n=9+10) Copy/128String-16 2.07ns ± 0% 2.07ns ± 0% +0.36% (p=0.004 n=10+10) Copy/1024Byte-16 6.07ns ± 2% 6.00ns ± 1% -1.20% (p=0.000 n=9+10) Copy/1024String-16 6.05ns ± 0% 5.95ns ± 1% -1.54% (p=0.000 n=10+9) AppendInPlace/NoGrow/Byte-16 288ns ± 1% 284ns ± 1% -1.58% (p=0.000 n=10+10) AppendInPlace/NoGrow/1Ptr-16 844ns ± 1% 809ns ± 3% -4.13% (p=0.000 n=9+10) AppendInPlace/NoGrow/2Ptr-16 1.47µs ± 1% 1.46µs ± 1% ~ (p=0.388 n=9+10) AppendInPlace/NoGrow/3Ptr-16 1.87µs ± 7% 1.91µs ± 1% ~ (p=0.166 n=10+8) AppendInPlace/NoGrow/4Ptr-16 2.66µs ± 1% 2.67µs ± 3% ~ (p=0.968 n=9+10) AppendInPlace/Grow/Byte-16 126ns ± 2% 121ns ± 2% -4.06% (p=0.000 n=10+10) AppendInPlace/Grow/1Ptr-16 132ns ± 2% 127ns ± 2% -4.28% (p=0.000 n=10+9) AppendInPlace/Grow/2Ptr-16 196ns ± 2% 188ns ± 1% -4.20% (p=0.000 n=10+8) AppendInPlace/Grow/3Ptr-16 264ns ± 1% 260ns ± 1% -1.51% (p=0.000 n=9+10) AppendInPlace/Grow/4Ptr-16 297ns ± 2% 294ns ± 2% ~ (p=0.085 n=10+10) StackCopyPtr-16 36.4ms ± 2% 36.7ms ± 2% ~ (p=0.481 n=10+10) StackCopy-16 33.9ms ± 3% 32.6ms ± 1% -3.87% (p=0.000 n=10+8) StackCopyNoCache-16 1.00ms ± 5% 1.01ms ± 5% ~ (p=0.143 n=10+10) StackCopyWithStkobj-16 11.0ms ± 3% 10.9ms ± 4% ~ (p=0.579 n=10+10) Issue18138-16 49.2µs ± 5% 49.0µs ± 4% ~ (p=1.000 n=10+9) CompareStringEqual-16 1.39ns ± 1% 1.45ns ± 2% +3.80% (p=0.000 n=8+10) CompareStringIdentical-16 0.55ns ± 1% 0.55ns ± 0% +0.42% (p=0.007 n=10+10) CompareStringSameLength-16 1.03ns ± 0% 1.03ns ± 0% ~ (p=0.430 n=9+10) CompareStringDifferentLength-16 0.11ns ± 2% 0.11ns ± 3% ~ (p=0.139 n=9+10) CompareStringBigUnaligned-16 23.9µs ± 1% 24.0µs ± 1% ~ (p=0.370 n=9+8) CompareStringBig-16 22.0µs ± 3% 22.2µs ± 3% ~ (p=0.243 n=9+10) ConcatStringAndBytes-16 10.7ns ± 1% 10.0ns ± 2% -6.33% (p=0.000 n=10+10) SliceByteToString/1-16 1.34ns ± 0% 1.34ns ± 0% ~ (p=0.057 n=10+10) SliceByteToString/2-16 6.67ns ± 2% 6.60ns ± 3% ~ (p=0.101 n=10+10) SliceByteToString/4-16 7.76ns ± 2% 7.56ns ± 3% -2.59% (p=0.001 n=10+10) SliceByteToString/8-16 9.81ns ± 4% 9.57ns ± 2% -2.48% (p=0.005 n=10+10) SliceByteToString/16-16 14.0ns ± 3% 13.7ns ± 2% -2.31% (p=0.009 n=10+10) SliceByteToString/32-16 17.3ns ± 1% 16.7ns ± 2% -3.41% (p=0.000 n=10+10) SliceByteToString/64-16 25.1ns ± 1% 24.1ns ± 2% -3.93% (p=0.000 n=9+10) SliceByteToString/128-16 38.6ns ± 1% 36.5ns ± 1% -5.60% (p=0.000 n=10+10) RuneCount/lenruneslice/ASCII-16 4.12ns ± 0% 4.11ns ± 0% ~ (p=0.382 n=10+10) RuneCount/lenruneslice/Japanese-16 25.4ns ± 2% 25.6ns ± 2% ~ (p=0.138 n=9+10) RuneCount/lenruneslice/MixedLength-16 17.1ns ± 0% 17.2ns ± 0% +0.59% (p=0.000 n=9+9) RuneCount/rangeloop/ASCII-16 3.30ns ± 1% 3.29ns ± 0% ~ (p=0.267 n=10+10) RuneCount/rangeloop/Japanese-16 20.1ns ± 1% 24.9ns ± 1% +24.31% (p=0.000 n=9+9) RuneCount/rangeloop/MixedLength-16 16.5ns ± 1% 16.7ns ± 1% +1.34% (p=0.000 n=10+10) RuneCount/utf8.RuneCountInString/ASCII-16 5.71ns ± 1% 5.73ns ± 2% ~ (p=0.579 n=10+10) RuneCount/utf8.RuneCountInString/Japanese-16 22.0ns ± 6% 18.4ns ± 3% -16.41% (p=0.000 n=9+10) RuneCount/utf8.RuneCountInString/MixedLength-16 15.0ns ± 1% 14.9ns ± 1% -1.01% (p=0.004 n=9+10) RuneIterate/range/ASCII-16 2.69ns ± 1% 2.72ns ± 0% +0.94% (p=0.026 n=10+9) RuneIterate/range/Japanese-16 24.5ns ± 2% 25.3ns ± 2% +3.23% (p=0.000 n=9+10) RuneIterate/range/MixedLength-16 17.0ns ± 1% 17.1ns ± 1% +0.85% (p=0.000 n=10+10) RuneIterate/range1/ASCII-16 2.70ns ± 1% 2.72ns ± 0% ~ (p=0.058 n=9+9) RuneIterate/range1/Japanese-16 24.1ns ± 2% 25.2ns ± 3% +4.30% (p=0.000 n=10+10) RuneIterate/range1/MixedLength-16 16.9ns ± 1% 17.7ns ± 0% +5.04% (p=0.000 n=10+8) RuneIterate/range2/ASCII-16 2.84ns ± 8% 2.72ns ± 1% -4.28% (p=0.003 n=10+9) RuneIterate/range2/Japanese-16 22.7ns ± 4% 25.2ns ± 3% +10.97% (p=0.000 n=10+10) RuneIterate/range2/MixedLength-16 17.0ns ± 1% 17.2ns ± 0% +0.95% (p=0.000 n=10+10) ArrayEqual-16 0.40ns ± 5% 0.35ns ± 2% -11.83% (p=0.000 n=10+10) Func/Name-16 8.05ns ± 1% 8.09ns ± 1% +0.40% (p=0.025 n=8+10) Func/Entry-16 1.73ns ± 1% 1.66ns ± 1% -3.93% (p=0.000 n=10+10) Func/FileLine-16 27.5ns ± 2% 26.0ns ± 0% -5.50% (p=0.000 n=10+10) [Geo mean] 16.7ns 15.7ns -6.08% name old speed new speed delta SetTypePtr-16 11.0GB/s ± 1% 11.0GB/s ± 3% ~ (p=0.684 n=10+10) SetTypePtr8-16 15.5GB/s ± 0% 15.5GB/s ± 0% ~ (p=0.123 n=10+10) SetTypePtr16-16 31.0GB/s ± 1% 31.1GB/s ± 0% ~ (p=0.123 n=10+10) SetTypePtr32-16 62.1GB/s ± 0% 62.2GB/s ± 0% ~ (p=0.123 n=10+10) SetTypePtr64-16 124GB/s ± 0% 124GB/s ± 0% ~ (p=0.684 n=10+10) SetTypePtr126-16 146GB/s ± 0% 146GB/s ± 0% ~ (p=0.481 n=10+10) SetTypePtr128-16 154GB/s ± 0% 154GB/s ± 0% ~ (p=0.243 n=9+10) SetTypePtrSlice-16 151GB/s ± 1% 151GB/s ± 1% ~ (p=0.497 n=9+10) SetTypeNode1-16 5.82GB/s ± 1% 5.82GB/s ± 0% ~ (p=0.353 n=10+10) SetTypeNode1Slice-16 76.1GB/s ± 1% 77.0GB/s ± 1% +1.19% (p=0.000 n=10+10) SetTypeNode8-16 19.4GB/s ± 0% 19.4GB/s ± 0% ~ (p=0.130 n=8+8) SetTypeNode8Slice-16 113GB/s ± 0% 113GB/s ± 0% ~ (p=0.604 n=10+9) SetTypeNode64-16 76.5GB/s ± 0% 76.5GB/s ± 0% ~ (p=0.190 n=10+10) SetTypeNode64Slice-16 97.8GB/s ± 0% 97.7GB/s ± 0% ~ (p=0.549 n=9+10) SetTypeNode64Dead-16 95.5GB/s ± 0% 95.7GB/s ± 0% ~ (p=0.118 n=10+6) SetTypeNode64DeadSlice-16 112GB/s ± 0% 112GB/s ± 0% ~ (p=0.353 n=10+10) SetTypeNode124-16 146GB/s ± 0% 146GB/s ± 0% ~ (p=0.853 n=10+10) SetTypeNode124Slice-16 146GB/s ± 5% 149GB/s ± 0% ~ (p=0.315 n=10+10) SetTypeNode126-16 154GB/s ± 0% 154GB/s ± 0% ~ (p=0.356 n=10+9) SetTypeNode126Slice-16 150GB/s ± 0% 150GB/s ± 0% ~ (p=0.095 n=9+10) SetTypeNode128-16 107GB/s ± 0% 107GB/s ± 0% +0.31% (p=0.003 n=9+10) SetTypeNode128Slice-16 119GB/s ± 0% 120GB/s ± 0% ~ (p=0.156 n=10+9) SetTypeNode130-16 108GB/s ± 0% 108GB/s ± 0% +0.33% (p=0.002 n=10+10) SetTypeNode130Slice-16 119GB/s ± 0% 119GB/s ± 0% ~ (p=0.739 n=10+10) SetTypeNode1024-16 160GB/s ± 0% 159GB/s ± 1% ~ (p=0.113 n=9+9) SetTypeNode1024Slice-16 144GB/s ± 0% 144GB/s ± 0% ~ (p=0.063 n=10+10) Hash5-16 2.59GB/s ± 1% 2.49GB/s ± 0% -3.90% (p=0.000 n=10+9) Hash16-16 7.85GB/s ± 1% 7.23GB/s ± 1% -7.92% (p=0.000 n=10+10) Hash64-16 24.0GB/s ± 0% 23.9GB/s ± 0% ~ (p=0.190 n=9+9) Hash1024-16 62.4GB/s ± 0% 62.3GB/s ± 0% -0.16% (p=0.017 n=9+10) Hash65536-16 74.0GB/s ± 0% 74.0GB/s ± 0% ~ (p=0.796 n=10+10) Memmove/1-16 1.08GB/s ± 0% 1.08GB/s ± 0% -0.21% (p=0.035 n=10+10) Memmove/2-16 2.16GB/s ± 0% 2.15GB/s ± 0% ~ (p=0.105 n=10+10) Memmove/3-16 3.24GB/s ± 1% 3.22GB/s ± 1% -0.49% (p=0.004 n=10+10) Memmove/4-16 3.89GB/s ± 0% 3.89GB/s ± 0% ~ (p=0.218 n=10+10) Memmove/5-16 4.42GB/s ± 0% 4.42GB/s ± 0% ~ (p=0.075 n=10+10) Memmove/6-16 5.31GB/s ± 0% 5.29GB/s ± 1% ~ (p=0.218 n=10+10) Memmove/7-16 6.19GB/s ± 0% 6.18GB/s ± 0% -0.15% (p=0.035 n=10+9) Memmove/8-16 7.07GB/s ± 0% 7.07GB/s ± 0% ~ (p=0.684 n=10+10) Memmove/9-16 7.22GB/s ± 0% 6.68GB/s ± 0% -7.37% (p=0.000 n=10+10) Memmove/10-16 8.02GB/s ± 0% 7.43GB/s ± 0% -7.38% (p=0.000 n=9+9) Memmove/11-16 8.83GB/s ± 0% 8.13GB/s ± 0% -7.87% (p=0.000 n=10+9) Memmove/12-16 9.62GB/s ± 0% 8.89GB/s ± 1% -7.61% (p=0.000 n=10+10) Memmove/13-16 10.4GB/s ± 0% 9.7GB/s ± 0% -7.20% (p=0.000 n=10+10) Memmove/14-16 11.2GB/s ± 0% 10.4GB/s ± 1% -7.64% (p=0.000 n=10+9) Memmove/15-16 12.0GB/s ± 0% 11.1GB/s ± 0% -7.46% (p=0.000 n=10+9) Memmove/16-16 12.8GB/s ± 0% 11.8GB/s ± 1% -7.67% (p=0.000 n=10+10) Memmove/32-16 23.8GB/s ± 0% 23.5GB/s ± 1% -1.20% (p=0.000 n=10+10) Memmove/64-16 44.2GB/s ± 0% 39.1GB/s ± 0% -11.56% (p=0.000 n=10+9) Memmove/128-16 68.7GB/s ± 0% 63.2GB/s ± 0% -7.95% (p=0.000 n=10+10) Memmove/256-16 104GB/s ± 0% 103GB/s ± 0% -1.13% (p=0.000 n=10+10) Memmove/512-16 129GB/s ± 1% 129GB/s ± 0% ~ (p=0.165 n=10+10) Memmove/1024-16 174GB/s ± 1% 174GB/s ± 1% ~ (p=0.258 n=9+9) Memmove/2048-16 213GB/s ± 1% 213GB/s ± 2% ~ (p=0.963 n=8+9) Memmove/4096-16 250GB/s ± 1% 240GB/s ± 4% -3.83% (p=0.006 n=9+9) MemmoveOverlap/32-16 19.8GB/s ± 1% 19.1GB/s ± 1% -3.40% (p=0.000 n=10+10) MemmoveOverlap/64-16 39.0GB/s ± 0% 38.8GB/s ± 0% -0.28% (p=0.001 n=9+9) MemmoveOverlap/128-16 62.2GB/s ± 0% 62.1GB/s ± 0% ~ (p=0.063 n=10+10) MemmoveOverlap/256-16 96.0GB/s ± 0% 95.8GB/s ± 0% -0.26% (p=0.009 n=10+10) MemmoveOverlap/512-16 83.6GB/s ±16% 89.2GB/s ± 0% ~ (p=0.696 n=10+8) MemmoveOverlap/1024-16 141GB/s ± 0% 140GB/s ± 0% -0.28% (p=0.006 n=8+10) MemmoveOverlap/2048-16 172GB/s ± 0% 171GB/s ± 1% -0.38% (p=0.008 n=9+9) MemmoveOverlap/4096-16 176GB/s ± 1% 177GB/s ± 1% +0.84% (p=0.001 n=8+10) MemmoveUnalignedDst/1-16 806MB/s ± 0% 802MB/s ± 1% -0.52% (p=0.023 n=10+10) MemmoveUnalignedDst/2-16 1.62GB/s ± 0% 1.62GB/s ± 0% -0.11% (p=0.041 n=10+10) MemmoveUnalignedDst/3-16 2.43GB/s ± 0% 2.43GB/s ± 0% -0.14% (p=0.006 n=9+9) MemmoveUnalignedDst/4-16 3.24GB/s ± 0% 3.23GB/s ± 1% -0.36% (p=0.007 n=10+10) MemmoveUnalignedDst/5-16 3.71GB/s ± 0% 3.71GB/s ± 0% ~ (p=0.063 n=10+10) MemmoveUnalignedDst/6-16 4.48GB/s ± 0% 4.47GB/s ± 0% ~ (p=0.912 n=10+10) MemmoveUnalignedDst/7-16 5.22GB/s ± 0% 5.22GB/s ± 0% ~ (p=1.000 n=10+10) MemmoveUnalignedDst/8-16 5.95GB/s ± 0% 5.93GB/s ± 1% -0.40% (p=0.023 n=10+10) MemmoveUnalignedDst/9-16 6.24GB/s ± 0% 6.24GB/s ± 0% ~ (p=0.912 n=10+10) MemmoveUnalignedDst/10-16 6.94GB/s ± 0% 6.94GB/s ± 0% ~ (p=0.353 n=10+10) MemmoveUnalignedDst/11-16 7.64GB/s ± 0% 7.63GB/s ± 0% ~ (p=0.393 n=10+10) MemmoveUnalignedDst/12-16 8.33GB/s ± 0% 8.33GB/s ± 0% ~ (p=0.971 n=10+10) MemmoveUnalignedDst/13-16 9.02GB/s ± 0% 9.01GB/s ± 0% ~ (p=0.436 n=10+10) MemmoveUnalignedDst/14-16 9.71GB/s ± 0% 9.71GB/s ± 0% ~ (p=0.280 n=10+10) MemmoveUnalignedDst/15-16 10.4GB/s ± 0% 10.4GB/s ± 1% ~ (p=0.853 n=10+10) MemmoveUnalignedDst/16-16 11.1GB/s ± 0% 11.1GB/s ± 0% ~ (p=0.089 n=10+10) MemmoveUnalignedDst/32-16 19.7GB/s ± 1% 19.6GB/s ± 0% ~ (p=0.075 n=10+10) MemmoveUnalignedDst/64-16 38.9GB/s ± 0% 38.8GB/s ± 0% ~ (p=0.218 n=10+10) MemmoveUnalignedDst/128-16 62.1GB/s ± 0% 62.1GB/s ± 0% ~ (p=0.549 n=10+9) MemmoveUnalignedDst/256-16 69.4GB/s ± 0% 69.3GB/s ± 0% ~ (p=0.105 n=10+10) MemmoveUnalignedDst/512-16 124GB/s ± 1% 124GB/s ± 0% ~ (p=0.762 n=10+8) MemmoveUnalignedDst/1024-16 136GB/s ± 0% 136GB/s ± 1% ~ (p=0.666 n=9+9) MemmoveUnalignedDst/2048-16 159GB/s ± 0% 159GB/s ± 0% ~ (p=0.574 n=8+8) MemmoveUnalignedDst/4096-16 161GB/s ± 0% 161GB/s ± 0% ~ (p=1.000 n=9+9) MemmoveUnalignedDstOverlap/32-16 7.84GB/s ± 0% 7.83GB/s ± 0% ~ (p=0.353 n=10+10) MemmoveUnalignedDstOverlap/64-16 14.0GB/s ± 0% 14.0GB/s ± 0% ~ (p=0.661 n=10+9) MemmoveUnalignedDstOverlap/128-16 27.4GB/s ± 0% 27.4GB/s ± 0% ~ (p=0.353 n=10+10) MemmoveUnalignedDstOverlap/256-16 50.4GB/s ± 0% 50.4GB/s ± 0% ~ (p=0.156 n=10+9) MemmoveUnalignedDstOverlap/512-16 60.7GB/s ± 4% 62.5GB/s ± 0% +3.07% (p=0.022 n=10+9) MemmoveUnalignedDstOverlap/1024-16 107GB/s ± 0% 107GB/s ± 0% ~ (p=0.234 n=8+8) MemmoveUnalignedDstOverlap/2048-16 146GB/s ± 0% 146GB/s ± 1% ~ (p=0.182 n=10+9) MemmoveUnalignedDstOverlap/4096-16 155GB/s ± 0% 155GB/s ± 0% ~ (p=0.400 n=10+9) MemmoveUnalignedSrc/1-16 882MB/s ± 0% 884MB/s ± 1% +0.24% (p=0.033 n=10+9) MemmoveUnalignedSrc/2-16 1.76GB/s ± 1% 1.77GB/s ± 0% +0.27% (p=0.028 n=10+9) MemmoveUnalignedSrc/3-16 2.43GB/s ± 0% 2.43GB/s ± 0% +0.26% (p=0.027 n=9+10) MemmoveUnalignedSrc/4-16 3.24GB/s ± 0% 3.24GB/s ± 1% ~ (p=0.079 n=9+10) MemmoveUnalignedSrc/5-16 3.73GB/s ± 0% 3.73GB/s ± 1% ~ (p=0.829 n=8+10) MemmoveUnalignedSrc/6-16 4.47GB/s ± 0% 4.49GB/s ± 0% +0.39% (p=0.000 n=10+10) MemmoveUnalignedSrc/7-16 5.22GB/s ± 0% 5.23GB/s ± 0% ~ (p=0.280 n=10+10) MemmoveUnalignedSrc/8-16 5.95GB/s ± 0% 5.98GB/s ± 0% +0.39% (p=0.001 n=10+9) MemmoveUnalignedSrc/9-16 6.24GB/s ± 0% 6.25GB/s ± 0% ~ (p=0.549 n=10+9) MemmoveUnalignedSrc/10-16 6.93GB/s ± 0% 6.94GB/s ± 0% ~ (p=0.604 n=10+9) MemmoveUnalignedSrc/11-16 7.63GB/s ± 0% 7.63GB/s ± 1% ~ (p=0.353 n=10+10) MemmoveUnalignedSrc/12-16 8.32GB/s ± 0% 8.32GB/s ± 0% ~ (p=0.218 n=10+10) MemmoveUnalignedSrc/13-16 9.02GB/s ± 0% 9.00GB/s ± 1% ~ (p=0.684 n=10+10) MemmoveUnalignedSrc/14-16 9.71GB/s ± 0% 9.71GB/s ± 0% ~ (p=0.739 n=10+10) MemmoveUnalignedSrc/15-16 10.4GB/s ± 0% 10.4GB/s ± 0% ~ (p=0.353 n=10+10) MemmoveUnalignedSrc/16-16 11.1GB/s ± 1% 11.1GB/s ± 0% ~ (p=0.579 n=10+10) MemmoveUnalignedSrc/32-16 20.0GB/s ± 1% 20.0GB/s ± 0% ~ (p=0.631 n=10+10) MemmoveUnalignedSrc/64-16 38.8GB/s ± 0% 38.8GB/s ± 0% ~ (p=0.579 n=10+10) MemmoveUnalignedSrc/128-16 61.2GB/s ± 0% 61.2GB/s ± 0% ~ (p=0.780 n=10+9) MemmoveUnalignedSrc/256-16 94.8GB/s ± 0% 92.2GB/s ± 1% -2.73% (p=0.000 n=10+10) MemmoveUnalignedSrc/512-16 119GB/s ± 0% 119GB/s ± 0% +0.26% (p=0.027 n=8+9) MemmoveUnalignedSrc/1024-16 141GB/s ± 0% 142GB/s ± 1% +1.07% (p=0.000 n=8+10) MemmoveUnalignedSrc/2048-16 157GB/s ± 0% 157GB/s ± 0% ~ (p=0.167 n=9+8) MemmoveUnalignedSrc/4096-16 161GB/s ± 0% 162GB/s ± 1% ~ (p=0.063 n=10+10) MemmoveUnalignedSrcOverlap/32-16 7.93GB/s ± 0% 7.88GB/s ± 0% -0.63% (p=0.000 n=9+10) MemmoveUnalignedSrcOverlap/64-16 15.5GB/s ± 0% 15.5GB/s ± 0% ~ (p=0.529 n=10+10) MemmoveUnalignedSrcOverlap/128-16 28.3GB/s ± 0% 28.3GB/s ± 0% ~ (p=0.218 n=10+10) MemmoveUnalignedSrcOverlap/256-16 41.5GB/s ± 0% 41.6GB/s ± 0% +0.35% (p=0.000 n=10+9) MemmoveUnalignedSrcOverlap/512-16 68.9GB/s ± 0% 68.8GB/s ± 0% ~ (p=0.541 n=9+8) MemmoveUnalignedSrcOverlap/1024-16 115GB/s ± 0% 115GB/s ± 0% ~ (p=0.382 n=8+8) MemmoveUnalignedSrcOverlap/2048-16 155GB/s ± 0% 144GB/s ±18% ~ (p=0.101 n=8+10) MemmoveUnalignedSrcOverlap/4096-16 160GB/s ± 0% 160GB/s ± 1% ~ (p=0.605 n=9+9) Memclr/5-16 5.81GB/s ± 1% 5.80GB/s ± 2% ~ (p=0.546 n=9+9) Memclr/16-16 15.5GB/s ± 0% 15.4GB/s ± 0% -0.32% (p=0.008 n=9+10) Memclr/64-16 51.8GB/s ± 0% 50.7GB/s ± 0% -2.22% (p=0.000 n=10+10) Memclr/256-16 113GB/s ± 0% 113GB/s ± 0% ~ (p=0.143 n=10+10) Memclr/4096-16 239GB/s ± 1% 237GB/s ± 0% -0.87% (p=0.000 n=10+10) Memclr/65536-16 79.8GB/s ± 0% 79.7GB/s ± 0% ~ (p=0.529 n=10+10) Memclr/1M-16 74.6GB/s ± 1% 74.7GB/s ± 1% ~ (p=0.529 n=10+10) Memclr/4M-16 48.7GB/s ± 1% 48.8GB/s ± 0% ~ (p=0.123 n=10+10) Memclr/8M-16 48.2GB/s ± 2% 48.6GB/s ± 0% ~ (p=0.408 n=10+8) Memclr/16M-16 43.6GB/s ± 4% 43.3GB/s ± 0% ~ (p=0.173 n=10+8) Memclr/64M-16 30.7GB/s ± 0% 30.7GB/s ± 0% ~ (p=0.113 n=10+9) GoMemclr/5-16 6.07GB/s ± 0% 6.08GB/s ± 0% ~ (p=0.367 n=9+10) GoMemclr/16-16 15.6GB/s ± 0% 15.6GB/s ± 0% -0.22% (p=0.004 n=9+9) GoMemclr/64-16 56.1GB/s ± 0% 56.1GB/s ± 0% ~ (p=0.968 n=10+9) GoMemclr/256-16 125GB/s ± 0% 124GB/s ± 0% ~ (p=0.912 n=10+10) MemclrRange/1K_2K-16 210GB/s ± 0% 224GB/s ± 1% +6.81% (p=0.000 n=10+10) MemclrRange/2K_8K-16 228GB/s ± 0% 228GB/s ± 0% ~ (p=0.684 n=10+10) MemclrRange/4K_16K-16 279GB/s ± 0% 279GB/s ± 0% ~ (p=0.780 n=9+10) MemclrRange/160K_228K-16 80.3GB/s ± 0% 80.2GB/s ± 0% ~ (p=0.165 n=10+10) Copy/1Byte-16 808MB/s ± 1% 810MB/s ± 0% +0.28% (p=0.000 n=10+8) Copy/1String-16 810MB/s ± 0% 811MB/s ± 0% ~ (p=0.105 n=10+10) Copy/2Byte-16 1.62GB/s ± 0% 1.62GB/s ± 0% ~ (p=0.182 n=10+9) Copy/2String-16 1.62GB/s ± 1% 1.62GB/s ± 1% ~ (p=1.000 n=10+10) Copy/4Byte-16 3.22GB/s ± 0% 3.24GB/s ± 0% +0.46% (p=0.000 n=10+10) Copy/4String-16 3.24GB/s ± 0% 3.24GB/s ± 0% ~ (p=0.075 n=10+10) Copy/8Byte-16 5.86GB/s ± 0% 5.96GB/s ± 0% +1.82% (p=0.000 n=9+9) Copy/8String-16 5.95GB/s ± 0% 5.99GB/s ± 0% +0.59% (p=0.000 n=9+10) Copy/12Byte-16 8.32GB/s ± 0% 8.32GB/s ± 0% ~ (p=0.190 n=9+9) Copy/12String-16 8.31GB/s ± 0% 8.29GB/s ± 0% ~ (p=0.068 n=9+10) Copy/16Byte-16 11.1GB/s ± 0% 11.1GB/s ± 0% +0.18% (p=0.003 n=10+9) Copy/16String-16 11.1GB/s ± 0% 11.1GB/s ± 0% -0.31% (p=0.009 n=10+10) Copy/32Byte-16 19.6GB/s ± 1% 19.8GB/s ± 1% +0.72% (p=0.002 n=10+10) Copy/32String-16 20.0GB/s ± 0% 19.5GB/s ± 0% -2.19% (p=0.000 n=10+10) Copy/128Byte-16 62.2GB/s ± 0% 62.1GB/s ± 0% ~ (p=0.661 n=9+10) Copy/128String-16 61.9GB/s ± 0% 61.7GB/s ± 0% -0.35% (p=0.005 n=10+10) Copy/1024Byte-16 169GB/s ± 2% 171GB/s ± 1% +1.21% (p=0.000 n=9+10) Copy/1024String-16 169GB/s ± 0% 172GB/s ± 1% +1.57% (p=0.000 n=10+9) CompareStringBigUnaligned-16 43.8GB/s ± 1% 43.7GB/s ± 1% ~ (p=0.370 n=9+8) CompareStringBig-16 47.6GB/s ± 3% 47.3GB/s ± 3% ~ (p=0.243 n=9+10) [Geo mean] 25.3GB/s 28.0GB/s +10.66% name old p50-ns new p50-ns delta ReadMemStatsLatency-16 98.4k ±37% 112.7k ±64% ~ (p=0.436 n=10+10) ReadMetricsLatency-16 1.69k ± 3% 1.70k ± 2% ~ (p=0.646 n=9+10) GoroutineProfile/small-nil/idle-16 3.75k ± 4% 3.72k ± 1% ~ (p=0.447 n=10+9) GoroutineProfile/small-nil/loaded-16 4.33k ± 3% 4.31k ± 5% ~ (p=0.931 n=9+9) GoroutineProfile/small/idle-16 102k ± 3% 101k ± 4% ~ (p=0.113 n=9+9) GoroutineProfile/small/loaded-16 214k ± 3% 215k ± 6% ~ (p=0.842 n=9+10) GoroutineProfile/large-nil/idle-16 3.70k ± 2% 3.65k ± 2% ~ (p=0.075 n=10+10) GoroutineProfile/large-nil/loaded-16 4.36k ± 3% 4.31k ±10% ~ (p=0.631 n=10+10) GoroutineProfile/large/idle-16 2.56M ± 1% 2.51M ± 1% -2.28% (p=0.000 n=10+10) GoroutineProfile/large/loaded-16 6.77M ± 4% 6.85M ±19% ~ (p=0.536 n=7+10) GoroutineProfile/sparse-nil/idle-16 3.66k ± 1% 3.64k ± 2% ~ (p=0.136 n=9+9) GoroutineProfile/sparse-nil/loaded-16 4.25k ± 5% 4.15k ± 4% ~ (p=0.190 n=10+10) GoroutineProfile/sparse/idle-16 102k ± 4% 101k ± 3% ~ (p=0.447 n=10+9) GoroutineProfile/sparse/loaded-16 216k ± 4% 218k ± 4% ~ (p=0.549 n=9+10) [Geo mean] 35.8k 35.9k +0.35% name old p90-ns new p90-ns delta ReadMemStatsLatency-16 983k ±310% 200k ±34% -79.62% (p=0.034 n=10+8) ReadMetricsLatency-16 4.01k ±35% 3.75k ±17% ~ (p=0.315 n=10+10) GoroutineProfile/small-nil/idle-16 4.21k ± 4% 4.27k ± 8% ~ (p=0.968 n=9+10) GoroutineProfile/small-nil/loaded-16 5.58k ± 8% 5.35k ±12% ~ (p=0.190 n=10+10) GoroutineProfile/small/idle-16 108k ± 6% 107k ± 7% ~ (p=0.497 n=9+10) GoroutineProfile/small/loaded-16 450k ± 5% 432k ± 3% -3.92% (p=0.002 n=9+9) GoroutineProfile/large-nil/idle-16 4.13k ± 7% 4.04k ± 2% ~ (p=0.181 n=10+8) GoroutineProfile/large-nil/loaded-16 5.76k ± 4% 5.67k ± 5% ~ (p=0.190 n=10+10) GoroutineProfile/large/idle-16 2.63M ± 2% 2.58M ± 1% -1.97% (p=0.000 n=10+10) GoroutineProfile/large/loaded-16 16.9M ± 4% 17.0M ± 6% ~ (p=0.661 n=9+10) GoroutineProfile/sparse-nil/idle-16 4.21k ±10% 4.07k ± 7% ~ (p=0.128 n=10+10) GoroutineProfile/sparse-nil/loaded-16 5.55k ± 8% 5.38k ± 6% ~ (p=0.089 n=10+10) GoroutineProfile/sparse/idle-16 106k ± 4% 106k ± 3% ~ (p=0.661 n=10+9) GoroutineProfile/sparse/loaded-16 454k ± 6% 441k ± 6% -2.86% (p=0.043 n=10+10) [Geo mean] 58.4k 51.0k -12.61% name old p99-ns new p99-ns delta ReadMemStatsLatency-16 983k ±310% 200k ±34% -79.62% (p=0.034 n=10+8) ReadMetricsLatency-16 26.6k ±22% 26.6k ±17% ~ (p=0.971 n=10+10) GoroutineProfile/small-nil/idle-16 5.27k ±12% 5.19k ±12% ~ (p=0.579 n=10+10) GoroutineProfile/small-nil/loaded-16 7.28k ± 3% 7.04k ± 9% ~ (p=0.113 n=9+10) GoroutineProfile/small/idle-16 114k ± 6% 113k ± 6% ~ (p=0.604 n=9+10) GoroutineProfile/small/loaded-16 4.84M ±70% 6.06M ±131% ~ (p=0.842 n=9+10) GoroutineProfile/large-nil/idle-16 5.38k ±19% 5.26k ±13% ~ (p=0.912 n=10+10) GoroutineProfile/large-nil/loaded-16 7.38k ± 3% 7.23k ± 4% ~ (p=0.143 n=10+10) GoroutineProfile/large/idle-16 2.79M ± 5% 2.72M ± 3% ~ (p=0.089 n=10+10) GoroutineProfile/large/loaded-16 24.0M ±24% 24.9M ±29% ~ (p=0.684 n=10+10) GoroutineProfile/sparse-nil/idle-16 5.32k ±17% 5.49k ±17% ~ (p=0.684 n=10+10) GoroutineProfile/sparse-nil/loaded-16 7.25k ± 4% 6.97k ± 5% -3.90% (p=0.005 n=9+10) GoroutineProfile/sparse/idle-16 113k ± 5% 112k ± 6% ~ (p=0.631 n=10+10) GoroutineProfile/sparse/loaded-16 4.00M ±66% 4.26M ±65% ~ (p=0.489 n=9+9) [Geo mean] 107k 97k -9.55% name old alloc/op new alloc/op delta NewEmptyMap-16 0.00B 0.00B ~ (all equal) NewSmallMap-16 0.00B 0.00B ~ (all equal) MapPopulate/1-16 0.00B 0.00B ~ (all equal) MapPopulate/10-16 179B ± 0% 179B ± 0% ~ (all equal) MapPopulate/100-16 3.35kB ± 0% 3.35kB ± 0% ~ (p=0.294 n=10+8) MapPopulate/1000-16 53.3kB ± 0% 53.3kB ± 0% ~ (p=1.000 n=8+10) MapPopulate/10000-16 428kB ± 0% 428kB ± 0% ~ (p=0.469 n=10+10) MapPopulate/100000-16 3.62MB ± 0% 3.62MB ± 0% ~ (p=0.888 n=9+10) MapStringConversion/32/simple-16 0.00B 0.00B ~ (all equal) MapStringConversion/32/struct-16 0.00B 0.00B ~ (all equal) MapStringConversion/32/array-16 0.00B 0.00B ~ (all equal) MapStringConversion/64/simple-16 0.00B 0.00B ~ (all equal) MapStringConversion/64/struct-16 0.00B 0.00B ~ (all equal) MapStringConversion/64/array-16 0.00B 0.00B ~ (all equal) NewEmptyMapHintLessThan8-16 0.00B 0.00B ~ (all equal) NewEmptyMapHintGreaterThan8-16 1.15kB ± 0% 1.15kB ± 0% ~ (all equal) MapAppendAssign/Int32/256-16 41.7B ±15% 44.3B ±12% ~ (p=0.106 n=10+10) MapAppendAssign/Int32/65536-16 22.6B ± 6% 23.5B ± 6% +4.19% (p=0.025 n=9+10) MapAppendAssign/Int64/256-16 43.5B ±10% 42.9B ± 7% ~ (p=0.757 n=10+10) MapAppendAssign/Int64/65536-16 24.7B ± 7% 21.8B ± 6% -11.74% (p=0.000 n=10+10) MapAppendAssign/Str/256-16 87.6B ±10% 89.2B ± 9% ~ (p=0.379 n=10+10) MapAppendAssign/Str/65536-16 45.1B ±14% 47.2B ± 8% ~ (p=0.150 n=10+9) CreateGoroutinesCapture-16 144B ± 0% 144B ± 0% ~ (all equal) [Geo mean] 769B 770B +0.21% name old allocs/op new allocs/op delta NewEmptyMap-16 0.00 0.00 ~ (all equal) NewSmallMap-16 0.00 0.00 ~ (all equal) MapPopulate/1-16 0.00 0.00 ~ (all equal) MapPopulate/10-16 1.00 ± 0% 1.00 ± 0% ~ (all equal) MapPopulate/100-16 17.0 ± 0% 17.0 ± 0% ~ (all equal) MapPopulate/1000-16 73.0 ± 0% 73.0 ± 0% ~ (all equal) MapPopulate/10000-16 320 ± 0% 320 ± 0% ~ (p=1.000 n=10+10) MapPopulate/100000-16 4.00k ± 0% 4.00k ± 0% ~ (p=0.753 n=10+10) MapStringConversion/32/simple-16 0.00 0.00 ~ (all equal) MapStringConversion/32/struct-16 0.00 0.00 ~ (all equal) MapStringConversion/32/array-16 0.00 0.00 ~ (all equal) MapStringConversion/64/simple-16 0.00 0.00 ~ (all equal) MapStringConversion/64/struct-16 0.00 0.00 ~ (all equal) MapStringConversion/64/array-16 0.00 0.00 ~ (all equal) NewEmptyMapHintLessThan8-16 0.00 0.00 ~ (all equal) NewEmptyMapHintGreaterThan8-16 1.00 ± 0% 1.00 ± 0% ~ (all equal) MapAppendAssign/Int32/256-16 0.00 0.00 ~ (all equal) MapAppendAssign/Int32/65536-16 0.00 0.00 ~ (all equal) MapAppendAssign/Int64/256-16 0.00 0.00 ~ (all equal) MapAppendAssign/Int64/65536-16 0.00 0.00 ~ (all equal) MapAppendAssign/Str/256-16 0.00 0.00 ~ (all equal) MapAppendAssign/Str/65536-16 0.00 0.00 ~ (all equal) CreateGoroutinesCapture-16 5.00 ± 0% 5.00 ± 0% ~ (all equal) [Geo mean] 26.0 26.0 +0.00% Change-Id: I5fb03e93df8b380e04795afbdcd1c94aeeecacc6 Reviewed-on: https://go-review.googlesource.com/c/go/+/454255 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Jakub Ciolek <jakub@ciolek.dev> TryBot-Result: Gopher Robot <gobot@golang.org> |
||
![]() |
fc814056aa |
cmd/compile: rewrite empty makeslice to zerobase pointer
make\(\[\][a-zA-Z0-9]+, 0\) is seen 52 times in the go source. And at least 391 times on internet: https://grep.app/search?q=make%5C%28%5C%5B%5C%5D%5Ba-zA-Z0-9%5D%2B%2C%200%5C%29®exp=true This used to compile to calling runtime.makeslice. However we can copy what we do for []T{}, just use a zerobase pointer. On my machine this is 10x faster (from 3ns to 0.3ns). Note that an empty loop also runs in 0.3ns, so this really is free when you count superscallar execution. Change-Id: I1cfe7e69f5a7a4dabbc71912ce6a4f8a2d4a7f3c Reviewed-on: https://go-review.googlesource.com/c/go/+/454036 Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Jakub Ciolek <jakub@ciolek.dev> |
||
![]() |
47a0d46716 |
cmd/compile/internal/ssa: generate code via a //go:generate directive
The standard way to generate code in a Go package is via //go:generate directives, which are invoked by the developer explicitly running: go generate import/path/of/said/package Switch to using that approach here. This way, developers don't need to learn and remember a custom way that each particular Go package may choose to implement its code generation. It also enables conveniences such as 'go generate -n' to discover how code is generated without running anything (this works on all packages that rely on //go:generate directives), being able to generate multiple packages at once and from any directory, and so on. Change-Id: I0e5b6a1edeff670a8e588befeef0c445613803c7 Reviewed-on: https://go-review.googlesource.com/c/go/+/460135 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Keith Randall <khr@google.com> Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org> Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> |
||
![]() |
8893da7c72 |
cmd/compile: fix wrong optimization for eliding Not in Phi
The previous rule may move the phi value into a wrong block. This CL make it only rewrite the phi value not the If block, so that the phi value will stay in old block. Fixes #56777 Change-Id: I9479a5c7f28529786968413d35b82a16181bb1f1 Reviewed-on: https://go-review.googlesource.com/c/go/+/451496 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Wayne Zuo <wdvxdr@golangcn.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: David Chase <drchase@google.com> |
||
![]() |
fd59c6cf8c |
cmd/compile: elide unnecessary Not in Phi block controls
For a BlockIf, we can change the order of the successors if all OpPhi args are an OpNot and this allows us to elide said OpNots. When compiling Go itself, there were no hits for (If (Phi (Not x) (Not y) (Not z)) or any other longer patterns. compilecmp: errors errors.As changed reflect reflect.Value.FieldByIndex changed reflect.Value.Method changed reflect.cvtI2I changed reflect.Value.FieldByIndexErr changed reflect.deepValueEqual.func1 496 -> 502 (+1.21%) reflect.deepValueEqual changed internal/fmtsort internal/fmtsort.nilCompare 652 -> 648 (-0.61%) database/sql database/sql.convertAssignRows changed encoding/json encoding/json.interfaceEncoder changed encoding/json.(*decodeState).unmarshal 574 -> 571 (-0.52%) encoding/json.mapEncoder.encode changed encoding/json.ptrEncoder.encode changed encoding/json.encodeByteSlice changed encoding/json.addrMarshalerEncoder changed encoding/json.addrTextMarshalerEncoder changed encoding/json.(*decodeState).object changed encoding/json.sliceEncoder.encode changed encoding/json.indirect 1303 -> 1286 (-1.30%) encoding/gob encoding/gob.encOpFor.func3 changed encoding/gob.encIndirect changed encoding/gob.encOpFor.func5 changed encoding/gob.(*Encoder).encodeInterface changed encoding/gob.(*Decoder).decodeMap changed encoding/xml encoding/xml.(*printer).marshalStruct changed encoding/xml.(*fieldInfo).value changed encoding/xml.(*printer).marshalAttr changed encoding/xml.indirect changed encoding/xml.(*printer).marshalValue changed encoding/asn1 encoding/asn1.UnmarshalWithParams 837 -> 845 (+0.96%) text/template text/template.indirectInterface changed text/template.indirect changed text/template.safeCall changed net/http/httptrace net/http/httptrace.(*ClientTrace).compose changed cmd/fix main.typefix.func2 changed cmd/compile/internal/ir cmd/compile/internal/ir.dumpNode changed cmd/gofmt main.match changed main.subst changed cmd/compile/internal/ssa cmd/compile/internal/ssa.rewriteBlockgeneric 626 -> 1030 (+64.54%) Change-Id: I645b3b3e37302a63e06b79ce74674882fb603ef3 Reviewed-on: https://go-review.googlesource.com/c/go/+/449055 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Joedian Reid <joedian@golang.org> Run-TryBot: Jakub Ciolek <jakub@ciolek.dev> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> |
||
![]() |
9ce27feaeb |
cmd/compile: add rule for post-decomposed growslice optimization
The recently added rule only works before decomposing slices. Add a rule that works after decomposing slices. The reason we need the latter is because although the length may be a constant, it can be hidden inside a slice that is not constant (its pointer or capacity might be changing). By applying this optimization after decomposing slices, we can find more cases where it applies. Fixes #56440 Change-Id: I0094e59eee3065ab4d210defdda8227a6e897420 Reviewed-on: https://go-review.googlesource.com/c/go/+/446277 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Keith Randall <khr@google.com> |
||
![]() |
0156b797e6 |
cmd/compile: recognize when the result of append has a constant length
Fixes a performance regression due to CL 418554. Fixes #56440 Change-Id: I6ff152e9b83084756363f49ee6b0844a7a284880 Reviewed-on: https://go-review.googlesource.com/c/go/+/445875 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> |
||
![]() |
85196fc982 |
cmd/internal/ssa: correct references to _gen folder
The gen folder was renamed to _gen in CL 435472, but references in code and docs were not updated. This updates the references. Change-Id: Ibadc0cdcb5bed145c3257b58465a8df370487ae5 Reviewed-on: https://go-review.googlesource.com/c/go/+/444355 Reviewed-by: Bryan Mills <bcmills@google.com> Reviewed-by: David Chase <drchase@google.com> Run-TryBot: Johan Brandhorst-Satzkorn <johan.brandhorst@gmail.com> TryBot-Result: Gopher Robot <gobot@golang.org> |
||
![]() |
e283473ebb |
cmd/compile: avoid using destination pointer base type in memmove optimization
The type of the source and destination of a memmove call isn't always accurate. It will always be a pointer (or an unsafe.Pointer), but the base type might not be accurate. This comes about because multiple copies of a pointer with different base types are coalesced into a single value. In the failing example, the IData selector of the input argument is a *[32]byte in one branch of the type switch, and a *[]byte in the other branch. During the expand_calls pass both IDatas become just copies of the input register. Those copies are deduped and an arbitrary one wins (in this case, *[]byte is the unfortunate winner). Generally an op v can rely on v.Type during rewrite rules. But relying on v.Args[i].Type is discouraged. Fixes #55122 Change-Id: I348fd9accf2058a87cd191eec01d39cda612f120 Reviewed-on: https://go-review.googlesource.com/c/go/+/431496 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: Keith Randall <khr@google.com> |
||
![]() |
5b1fbfba1c |
cmd/compile: rewrite >>c<<c to &^(1<<c-1)
Fixes #54496 Change-Id: I3c2ed8cd55836d5b07c8cdec00d3b584885aca79 Reviewed-on: https://go-review.googlesource.com/c/go/+/424856 Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Heschi Kreinick <heschi@google.com> Run-TryBot: Martin Möhrmann <martin@golang.org> TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Martin Möhrmann <martin@golang.org> |
||
![]() |
33a7e5a4b4 |
cmd/compile: combine multiple rotate instructions
Rotating by c, then by d, is the same as rotating by c+d. Change-Id: I36df82261460ff80f7c6d39bcdf0e840cef1c91a Reviewed-on: https://go-review.googlesource.com/c/go/+/424894 Reviewed-by: Wayne Zuo <wdvxdr@golangcn.org> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Reviewed-by: Ruinan Sun <Ruinan.Sun@arm.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com> |
||
![]() |
ab8a2c5e44 |
cmd/compile: generic constant folding: Floor Ceil Trunc RoundToEven
Change-Id: I553a6d0bd3ae45e5bf62191411e71102b3f44cd8 Reviewed-on: https://go-review.googlesource.com/c/go/+/411215 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Joedian Reid <joedian@golang.org> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> |
||
![]() |
60ad3c48f5 |
cmd/compile: move SSA rotate instruction detection to arch-independent rules
Detect rotate instructions while still in architecture-independent form. It's easier to do here, and we don't need to repeat it in each architecture file. Change-Id: I9396954b3f3b3bfb96c160d064a02002309935bb Reviewed-on: https://go-review.googlesource.com/c/go/+/421195 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Eric Fang <eric.fang@arm.com> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Reviewed-by: David Chase <drchase@google.com> Reviewed-by: Joedian Reid <joedian@golang.org> Reviewed-by: Ruinan Sun <Ruinan.Sun@arm.com> Run-TryBot: Keith Randall <khr@golang.org> |
||
![]() |
c2a9c55823 |
cmd/compile: optimize unsafe.Slice generated code
We don't need a multiply when the element type is size 0 or 1. The panic functions don't return, so we don't need any post-call code (register restores, etc.). Change-Id: I0dcea5df56d29d7be26554ddca966b3903c672e5 Reviewed-on: https://go-review.googlesource.com/c/go/+/419754 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com> Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Keith Randall <khr@google.com> Reviewed-by: Than McIntosh <thanm@google.com> Reviewed-by: Matthew Dempsky <mdempsky@google.com> |
||
![]() |
e858c14f91 |
cmd/compile: more negation related generic SSA rewrite rules
The x + (-y) => x - y rule is hitted 75 times while building stage 3 and tools
and make the linux-amd64 go binary 0.007% smaller.
It transform:
NEG AX
ADD BX, AX
Into:
SUB BX, AX
Which is 2X faster (assuming this assembly in a vacum).
The x ^ (-1) => ^x rule is not hitted in the toolchain.
It transforms:
XOR $-1, AX
Into:
NOT AX
Which is more compact as it doesn't encode the immediate.
Cache usage aside, this does not affect performance
(assuming this assembly in a vacum).
On my ryzen 3600, with some surrouding code, this randomly might be 2X faster,
I guess this has to do with loading the immediate into a temporary register.
Combined to an other rule that already exists it also rewrite manual two's
complement negation from:
XOR $-1, AX
INC AX
Into:
NEG AX
Which is 2X faster.
The other rules just eliminates similar trivial cases and help constants
folding.
This should generalise to other architectures.
Change-Id: Ia1e51b340622e7ed88e5d856f3b1aa424aa039de
GitHub-Last-Rev:
|
||
![]() |
810b08b8ec |
cmd/compile: inline memequal(x, const, sz) for small sizes
This CL adds late expanded memequal(x, const, sz) inlining for 2, 4, 8 bytes size. This PoC is using the same method as CL 248404. This optimization fires about 100 times in Go compiler (1675 occurrences reduced to 1574, so -6%). Also, added unit-tests to codegen/comparisions.go file. Updates #37275 Change-Id: Ia52808d573cb706d1da8166c5746ede26f46c5da Reviewed-on: https://go-review.googlesource.com/c/go/+/328291 Reviewed-by: Cherry Mui <cherryyz@google.com> Run-TryBot: Cherry Mui <cherryyz@google.com> Trust: David Chase <drchase@google.com> |
||
![]() |
21de6bc463 |
cmd/compile: simplify less with non-negative number and constant 0 or 1
The most common cases: len(s) > 0 len(s) < 1 and they can be simplified to: len(s) != 0 len(s) == 0 Fixes #48054 Change-Id: I16e5b0cffcfab62a4acc2a09977a6cd3543dd000 Reviewed-on: https://go-review.googlesource.com/c/go/+/346050 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> |
||
![]() |
6cf1d5d0fa |
cmd/compile: generic SSA rules for simplifying 2 and 3 operand integer arithmetic expressions
This applies the following generic integer addition/subtraction transformations: x - (x + y) = -y (x - y) - x = -y y + (x - y) = x y + (z + (x - y) = x + z There's over 40 unique functions matching in Go. Hits 2 funcs in the runtime itself: runtime.stackfree() runtime.runqdrain() Go binary size reduced by 0.05% on Linux x86_64. StackCopy bench (perflocked Cascade Lake x86): name old time/op new time/op delta StackCopyPtr-8 87.3ms ± 1% 86.9ms ± 0% -0.45% (p=0.000 n=20+20) StackCopy-8 77.6ms ± 1% 77.0ms ± 0% -0.76% (p=0.000 n=20+20) StackCopyNoCache-8 2.28ms ± 2% 2.26ms ± 2% -0.93% (p=0.008 n=19+20) test/bench/go1 benchmarks (perflocked Cascade Lake x86): name old time/op new time/op delta BinaryTree17-8 1.88s ± 1% 1.88s ± 0% ~ (p=0.373 n=15+12) Fannkuch11-8 2.31s ± 0% 2.35s ± 0% +1.52% (p=0.000 n=15+14) FmtFprintfEmpty-8 26.6ns ± 0% 26.6ns ± 0% ~ (p=0.081 n=14+13) FmtFprintfString-8 48.6ns ± 0% 50.0ns ± 0% +2.86% (p=0.000 n=15+14) FmtFprintfInt-8 56.9ns ± 0% 54.8ns ± 0% -3.70% (p=0.000 n=15+15) FmtFprintfIntInt-8 90.4ns ± 0% 88.8ns ± 0% -1.78% (p=0.000 n=15+15) FmtFprintfPrefixedInt-8 104ns ± 0% 104ns ± 0% ~ (p=0.905 n=14+13) FmtFprintfFloat-8 148ns ± 0% 144ns ± 0% -2.19% (p=0.000 n=14+15) FmtManyArgs-8 389ns ± 0% 390ns ± 0% +0.35% (p=0.000 n=12+15) GobDecode-8 3.90ms ± 1% 3.88ms ± 0% -0.49% (p=0.000 n=15+14) GobEncode-8 2.73ms ± 0% 2.73ms ± 0% ~ (p=0.425 n=15+14) Gzip-8 169ms ± 0% 168ms ± 0% -0.52% (p=0.000 n=13+13) Gunzip-8 24.7ms ± 0% 24.8ms ± 0% +0.61% (p=0.000 n=15+15) HTTPClientServer-8 60.5µs ± 6% 60.4µs ± 7% ~ (p=0.595 n=15+15) JSONEncode-8 6.97ms ± 1% 6.93ms ± 0% -0.69% (p=0.000 n=14+14) JSONDecode-8 31.2ms ± 1% 30.8ms ± 1% -1.27% (p=0.000 n=14+14) Mandelbrot200-8 3.87ms ± 0% 3.87ms ± 0% ~ (p=0.652 n=15+14) GoParse-8 2.65ms ± 2% 2.64ms ± 1% ~ (p=0.202 n=15+15) RegexpMatchEasy0_32-8 45.1ns ± 0% 45.9ns ± 0% +1.68% (p=0.000 n=14+15) RegexpMatchEasy0_1K-8 140ns ± 0% 139ns ± 0% -0.44% (p=0.000 n=15+14) RegexpMatchEasy1_32-8 40.9ns ± 3% 40.5ns ± 0% -0.88% (p=0.000 n=15+13) RegexpMatchEasy1_1K-8 215ns ± 1% 220ns ± 1% +2.27% (p=0.000 n=15+15) RegexpMatchMedium_32-8 783ns ± 7% 738ns ± 0% ~ (p=0.361 n=15+15) RegexpMatchMedium_1K-8 24.1µs ± 6% 23.4µs ± 6% -2.94% (p=0.004 n=15+15) RegexpMatchHard_32-8 1.10µs ± 1% 1.09µs ± 1% -0.40% (p=0.006 n=15+14) RegexpMatchHard_1K-8 33.0µs ± 0% 33.0µs ± 0% ~ (p=0.535 n=12+14) Revcomp-8 354ms ± 0% 353ms ± 0% -0.23% (p=0.002 n=15+13) Template-8 42.0ms ± 1% 41.8ms ± 2% -0.37% (p=0.023 n=14+15) TimeParse-8 181ns ± 0% 180ns ± 1% -0.18% (p=0.014 n=12+13) TimeFormat-8 240ns ± 0% 242ns ± 1% +0.69% (p=0.000 n=12+15) [Geo mean] 35.2µs 35.1µs -0.43% name old speed new speed delta GobDecode-8 197MB/s ± 1% 198MB/s ± 0% +0.49% (p=0.000 n=15+14) GobEncode-8 281MB/s ± 0% 281MB/s ± 0% ~ (p=0.419 n=15+14) Gzip-8 115MB/s ± 0% 115MB/s ± 0% +0.52% (p=0.000 n=13+13) Gunzip-8 786MB/s ± 0% 781MB/s ± 0% -0.60% (p=0.000 n=15+15) JSONEncode-8 278MB/s ± 1% 280MB/s ± 0% +0.69% (p=0.000 n=14+14) JSONDecode-8 62.3MB/s ± 1% 63.1MB/s ± 1% +1.29% (p=0.000 n=14+14) GoParse-8 21.9MB/s ± 2% 22.0MB/s ± 1% ~ (p=0.205 n=15+15) RegexpMatchEasy0_32-8 709MB/s ± 0% 697MB/s ± 0% -1.65% (p=0.000 n=14+15) RegexpMatchEasy0_1K-8 7.34GB/s ± 0% 7.37GB/s ± 0% +0.43% (p=0.000 n=15+15) RegexpMatchEasy1_32-8 783MB/s ± 2% 790MB/s ± 0% +0.88% (p=0.000 n=15+13) RegexpMatchEasy1_1K-8 4.77GB/s ± 1% 4.66GB/s ± 1% -2.23% (p=0.000 n=15+15) RegexpMatchMedium_32-8 41.0MB/s ± 7% 43.3MB/s ± 0% ~ (p=0.360 n=15+15) RegexpMatchMedium_1K-8 42.5MB/s ± 6% 43.8MB/s ± 6% +3.07% (p=0.004 n=15+15) RegexpMatchHard_32-8 29.2MB/s ± 1% 29.3MB/s ± 1% +0.41% (p=0.006 n=15+14) RegexpMatchHard_1K-8 31.1MB/s ± 0% 31.1MB/s ± 0% ~ (p=0.495 n=12+14) Revcomp-8 718MB/s ± 0% 720MB/s ± 0% +0.23% (p=0.002 n=15+13) Template-8 46.3MB/s ± 1% 46.4MB/s ± 2% +0.38% (p=0.021 n=14+15) [Geo mean] 205MB/s 206MB/s +0.57% Change-Id: Ibd1afdf8b6c0b08087dcc3acd8f943637eb95ac0 Reviewed-on: https://go-review.googlesource.com/c/go/+/344930 Reviewed-by: Keith Randall <khr@golang.org> Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Go Bot <gobot@golang.org> Trust: Josh Bleecher Snyder <josharian@gmail.com> |