During walkexpr, we were assessing whether shifts were bounded.
However, that information was dropped on the floor during SSA conversion.
The SSA backend already finds all bounded shifts that walkexpr could have,
and at negligible extra cost (0.02% in alloc, CPU undetectable).
Change-Id: Ieda1af1a2a3ec99bfdc2b0b704c9b80ce8a34486
Reviewed-on: https://go-review.googlesource.com/c/148897
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Only return a pointer p to the new slices backing array from makeslice.
Makeslice callers then construct sliceheader{p, len, cap} explictly
instead of makeslice returning the slice.
Reduces go binary size by ~0.2%.
Removes 92 (~3.5%) panicindex calls from go binary.
Change-Id: I29b7c3b5fe8b9dcec96e2c43730575071cfe8a94
Reviewed-on: https://go-review.googlesource.com/c/141822
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
CL 114797 reworked how arguments get written to the stack.
Some type conversions got lost in the process. Restore them.
Fixes#28390
Updates #28430
Change-Id: Ia0d37428d7d615c865500bbd1a7a4167554ee34f
Reviewed-on: https://go-review.googlesource.com/c/144598
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The second and subsequent return values from f() need to be
converted to the element type of the first return value from f()
(which must be a slice).
Fixes#22327
Change-Id: I5c0a424812c82c1b95b6d124c5626cfc4408bdb6
Reviewed-on: https://go-review.googlesource.com/c/142718
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
The goal of this change is to move work from walk to SSA,
and simplify things along the way.
This is hard to accomplish cleanly with small incremental changes,
so this large commit message aims to provide a roadmap to the diff.
High level description:
Prior to this change, walk was responsible for constructing (most of) the stack for function calls.
ascompatte gathered variadic arguments into a slice.
It also rewrote n.List from a list of arguments to a list of assignments to stack slots.
ascompatte was called multiple times to handle the receiver in a method call.
reorder1 then introduced temporaries into n.List as needed to avoid smashing the stack.
adjustargs then made extra stack space for go/defer args as needed.
Node to SSA construction evaluated all the statements in n.List,
and issued the function call, assuming that the stack was correctly constructed.
Intrinsic calls had to dig around inside n.List to extract the arguments,
since intrinsics don't use the stack to make function calls.
This change moves stack construction to the SSA construction phase.
ascompatte, now called walkParams, does all the work that ascompatte and reorder1 did.
It handles variadic arguments, inserts the method receiver if needed, and allocates temporaries.
It does not, however, make any assignments to stack slots.
Instead, it moves the function arguments to n.Rlist, leaving assignments to temporaries in n.List.
(It would be better to use Ninit instead of List; future work.)
During SSA construction, after doing all the temporary assignments in n.List,
the function arguments are assigned to stack slots by
constructing the appropriate SSA Value, using (*state).storeArg.
SSA construction also now handles adjustments for go/defer args.
This change also simplifies intrinsic calls, since we no longer need to undo walk's work.
Along the way, we simplify nodarg by pushing the fp==1 case to its callers, where it fits nicely.
Generated code differences:
There were a few optimizations applied along the way, the old way.
f(g()) was rewritten to do a block copy of function results to function arguments.
And reorder1 avoided introducing the final "save the stack" temporary in n.List.
The f(g()) block copy optimization never actually triggered; the order pass rewrote away g(), so that has been removed.
SSA optimizations mostly obviated the need for reorder1's optimization of avoiding the final temporary.
The exception was when the temporary's type was not SSA-able;
in that case, we got a Move into an autotmp and then an immediate Move onto the stack,
with the autotmp never read or used again.
This change introduces a new rewrite rule to detect such pointless double Moves
and collapse them into a single Move.
This is actually more powerful than the original optimization,
since the original optimization relied on the imprecise Node.HasCall calculation.
The other significant difference in the generated code is that the stack is now constructed
completely in SP-offset order. Prior to this change, the stack was constructed somewhat
haphazardly: first the final argument that Node.HasCall deemed to require a temporary,
then other arguments, then the method receiver, then the defer/go args.
SP-offset is probably a good default order. See future work.
There are a few minor object file size changes as a result of this change.
I investigated some regressions in early versions of this change.
One regression (in archive/tar) was the addition of a single CMPQ instruction,
which would be eliminated were this TODO from flagalloc to be done:
// TODO: Remove original instructions if they are never used.
One regression (in text/template) was an ADDQconstmodify that is now
a regular MOVQLoad+ADDQconst+MOVQStore, due to an unlucky change
in the order in which arguments are written. The argument change
order can also now be luckier, so this appears to be a wash.
All in all, though there will be minor winners and losers,
this change appears to be performance neutral.
Future work:
Move loading the result of function calls to SSA construction; eliminate OINDREGSP.
Consider pushing stack construction deeper into SSA world, perhaps in an arch-specific pass.
Among other benefits, this would make it easier to transition to a new calling convention.
This would require rethinking the handling of stack conflicts and is non-trivial.
Figure out some clean way to indicate that stack construction Stores/Moves
do not alias each other, so that subsequent passes may do things like
CSE+tighten shared stack setup, do DSE using non-first Stores, etc.
This would allow us to eliminate the minor text/template regression.
Possibly make assignments to stack slots not treated as statements by DWARF.
Compiler benchmarks:
name old time/op new time/op delta
Template 182ms ± 2% 179ms ± 2% -1.69% (p=0.000 n=47+48)
Unicode 86.3ms ± 5% 85.1ms ± 4% -1.36% (p=0.001 n=50+50)
GoTypes 646ms ± 1% 642ms ± 1% -0.63% (p=0.000 n=49+48)
Compiler 2.89s ± 1% 2.86s ± 2% -1.36% (p=0.000 n=48+50)
SSA 8.47s ± 1% 8.37s ± 2% -1.22% (p=0.000 n=47+50)
Flate 122ms ± 2% 121ms ± 2% -0.66% (p=0.000 n=47+45)
GoParser 147ms ± 2% 146ms ± 2% -0.53% (p=0.006 n=46+49)
Reflect 406ms ± 2% 403ms ± 2% -0.76% (p=0.000 n=48+43)
Tar 162ms ± 3% 162ms ± 4% ~ (p=0.191 n=46+50)
XML 223ms ± 2% 222ms ± 2% -0.37% (p=0.031 n=45+49)
[Geo mean] 382ms 378ms -0.89%
name old user-time/op new user-time/op delta
Template 219ms ± 3% 216ms ± 3% -1.56% (p=0.000 n=50+48)
Unicode 109ms ± 6% 109ms ± 5% ~ (p=0.190 n=50+49)
GoTypes 836ms ± 2% 828ms ± 2% -0.96% (p=0.000 n=49+48)
Compiler 3.87s ± 2% 3.80s ± 1% -1.81% (p=0.000 n=49+46)
SSA 12.0s ± 1% 11.8s ± 1% -2.01% (p=0.000 n=48+50)
Flate 142ms ± 3% 141ms ± 3% -0.85% (p=0.003 n=50+48)
GoParser 178ms ± 4% 175ms ± 4% -1.66% (p=0.000 n=48+46)
Reflect 520ms ± 2% 512ms ± 2% -1.44% (p=0.000 n=45+48)
Tar 200ms ± 3% 198ms ± 4% -0.61% (p=0.037 n=47+50)
XML 277ms ± 3% 275ms ± 3% -0.85% (p=0.000 n=49+48)
[Geo mean] 482ms 476ms -1.23%
name old alloc/op new alloc/op delta
Template 36.1MB ± 0% 35.3MB ± 0% -2.18% (p=0.008 n=5+5)
Unicode 29.8MB ± 0% 29.3MB ± 0% -1.58% (p=0.008 n=5+5)
GoTypes 125MB ± 0% 123MB ± 0% -2.13% (p=0.008 n=5+5)
Compiler 531MB ± 0% 513MB ± 0% -3.40% (p=0.008 n=5+5)
SSA 2.00GB ± 0% 1.93GB ± 0% -3.34% (p=0.008 n=5+5)
Flate 24.5MB ± 0% 24.3MB ± 0% -1.18% (p=0.008 n=5+5)
GoParser 29.4MB ± 0% 28.7MB ± 0% -2.34% (p=0.008 n=5+5)
Reflect 87.1MB ± 0% 86.0MB ± 0% -1.33% (p=0.008 n=5+5)
Tar 35.3MB ± 0% 34.8MB ± 0% -1.44% (p=0.008 n=5+5)
XML 47.9MB ± 0% 47.1MB ± 0% -1.86% (p=0.008 n=5+5)
[Geo mean] 82.8MB 81.1MB -2.08%
name old allocs/op new allocs/op delta
Template 352k ± 0% 347k ± 0% -1.32% (p=0.008 n=5+5)
Unicode 342k ± 0% 339k ± 0% -0.66% (p=0.008 n=5+5)
GoTypes 1.29M ± 0% 1.27M ± 0% -1.30% (p=0.008 n=5+5)
Compiler 4.98M ± 0% 4.87M ± 0% -2.14% (p=0.008 n=5+5)
SSA 15.7M ± 0% 15.2M ± 0% -2.86% (p=0.008 n=5+5)
Flate 233k ± 0% 231k ± 0% -0.83% (p=0.008 n=5+5)
GoParser 296k ± 0% 291k ± 0% -1.54% (p=0.016 n=5+4)
Reflect 1.05M ± 0% 1.04M ± 0% -0.65% (p=0.008 n=5+5)
Tar 343k ± 0% 339k ± 0% -0.97% (p=0.008 n=5+5)
XML 432k ± 0% 426k ± 0% -1.19% (p=0.008 n=5+5)
[Geo mean] 815k 804k -1.35%
name old object-bytes new object-bytes delta
Template 505kB ± 0% 505kB ± 0% -0.01% (p=0.008 n=5+5)
Unicode 224kB ± 0% 224kB ± 0% ~ (all equal)
GoTypes 1.82MB ± 0% 1.83MB ± 0% +0.06% (p=0.008 n=5+5)
Flate 324kB ± 0% 324kB ± 0% +0.00% (p=0.008 n=5+5)
GoParser 402kB ± 0% 402kB ± 0% +0.04% (p=0.008 n=5+5)
Reflect 1.39MB ± 0% 1.39MB ± 0% -0.01% (p=0.008 n=5+5)
Tar 449kB ± 0% 449kB ± 0% -0.02% (p=0.008 n=5+5)
XML 598kB ± 0% 597kB ± 0% -0.05% (p=0.008 n=5+5)
Change-Id: Ifc9d5c1bd01f90171414b8fb18ffe2290d271143
Reviewed-on: https://go-review.googlesource.com/c/114797
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Nowadays there are better ways to safely run untrusted Go programs, like
NaCl and gVisor.
Change-Id: I20c45f13a50dbcf35c343438b720eb93e7b4e13a
Reviewed-on: https://go-review.googlesource.com/c/142717
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
The generated code for the append builtin already checks if the appended
to slice is large enough and calls growslice if that is not the case.
Trust that this ensures the slice is large enough and avoid the
implicit bounds check when slicing the slice to its new size.
Removes 365 panicslice calls (-14%) from the go binary which
reduces the binary size by ~12kbyte.
Change-Id: I1b88418675ff409bc0b956853c9e95241274d5a6
Reviewed-on: https://go-review.googlesource.com/c/119315
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
When we pass these types by reference, we usually have to allocate
temporaries on the stack, initialize them, then pass their address
to the conversion functions. It's simpler to pass these types
directly by value.
This particularly applies to conversions needed for fmt.Printf
(to interface{} for constructing a [...]interface{}).
func f(a, b, c string) {
fmt.Printf("%s %s\n", a, b)
fmt.Printf("%s %s\n", b, c)
}
This function's stack frame shrinks from 200 to 136 bytes, and
its code shrinks from 535 to 453 bytes.
The go binary shrinks 0.3%.
Update #24286
Aside: for this function f, we don't really need to allocate
temporaries for the convT2E function. We could use the address
of a, b, and c directly. That might get similar (or maybe better?)
improvements. I investigated a bit, but it seemed complicated
to do it safely. This change was much easier.
Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d
Reviewed-on: https://go-review.googlesource.com/c/135377
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This CL makes sure we walk the newly generated assignment. Part of
that walk makes sure that all symbols for strings are emitted before
we start referencing them during the parallel compilation
phase. Without this change, those references during the parallel phase
do a create-if-not-exist, which leads to a data race.
I'm not 100% sure this is the fix for the issues below, but optimistically
assuming it is...
Fixes#28170Fixes#28159
Change-Id: Ic63d5160ad9be5cb23fa6bbb2183e4848776c0ff
Reviewed-on: https://go-review.googlesource.com/c/141648
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Interface and string comparisons don't need separate Ops any more than
struct or array comparisons do.
Removing them requires shuffling some code around in walk (and a
little in order), but overall allows simplifying things a bit.
Passes toolstash-check.
Change-Id: I084b8a6c089b768dc76d220379f4daed8a35db15
Reviewed-on: https://go-review.googlesource.com/c/141637
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Do []byte(string) conversions more efficiently when the string
is a constant. Instead of calling stringtobyteslice, allocate
just the space we need and encode the initialization directly.
[]byte("foo") rewrites to the following pseudocode:
var s [3]byte // on heap or stack, depending on whether b escapes
s = *(*[3]byte)(&"foo"[0]) // initialize s from the string
b = s[:]
which generates this assembly:
0x001d 00029 (tmp1.go:9) LEAQ type.[3]uint8(SB), AX
0x0024 00036 (tmp1.go:9) MOVQ AX, (SP)
0x0028 00040 (tmp1.go:9) CALL runtime.newobject(SB)
0x002d 00045 (tmp1.go:9) MOVQ 8(SP), AX
0x0032 00050 (tmp1.go:9) MOVBLZX go.string."foo"+2(SB), CX
0x0039 00057 (tmp1.go:9) MOVWLZX go.string."foo"(SB), DX
0x0040 00064 (tmp1.go:9) MOVW DX, (AX)
0x0043 00067 (tmp1.go:9) MOVB CL, 2(AX)
// Then the slice is b = {AX, 3, 3}
The generated code is still not optimal, as it still does load/store
from read-only memory instead of constant stores. Next CL...
Update #26498Fixes#10170
Change-Id: I4b990b19f9a308f60c8f4f148934acffefe0a5bd
Reviewed-on: https://go-review.googlesource.com/c/140698
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This was missed as part of adding a top-level VARDEF
for stack tracing (CL 134156).
Fixes#28055
Change-Id: Id14748dfccb119197d788867d2ec6a3b3c9835cf
Reviewed-on: https://go-review.googlesource.com/c/140304
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
Node.copy used to make a shallow copy of a node. Often, this is not
correct: If a node n's Orig field pointed to itself, the copy's Orig
field has to be adjusted to point to the copy. Otherwise, if n is
modified later, the copy's Orig appears modified as well (because it
points to n).
This was fixed for one specific case with
https://go-review.googlesource.com/c/go/+/136395 (issue #26855).
This change instead addresses copy in general:
In two cases we don't want the Orig adjustment as it causes escape
analysis output to fail (not match the existing error messages).
rawcopy is used in those cases.
In several cases Orig is set to the copy immediately after making
a copy; a new function sepcopy is used there.
Updates #26855.
Fixes#27765.
Change-Id: Idaadeb5c4b9a027daabd46a2361348f7a93f2b00
Reviewed-on: https://go-review.googlesource.com/136540
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
In the compiler frontend, walkinrange indiscriminately calls Int64()
on const CTINT nodes, even though Int64's return value is undefined
for anything over 2⁶³ (in practise, it'll return a negative number).
This causes the introduction of bad constants during rewrites of
unsigned expressions, which make the compiler reject valid Go
programs.
This change introduces a preliminary check that Int64() is safe to
call on the consts on hand. If it isn't, walkinrange exits without
doing any rewrite.
Fixes#27143
Change-Id: I2017073cae65468a521ff3262d4ea8ab0d7098d9
Reviewed-on: https://go-review.googlesource.com/130735
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Use a separate func, which needs less indentation and can use returns
instead of labelled breaks. We can also give the types better names, and
we don't have to repeat the calls to conv and mkcall.
Passes toolstash -cmp on std cmd.
Change-Id: I1071c170fa729562d70093a09b7dea003c5fe26e
Reviewed-on: https://go-review.googlesource.com/130075
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
- add comments with builtin function signatures that are instantiated
- use Nodes type from the beginning instead of
[]*Node with a later conversion to Nodes
- use conv(x, y) helper function instead of nod(OCONV, x, y)
- factor out repeated calls to Type.Elem()
This makes the function style similar to newer functions like extendslice.
passes toolstash -cmp
Change-Id: Iedab191af9e0884fb6762c9c168430c1d2246979
Reviewed-on: https://go-review.googlesource.com/112598
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Like the conv helper function but for creating OCONVNOP nodes
instead of OCONV nodes.
passes toolstash -cmp
Change-Id: Ib93ffe66590ebaa2b4fa552c81f1a2902e789d8e
Reviewed-on: https://go-review.googlesource.com/112597
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Two funcs and a field were unused. Remove them.
A few statements could be made simpler.
importsym's pos parameter was unused, so remove it.
Finally, don't use printf-like funcs with constant strings that have no
formatting directives.
Change-Id: I415452249bf2168aa353ac4f3643dfc03017ee53
Reviewed-on: https://go-review.googlesource.com/117699
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
Inlining of switch statements into a RETURNed expression
can sometimes lead to the switch being walked twice, which
results in a miscompiled switch statement. The bug depends
on:
1) multiple results
2) named results
3) a return statement whose expression includes a call to a
function containing a switch statement that is inlined.
It may also be significant that the default case of that
switch is a panic(), though that's not proven.
Rearranged the walk case for ORETURN so that double walks are
not possible. Added a test, because this is so fiddly.
Added a check against double walks, verified that it fires
w/o other fix.
Fixes#25776.
Change-Id: I2d594351fa082632512ef989af67eb887059729b
Reviewed-on: https://go-review.googlesource.com/118318
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
-W and -w turn on printing of Nodes for both order and walk.
I have found their output mildly incomprehensible for years.
Improve it, at long last.
Change-Id: Ia05d77e59aa741c2dfc9fcca07f45019420b655e
Reviewed-on: https://go-review.googlesource.com/114520
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Currently, range loops over slices and arrays are compiled roughly
like:
for i, x := range s { b }
⇓
for i, _n, _p := 0, len(s), &s[0]; i < _n; i, _p = i+1, _p + unsafe.Sizeof(s[0]) { b }
⇓
i, _n, _p := 0, len(s), &s[0]
goto cond
body:
{ b }
i, _p = i+1, _p + unsafe.Sizeof(s[0])
cond:
if i < _n { goto body } else { goto end }
end:
The problem with this lowering is that _p may temporarily point past
the end of the allocation the moment before the loop terminates. Right
now this isn't a problem because there's never a safe-point during
this brief moment.
We're about to introduce safe-points everywhere, so this bad pointer
is going to be a problem. We could mark the increment as an unsafe
block, but this inhibits reordering opportunities and could result in
infrequent safe-points if the body is short.
Instead, this CL fixes this by changing how we compile range loops to
never produce this past-the-end pointer. It changes the lowering to
roughly:
i, _n, _p := 0, len(s), &s[0]
if i < _n { goto body } else { goto end }
top:
_p += unsafe.Sizeof(s[0])
body:
{ b }
i++
if i < _n { goto top } else { goto end }
end:
Notably, the increment is split into two parts: we increment the index
before checking the condition, but increment the pointer only *after*
the condition check has succeeded.
The implementation builds on the OFORUNTIL construct that was
introduced during the loop preemption experiments, since OFORUNTIL
places the increment and condition after the loop body. To support the
extra "late increment" step, we further define OFORUNTIL's "List"
field to contain the late increment statements. This makes all of this
a relatively small change.
This depends on the improvements to the prove pass in CL 102603. With
the current lowering, bounds-check elimination knows that i < _n in
the body because the body block is dominated by the cond block. In the
new lowering, deriving this fact requires detecting that i < _n on
*both* paths into body and hence is true in body. CL 102603 made prove
able to detect this.
The code size effect of this is minimal. The cmd/go binary on
linux/amd64 increases by 0.17%. Performance-wise, this actually
appears to be a net win, though it's mostly noise:
name old time/op new time/op delta
BinaryTree17-12 2.80s ± 0% 2.61s ± 1% -6.88% (p=0.000 n=20+18)
Fannkuch11-12 2.41s ± 0% 2.42s ± 0% +0.05% (p=0.005 n=20+20)
FmtFprintfEmpty-12 41.6ns ± 5% 41.4ns ± 6% ~ (p=0.765 n=20+19)
FmtFprintfString-12 69.4ns ± 3% 69.3ns ± 1% ~ (p=0.084 n=19+17)
FmtFprintfInt-12 76.1ns ± 1% 77.3ns ± 1% +1.57% (p=0.000 n=19+19)
FmtFprintfIntInt-12 122ns ± 2% 123ns ± 3% +0.95% (p=0.015 n=20+20)
FmtFprintfPrefixedInt-12 153ns ± 2% 151ns ± 3% -1.27% (p=0.013 n=20+20)
FmtFprintfFloat-12 215ns ± 0% 216ns ± 0% +0.47% (p=0.000 n=20+16)
FmtManyArgs-12 486ns ± 1% 498ns ± 0% +2.40% (p=0.000 n=20+17)
GobDecode-12 6.43ms ± 0% 6.50ms ± 0% +1.08% (p=0.000 n=18+19)
GobEncode-12 5.43ms ± 1% 5.47ms ± 0% +0.76% (p=0.000 n=20+20)
Gzip-12 218ms ± 1% 218ms ± 1% ~ (p=0.883 n=20+20)
Gunzip-12 38.8ms ± 0% 38.9ms ± 0% ~ (p=0.644 n=19+19)
HTTPClientServer-12 76.2µs ± 1% 76.4µs ± 2% ~ (p=0.218 n=20+20)
JSONEncode-12 12.2ms ± 0% 12.3ms ± 1% +0.45% (p=0.000 n=19+19)
JSONDecode-12 54.2ms ± 1% 53.3ms ± 0% -1.67% (p=0.000 n=20+20)
Mandelbrot200-12 3.71ms ± 0% 3.71ms ± 0% ~ (p=0.143 n=19+20)
GoParse-12 3.22ms ± 0% 3.19ms ± 1% -0.72% (p=0.000 n=20+20)
RegexpMatchEasy0_32-12 76.7ns ± 1% 75.8ns ± 1% -1.19% (p=0.000 n=20+17)
RegexpMatchEasy0_1K-12 245ns ± 1% 243ns ± 0% -0.72% (p=0.000 n=18+17)
RegexpMatchEasy1_32-12 71.9ns ± 0% 71.7ns ± 1% -0.39% (p=0.006 n=12+18)
RegexpMatchEasy1_1K-12 358ns ± 1% 354ns ± 1% -1.13% (p=0.000 n=20+19)
RegexpMatchMedium_32-12 105ns ± 2% 105ns ± 1% -0.63% (p=0.007 n=19+20)
RegexpMatchMedium_1K-12 31.9µs ± 1% 31.9µs ± 1% ~ (p=1.000 n=17+17)
RegexpMatchHard_32-12 1.51µs ± 1% 1.52µs ± 2% +0.46% (p=0.042 n=18+18)
RegexpMatchHard_1K-12 45.3µs ± 1% 45.5µs ± 2% +0.44% (p=0.029 n=18+19)
Revcomp-12 388ms ± 1% 385ms ± 0% -0.57% (p=0.000 n=19+18)
Template-12 63.0ms ± 1% 63.3ms ± 0% +0.50% (p=0.000 n=19+20)
TimeParse-12 309ns ± 1% 307ns ± 0% -0.62% (p=0.000 n=20+20)
TimeFormat-12 328ns ± 0% 333ns ± 0% +1.35% (p=0.000 n=19+19)
[Geo mean] 47.0µs 46.9µs -0.20%
(https://perf.golang.org/search?q=upload:20180326.1)
For #10958.
For #24543.
Change-Id: Icbd52e711fdbe7938a1fea3e6baca1104b53ac3a
Reviewed-on: https://go-review.googlesource.com/102604
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Using the extendslice init node list to add the init nodes for the memclr
call could add init nodes for memclr function before the growslice call
created by extendslice.
As all arguments of the memclr were explicitly set in OAS nodes before
the memclr call this does not change the generated code currently.
./all.bash runs fine when replacing memclr init with nil suggesting there
are currently no additional nodes added to the init of extendslice by
the memclr call.
Add the init nodes for the memclr call directly before the node of the
memclr call to prevent additional future init nodes for function calls
and argument evaluations to be evaluated too early when other compiler
code is added.
passes toolstash -cmp
Updates #21266
Change-Id: I44bd396fe864bfda315175aa1064f9d51c5fb57a
Reviewed-on: https://go-review.googlesource.com/112595
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Pick the low-hanging fruit, which are the gotos that don't go very far
and labels that aren't used often. All of them have easy replacements
with breaks and returns.
One slightly tricky rewrite is defaultlitreuse. We cannot use a defer
func to reset lineno, because one of its return paths does not reset
lineno, and thus broke toolstash -cmp.
Passes toolstash -cmp on std cmd.
Change-Id: Id1c0967868d69bb073addc7c5c3017ca91ff966f
Reviewed-on: https://go-review.googlesource.com/110063
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Some of the comments relative paths do not exist and
reflect does not define its own hmap structure.
Correct paths and consistently reference paths starting from the
go src directory.
Change-Id: I5204a3a98f77d65f17dcde98b847378cea05ad8a
Reviewed-on: https://go-review.googlesource.com/94758
Run-TryBot: Martin Möhrmann <moehrmann@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This reduces the API surface of Type slightly (for #25056), but also
makes it more consistent with the reflect and go/types APIs.
Passes toolstash-check.
Change-Id: Ief9a8eb461ae6e88895f347e2a1b7b8a62423222
Reviewed-on: https://go-review.googlesource.com/109138
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
There are a bunch of places where we generate functions: equality and
hash functions; method expression and promoted method wrappers; and
print/delete wrappers for defer/go statements.
This CL brings them in sync by:
1) Always using dclfunc and funcbody. Most were already using this,
but makepartialcall needed some changes.
2) Removing duplicate types.Markdcl/types.Popdcl calls. These are
already handled by dclfunc and funcbody.
3) Using structargs (already used by genwrapper) to construct new
param/result lists from existing types.
4) Always accessing the parameter ONAME nodes through Field.Nname
instead of poking into the ODCLFIELD. Also, since creating a slice of
the entire parameter list is common, extract this out into a
paramNnames helper function.
5) Add a Type.IsVariadic method to simplify identifying variadic
function types.
Passes toolstash-check -gcflags=-dwarf=false. DWARF output changes
because using structargs in makepartialcall changes the generated
parameter names.
Change-Id: I6661d3699afdbe7852ad60db5a4ec6eeb2b696e4
Reviewed-on: https://go-review.googlesource.com/108216
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Focus on "isfoo" funcs that take a *Node, and conver them to isFoo
methods instead. This makes for more idiomatic Go code, and also more
readable func names.
Found candidates with grep, and applied most changes with sed. The funcs
chosen were isgoconst, isnil, and isblank. All had the same signature,
func(*Node) bool.
While at it, camelCase the isliteral and iszero function names. Don't
move these to methods, as they are only used in the backend part of gc,
which might one day be split into a separate package.
Passes toolstash -cmp on std cmd.
Change-Id: I4df081b12d36c46c253167c8841c5a841f1c5a16
Reviewed-on: https://go-review.googlesource.com/105555
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Multi-byte comparison operations were used on amd64, arm64, i386
and s390x for comparisons with constant arrays, but only amd64 and
i386 for comparisons with string constants. This CL combines the
check for platform capability, since they have the same requirements,
and also enables both on ppc64le which also supports load merging.
Note that these optimizations currently use little endian byte order
which results in byte reversal instructions on s390x. This should
be fixed at some point.
Change-Id: Ie612d13359b50c77f4d7c6e73fea4a59fa11f322
Reviewed-on: https://go-review.googlesource.com/102558
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Insert appropriate race/msan calls before each memory operation during
SSA construction.
This is conceptually simple, but subtle because we need to be careful
that inserted instrumentation calls don't clobber arguments that are
currently being prepared for a user function call.
reorder1 already handles introducing temporary variables for arguments
in some cases. This CL changes it to use them for all arguments when
instrumenting.
Also, we can't SSA struct types with more than one field while
instrumenting. Otherwise, concurrent uses of disjoint fields within an
SSA-able struct can introduce false races.
This is both somewhat better and somewhat worse than the old racewalk
instrumentation pass. We're now able to easily recognize cases like
constructing non-escaping closures on the stack or accessing closure
variables don't need instrumentation calls. On the other hand,
spilling escaping parameters to the heap now results in an
instrumentation call.
Overall, this CL results in a small net reduction in the number of
instrumentation calls, but a small net increase in binary size for
instrumented executables. cmd/go ends up with 5.6% fewer calls, but a
2.4% larger binary.
Fixes#19054.
Change-Id: I70d1dd32ad6340e6fdb691e6d5a01452f58e97f3
Reviewed-on: https://go-review.googlesource.com/102817
Reviewed-by: Cherry Zhang <cherryyz@google.com>
NumComponents is used by racewalk to decide whether reads and writes
might occur to subobjects of an address. For that purpose,
blank fields matter.
It is also used to decide whether to inline == and != for a type.
For that purpose, blank fields may be ignored.
Add a parameter to NumComponents to support this distinction.
While we're here, document NumComponents, as requested in CL 59334.
Change-Id: I8c2021b172edadd6184848a32a74774dde1805c8
Reviewed-on: https://go-review.googlesource.com/103755
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
When making a shallow copy of a node, various methods were used,
including calling nod(OXXX, nil, nil) and then overwriting it, or
"n1 := *n" and then using &n1.
Add a copy method instead, simplifying all of those and making them
consistent.
Passes toolstash -cmp on std cmd.
Change-Id: I3f3fc88bad708edc712bf6d87214cda4ddc43b01
Reviewed-on: https://go-review.googlesource.com/72710
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Extract all rewrite-to-OLITERAL expressions to use a single setconst
helper function.
Does not pass toolstash-check for two reasons:
1) We now consistently clear Left/Right/etc when rewriting Nodes into
OLITERALs, which results in their inlining complexity being correctly
computed. So more functions can now be inlined.
2) We preserve Pos, so PC line tables change somewhat.
Change-Id: I2b5c293bee7c69c2ccd704677f5aba4ec40e3155
Reviewed-on: https://go-review.googlesource.com/103860
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>