The Zero op right after newobject has been removed. But this rule
does not cover Store of constant zero (for SSA-able types). Add
rules to cover Store op as well.
Updates #19027.
Change-Id: I5d2b62eeca0aa9ce8dc7205b264b779de01c660b
Reviewed-on: https://go-review.googlesource.com/36836
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Currently the conversion from constant divides to multiplies is mostly
done during the walk pass. This is suboptimal because SSA can
determine that the value being divided by is constant more often
(e.g. after inlining).
Change-Id: If1a9b993edd71be37396b9167f77da271966f85f
Reviewed-on: https://go-review.googlesource.com/37015
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Remove rotate generation from walk. Remove OLROT and ssa.Lrot* opcodes.
Generate rotates during SSA lowering for architectures that have them.
This CL will allow rotates to be generated in more situations,
like when the shift values are determined to be constant
only after some analysis.
Fixes#18254
Change-Id: I8d6d684ff5ce2511aceaddfda98b908007851079
Reviewed-on: https://go-review.googlesource.com/34232
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This is a mostly mechanical rename followed by manual fixes where necessary.
Change-Id: Ie5c670b133db978f15dc03e50dc2da0c80fc8842
Reviewed-on: https://go-review.googlesource.com/34137
Reviewed-by: David Lazar <lazard@golang.org>
Adds the new canMergeLoad function which can be used by rules to
decide whether a load can be merged into an operation. The function
ensures that the merge will not reorder the load relative to memory
operations (for example, stores) in such a way that the block can no
longer be scheduled.
This new function enables transformations such as:
MOVD 0(R1), R2
ADD R2, R3
to:
ADD 0(R1), R3
The two-operand form of the following instructions can now read a
single memory operand:
- ADD
- ADDC
- ADDW
- MULLD
- MULLW
- SUB
- SUBC
- SUBE
- SUBW
- AND
- ANDW
- OR
- ORW
- XOR
- XORW
Improves SHA3 performance by 6-8%.
Updates #15054.
Change-Id: Ibcb9122126cd1a26f2c01c0dfdbb42fe5e7b5b94
Reviewed-on: https://go-review.googlesource.com/29272
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Inputs to store[BHW] and cmpW(U) need not be correct
in more bits than are used by the instruction.
Added a pattern tailored to what appears to be cgo boilerplate.
Added a pattern (also seen in cgo boilerplate and hashing)
to replace {EQ,NE}-CMP-ANDconst with {EQ-NE}-ANDCCconst.
Added a pattern to clean up ANDconst shift distance inputs
(this was seen in hashing).
Simplify repeated and,or,xor.
Fixes#17109.
Change-Id: I68eac83e3e614d69ffe473a08953048c8b066d88
Reviewed-on: https://go-review.googlesource.com/30455
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Added rules for compare double and word immediate,
including those that use invertflags to cope with
flipped operands.
Change-Id: I594430a210e076e52299a2cc6ab074dbb04a02bd
Reviewed-on: https://go-review.googlesource.com/29763
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
Abandoned earlier efforts to expose zero register,
but left it in numbering to decrease squirrelyness of
register allocator.
ISELrelOp used in code generation of bool := x relOp y.
Some patterns added to better elide zero case and
some sign extension.
Updates: #17109
Change-Id: Ida7839f0023ca8f0ffddc0545f0ac269e65b05d9
Reviewed-on: https://go-review.googlesource.com/29380
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
The new SSA backend modifies the ABI slightly: R0 is now a usable
general purpose register.
Fixes#16677.
Change-Id: I367435ce921e0c7e79e021c80cf8ef5d1d1466cf
Reviewed-on: https://go-review.googlesource.com/28978
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
- Use machine instructions for uint64<->float conversions
- Do not enforce alignment on Zero/Move
ARM64 supports unaligned load/stores, but only aligned offset
or small offset can be encoded into instructions.
- Do combined loads
Change-Id: Iffca7dd0f13070b17b784861ce5a30af584680eb
Reviewed-on: https://go-review.googlesource.com/27086
Reviewed-by: David Chase <drchase@google.com>
SSA compiler on AMD64 may spill Duff-adjusted address as scalar. If
the object is on stack and the stack moves, the spilled address become
invalid.
Making the spill pointer-typed does not work. The Duff-adjusted address
points to the memory before the area to be zeroed and may be invalid.
This may cause stack scanning code panic.
Fix it by doing Duff-adjustment in genValue, so the intermediate value
is not seen by the reg allocator, and will not be spilled.
Add a test to cover both cases. As it depends on allocation, it may
be not always triggered.
Fixes#16515.
Change-Id: Ia81d60204782de7405b7046165ad063384ede0db
Reviewed-on: https://go-review.googlesource.com/25309
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Support the following:
- Shifts. ARM64 machine instructions only use lowest 6 bits of the
shift (i.e. mod 64). Use conditional selection instruction to
ensure Go semantics.
- Zero/Move. Alignment is ensured.
- Hmul, Avg64u, Sqrt.
- reserve R18 (platform register in ARM64 ABI) and R29 (frame pointer
in ARM64 ABI).
Everything compiles, all.bash passed (with non-SSA test disabled).
Change-Id: Ia8ed58dae5cbc001946f0b889357b258655078b1
Reviewed-on: https://go-review.googlesource.com/25290
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Fix up zero/move code, including duff calls and rep movs.
Handle the new ops generated by dec64.rules.
Fix constant shifts.
Change-Id: I7d89194b29b04311bfafa0fd93b9f5644af04df9
Reviewed-on: https://go-review.googlesource.com/25033
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
When rules are generated with -log, log rule application to a file.
The file is opened in append mode so multiple calls to the compiler
union their logs.
Change-Id: Ib35c7c85bf58e5909ea9231043f8cbaa6bf278b7
Reviewed-on: https://go-review.googlesource.com/23406
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
It is unused, remove the clutter.
Change-Id: I51a44326b125ef79241459c463441f76a289cc08
Reviewed-on: https://go-review.googlesource.com/22586
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Make sure we don't do O(n^2) work to eliminate a chain
of n copies.
benchmark old ns/op new ns/op delta
BenchmarkCopyElim1-8 1418 1406 -0.85%
BenchmarkCopyElim10-8 5289 5162 -2.40%
BenchmarkCopyElim100-8 52618 41684 -20.78%
BenchmarkCopyElim1000-8 2473878 424339 -82.85%
BenchmarkCopyElim10000-8 269373954 6367971 -97.64%
BenchmarkCopyElim100000-8 31272781165 104357244 -99.67%
Change-Id: I680f906f70f2ee1a8615cb1046bc510c77d59284
Reviewed-on: https://go-review.googlesource.com/22535
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
Combine stores into larger widths when it is safe to do so.
Add clobber() function so stray dead uses do not impede the
above rewrites.
Fix bug in loads where all intermediate values depending on
a small load (not just the load itself) must have no other uses.
We really need the small load to be dead after the rewrite..
Fixes#14267
Change-Id: Ib25666cb19777f65082c76238fba51a76beb5d74
Reviewed-on: https://go-review.googlesource.com/22326
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
Remove the "optimization" that was causing the issue.
For the following code the "optimization" was
converting v to (OpCopy x) which is wrong because
x doesn't dominate v.
b1:
y = ...
First .. b3
b2:
x = ...
Goto b3
b3:
v = phi x y
... use v ...
That "optimization" is likely no longer needed because
we now have a second opt pass with a dce in between
which removes blocks of type First.
For pkg/tools/linux_amd64/* the binary size drops
from 82142886 to 82060034.
Change-Id: I10428abbd8b32c5ca66fec3da2e6f3686dddbe31
Reviewed-on: https://go-review.googlesource.com/22312
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
We need to make sure that when we combine loads, we only do
so if there are no other uses of the load. We can't split
one load into two because that can then lead to inconsistent
loaded values in the presence of races.
Add some aggressive copy removal code so that phantom
"dead copy" uses of values are cleaned up promptly. This lets
us use x.Uses==1 conditions reliably.
Change-Id: I9037311db85665f3868dbeb3adb3de5c20728b38
Reviewed-on: https://go-review.googlesource.com/21853
Reviewed-by: Todd Neal <todd@tneal.org>
We need to make sure all the bounds checks pass before issuing
a load which combines several others. We do this by issuing the
combined load at the last load's block, where "last" = closest to
the leaf of the dominator tree.
Fixes#15002
Change-Id: I7358116db1e039a072c12c0a73d861f3815d72af
Reviewed-on: https://go-review.googlesource.com/21246
Reviewed-by: Todd Neal <todd@tneal.org>
Previously, t.IsPtr() reported whether t was represented with a
pointer, but some of its callers expected it to report whether t is an
actual Go pointer. Resolve this by renaming t.IsPtr to t.IsPtrShaped
and adding a new t.IsPtr method to report Go pointer types.
Updated a couple callers in gc/ssa.go to use IsPtr instead of
IsPtrShaped.
Passes toolstash -cmp.
Updates #15028.
Change-Id: I0a8154b5822ad8a6ad296419126ad01a3d2a5dc5
Reviewed-on: https://go-review.googlesource.com/21232
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Previously if we were only using the low bits of AuxInt,
the high bits were ignored and could be junk. This CL
changes that behavior to define the high bits to be the
sign-extended version of the low bits for all cases.
There are 2 main benefits:
- Deterministic representation. This helps with CSE.
(Const8 [0x1]) and (Const8 [0x101]) used to be the same "value"
but CSE couldn't see them as such.
- Testability. We can check that all ops leave AuxInt in a state
consistent with the new rule. In the old scheme, it was hard
to check whether a rule correctly used only the low-order bits.
Side benefits:
- ==0 and !=0 tests are easier.
Drawbacks:
- This differs from the runtime representation in registers,
where it is important that we allow upper bits to be undefined
(so we're not sign/zero-extending all the time).
- Ops that treat AuxInt as unsigned (shifts, mostly) need to be
a bit more careful.
Change-Id: I9a685ff27e36dc03287c9ab1cecd6c0b4045c819
Reviewed-on: https://go-review.googlesource.com/21256
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Most 64-bit x86 ops can only take a signed 32-bit constant.
Clean up our rewrite rules to enforce this restriction.
Modify the assembler to fail if the offset does not fit
in the instruction.
That last check triggers a few times on weird testing code.
Suppress those errors if the compiler itself generated errors.
Fixes#14862
Change-Id: I76559af035b38483b1e59621a8029fc66b3a5d1e
Reviewed-on: https://go-review.googlesource.com/20815
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Keep track of how many uses each Value has. Each appearance in
Value.Args and in Block.Control counts once.
The number of uses of a value is generically useful to
constrain rewrite rules. For instance, we might want to
prevent merging index operations into loads if the same
index expression is used lots of times.
But I have one use in particular for which the use count is required.
We must make sure we don't combine ops with loads if the load has
more than one use. Otherwise, we may split a single load
into multiple loads and that breaks perceived behavior in
the presence of races. In particular, the load of m.state
in sync/mutex.go:Lock can't be done twice. (I have a separate
CL which triggers the mutex failure. This CL has a test which
demonstrates a similar failure.)
Change-Id: Icaafa479239f48632a069d0c3f624e6ebc6b1f0e
Reviewed-on: https://go-review.googlesource.com/20790
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
Split the auxFloat type into 32/64 bit versions and perform checking for
exactly representable float32 values. Perform const folding on
float32/64. Comment out some const negation rules that the frontend
already performs.
Change-Id: Ib3f8d59fa8b30e50fe0267786cfb3c50a06169d2
Reviewed-on: https://go-review.googlesource.com/20568
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This increases the number of matches in make.bash
from 853 to 984.
Change-Id: I12697697a50ecd86d49698200144a4c80dd3e5a4
Reviewed-on: https://go-review.googlesource.com/20274
Reviewed-by: Todd Neal <todd@tneal.org>
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.
This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:
$ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])')
$ go test go/doc -update
Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Loads of stores from the same pointer with compatible types
can be replaced with a copy.
Change-Id: I514b3ed8e5b6a9c432946880eac67a51b1607932
Reviewed-on: https://go-review.googlesource.com/19743
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
* Merge copyelim into phielim.
* Add phielimValue to rewrite. cgoIsGoPointer is, for example, 2
instructions smaller now.
Change-Id: I8baeb206d1b3ef8aba4a6e3bcdc432959bcae2d5
Reviewed-on: https://go-review.googlesource.com/19462
Reviewed-by: David Chase <drchase@google.com>
ANDs of constants whose only set bits are leading or trailing can be
rewritten as two shifts instead. This is slightly faster for 32 or
64 bit operands.
Change-Id: Id5c1ff27e5a4df22fac67b03b9bddb944871145d
Reviewed-on: https://go-review.googlesource.com/19485
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Be more consistent about this. There's no reason to do the
pointer arithmetic on a different type, as sizeof(int) >=
sizeof(ptr) on all of our platforms. It simplifies our
rewrite rules also, except for a few that need duplication.
Add some more constant folding to get constant indexing and
slicing to fold down to nothing.
Change-Id: I3e56cdb14b3dc1a6a0514f0333e883f92c19e3c7
Reviewed-on: https://go-review.googlesource.com/16586
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Store floats in AuxInt to reduce allocations.
Change-Id: I101e6322530b4a0b2ea3591593ad022c992e8df8
Reviewed-on: https://go-review.googlesource.com/14320
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Store bools in AuxInt to reduce allocations.
Change-Id: Ibd26db67fca5e1e2803f53d7ef094897968b704b
Reviewed-on: https://go-review.googlesource.com/14276
Reviewed-by: Keith Randall <khr@golang.org>
MOVXload and MOVXstore opcodes have both an auxint offset
and an aux offset (a symbol name, like a local or arg or global).
Make sure we keep those values during rewrites.
Change-Id: Ic9fd61bf295b5d1457784c281079a4fb38f7ad3b
Reviewed-on: https://go-review.googlesource.com/13849
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Added F32 and F64 load, store, and addition.
Added F32 and F64 multiply.
Added F32 and F64 subtraction and division.
Added X15 to "clobber" for FP sub/div
Added FP constants
Added separate FP test in gc/testdata
Change-Id: Ifa60dbad948a40011b478d9605862c4b0cc9134c
Reviewed-on: https://go-review.googlesource.com/13612
Reviewed-by: Keith Randall <khr@golang.org>
Generating logging code every time causes large
diffs for small changes.
Since the intent is to use this for debugging only,
generate logging code only when requested.
Committed generated code will be logging free.
Change-Id: I9ef9e29c88b76c2557bad4c6b424b9db1255ec8b
Reviewed-on: https://go-review.googlesource.com/13623
Reviewed-by: Keith Randall <khr@golang.org>