The new SSA backend modifies the ABI slightly: R0 is now a usable
general purpose register.
Fixes#16677.
Change-Id: I367435ce921e0c7e79e021c80cf8ef5d1d1466cf
Reviewed-on: https://go-review.googlesource.com/28978
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
- Use machine instructions for uint64<->float conversions
- Do not enforce alignment on Zero/Move
ARM64 supports unaligned load/stores, but only aligned offset
or small offset can be encoded into instructions.
- Do combined loads
Change-Id: Iffca7dd0f13070b17b784861ce5a30af584680eb
Reviewed-on: https://go-review.googlesource.com/27086
Reviewed-by: David Chase <drchase@google.com>
SSA compiler on AMD64 may spill Duff-adjusted address as scalar. If
the object is on stack and the stack moves, the spilled address become
invalid.
Making the spill pointer-typed does not work. The Duff-adjusted address
points to the memory before the area to be zeroed and may be invalid.
This may cause stack scanning code panic.
Fix it by doing Duff-adjustment in genValue, so the intermediate value
is not seen by the reg allocator, and will not be spilled.
Add a test to cover both cases. As it depends on allocation, it may
be not always triggered.
Fixes#16515.
Change-Id: Ia81d60204782de7405b7046165ad063384ede0db
Reviewed-on: https://go-review.googlesource.com/25309
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Support the following:
- Shifts. ARM64 machine instructions only use lowest 6 bits of the
shift (i.e. mod 64). Use conditional selection instruction to
ensure Go semantics.
- Zero/Move. Alignment is ensured.
- Hmul, Avg64u, Sqrt.
- reserve R18 (platform register in ARM64 ABI) and R29 (frame pointer
in ARM64 ABI).
Everything compiles, all.bash passed (with non-SSA test disabled).
Change-Id: Ia8ed58dae5cbc001946f0b889357b258655078b1
Reviewed-on: https://go-review.googlesource.com/25290
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Fix up zero/move code, including duff calls and rep movs.
Handle the new ops generated by dec64.rules.
Fix constant shifts.
Change-Id: I7d89194b29b04311bfafa0fd93b9f5644af04df9
Reviewed-on: https://go-review.googlesource.com/25033
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
When rules are generated with -log, log rule application to a file.
The file is opened in append mode so multiple calls to the compiler
union their logs.
Change-Id: Ib35c7c85bf58e5909ea9231043f8cbaa6bf278b7
Reviewed-on: https://go-review.googlesource.com/23406
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
It is unused, remove the clutter.
Change-Id: I51a44326b125ef79241459c463441f76a289cc08
Reviewed-on: https://go-review.googlesource.com/22586
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Make sure we don't do O(n^2) work to eliminate a chain
of n copies.
benchmark old ns/op new ns/op delta
BenchmarkCopyElim1-8 1418 1406 -0.85%
BenchmarkCopyElim10-8 5289 5162 -2.40%
BenchmarkCopyElim100-8 52618 41684 -20.78%
BenchmarkCopyElim1000-8 2473878 424339 -82.85%
BenchmarkCopyElim10000-8 269373954 6367971 -97.64%
BenchmarkCopyElim100000-8 31272781165 104357244 -99.67%
Change-Id: I680f906f70f2ee1a8615cb1046bc510c77d59284
Reviewed-on: https://go-review.googlesource.com/22535
Reviewed-by: Alexandru Moșoi <alexandru@mosoi.ro>
Combine stores into larger widths when it is safe to do so.
Add clobber() function so stray dead uses do not impede the
above rewrites.
Fix bug in loads where all intermediate values depending on
a small load (not just the load itself) must have no other uses.
We really need the small load to be dead after the rewrite..
Fixes#14267
Change-Id: Ib25666cb19777f65082c76238fba51a76beb5d74
Reviewed-on: https://go-review.googlesource.com/22326
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
Remove the "optimization" that was causing the issue.
For the following code the "optimization" was
converting v to (OpCopy x) which is wrong because
x doesn't dominate v.
b1:
y = ...
First .. b3
b2:
x = ...
Goto b3
b3:
v = phi x y
... use v ...
That "optimization" is likely no longer needed because
we now have a second opt pass with a dce in between
which removes blocks of type First.
For pkg/tools/linux_amd64/* the binary size drops
from 82142886 to 82060034.
Change-Id: I10428abbd8b32c5ca66fec3da2e6f3686dddbe31
Reviewed-on: https://go-review.googlesource.com/22312
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
We need to make sure that when we combine loads, we only do
so if there are no other uses of the load. We can't split
one load into two because that can then lead to inconsistent
loaded values in the presence of races.
Add some aggressive copy removal code so that phantom
"dead copy" uses of values are cleaned up promptly. This lets
us use x.Uses==1 conditions reliably.
Change-Id: I9037311db85665f3868dbeb3adb3de5c20728b38
Reviewed-on: https://go-review.googlesource.com/21853
Reviewed-by: Todd Neal <todd@tneal.org>
We need to make sure all the bounds checks pass before issuing
a load which combines several others. We do this by issuing the
combined load at the last load's block, where "last" = closest to
the leaf of the dominator tree.
Fixes#15002
Change-Id: I7358116db1e039a072c12c0a73d861f3815d72af
Reviewed-on: https://go-review.googlesource.com/21246
Reviewed-by: Todd Neal <todd@tneal.org>
Previously, t.IsPtr() reported whether t was represented with a
pointer, but some of its callers expected it to report whether t is an
actual Go pointer. Resolve this by renaming t.IsPtr to t.IsPtrShaped
and adding a new t.IsPtr method to report Go pointer types.
Updated a couple callers in gc/ssa.go to use IsPtr instead of
IsPtrShaped.
Passes toolstash -cmp.
Updates #15028.
Change-Id: I0a8154b5822ad8a6ad296419126ad01a3d2a5dc5
Reviewed-on: https://go-review.googlesource.com/21232
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Previously if we were only using the low bits of AuxInt,
the high bits were ignored and could be junk. This CL
changes that behavior to define the high bits to be the
sign-extended version of the low bits for all cases.
There are 2 main benefits:
- Deterministic representation. This helps with CSE.
(Const8 [0x1]) and (Const8 [0x101]) used to be the same "value"
but CSE couldn't see them as such.
- Testability. We can check that all ops leave AuxInt in a state
consistent with the new rule. In the old scheme, it was hard
to check whether a rule correctly used only the low-order bits.
Side benefits:
- ==0 and !=0 tests are easier.
Drawbacks:
- This differs from the runtime representation in registers,
where it is important that we allow upper bits to be undefined
(so we're not sign/zero-extending all the time).
- Ops that treat AuxInt as unsigned (shifts, mostly) need to be
a bit more careful.
Change-Id: I9a685ff27e36dc03287c9ab1cecd6c0b4045c819
Reviewed-on: https://go-review.googlesource.com/21256
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Most 64-bit x86 ops can only take a signed 32-bit constant.
Clean up our rewrite rules to enforce this restriction.
Modify the assembler to fail if the offset does not fit
in the instruction.
That last check triggers a few times on weird testing code.
Suppress those errors if the compiler itself generated errors.
Fixes#14862
Change-Id: I76559af035b38483b1e59621a8029fc66b3a5d1e
Reviewed-on: https://go-review.googlesource.com/20815
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Keep track of how many uses each Value has. Each appearance in
Value.Args and in Block.Control counts once.
The number of uses of a value is generically useful to
constrain rewrite rules. For instance, we might want to
prevent merging index operations into loads if the same
index expression is used lots of times.
But I have one use in particular for which the use count is required.
We must make sure we don't combine ops with loads if the load has
more than one use. Otherwise, we may split a single load
into multiple loads and that breaks perceived behavior in
the presence of races. In particular, the load of m.state
in sync/mutex.go:Lock can't be done twice. (I have a separate
CL which triggers the mutex failure. This CL has a test which
demonstrates a similar failure.)
Change-Id: Icaafa479239f48632a069d0c3f624e6ebc6b1f0e
Reviewed-on: https://go-review.googlesource.com/20790
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
Split the auxFloat type into 32/64 bit versions and perform checking for
exactly representable float32 values. Perform const folding on
float32/64. Comment out some const negation rules that the frontend
already performs.
Change-Id: Ib3f8d59fa8b30e50fe0267786cfb3c50a06169d2
Reviewed-on: https://go-review.googlesource.com/20568
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This increases the number of matches in make.bash
from 853 to 984.
Change-Id: I12697697a50ecd86d49698200144a4c80dd3e5a4
Reviewed-on: https://go-review.googlesource.com/20274
Reviewed-by: Todd Neal <todd@tneal.org>
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.
This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:
$ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])')
$ go test go/doc -update
Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Loads of stores from the same pointer with compatible types
can be replaced with a copy.
Change-Id: I514b3ed8e5b6a9c432946880eac67a51b1607932
Reviewed-on: https://go-review.googlesource.com/19743
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
* Merge copyelim into phielim.
* Add phielimValue to rewrite. cgoIsGoPointer is, for example, 2
instructions smaller now.
Change-Id: I8baeb206d1b3ef8aba4a6e3bcdc432959bcae2d5
Reviewed-on: https://go-review.googlesource.com/19462
Reviewed-by: David Chase <drchase@google.com>
ANDs of constants whose only set bits are leading or trailing can be
rewritten as two shifts instead. This is slightly faster for 32 or
64 bit operands.
Change-Id: Id5c1ff27e5a4df22fac67b03b9bddb944871145d
Reviewed-on: https://go-review.googlesource.com/19485
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Be more consistent about this. There's no reason to do the
pointer arithmetic on a different type, as sizeof(int) >=
sizeof(ptr) on all of our platforms. It simplifies our
rewrite rules also, except for a few that need duplication.
Add some more constant folding to get constant indexing and
slicing to fold down to nothing.
Change-Id: I3e56cdb14b3dc1a6a0514f0333e883f92c19e3c7
Reviewed-on: https://go-review.googlesource.com/16586
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Store floats in AuxInt to reduce allocations.
Change-Id: I101e6322530b4a0b2ea3591593ad022c992e8df8
Reviewed-on: https://go-review.googlesource.com/14320
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Store bools in AuxInt to reduce allocations.
Change-Id: Ibd26db67fca5e1e2803f53d7ef094897968b704b
Reviewed-on: https://go-review.googlesource.com/14276
Reviewed-by: Keith Randall <khr@golang.org>
MOVXload and MOVXstore opcodes have both an auxint offset
and an aux offset (a symbol name, like a local or arg or global).
Make sure we keep those values during rewrites.
Change-Id: Ic9fd61bf295b5d1457784c281079a4fb38f7ad3b
Reviewed-on: https://go-review.googlesource.com/13849
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Added F32 and F64 load, store, and addition.
Added F32 and F64 multiply.
Added F32 and F64 subtraction and division.
Added X15 to "clobber" for FP sub/div
Added FP constants
Added separate FP test in gc/testdata
Change-Id: Ifa60dbad948a40011b478d9605862c4b0cc9134c
Reviewed-on: https://go-review.googlesource.com/13612
Reviewed-by: Keith Randall <khr@golang.org>
Generating logging code every time causes large
diffs for small changes.
Since the intent is to use this for debugging only,
generate logging code only when requested.
Committed generated code will be logging free.
Change-Id: I9ef9e29c88b76c2557bad4c6b424b9db1255ec8b
Reviewed-on: https://go-review.googlesource.com/13623
Reviewed-by: Keith Randall <khr@golang.org>
This makes it easier to investigate and
understand rewrite behavior.
Change-Id: I790e8964922caf98362ce8a6d6972f52d83eefa8
Reviewed-on: https://go-review.googlesource.com/13588
Reviewed-by: Keith Randall <khr@golang.org>
Use a version of Floyd's cycle finding algorithm,
but advance by 1 and 1/2 steps per cycle rather
than by 1 and 2. It is simpler and should be cheaper
in the normal, acyclic case.
This should fix the 386 and arm builds,
which are currently hung.
Change-Id: If8bd443011b28a5ecb004a549239991d3dfc862b
Reviewed-on: https://go-review.googlesource.com/13473
Reviewed-by: Keith Randall <khr@golang.org>
Fix an issue where doasm fails if trying to multiply by a larger
than 32 bit const (doasm: notfound ft=9 tt=14 00008 IMULQ
$34359738369, CX 9 14). Fix truncation of 64 to 32 bit integer
when generating LEA causing incorrect values to be computed.
Change-Id: I1e65b63cc32ac673a9bb5a297b578b44c2f1ac8f
Reviewed-on: https://go-review.googlesource.com/12678
Reviewed-by: Keith Randall <khr@golang.org>
Handle multiplication with -1, 0, 3, 5, 9 and all powers of two.
Change-Id: I8e87e7670dae389aebf6f446d7a56950cacb59e0
Reviewed-on: https://go-review.googlesource.com/12350
Reviewed-by: Keith Randall <khr@golang.org>
removePredecessor can change which blocks are live.
However, it cannot remove dead blocks from the function's
slice of blocks because removePredecessor may have been
called from within a function doing a walk of the blocks.
CL 11879 did not handle this correctly and broke the build.
To fix this, mark the block as dead but leave its actual
removal for a deadcode pass. Blocks that are dead must have
no successors, predecessors, values, or control values,
so they will generally be ignored by other passes.
To be safe, we add a deadcode pass after the opt pass,
which is the only other pass that calls removePredecessor.
Two alternatives that I considered and discarded:
(1) Make all call sites aware of the fact that removePrecessor
might make arbitrary changes to the list of blocks. This
will needlessly complicate callers.
(2) Handle the things that can go wrong in practice when
we encounter a dead-but-not-removed block. CL 11930 takes
this approach (and the tests are stolen from that CL).
However, this is just patching over the problem.
Change-Id: Icf0687b0a8148ce5e96b2988b668804411b05bd8
Reviewed-on: https://go-review.googlesource.com/12004
Reviewed-by: Todd Neal <todd@tneal.org>
Reviewed-by: Michael Matloob <michaelmatloob@gmail.com>
Use *Node of type ONAME instead of string as the key for variable maps.
This will prevent aliasing between two identically named but
differently scoped variables.
Introduce an Aux value that encodes the offset of a variable
from a base pointer (either global base pointer or stack pointer).
Allow LEAQ and derivatives (MOVQ, etc.) to also have such an Aux field.
Allocate space for AUTO variables in stackalloc.
Change-Id: Ibdccdaea4bbc63a1f4882959ac374f2b467e3acd
Reviewed-on: https://go-review.googlesource.com/11238
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
The SSA implementation logs for three purposes:
* debug logging
* fatal errors
* unimplemented features
Separating these three uses lets us attempt an SSA
implementation for all functions, not just
_ssa functions. This turns the entire standard
library into a compilation test, and makes it
easy to figure out things like
"how much coverage does SSA have now" and
"what should we do next to get more coverage?".
Functions called _ssa are still special.
They log profusely by default and
the output of the SSA implementation
is used. For all other functions,
logging is off, and the implementation
is built and discarded, due to lack of
support for the runtime.
While we're here, fix a few minor bugs and
add some extra Unimplementeds to allow
all.bash to pass.
As of now, SSA handles 20.79% of the functions
in the standard library (689 of 3314).
The top missing features are:
10.03% 2597 SSA unimplemented: zero for type error not implemented
7.79% 2016 SSA unimplemented: addr: bad op DOTPTR
7.33% 1898 SSA unimplemented: unhandled expr EQ
6.10% 1579 SSA unimplemented: unhandled expr OROR
4.91% 1271 SSA unimplemented: unhandled expr NE
4.49% 1163 SSA unimplemented: unhandled expr LROT
4.00% 1036 SSA unimplemented: unhandled expr LEN
3.56% 923 SSA unimplemented: unhandled stmt CALLFUNC
2.37% 615 SSA unimplemented: zero for type []byte not implemented
1.90% 492 SSA unimplemented: unhandled stmt CALLMETH
1.74% 450 SSA unimplemented: unhandled expr CALLINTER
1.74% 450 SSA unimplemented: unhandled expr DOT
1.71% 444 SSA unimplemented: unhandled expr ANDAND
1.65% 426 SSA unimplemented: unhandled expr CLOSUREVAR
1.54% 400 SSA unimplemented: unhandled expr CALLMETH
1.51% 390 SSA unimplemented: unhandled stmt SWITCH
1.47% 380 SSA unimplemented: unhandled expr CONV
1.33% 345 SSA unimplemented: addr: bad op *
1.30% 336 SSA unimplemented: unhandled OLITERAL 6
Change-Id: I4ca07951e276714dc13c31de28640aead17a1be7
Reviewed-on: https://go-review.googlesource.com/11160
Reviewed-by: Keith Randall <khr@golang.org>