Explcitly block fused multiply-add pattern matching when a cast is used
after the multiplication, for example:
- (a * b) + c // can emit fused multiply-add
- float64(a * b) + c // cannot emit fused multiply-add
float{32,64} and complex{64,128} casts of matching types are now kept
as OCONV operations rather than being replaced with OCONVNOP operations
because they now imply a rounding operation (and therefore aren't a
no-op anymore).
Operations (for example, multiplication) on complex types may utilize
fused multiply-add and -subtract instructions internally. There is no
way to disable this behavior at the moment.
Improves the performance of the floating point implementation of
poly1305:
name old speed new speed delta
64 246MB/s ± 0% 275MB/s ± 0% +11.48% (p=0.000 n=10+8)
1K 312MB/s ± 0% 357MB/s ± 0% +14.41% (p=0.000 n=10+10)
64Unaligned 246MB/s ± 0% 274MB/s ± 0% +11.43% (p=0.000 n=10+10)
1KUnaligned 312MB/s ± 0% 357MB/s ± 0% +14.39% (p=0.000 n=10+8)
Updates #17895.
Change-Id: Ia771d275bb9150d1a598f8cc773444663de5ce16
Reviewed-on: https://go-review.googlesource.com/36963
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Keith pointed out that these rules should zero extend during the review
of CL 36845. In practice the generic rules are responsible for eliminating
most load-hit-stores and they do not have this problem. When the s390x
rules are triggered any cast following the elided load-hit-store is
kept because of the sequence the rules are applied in (i.e. the load is
removed before the zero extension gets a chance to be merged into the load).
It is therefore not clear that this issue results in any functional bugs.
This CL includes a test, but it only tests the generic rules currently.
Change-Id: Idbc43c782097a3fb159be293ec3138c5b36858ad
Reviewed-on: https://go-review.googlesource.com/37154
Run-TryBot: Michael Munday <munday@ca.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
If a program wants to evaluate math.Sqrt for any constant value
(for example, math.Sqrt(3)), we can replace that expression with
its evaluation (1.7320508075688772) at compile time, instead of
generating a SQRT assembly command or equivalent.
Adds tests that math.Sqrt generates the correct values. I also
compiled a short program and verified that the Sqrt expression was
replaced by a constant value in the "after opt" step.
Adds a short doc to the top of generic.rules explaining what the file
does and how other files interact with it.
Fixes#15543.
Change-Id: I6b6e63ac61cec50763a09ba581024adeee03d4fa
Reviewed-on: https://go-review.googlesource.com/27457
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
We were generating format strings containing
a lone %. Vet legitimately complains:
cmd/compile/internal/gc/constFold_test.go:339: unrecognized printf verb ' '
The fix doesn't make for very readable code,
but it is simple and obviously correct.
Updates #11041
Change-Id: I90bd2d1d140887f5229752a279f7e46921472fbb
Reviewed-on: https://go-review.googlesource.com/27115
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This CL implements the following optimizations for ARM:
- use shifted ops (e.g. ADD R1<<2, R2) and indexed load/stores
- break up shift ops. Shifts used to be one SSA op that generates
multiple instructions. We break them up to multiple ops, which
allows constant folding and CSE for comparisons. Conditional moves
are introduced for this.
- simplify zero/sign-extension ops.
Updates #15365.
Change-Id: I55e262a776a7ef2a1505d75e04d1208913c35d39
Reviewed-on: https://go-review.googlesource.com/24512
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
This CL adds support of Div, Mod, Convert, GetClosurePtr and 64-bit indexing
support to SSA backend for ARM.
Add tests for 64-bit indexing to cmd/compile/internal/gc/testdata/string_ssa.go.
Tests cmd/compile/internal/gc/testdata/*_ssa.go passed, except compound_ssa.go
and fp_ssa.go.
Progress on SSA for ARM. Still not complete. Essentially the only unsupported
part is floating point.
Updates #15365.
Change-Id: I269e88b67f641c25e7a813d910c96d356d236bff
Reviewed-on: https://go-review.googlesource.com/23542
Reviewed-by: David Chase <drchase@google.com>
Covers a bunch of constant-folding rules in generic.rules that aren't
being covered currently.
Increases coverage in generic.rules from 65% to 72%.
Change-Id: I7bf58809faf22e97070183b42e6dd7d3f35bf5f9
Reviewed-on: https://go-review.googlesource.com/23407
Run-TryBot: Todd Neal <todd@tneal.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
Why not? Because the 386 backend can't handle one of them.
But other than that, it should work.
Change-Id: Iaeb9735f8c3c281136a0734376dec5ddba21be3b
Reviewed-on: https://go-review.googlesource.com/22748
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Combine stores into larger widths when it is safe to do so.
Add clobber() function so stray dead uses do not impede the
above rewrites.
Fix bug in loads where all intermediate values depending on
a small load (not just the load itself) must have no other uses.
We really need the small load to be dead after the rewrite..
Fixes#14267
Change-Id: Ib25666cb19777f65082c76238fba51a76beb5d74
Reviewed-on: https://go-review.googlesource.com/22326
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
We need to make sure that when we combine loads, we only do
so if there are no other uses of the load. We can't split
one load into two because that can then lead to inconsistent
loaded values in the presence of races.
Add some aggressive copy removal code so that phantom
"dead copy" uses of values are cleaned up promptly. This lets
us use x.Uses==1 conditions reliably.
Change-Id: I9037311db85665f3868dbeb3adb3de5c20728b38
Reviewed-on: https://go-review.googlesource.com/21853
Reviewed-by: Todd Neal <todd@tneal.org>
No point in doing anything for x=x assignments.
In addition, skipping these assignments prevents generating:
VARDEF x
COPY x -> x
which is bad because x is incorrectly considered
dead before the vardef.
Fixes#14904
Change-Id: I6817055ec20bcc34a9648617e0439505ee355f82
Reviewed-on: https://go-review.googlesource.com/21470
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>
In b we only need the division by 0 check.
func b(i uint, v []byte) byte {
return v[i%uint(len(v))]
}
Updates #15079.
Change-Id: Ic7491e677dd57cd6ba577efbce576dbb6e023cbd
Reviewed-on: https://go-review.googlesource.com/21502
Run-TryBot: Alexandru Moșoi <alexandru@mosoi.ro>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Ahmed Waheed <oneofone@gmail.com>
Don't write back parts of a slicing operation if they
are unchanged from the source of the slice. For example:
x.s = x.s[0:5] // don't write back pointer or cap
x.s = x.s[:5] // don't write back pointer or cap
x.s = x.s[:5:7] // don't write back pointer
There is more to be done here, for example:
x.s = x.s[:len(x.s):7] // don't write back ptr or len
This CL can't handle that one yet.
Fixes#14855
Change-Id: Id1e1a4fa7f3076dc1a76924a7f1cd791b81909bb
Reviewed-on: https://go-review.googlesource.com/20954
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Keith Randall <khr@golang.org>
Make sure symbol gets carried along by load-combining rule.
Add the new load into the right block where we know that
mem is live.
Use auxInt field to carry i along instead of an explicit ADDQ.
Incorporate LEA ops into MOVBQZX and friends.
Change-Id: I587f7c6120b98fd2a0d48ddd6ddd13345d4421b4
Reviewed-on: https://go-review.googlesource.com/20732
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Todd Neal <todd@tneal.org>
When the pointer offset is non-zero in the small loads, we need to add the offset
when converting to the larger load.
Fixes#14694
Change-Id: I5ba8bcb3b9ce26c7fae0c4951500b9ef0fed54cd
Reviewed-on: https://go-review.googlesource.com/20333
Reviewed-by: Keith Randall <khr@golang.org>
The tree's pretty inconsistent about single space vs double space
after a period in documentation. Make it consistently a single space,
per earlier decisions. This means contributors won't be confused by
misleading precedence.
This CL doesn't use go/doc to parse. It only addresses // comments.
It was generated with:
$ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])')
$ go test go/doc -update
Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7
Reviewed-on: https://go-review.googlesource.com/20022
Reviewed-by: Rob Pike <r@golang.org>
Reviewed-by: Dave Day <djd@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The upper bits of 8/16/32 bit constants are undefined. We need to
truncate in order to prevent x86.oclass misidentifying the size of the
constant.
Fixes#14389
Change-Id: I3e5ff79cd904376572a93f489ba7e152a5cb6e60
Reviewed-on: https://go-review.googlesource.com/19740
Reviewed-by: Keith Randall <khr@golang.org>
This adds a test case with aliased pointers to ensure modifications to
dse don't remove valid stores.
Change-Id: I143653250f46a403835218ec685bcd336d5087ef
Reviewed-on: https://go-review.googlesource.com/19795
Reviewed-by: Keith Randall <khr@golang.org>
* Phis can have variable number of arguments, but rulegen assumed that
each operation has fixed number of arguments.
* Rewriting Phis is necessary to handle the following case:
func f1_ssa(a bool, x int) int {
v := 0
if a {
v = -1
} else {
v = -1
}
return x|v
}
Change-Id: Iff6bd411b854f3d1d6d3ce21934bf566757094f2
Reviewed-on: https://go-review.googlesource.com/19412
Reviewed-by: Keith Randall <khr@golang.org>
* Simplify comparisons of form a + const1 == const2 or a + const1 != const2.
* Canonicalize Eq, Neq, Add, Sub to have a constant as first argument.
Needed for the above new rules and helps constant folding.
Change-Id: I8078702a5daa706da57106073a3e9f640a67f486
Reviewed-on: https://go-review.googlesource.com/19192
Reviewed-by: Keith Randall <khr@golang.org>
In code that does:
var x, z int32
var y int64
z = phi(x, int32(y))
We silently drop the int32 cast because truncation is a no-op.
The phi operation needs to make sure it uses the size of the
phi, not the size of its arguments, when generating spills.
Change-Id: I1f7baf44f019256977a46fdd3dad1972be209042
Reviewed-on: https://go-review.googlesource.com/18390
Reviewed-by: David Chase <drchase@google.com>
Make sure that when a pointer value is live across a function
call, we save it as a pointer. (And similarly a uintptr
live across a function call should not be saved as a pointer.)
Add a nasty test case.
This is probably what is preventing the merge from master
to dev.ssa. Signs point to something like this bug happening
in mallocgc.
Change-Id: Ib23fa1251b8d1c50d82c6a448cb4a4fc28219029
Reviewed-on: https://go-review.googlesource.com/16830
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
This improves cse and works correctly now that divide by zero is checked
explicitly.
Change-Id: If54fbe403ed5230b897afc5def644ba9f0056dfd
Reviewed-on: https://go-review.googlesource.com/16454
Run-TryBot: Todd Neal <todd@tneal.org>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Replace REP MOVSB with all the copying techniques used by the
old compiler. Copy in chunks, DUFFCOPY, etc.
Introduces MOVO opcodes and an Int128 type to move around
16 bytes at a time.
Change-Id: I1e73e68ca1d8b3dd58bb4af2f4c9e5d9bf13a502
Reviewed-on: https://go-review.googlesource.com/16174
Reviewed-by: Todd Neal <todd@tneal.org>
Run-TryBot: Keith Randall <khr@golang.org>
Cleaned up first-block-in-function code.
Added cases for |PHEAP for PPARAM and PAUTO.
Made PPARAMOUT act more like PAUTO for purposes
of address generation and vardef placement.
Added cases for OCLOSUREVAR and Ops for getting closure
pointer. Closure ops are scheduled at top of entry block
to capture DX.
Wrote test that seems to show proper behavior for addressed
parameters, locals, and returns.
Change-Id: Iee93ebf9e3d9f74cfb4d1c1da8038eb278d8a857
Reviewed-on: https://go-review.googlesource.com/14650
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: David Chase <drchase@google.com>
Load-and-sign-extend opcodes were being generated in the
wrong block, leading to having more than one memory variable
live at once. Fix the rules + add a test.
Change-Id: Iadf80e55ea901549c15c628ae295c2d0f1f64525
Reviewed-on: https://go-review.googlesource.com/14591
Reviewed-by: Todd Neal <todd@tneal.org>
Run-TryBot: Todd Neal <todd@tneal.org>
They were using the result type to look up the op, not the arg type.
Change-Id: I0641cba363fa6e7a66ad0860aa340106c10c2cea
Reviewed-on: https://go-review.googlesource.com/14469
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Using the main package causes a binary to be generated. That binary
clutters up git listings.
Use a non-main package instead, so the results of a successful
compilation are thrown away.
Change-Id: I3ac91fd69ad297a5c0fe035c22fdef290b7dfbc4
Reviewed-on: https://go-review.googlesource.com/14399
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
The after-defer test jumps to a deferreturn site. Some functions
(those with infinite loops) have no deferreturn site. Add one
so we have one to jump to.
Change-Id: I505e7f3f888f5e7d03ca49a3477b41cf1f78eb8a
Reviewed-on: https://go-review.googlesource.com/14349
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
complex128 was being treated as a complex64
Fixes math/cmplx.
Change-Id: I2996915b4cb6b94198d41cf08a30bd8531b9fec5
Reviewed-on: https://go-review.googlesource.com/14206
Reviewed-by: Keith Randall <khr@golang.org>
liblink was rewriting xor by a negative zero (used by SSA
for negation) as XORPS reg,reg.
Fixes strconv.
Change-Id: I627a0a7366618e6b07ba8f0ad0db0e102340c5e3
Reviewed-on: https://go-review.googlesource.com/14200
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Modified to use the truncating conversion.
Fixes reflect.
Change-Id: I47bf3200abc2d2c662939a2a2351e2ff84168f4a
Reviewed-on: https://go-review.googlesource.com/14167
Reviewed-by: David Chase <drchase@google.com>
liblink rewrites MOV $0, reg into XOR reg, reg. Make MOVxconst clobber
flags so we don't generate invalid code in the unlikely case that it
matters. In testing, this change leads to no additional regenerated
flags due to a scheduling fix in CL14042.
Change-Id: I7bc1cfee94ef83beb2f97c31ec6a97e19872fb89
Reviewed-on: https://go-review.googlesource.com/14043
Reviewed-by: Keith Randall <khr@golang.org>