Commit graph

192 commits

Author SHA1 Message Date
David Chase
7bda6154ca cmd/compile: add generic optimization patterns for late-expanded calls.
Repeats existing patterns for old calls, so that these will apply
during the optimization phases that precede call expansion.

Change-Id: I1ca0a78c159aa1a51004db217edde4ecc772b646
Reviewed-on: https://go-review.googlesource.com/c/go/+/248190
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
2020-10-12 22:39:53 +00:00
Lynn Boger
cc2a5cf4b8 cmd/compile,cmd/internal/obj/ppc64: fix some shift rules due to a regression
A recent change to improve shifts was generating some
invalid cases when the rule was based on an AND. The
extended mnemonics CLRLSLDI and CLRLSLWI only allow
certain values for the operands and in the mask case
those values were not being checked properly. This
adds a check to those rules to verify that the
'b' and 'n' values used when an AND was part of the rule
have correct values.

There was a bug in some diag messages in asm9. The
message expected 3 values but only provided 2. Those are
corrected here also.

The test/codegen/shift.go was updated to add a few more
cases to check for the case mentioned here.

Some of the comments that mention the order of operands
in these extended mnemonics were wrong and those have been
corrected.

Fixes #41683.

Change-Id: If5bb860acaa5051b9e0cd80784b2868b85898c31
Reviewed-on: https://go-review.googlesource.com/c/go/+/258138
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
Reviewed-by: Paul Murphy <murp@ibm.com>
Reviewed-by: Carlos Eduardo Seo <carlos.seo@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
2020-10-01 18:51:18 +00:00
David Chase
adef4deeb8 cmd/compile: enable late expansion for interface calls
Includes a few tweaks to Value.copyOf(a) (make it a no-op for
a self-copy) and new pattern hack "___" (3 underscores) is
like ellipsis, except the replacement doesn't need to have
matching ellipsis/underscores.

Moved the arg-length check in generated pattern-matching code
BEFORE the args are probed, because not all instances of
variable length OpFoo will have all the args mentioned in
some rule for OpFoo, and when that happens, the compiler
panics without the early check.

Change-Id: I66de40672b3794a6427890ff96c805a488d783f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/247537
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-10-01 17:47:21 +00:00
Lynn Boger
967465da29 cmd/compile: use combined shifts to improve array addressing on ppc64x
This change adds rules to find pairs of instructions that can
be combined into a single shifts. These instruction sequences
are common in array addressing within loops. Improvements can
be seen in many crypto packages and the hash packages.

These are based on the extended mnemonics found in the ISA
sections C.8.1 and C.8.2.

Some rules in PPC64.rules were moved because the ordering prevented
some matching.

The following results were generated on power9.

hash/crc32:
    CRC32/poly=Koopman/size=40/align=0          195ns ± 0%     163ns ± 0%  -16.41%
    CRC32/poly=Koopman/size=40/align=1          200ns ± 0%     163ns ± 0%  -18.50%
    CRC32/poly=Koopman/size=512/align=0        1.98µs ± 0%    1.67µs ± 0%  -15.46%
    CRC32/poly=Koopman/size=512/align=1        1.98µs ± 0%    1.69µs ± 0%  -14.80%
    CRC32/poly=Koopman/size=1kB/align=0        3.90µs ± 0%    3.31µs ± 0%  -15.27%
    CRC32/poly=Koopman/size=1kB/align=1        3.85µs ± 0%    3.31µs ± 0%  -14.15%
    CRC32/poly=Koopman/size=4kB/align=0        15.3µs ± 0%    13.1µs ± 0%  -14.22%
    CRC32/poly=Koopman/size=4kB/align=1        15.4µs ± 0%    13.1µs ± 0%  -14.79%
    CRC32/poly=Koopman/size=32kB/align=0        137µs ± 0%     105µs ± 0%  -23.56%
    CRC32/poly=Koopman/size=32kB/align=1        137µs ± 0%     105µs ± 0%  -23.53%

crypto/rc4:
    RC4_128    733ns ± 0%    650ns ± 0%  -11.32%  (p=1.000 n=1+1)
    RC4_1K    5.80µs ± 0%   5.17µs ± 0%  -10.89%  (p=1.000 n=1+1)
    RC4_8K    45.7µs ± 0%   40.8µs ± 0%  -10.73%  (p=1.000 n=1+1)

crypto/sha1:
    Hash8Bytes       635ns ± 0%     613ns ± 0%   -3.46%  (p=1.000 n=1+1)
    Hash320Bytes    2.30µs ± 0%    2.18µs ± 0%   -5.38%  (p=1.000 n=1+1)
    Hash1K          5.88µs ± 0%    5.38µs ± 0%   -8.62%  (p=1.000 n=1+1)
    Hash8K          42.0µs ± 0%    37.9µs ± 0%   -9.75%  (p=1.000 n=1+1)

There are other improvements found in golang.org/x/crypto which are all in the
range of 5-15%.

Change-Id: I193471fbcf674151ffe2edab212799d9b08dfb8c
Reviewed-on: https://go-review.googlesource.com/c/go/+/252097
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Carlos Eduardo Seo <cseo@linux.vnet.ibm.com>
2020-09-17 12:37:40 +00:00
David Chase
acde81e0a9 cmd/compile: initialize ACArgs and ACResults AuxCall fields for static and interface calls.
Extend use of AuxCall

Change-Id: I68b6d9bad09506532e1415fd70d44cf6c15b4b93
Reviewed-on: https://go-review.googlesource.com/c/go/+/239081
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-16 20:59:03 +00:00
David Chase
3c85e995ef cmd/compile: extend ssa.AuxCall to closure and interface calls
Also introduce helper methods.

Change-Id: I11a744ed002bae0ca9ebabba3206e1c14147e03d
Reviewed-on: https://go-review.googlesource.com/c/go/+/239080
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-16 20:58:14 +00:00
David Chase
b4ef49e527 cmd/compile: introduce special ssa Aux type for calls
This is prerequisite to moving call expansion later into SSA,
and probably a good idea anyway.  Passes tests.

This is the first minimal CL that does a 1-for-1 substitution
of *ssa.AuxCall for *obj.LSym.  Next step (next CL) is to make
this change for all calls so that additional information can
be stored in AuxCall.

Change-Id: Ia3a7715648fd9fb1a176850767a726e6f5b959eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/237680
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-09-16 20:57:24 +00:00
fanzha02
ae658cb19a cmd/compile: store the comparison pseudo-ops of arm64 conditional instructions in AuxInt
The current implementation stores the comparison pseudo-ops of arm64
conditional instructions (CSEL/CSEL0) in Aux, this patch modifies it
and stores it in AuxInt, which can avoid the allocation.

Change-Id: I0b69e51f63acd84c6878c6a59ccf6417501a8cfc
Reviewed-on: https://go-review.googlesource.com/c/go/+/252517
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-09-03 14:45:27 +00:00
Keith Randall
cdc635547f cmd/compile: invalidate zero-use values during rewrite
This helps remove uses that aren't needed any more.
That in turn helps other rules with Uses==1 conditions fire.

Update #39918

Change-Id: I68635b675472f1d59e59604e4d34b949a0016533
Reviewed-on: https://go-review.googlesource.com/c/go/+/249463
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-08-27 22:56:29 +00:00
fanzha02
85902b6786 cmd/compile: convert rest ARM64.rules lines to typed aux mode
This patch adds the ARM6464Bitfield auxInt to auxIntType() and
returns its Go type as "arm64Bitfield" type, which is defined
as int16 type.

And the Go type of SymOff auxInt is int32, but some functions
(such as min(), areAdjacentOffsets() and read16/32/64(),etc.)
use SymOff as an input parameter and treat its type as int64,
this patch adds the type conversion for these rules.

Passes toolstash-check -all.

Change-Id: Ib234b48d0a97ef244dd37878e06b5825316dd782
Reviewed-on: https://go-review.googlesource.com/c/go/+/234378
Reviewed-by: Keith Randall <khr@golang.org>
2020-08-24 14:38:38 +00:00
fanzha02
ffbd8524ac cmd/compile: convert typed aux to CCop for ARM64 rules
Add a new conversion function to convert aux type to Op type.

Passes toolstash-check -all.

Change-Id: I25d649a5296f6f178d64320dfc5d291e0a597e24
Reviewed-on: https://go-review.googlesource.com/c/go/+/233739
Reviewed-by: Keith Randall <khr@golang.org>
2020-08-24 06:03:06 +00:00
Keith Randall
8c39bbf9c9 cmd/compile: stop race instrumentation from clobbering frame pointer
There is an optimization rule that removes calls to racefuncenter and
racefuncexit, if there are no other race calls in the function. The
rule removes the call to racefuncenter, but it does *not* remove the
store of its argument to the outargs section of the frame. If the
outargs section is now size 0 (because the calls to racefuncenter/exit
were the only calls), then that argument store clobbers the frame
pointer instead.

The fix is to remove the argument store when removing the call to
racefuncenter.  (Racefuncexit doesn't have an argument.)

Change-Id: I183ec4d92bbb4920200e1be27b7b8f66b89a2a0a
Reviewed-on: https://go-review.googlesource.com/c/go/+/248262
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-08-16 17:05:44 +00:00
Keith Randall
a07e28194a cmd/compile: redo flag constant ops for arm64
Fixes the *noov opcodes so they handle a constant argument properly.

Most of the infrastructure for this CL is in CL 238077 (the arm32 one).

Fixes #39505

Change-Id: Id424a4e18964b848f05aa42f4d78e5f2e2cdf43b
Reviewed-on: https://go-review.googlesource.com/c/go/+/237999
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-06-18 20:58:26 +00:00
Keith Randall
40ef1faabc cmd/compile: redo flag constant ops for arm
Encode the flag results in an auxint field instead of having
one opcode per flag state. This helps us handle the new *noov
branches in a unified manner.

This is only for arm, arm64 is in a subsequent CL.

We could extend to other architectures as well, athough it would
only be cleanup, no behavioral change.

Update #39505

Change-Id: Ia46cea596faad540d1496c5915ab1274571543f0
Reviewed-on: https://go-review.googlesource.com/c/go/+/238077
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-06-18 20:57:49 +00:00
Alberto Donizetti
666c9aedd4 cmd/compile: switch to typed auxint for arm64 TBZ/TBNZ block
This CL changes the arm64 TBZ/TBNZ block from using Aux to using
a (typed) AuxInt. The corresponding rules have also been changed
to be typed.

Passes

  GOARCH=arm64 gotip build -toolexec 'toolstash -cmp' -a std

Change-Id: I98d0cd2a791948f1db13259c17fb1b9b2807a043
Reviewed-on: https://go-review.googlesource.com/c/go/+/230839
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-30 17:30:54 +00:00
Alberto Donizetti
c8b42f9aa5 cmd/compile: convert CCop arm64 rules to typed aux
Passes

  GOARCH=arm64 gotip build -toolexec 'toolstash -cmp' -a std

Change-Id: Id88141dfc0dfb02c3e958e48ab4adfbac7ba1380
Reviewed-on: https://go-review.googlesource.com/c/go/+/230837
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-29 20:32:48 +00:00
Josh Bleecher Snyder
4a7e363288 cmd/compile: optimize Move with all-zero ro sym src to Zero
We set up static symbols during walk that
we later make copies of to initialize local variables.
It is difficult to ascertain at that time exactly
when copying a symbol is profitable vs locally
initializing an autotmp.

During SSA, we are much better placed to optimize.
This change recognizes when we are copying from a
global readonly all-zero symbol and replaces it with
direct zeroing.

This often allows the all-zero symbol to be
deadcode eliminated at link time.
This is not ideal--it makes for large object files,
and longer link times--but it is the cleanest fix I could find.

This makes the final binary for the program in #38554
shrink from >500mb to ~2.2mb.

It also shrinks the standard binaries:

file      before    after     Δ       %
addr2line 4412496   4404304   -8192   -0.186%
buildid   2893816   2889720   -4096   -0.142%
cgo       4841048   4832856   -8192   -0.169%
compile   19926480  19922432  -4048   -0.020%
cover     5281816   5277720   -4096   -0.078%
link      6734648   6730552   -4096   -0.061%
nm        4366240   4358048   -8192   -0.188%
objdump   4755968   4747776   -8192   -0.172%
pprof     14653060  14612100  -40960  -0.280%
trace     11805940  11777268  -28672  -0.243%
vet       7185560   7181416   -4144   -0.058%
total     113588440 113465560 -122880 -0.108%

And not just by removing unnecessary symbols;
the program text shrinks a bit as well.

Fixes #38554

Change-Id: I8381ae6084ae145a5e0cd9410c451e52c0dc51c8
Reviewed-on: https://go-review.googlesource.com/c/go/+/229704
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-24 23:58:10 +00:00
Josh Bleecher Snyder
b0a8754475 cmd/compile: convert race cleanup rule to typed aux
Passes toolstash-check.

Change-Id: I3005210cc156d01a6ac1ccaafb4311c607681bf0
Reviewed-on: https://go-review.googlesource.com/c/go/+/229691
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-24 23:12:43 +00:00
Josh Bleecher Snyder
32467e677f cmd/compile: convert Move and Zero optimizations to typed aux
Passes toolstash-check.

Change-Id: I9bd722ce19b2ef39931658a02663aeb7db575939
Reviewed-on: https://go-review.googlesource.com/c/go/+/229690
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-24 23:12:34 +00:00
Josh Bleecher Snyder
6b5ab20b65 cmd/compile: convert devirtualization rule to typed aux
Passes toolstash-check.

Change-Id: Iebcdc35f1a2112d5384c70eb3fdbd92ebb3d248e
Reviewed-on: https://go-review.googlesource.com/c/go/+/229689
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-24 23:12:23 +00:00
Josh Bleecher Snyder
47b5efad5d cmd/compile: convert nilcheck elim rules to typed aux
Passes toolstash-check.

Change-Id: Ic7efb0e4778844366f581c6310a1a2f3bfc1868a
Reviewed-on: https://go-review.googlesource.com/c/go/+/229686
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-24 23:06:48 +00:00
Josh Bleecher Snyder
e354309e1e cmd/compile: add ssa.Block.truncateValues
It is a common operation.

Passes toolstash-check.

Change-Id: Icc34600b0f79d0ecb19f257e3c7f23b6f01a26ab
Reviewed-on: https://go-review.googlesource.com/c/go/+/229599
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2020-04-23 14:59:55 +00:00
Cuong Manh Le
79395c55e2 cmd/compile: remove ntz function
Use ntzX variants instead.

Passes toolstash-check -a.

Change-Id: I7a627f46f75c3d339034bd3e81c190cea5409c88
Reviewed-on: https://go-review.googlesource.com/c/go/+/229140
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-22 08:04:58 +00:00
Cuong Manh Le
05db7de1c1 cmd/compile: remove unused nlo function
Change-Id: I858d666d491f649f78581a43437408ffab33863b
Reviewed-on: https://go-review.googlesource.com/c/go/+/229139
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2020-04-21 18:41:44 +00:00
Cuong Manh Le
65c9b57566 cmd/compile: remove nlz function
Use nlzX variants instead. While at it, also remove tests involve
nlz/nlo/nto/log2, since when we are calling directly "math/bits"
functions.

Passes toolstash-check.

Change-Id: I83899741a29e05bc2c19d73652961ac795001781
Reviewed-on: https://go-review.googlesource.com/c/go/+/229138
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-21 18:15:14 +00:00
Alberto Donizetti
17fbc818ff cmd/compile: switch to typed aux for 386 optimization rules
Convert first section of 386 optimization rules to the typed aux form.

Adds addOffset{32,64} functions that returns ValAndOffs and a
ValAndOff.canAdd32 function that takes an int32.

Passes

  GOARCH=386 gotip build -toolexec 'toolstash -cmp' -a std

Change-Id: I69d2a8ace6936d5e8ba6ba047183002bf07dd5be
Reviewed-on: https://go-review.googlesource.com/c/go/+/228825
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-21 08:08:49 +00:00
Josh Bleecher Snyder
4974ac6874 cmd/compile: use cheaper implementation of oneBit
This is the second attempt. The first attempt was CL 229127,
which got rolled back by CL 229177, because it caused
an infinite loop during compilation on some platforms.
I didn't notice that the trybots hadn't completed when I submitted; mea culpa.

The bug was that we were checking x&(x-1)==0, which is also true of 0,
which does not have exactly one bit set.
This caused an infinite rewrite rule loop.

Updates #38547

file    before    after     Δ       %
compile 19678112  19669808  -8304   -0.042%
total   113143160 113134856 -8304   -0.007%

Change-Id: I417a4f806e1ba61277e31bab2e57dd3f1ac7e835
Reviewed-on: https://go-review.googlesource.com/c/go/+/229197
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2020-04-21 05:56:02 +00:00
Josh Bleecher Snyder
9255163091 Revert "cmd/compile: use cheaper implementation of oneBit"
This reverts commit 066c47ca5f.

Reason for revert: This appears to have broken a bunch of builders.

Change-Id: I68b4decf3c1892766e195d8eb018844cdff69443
Reviewed-on: https://go-review.googlesource.com/c/go/+/229177
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-04-21 04:28:59 +00:00
Cuong Manh Le
0f14c2a042 cmd/compile: rewrite shift rules to use typed aux fields
Passes toolstash-check.

Change-Id: I02e78591fe46e19a43dc36913baef0338a014a3d
Reviewed-on: https://go-review.googlesource.com/c/go/+/228860
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-21 03:13:22 +00:00
Josh Bleecher Snyder
066c47ca5f cmd/compile: use cheaper implementation of oneBit
Updates #38547

file    before    after     Δ       %       
compile 19678112  19669808  -8304   -0.042% 
total   113143160 113134856 -8304   -0.007% 

Change-Id: I5f8afe17401dbdb7c7b3d66d95fe40821c499a92
Reviewed-on: https://go-review.googlesource.com/c/go/+/229127
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2020-04-21 00:38:53 +00:00
Josh Bleecher Snyder
50b11318fe cmd/compile: use oneBit instead of isPowerOfTwo in bit optimization
This optimization works on any integer with exactly one bit set.
This is identical to being a power of two, except in the
most negative number. Use oneBit instead.

The rule now triggers in a few more places in std+cmd,
in packages encoding/asn1, crypto/elliptic, and
vendor/golang.org/x/crypto/cryptobyte.

This change obviates the need for CL 222479
by doing this optimization consistently in the compiler.

Change-Id: I983c6235290fdc634fda5e11b10f1f8ce041272f
Reviewed-on: https://go-review.googlesource.com/c/go/+/229124
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
2020-04-21 00:38:34 +00:00
David Finkel
1cca496c5e Revert "Revert "cmd/compile: adjust RISCV64 rewrite rules to use typed aux fields""
This reverts commit 98c32670fd454939794504225dca1d4ec55045d5.

Rolling-forward with trivial format-string fix

cmd/compile: adjust RISCV64 rewrite rules to use typed aux fields

Also add a typed version of mergeSym to rewrite.go to assist with a few
rules that used mergeSym in the untyped-form.

Remove a few extra int32 overflow checks that no longer make sense, as
adding two int8s or int16s should never overflow an int32.

Passes toolstash-check -all.

Original review: https://go-review.googlesource.com/c/go/+/228882

Change-Id: Ib63db4ee1687446f0f3d9f11575a40dd85cbce55
Reviewed-on: https://go-review.googlesource.com/c/go/+/229126
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
2020-04-20 23:30:29 +00:00
Than McIntosh
0cffc95109 Revert "cmd/compile: adjust RISCV64 rewrite rules to use typed aux fields"
This reverts commit 7004be998b.

Reason for revert: causing failures on many builders

Change-Id: I9216bd5409bb6814bac18a6a13ef5115db01b5fe
Reviewed-on: https://go-review.googlesource.com/c/go/+/229120
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
2020-04-20 22:38:59 +00:00
David Finkel
7004be998b cmd/compile: adjust RISCV64 rewrite rules to use typed aux fields
Also add a typed version of mergeSym to rewrite.go to assist with a few
rules that used mergeSym in the untyped-form.

Remove a few extra int32 overflow checks that no longer make sense, as
adding two int8s or int16s should never overflow an int32.

Passes toolstash-check -all.

Change-Id: I72ddd2b0d9001faa87ad0ab54f500057164661b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/228882
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-20 20:22:51 +00:00
Michael Munday
b1cae8cd1d cmd/compile: make some s390x rules use strongly typed aux values
This first pass makes the rules using the condition code mask
(CCMask) and rotate parameters (RotateParams) aux values strongly
typed. This required adding strongly typed aux handling to the
block rulegen.

More CLs like this to follow, but this is probably the most
complex.

Passes toolstash-check -all.

Change-Id: Ie513b07d527f0c1b398d7748331442dcb5f7b17d
Reviewed-on: https://go-review.googlesource.com/c/go/+/228518
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-17 14:54:05 +00:00
Cherry Zhang
1b15c7f102 cmd/compile: debug rewrite
If -d=ssa/PASS/debug=N is specified (N >= 2) for a rewrite pass
(e.g. lower), when a Value (or Block) is rewritten, print the
Value (or Block) before and after.

For #31915.
Updates #19013.

Change-Id: I80eadd44302ae736bc7daed0ef68529ab7a16776
Reviewed-on: https://go-review.googlesource.com/c/go/+/176718
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-04-13 21:56:15 +00:00
Keith Randall
dc9879e8fd cmd/compile: convert more AMD64.rules lines to typed aux mode
Change-Id: Idded860128b1a23680520d8c2b9f6d8620dcfcc7
Reviewed-on: https://go-review.googlesource.com/c/go/+/228077
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-04-13 15:41:13 +00:00
Keith Randall
2f84caebe3 cmd/compile: rewrite some AMD64 rules to use typed aux fields
Surprisingly many rules needed no modification.

Use wrapper functions for aux like we did for auxint.
Simplifies things a bit.

Change-Id: I2e852e77f1585dcb306a976ab9335f1ac5b4a770
Reviewed-on: https://go-review.googlesource.com/c/go/+/227961
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2020-04-12 23:46:56 +00:00
Keith Randall
7580937524 cmd/compile: move more generic rewrites to the typed version
Change-Id: I22d0644710d12c7efc509fd2a15789e2e073e6a3
Reviewed-on: https://go-review.googlesource.com/c/go/+/227869
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-04-12 19:41:35 +00:00
Keith Randall
a1b802bde7 cmd/compile: move some generic rules to strongly typed
Move a lot of the constant folding rules to use strongly
typed AuxInt fields.

We need more than a cast to convert AuxInt to, e.g., float32.
Make conversion functions for converting back and forth.

Change-Id: Ia3d95ee3583ee2179a10938e20210a7617358c88
Reviewed-on: https://go-review.googlesource.com/c/go/+/227866
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-04-11 15:49:38 +00:00
Lynn Boger
815509ae31 cmd/compile: improve lowered moves and zeros for ppc64le
This change includes the following:
- Generate LXV/STXV sequences instead of LXVD2X/STXVD2X on power9.
These instructions do not require an index register, which
allows more loads and stores within a loop without initializing
multiple index registers. The LoweredQuadXXX generate LXV/STXV.
- Create LoweredMoveXXXShort and LoweredZeroXXXShort for short
moves that don't generate loops, and therefore don't clobber the
address registers or flags.
- Use registers other than R3 and R4 to avoid conflicting with
registers that have already been allocated to avoid unnecessary
register moves.
- Eliminate the use of R14 as scratch register and use R31
instead.
- Add PCALIGN when the LoweredMoveXXX or LoweredZeroXXX generates a
loop with more than 3 iterations.

This performance opportunity was noticed in github.com/golang/snappy
benchmarks. Results on power9:

WordsDecode1e1    54.1ns ± 0%    53.8ns ± 0%   -0.51%  (p=0.029 n=4+4)
WordsDecode1e2     287ns ± 0%     282ns ± 1%   -1.83%  (p=0.029 n=4+4)
WordsDecode1e3    3.98µs ± 0%    3.64µs ± 0%   -8.52%  (p=0.029 n=4+4)
WordsDecode1e4    66.9µs ± 0%    67.0µs ± 0%   +0.20%  (p=0.029 n=4+4)
WordsDecode1e5     723µs ± 0%     723µs ± 0%   -0.01%  (p=0.200 n=4+4)
WordsDecode1e6    7.21ms ± 0%    7.21ms ± 0%   -0.02%  (p=1.000 n=4+4)
WordsEncode1e1    29.9ns ± 0%    29.4ns ± 0%   -1.51%  (p=0.029 n=4+4)
WordsEncode1e2    2.12µs ± 0%    1.75µs ± 0%  -17.70%  (p=0.029 n=4+4)
WordsEncode1e3    11.7µs ± 0%    11.2µs ± 0%   -4.61%  (p=0.029 n=4+4)
WordsEncode1e4     119µs ± 0%     120µs ± 0%   +0.36%  (p=0.029 n=4+4)
WordsEncode1e5    1.21ms ± 0%    1.22ms ± 0%   +0.41%  (p=0.029 n=4+4)
WordsEncode1e6    12.0ms ± 0%    12.0ms ± 0%   +0.57%  (p=0.029 n=4+4)
RandomEncode       286µs ± 0%     203µs ± 0%  -28.82%  (p=0.029 n=4+4)
ExtendMatch       47.4µs ± 0%    47.0µs ± 0%   -0.85%  (p=0.029 n=4+4)

Change-Id: Iecad3a39ae55280286e42760a5c9d5c1168f5858
Reviewed-on: https://go-review.googlesource.com/c/go/+/226539
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-04-06 12:09:39 +00:00
David Chase
47ade08141 cmd/compile: add logging for large (>= 128 byte) copies
For 1.15, unless someone really wants it in 1.14.

A performance-sensitive user thought this would be useful,
though "large" was not well-defined.  If 128 is large,
there are 139 static instances of "large" copies in the compiler
itself.

Includes test.

Change-Id: I81f20c62da59d37072429f3a22c1809e6fb2946d
Reviewed-on: https://go-review.googlesource.com/c/go/+/205066
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-04-03 17:24:48 +00:00
Josh Bleecher Snyder
82253ddc7a cmd/compile: constant fold CtzNN
Change-Id: I3ecd2c7ed3c8ae35c2bb9562aed09f7ade5c8cdd
Reviewed-on: https://go-review.googlesource.com/c/go/+/221609
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-03-31 22:36:54 +00:00
Giovanni Bajo
14ad23d1f5 cmd/compile: avoid zero extensions after 32-bit shifts
zeroUpper32Bits wasn't checking for shift-extension ops. This would not
check shifts that were marking as bounded by prove (normally, shifts
are wrapped in a sequence that ends with an ANDL, and zeroUpper32Bits
would see the ANDL).

This produces no changes on generated output right now, but will be
important once CL196679 lands because many shifts will be marked
as bounded, and lower will stop generating the masking code sequence
around them.

Change-Id: Iaea94acc5b60bb9a5021c9fb7e4a1e2e5244435e
Reviewed-on: https://go-review.googlesource.com/c/go/+/226338
Reviewed-by: Keith Randall <khr@golang.org>
2020-03-30 19:58:50 +00:00
Keith Randall
33b648c0e9 cmd/compile: fix ephemeral pointer problem on amd64
Make sure we don't use the rewrite ptr + (c + x) -> c + (ptr + x), as
that may create an ephemeral out-of-bounds pointer.

I have not seen an actual bug caused by this yet, but we've seen
them in the 386 port so I'm fixing this issue for amd64 as well.

The load-combining rules needed to be reworked somewhat to still
work without the above broken rule.

Update #37881

Change-Id: I8046d170e89e2035195f261535e34ca7d8aca68a
Reviewed-on: https://go-review.googlesource.com/c/go/+/226437
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-03-30 17:25:29 +00:00
Keith Randall
af7eafd150 cmd/compile: convert 386 port to use addressing modes pass (take 2)
Retrying CL 222782, with a fix that will hopefully stop the random crashing.

The issue with the previous CL is that it does pointer arithmetic
in a way that may briefly generate an out-of-bounds pointer. If an
interrupt happens to occur in that state, the referenced object may
be collected incorrectly.

Suppose there was code that did s[x+c].  The previous CL had a rule
to the effect of ptr + (x + c) -> c + (ptr + x).  But ptr+x is not
guaranteed to point to the same object as ptr. In contrast,
ptr+(x+c) is guaranteed to point to the same object as ptr, because
we would have already checked that x+c is in bounds.

For example, strconv.trim used to have this code:
  MOVZX -0x1(BX)(DX*1), BP
  CMPL $0x30, AL
After CL 222782, it had this code:
  LEAL 0(BX)(DX*1), BP
  CMPB $0x30, -0x1(BP)

An interrupt between those last two instructions could see BP pointing
outside the backing store of the slice involved.

It's really hard to actually demonstrate a bug. First, you need to
have an interrupt occur at exactly the right time. Then, there must
be no other pointers to the object in question. Since the interrupted
frame will be scanned conservatively, there can't even be a dead
pointer in another register or on the stack. (In the example above,
a bug can't happen because BX still holds the original pointer.)
Then, the object in question needs to be collected (or at least
scanned?) before the interrupted code continues.

This CL needs to handle load combining somewhat differently than CL 222782
because of the new restriction on arithmetic. That's the only real
difference (other than removing the bad rules) from that old CL.

This bug is also present in the amd64 rewrite rules, and we haven't
seen any crashing as a result. I will fix up that code similarly to
this one in a separate CL.

Update #37881

Change-Id: I5f0d584d9bef4696bfe89a61ef0a27c8d507329f
Reviewed-on: https://go-review.googlesource.com/c/go/+/225798
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
2020-03-27 18:54:45 +00:00
Keith Randall
98cb76799c cmd/compile: insert complicated x86 addressing modes as a separate pass
Use a separate compiler pass to introduce complicated x86 addressing
modes.  Loads in the normal architecture rules (for x86 and all other
platforms) can have constant offsets (AuxInt values) and symbols (Aux
values), but no more.

The complex addressing modes (x+y, x+2*y, etc.) are introduced in a
separate pass that combines loads with LEAQx ops.

Organizing rewrites this way simplifies the number of rewrites
required, as there are lots of different rule orderings that have to
be specified to ensure these complex addressing modes are always found
if they are possible.

Update #36468

Change-Id: I5b4bf7b03a1e731d6dfeb9ef19b376175f3b4b44
Reviewed-on: https://go-review.googlesource.com/c/go/+/217097
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-03-10 00:13:21 +00:00
Keith Randall
cd9fd640db cmd/compile: don't allow NaNs in floating-point constant ops
Trying this CL again, with a fixed test that allows platforms
to disagree on the exact behavior of converting NaNs.

We store 32-bit floating point constants in a 64-bit field, by
converting that 32-bit float to 64-bit float to store it, and convert
it back to use it.

That works for *almost* all floating-point constants. The exception is
signaling NaNs. The round trip described above means we can't represent
a 32-bit signaling NaN, because conversions strip the signaling bit.

To fix this issue, just forbid NaNs as floating-point constants in SSA
form. This shouldn't affect any real-world code, as people seldom
constant-propagate NaNs (except in test code).

Additionally, NaNs are somewhat underspecified (which of the many NaNs
do you get when dividing 0/0?), so when cross-compiling there's a
danger of using the compiler machine's NaN regime for some math, and
the target machine's NaN regime for other math. Better to use the
target machine's NaN regime always.

Update #36400

Change-Id: Idf203b688a15abceabbd66ba290d4e9f63619ecb
Reviewed-on: https://go-review.googlesource.com/c/go/+/221790
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
2020-03-04 04:49:54 +00:00
Josh Bleecher Snyder
eb5cd0fb40 cmd/compile: mark Lsyms as readonly earlier
The SSA backend has rules to read the contents of readonly Lsyms.
However, this rule was failing to trigger for many readonly Lsyms.
This is because the readonly attribute that was set on the Node.Name
was not propagated to its Lsym until the dump globals phase, after SSA runs.

To work around this phase ordering problem, introduce Node.SetReadonly,
which sets Node.Name.Readonly and also configures the Lsym
enough that SSA can use it.

This change also fixes a latent problem in the rewrite rule function,
namely that reads past the end of lsym.P were treated as entirely zero,
instead of merely requiring padding with trailing zeros.

This change also adds an amd64 rule needed to fully optimize
the results of this change. It would be better not to need this,
but the zero extension that should handle this for us
gets optimized away too soon (see #36897 for a similar problem).
I have not investigated whether other platforms also need new
rules to take full advantage of the new optimizations.

Compiled code for (interface{})(true) on amd64 goes from:

LEAQ	type.bool(SB), AX
MOVBLZX	""..stmp_0(SB), BX
LEAQ	runtime.staticbytes(SB), CX
ADDQ	CX, BX

to

LEAQ	type.bool(SB), AX
LEAQ	runtime.staticbytes+1(SB), BX

Prior to this change, the readonly symbol rewrite rules
fired a total of 884 times during make.bash.
Afterwards they fire 1807 times.

file    before    after     Δ       %
cgo     4827832   4823736   -4096   -0.085%
compile 24907768  24895656  -12112  -0.049%
fix     3376952   3368760   -8192   -0.243%
pprof   14751700  14747604  -4096   -0.028%
total   120343528 120315032 -28496  -0.024%

Change-Id: I59ea52138276c37840f69e30fb109fd376d579ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/220499
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
2020-02-26 19:30:21 +00:00
Josh Bleecher Snyder
390c096ee9 cmd/compile: make clobber variadic
There are often many values to clobber.
Allow passing them all in at once.
The goal is increased rule readability.
As a bonus, it shrinks cmd/compile by ~97k, almost half a percent.
Package SSA requires 1.2% less memory to compile.

The single-line changes were make via regex,
and the remaining multi-line clobbers were manually combined.

Passes toolstash-check -all.

Change-Id: Ib310e9265d3616211f8192c9040b4c8933824d19
Reviewed-on: https://go-review.googlesource.com/c/go/+/220691
Reviewed-by: Michael Munday <mike.munday@ibm.com>
2020-02-26 18:59:58 +00:00