The go/build package needs access to this configuration,
so move it into a new package available to the standard library.
Change-Id: I868a94148b52350c76116451f4ad9191246adcff
Reviewed-on: https://go-review.googlesource.com/c/go/+/310731
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Jay Conrod <jayconrod@google.com>
The rule that inlines memmove expects SSA ops that calls memmove
with arguments in memory. This CL adds a version that matches
it with arguments in registers, so the optimization works for
both situations.
Change-Id: Ideb64f65b7521481ab2ca7c9975a6cf7b70d5966
Reviewed-on: https://go-review.googlesource.com/c/go/+/309332
Trust: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Reviewed-by: David Chase <drchase@google.com>
If I change a rule in ARM64.rules to use the variable name "b" in a
conflicting way, rulegen would previously not complain, and the compiler
would later give a confusing error:
$ go run *.go && go build cmd/compile/internal/ssa
# cmd/compile/internal/ssa
../rewriteARM64.go:24236:10: b.NewValue0 undefined (type int64 has no field or method NewValue0)
Make rulegen complain early about those cases. Sometimes they might
happen to be harmless, but in general they can easily cause confusion or
unintended effect due to shadowing.
After the change, with the same conflicting rule:
$ go run *.go && go build cmd/compile/internal/ssa
2021/03/22 11:31:49 rule ARM64.rules:495 uses the reserved name b
exit status 1
Note that 24 existing rules were using reserved names. It seems like the
shadowing was harmless, as it wasn't causing typechecking issues nor did
it seem to cause unintended behavior when the rule rewrite code ran.
The bool values "b" were renamed "t", since that seems to have a
precedent in other rules and in the fmt package.
Sequential values like "a b c" were renamed to "x y z", since "b" is
reserved.
Finally, "typ" was renamed to "_typ", since there doesn't seem to be an
obviously better answer.
Passes all three of:
$ GOARCH=amd64 go build -toolexec 'toolstash -cmp' -a std
$ GOARCH=arm64 go build -toolexec 'toolstash -cmp' -a std
$ GOARCH=mips64 go build -toolexec 'toolstash -cmp' -a std
Fixes#45154.
Change-Id: I1cce194dc7b477886a9c218c17973e996bcedccf
Reviewed-on: https://go-review.googlesource.com/c/go/+/303549
Trust: Daniel Martí <mvdan@mvdan.cc>
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Add generic rule to rewrite the single-precision square root expression
with one single-precision instruction. The optimization will reduce two
times of precision converting between double-precision and single-precision.
On arm64 flatform.
previous:
FCVTSD F0, F0
FSQRTD F0, F0
FCVTDS F0, F0
optimized:
FSQRTS S0, S0
And this patch adds the test case to check the correctness.
This patch refers to CL 241877, contributed by Alice Xu
(dianhong.xu@arm.com)
Change-Id: I6de5d02281c693017ac4bd4c10963dd55989bd7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/276873
Trust: fannie zhang <Fannie.Zhang@arm.com>
Run-TryBot: fannie zhang <Fannie.Zhang@arm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
StaticLECall (multiple value in +mem, multiple value result +mem) ->
StaticCall (multiple ergister value in +mem,
multiple register-sized-value result +mem) ->
ARCH CallStatic (multiple ergister value in +mem,
multiple register-sized-value result +mem)
But the architecture-dependent stuff is indifferent to whether
it is mem->mem or (mem)->(mem) until Prog generation.
Deal with OpSelectN -> Prog in ssagen/ssa.go, others, as they
appear.
For #40724.
Change-Id: I1d0436f6371054f1881862641d8e7e418e4a6a16
Reviewed-on: https://go-review.googlesource.com/c/go/+/293391
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Current optimization: When we copy a->b and then b->c, we might as well
copy a->c instead of b->c (then b might be dead and go away).
*Except* if a is a volatile location (might be clobbered by a call).
In that case, we really do want to copy a immediately, because there
might be a call before we can do the a->c copy.
User calls can't happen in between, because the rule matches up the
memory states. But calls inserted for memory barriers, particularly
runtime.typedmemmove, can.
(I guess we could introduce a register-calling-convention version
of runtime.typedmemmove, but that seems a bigger change than this one.)
Fixes#43570
Change-Id: Ifa518bb1a6f3a8dd46c352d4fd54ea9713b3eb1a
Reviewed-on: https://go-review.googlesource.com/c/go/+/282492
Trust: Keith Randall <khr@golang.org>
Trust: Josh Bleecher Snyder <josharian@gmail.com>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Proof of concept; also an actual optimization that fires 180 times
in the Go source base.
Change-Id: I5cb87474be764264cde6e4cbcb471ef109306f08
Reviewed-on: https://go-review.googlesource.com/c/go/+/248404
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Rulegen was not run again between patchsets 2 and 3 of CL 264683, so
the rewritegeneric.go file is out of sync. Run rulegen.
Change-Id: I67d8d267c4f666943ed9bf8c33aac2ac6013336a
Reviewed-on: https://go-review.googlesource.com/c/go/+/266080
Trust: Alberto Donizetti <alb.donizetti@gmail.com>
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Rémy Oudompheng <remyoudompheng@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
This is done by decomposing the number to be divided in 32-bit
components and using the 32-bit magic multiply. For the lowering to be
effective the constant must fit in 16 bits.
On ARM the expression n / 5 compiles to 25 instructions.
Benchmark for GOARCH=arm (Cortex-A53)
name old time/op new time/op delta
DivconstU64/3-6 1.19µs ± 0% 0.03µs ± 1% -97.40% (p=0.000 n=9+9)
DivconstU64/5-6 1.18µs ± 1% 0.03µs ± 1% -97.38% (p=0.000 n=10+8)
DivconstU64/37-6 1.13µs ± 1% 0.04µs ± 1% -96.51% (p=0.000 n=10+8)
DivconstU64/1234567-6 852ns ± 0% 901ns ± 1% +5.73% (p=0.000 n=8+9)
Benchmark for GOARCH=386 (Haswell)
name old time/op new time/op delta
DivconstU64/3-4 18.0ns ± 2% 5.6ns ± 1% -69.06% (p=0.000 n=10+10)
DivconstU64/5-4 17.8ns ± 1% 5.5ns ± 1% -68.87% (p=0.000 n=9+10)
DivconstU64/37-4 17.8ns ± 1% 7.3ns ± 0% -58.90% (p=0.000 n=10+10)
DivconstU64/1234567-4 17.5ns ± 1% 16.0ns ± 0% -8.55% (p=0.000 n=10+9)
Change-Id: I38a19b4d59093ec021ef2e5241364a3dad4eae73
Reviewed-on: https://go-review.googlesource.com/c/go/+/264683
Run-TryBot: Emmanuel Odeke <emmanuel@orijtech.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Trust: Emmanuel Odeke <emmanuel@orijtech.com>
Repeats existing patterns for old calls, so that these will apply
during the optimization phases that precede call expansion.
Change-Id: I1ca0a78c159aa1a51004db217edde4ecc772b646
Reviewed-on: https://go-review.googlesource.com/c/go/+/248190
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Includes a few tweaks to Value.copyOf(a) (make it a no-op for
a self-copy) and new pattern hack "___" (3 underscores) is
like ellipsis, except the replacement doesn't need to have
matching ellipsis/underscores.
Moved the arg-length check in generated pattern-matching code
BEFORE the args are probed, because not all instances of
variable length OpFoo will have all the args mentioned in
some rule for OpFoo, and when that happens, the compiler
panics without the early check.
Change-Id: I66de40672b3794a6427890ff96c805a488d783f4
Reviewed-on: https://go-review.googlesource.com/c/go/+/247537
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Extend use of AuxCall
Change-Id: I68b6d9bad09506532e1415fd70d44cf6c15b4b93
Reviewed-on: https://go-review.googlesource.com/c/go/+/239081
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This is prerequisite to moving call expansion later into SSA,
and probably a good idea anyway. Passes tests.
This is the first minimal CL that does a 1-for-1 substitution
of *ssa.AuxCall for *obj.LSym. Next step (next CL) is to make
this change for all calls so that additional information can
be stored in AuxCall.
Change-Id: Ia3a7715648fd9fb1a176850767a726e6f5b959eb
Reviewed-on: https://go-review.googlesource.com/c/go/+/237680
Trust: David Chase <drchase@google.com>
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
With this patch, opt pass can expose more obvious constant-folding
opportunites.
Example:
func test(i int) int {return (i+8)-(i+4)}
The previous version:
MOVD "".i(FP), R0
ADD $8, R0, R1
ADD $4, R0, R0
SUB R0, R1, R0
MOVD R0, "".~r1+8(FP)
RET (R30)
The optimized version:
MOVD $4, R0
MOVD R0, "".~r1+8(FP)
RET (R30)
This patch removes some existing reassociation rules, such as "x+(z-C)",
because the current generic rewrite rules will canonicalize "x-const"
to "x+(-const)", making "x+(z-C)" equal to "x+(z+(-C))".
This patch also adds test cases.
Change-Id: I857108ba0b5fcc18a879eeab38e2551bc4277797
Reviewed-on: https://go-review.googlesource.com/c/go/+/237137
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
There are some architecture-independent rules in #21439, since an
unsigned integer >= 0 is always true and < 0 is always false. This CL
adds these optimizations to generic rules.
Updates #21439
Change-Id: Iec7e3040b761ecb1e60908f764815fdd9bc62495
Reviewed-on: https://go-review.googlesource.com/c/go/+/246617
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This triggers in 131 functions in std+cmd.
In those functions, it often helps considerably
(2-10% text size reduction).
Noticed while working on #38554.
Change-Id: Id0dbb8e7cb21d469ec08ec3d5be9beb9e8291e9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/229707
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
We set up static symbols during walk that
we later make copies of to initialize local variables.
It is difficult to ascertain at that time exactly
when copying a symbol is profitable vs locally
initializing an autotmp.
During SSA, we are much better placed to optimize.
This change recognizes when we are copying from a
global readonly all-zero symbol and replaces it with
direct zeroing.
This often allows the all-zero symbol to be
deadcode eliminated at link time.
This is not ideal--it makes for large object files,
and longer link times--but it is the cleanest fix I could find.
This makes the final binary for the program in #38554
shrink from >500mb to ~2.2mb.
It also shrinks the standard binaries:
file before after Δ %
addr2line 4412496 4404304 -8192 -0.186%
buildid 2893816 2889720 -4096 -0.142%
cgo 4841048 4832856 -8192 -0.169%
compile 19926480 19922432 -4048 -0.020%
cover 5281816 5277720 -4096 -0.078%
link 6734648 6730552 -4096 -0.061%
nm 4366240 4358048 -8192 -0.188%
objdump 4755968 4747776 -8192 -0.172%
pprof 14653060 14612100 -40960 -0.280%
trace 11805940 11777268 -28672 -0.243%
vet 7185560 7181416 -4144 -0.058%
total 113588440 113465560 -122880 -0.108%
And not just by removing unnecessary symbols;
the program text shrinks a bit as well.
Fixes#38554
Change-Id: I8381ae6084ae145a5e0cd9410c451e52c0dc51c8
Reviewed-on: https://go-review.googlesource.com/c/go/+/229704
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
Keep track of all expressions encountered while
generating a rewrite result, and re-use them whenever possible.
Named expressions may still be used for clarity when desired.
Change-Id: I640dca108763eb8baeff8f9a4169300af3445b82
Reviewed-on: https://go-review.googlesource.com/c/go/+/229800
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
This is followup of CL 228860, which rewrite shift rules to use typed
aux. That CL introduced nlz* functions, to refactor left shift rules.
While at it, we realize there's a bug in old rules with both right/left
shift rules, but only fix for left shift rules only.
This CL fixes the bug for right shift rules.
Passes toolstash-check.
Change-Id: Id8f2158b1b66c9e87f3fdeaa7ae3e35dc0666f8b
Reviewed-on: https://go-review.googlesource.com/c/go/+/229137
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Emmanuel Odeke <emm.odeke@gmail.com>
Reviewed-by: Keith Randall <khr@golang.org>
This optimization works on any integer with exactly one bit set.
This is identical to being a power of two, except in the
most negative number. Use oneBit instead.
The rule now triggers in a few more places in std+cmd,
in packages encoding/asn1, crypto/elliptic, and
vendor/golang.org/x/crypto/cryptobyte.
This change obviates the need for CL 222479
by doing this optimization consistently in the compiler.
Change-Id: I983c6235290fdc634fda5e11b10f1f8ce041272f
Reviewed-on: https://go-review.googlesource.com/c/go/+/229124
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Surprisingly many rules needed no modification.
Use wrapper functions for aux like we did for auxint.
Simplifies things a bit.
Change-Id: I2e852e77f1585dcb306a976ab9335f1ac5b4a770
Reviewed-on: https://go-review.googlesource.com/c/go/+/227961
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Move a lot of the constant folding rules to use strongly
typed AuxInt fields.
We need more than a cast to convert AuxInt to, e.g., float32.
Make conversion functions for converting back and forth.
Change-Id: Ia3d95ee3583ee2179a10938e20210a7617358c88
Reviewed-on: https://go-review.googlesource.com/c/go/+/227866
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Missed as part of CL 221790. It isn't just * and / that can make NaNs.
Update #36400Fixes#38359
Change-Id: I3fa562f772fe03b510793a6dc0cf6189c0c3e652
Reviewed-on: https://go-review.googlesource.com/c/go/+/227860
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
Reviewed-by: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Will help with strongly typed rewrite rules.
Change-Id: Ifbf316a49f4081322b3b8f13bc962713437d9aba
Reviewed-on: https://go-review.googlesource.com/c/go/+/227785
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Right now the Aux and AuxInt fields of ssa.Values are typed as
interface{} and int64, respectively. Each rule that uses these values
must cast them to the type they actually are (*obj.LSym, or int32, or
ValAndOff, etc.), use them, and then cast them back to interface{} or
int64.
We know for each opcode what the types of the Aux and AuxInt fields
should be. So let's modify the rule generator to declare the types to
be what we know they should be, autoconverting to and from the generic
types for us. That way we can make the rules more type safe.
It's difficult to make a single CL for this, so I've coopted the "=>"
token to indicate a rule that is strongly typed. "->" rules are
processed as before. That will let us migrate a few rules at a time in
separate CLs. Hopefully we can reach a state where all rules are
strongly typed and we can drop the distinction.
This CL changes just a few rules to get a feel for what this
transition would look like.
I've decided not to put explicit types in the rules. I think it
makes the rules somewhat clearer, but definitely more verbose.
In particular, the passthrough rules that don't modify the fields
in question are verbose for no real reason.
Change-Id: I63a1b789ac5702e7caf7934cd49f784235d1d73d
Reviewed-on: https://go-review.googlesource.com/c/go/+/190197
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Extend CL 220417 (which removed the integer Greater and Geq ops) to
floating point comparisons. Greater and Geq can always be
implemented using Less and Leq.
Fixes#37316.
Change-Id: Ieaddb4877dd0ff9037a1dd11d0a9a9e45ced71e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/222397
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Retrying CL 222782, with a fix that will hopefully stop the random crashing.
The issue with the previous CL is that it does pointer arithmetic
in a way that may briefly generate an out-of-bounds pointer. If an
interrupt happens to occur in that state, the referenced object may
be collected incorrectly.
Suppose there was code that did s[x+c]. The previous CL had a rule
to the effect of ptr + (x + c) -> c + (ptr + x). But ptr+x is not
guaranteed to point to the same object as ptr. In contrast,
ptr+(x+c) is guaranteed to point to the same object as ptr, because
we would have already checked that x+c is in bounds.
For example, strconv.trim used to have this code:
MOVZX -0x1(BX)(DX*1), BP
CMPL $0x30, AL
After CL 222782, it had this code:
LEAL 0(BX)(DX*1), BP
CMPB $0x30, -0x1(BP)
An interrupt between those last two instructions could see BP pointing
outside the backing store of the slice involved.
It's really hard to actually demonstrate a bug. First, you need to
have an interrupt occur at exactly the right time. Then, there must
be no other pointers to the object in question. Since the interrupted
frame will be scanned conservatively, there can't even be a dead
pointer in another register or on the stack. (In the example above,
a bug can't happen because BX still holds the original pointer.)
Then, the object in question needs to be collected (or at least
scanned?) before the interrupted code continues.
This CL needs to handle load combining somewhat differently than CL 222782
because of the new restriction on arithmetic. That's the only real
difference (other than removing the bad rules) from that old CL.
This bug is also present in the amd64 rewrite rules, and we haven't
seen any crashing as a result. I will fix up that code similarly to
this one in a separate CL.
Update #37881
Change-Id: I5f0d584d9bef4696bfe89a61ef0a27c8d507329f
Reviewed-on: https://go-review.googlesource.com/c/go/+/225798
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Trying this CL again, with a fixed test that allows platforms
to disagree on the exact behavior of converting NaNs.
We store 32-bit floating point constants in a 64-bit field, by
converting that 32-bit float to 64-bit float to store it, and convert
it back to use it.
That works for *almost* all floating-point constants. The exception is
signaling NaNs. The round trip described above means we can't represent
a 32-bit signaling NaN, because conversions strip the signaling bit.
To fix this issue, just forbid NaNs as floating-point constants in SSA
form. This shouldn't affect any real-world code, as people seldom
constant-propagate NaNs (except in test code).
Additionally, NaNs are somewhat underspecified (which of the many NaNs
do you get when dividing 0/0?), so when cross-compiling there's a
danger of using the compiler machine's NaN regime for some math, and
the target machine's NaN regime for other math. Better to use the
target machine's NaN regime always.
Update #36400
Change-Id: Idf203b688a15abceabbd66ba290d4e9f63619ecb
Reviewed-on: https://go-review.googlesource.com/c/go/+/221790
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
In cases in which we had a named value whose args were all _,
like this rule from ARM.rules:
(MOVBUreg x:(MOVBUload _ _)) -> (MOVWreg x)
We previously inserted
_ = x.Args[1]
even though it is unnecessary.
This change eliminates this pointless bounds check.
And in other cases, we now check bounds just as far as strictly necessary.
No significant movement on any compiler metrics.
Just nicer (and less) code.
Passes toolstash-check -all.
Change-Id: I075dfe9f926cc561cdc705e9ddaab563164bed3a
Reviewed-on: https://go-review.googlesource.com/c/go/+/221781
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This:
* Simplifies and shortens the generated code for rewrite rules.
* Shrinks cmd/compile by 86k (0.4%) and makes it easier to compile.
* Removes the stmt boundary code wrangling from Value.reset,
in favor of doing it in the one place where it actually does some work,
namely the writebarrier pass. (This was ascertained by inspecting the
code for cases in which notStmtBoundary values were generated.)
Passes toolstash-check -all.
Change-Id: I25671d4c4bbd772f235195d11da090878ea2cc07
Reviewed-on: https://go-review.googlesource.com/c/go/+/221421
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>