Use nlzX variants instead. While at it, also remove tests involve
nlz/nlo/nto/log2, since when we are calling directly "math/bits"
functions.
Passes toolstash-check.
Change-Id: I83899741a29e05bc2c19d73652961ac795001781
Reviewed-on: https://go-review.googlesource.com/c/go/+/229138
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Convert first section of 386 optimization rules to the typed aux form.
Adds addOffset{32,64} functions that returns ValAndOffs and a
ValAndOff.canAdd32 function that takes an int32.
Passes
GOARCH=386 gotip build -toolexec 'toolstash -cmp' -a std
Change-Id: I69d2a8ace6936d5e8ba6ba047183002bf07dd5be
Reviewed-on: https://go-review.googlesource.com/c/go/+/228825
Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
This is the second attempt. The first attempt was CL 229127,
which got rolled back by CL 229177, because it caused
an infinite loop during compilation on some platforms.
I didn't notice that the trybots hadn't completed when I submitted; mea culpa.
The bug was that we were checking x&(x-1)==0, which is also true of 0,
which does not have exactly one bit set.
This caused an infinite rewrite rule loop.
Updates #38547
file before after Δ %
compile 19678112 19669808 -8304 -0.042%
total 113143160 113134856 -8304 -0.007%
Change-Id: I417a4f806e1ba61277e31bab2e57dd3f1ac7e835
Reviewed-on: https://go-review.googlesource.com/c/go/+/229197
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
This reverts commit 066c47ca5f.
Reason for revert: This appears to have broken a bunch of builders.
Change-Id: I68b4decf3c1892766e195d8eb018844cdff69443
Reviewed-on: https://go-review.googlesource.com/c/go/+/229177
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
This optimization works on any integer with exactly one bit set.
This is identical to being a power of two, except in the
most negative number. Use oneBit instead.
The rule now triggers in a few more places in std+cmd,
in packages encoding/asn1, crypto/elliptic, and
vendor/golang.org/x/crypto/cryptobyte.
This change obviates the need for CL 222479
by doing this optimization consistently in the compiler.
Change-Id: I983c6235290fdc634fda5e11b10f1f8ce041272f
Reviewed-on: https://go-review.googlesource.com/c/go/+/229124
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This reverts commit 98c32670fd454939794504225dca1d4ec55045d5.
Rolling-forward with trivial format-string fix
cmd/compile: adjust RISCV64 rewrite rules to use typed aux fields
Also add a typed version of mergeSym to rewrite.go to assist with a few
rules that used mergeSym in the untyped-form.
Remove a few extra int32 overflow checks that no longer make sense, as
adding two int8s or int16s should never overflow an int32.
Passes toolstash-check -all.
Original review: https://go-review.googlesource.com/c/go/+/228882
Change-Id: Ib63db4ee1687446f0f3d9f11575a40dd85cbce55
Reviewed-on: https://go-review.googlesource.com/c/go/+/229126
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Than McIntosh <thanm@google.com>
Also add a typed version of mergeSym to rewrite.go to assist with a few
rules that used mergeSym in the untyped-form.
Remove a few extra int32 overflow checks that no longer make sense, as
adding two int8s or int16s should never overflow an int32.
Passes toolstash-check -all.
Change-Id: I72ddd2b0d9001faa87ad0ab54f500057164661b7
Reviewed-on: https://go-review.googlesource.com/c/go/+/228882
Reviewed-by: Keith Randall <khr@golang.org>
This first pass makes the rules using the condition code mask
(CCMask) and rotate parameters (RotateParams) aux values strongly
typed. This required adding strongly typed aux handling to the
block rulegen.
More CLs like this to follow, but this is probably the most
complex.
Passes toolstash-check -all.
Change-Id: Ie513b07d527f0c1b398d7748331442dcb5f7b17d
Reviewed-on: https://go-review.googlesource.com/c/go/+/228518
Run-TryBot: Michael Munday <mike.munday@ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
If -d=ssa/PASS/debug=N is specified (N >= 2) for a rewrite pass
(e.g. lower), when a Value (or Block) is rewritten, print the
Value (or Block) before and after.
For #31915.
Updates #19013.
Change-Id: I80eadd44302ae736bc7daed0ef68529ab7a16776
Reviewed-on: https://go-review.googlesource.com/c/go/+/176718
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Surprisingly many rules needed no modification.
Use wrapper functions for aux like we did for auxint.
Simplifies things a bit.
Change-Id: I2e852e77f1585dcb306a976ab9335f1ac5b4a770
Reviewed-on: https://go-review.googlesource.com/c/go/+/227961
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>
Move a lot of the constant folding rules to use strongly
typed AuxInt fields.
We need more than a cast to convert AuxInt to, e.g., float32.
Make conversion functions for converting back and forth.
Change-Id: Ia3d95ee3583ee2179a10938e20210a7617358c88
Reviewed-on: https://go-review.googlesource.com/c/go/+/227866
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Giovanni Bajo <rasky@develer.com>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
For 1.15, unless someone really wants it in 1.14.
A performance-sensitive user thought this would be useful,
though "large" was not well-defined. If 128 is large,
there are 139 static instances of "large" copies in the compiler
itself.
Includes test.
Change-Id: I81f20c62da59d37072429f3a22c1809e6fb2946d
Reviewed-on: https://go-review.googlesource.com/c/go/+/205066
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
zeroUpper32Bits wasn't checking for shift-extension ops. This would not
check shifts that were marking as bounded by prove (normally, shifts
are wrapped in a sequence that ends with an ANDL, and zeroUpper32Bits
would see the ANDL).
This produces no changes on generated output right now, but will be
important once CL196679 lands because many shifts will be marked
as bounded, and lower will stop generating the masking code sequence
around them.
Change-Id: Iaea94acc5b60bb9a5021c9fb7e4a1e2e5244435e
Reviewed-on: https://go-review.googlesource.com/c/go/+/226338
Reviewed-by: Keith Randall <khr@golang.org>
Make sure we don't use the rewrite ptr + (c + x) -> c + (ptr + x), as
that may create an ephemeral out-of-bounds pointer.
I have not seen an actual bug caused by this yet, but we've seen
them in the 386 port so I'm fixing this issue for amd64 as well.
The load-combining rules needed to be reworked somewhat to still
work without the above broken rule.
Update #37881
Change-Id: I8046d170e89e2035195f261535e34ca7d8aca68a
Reviewed-on: https://go-review.googlesource.com/c/go/+/226437
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Retrying CL 222782, with a fix that will hopefully stop the random crashing.
The issue with the previous CL is that it does pointer arithmetic
in a way that may briefly generate an out-of-bounds pointer. If an
interrupt happens to occur in that state, the referenced object may
be collected incorrectly.
Suppose there was code that did s[x+c]. The previous CL had a rule
to the effect of ptr + (x + c) -> c + (ptr + x). But ptr+x is not
guaranteed to point to the same object as ptr. In contrast,
ptr+(x+c) is guaranteed to point to the same object as ptr, because
we would have already checked that x+c is in bounds.
For example, strconv.trim used to have this code:
MOVZX -0x1(BX)(DX*1), BP
CMPL $0x30, AL
After CL 222782, it had this code:
LEAL 0(BX)(DX*1), BP
CMPB $0x30, -0x1(BP)
An interrupt between those last two instructions could see BP pointing
outside the backing store of the slice involved.
It's really hard to actually demonstrate a bug. First, you need to
have an interrupt occur at exactly the right time. Then, there must
be no other pointers to the object in question. Since the interrupted
frame will be scanned conservatively, there can't even be a dead
pointer in another register or on the stack. (In the example above,
a bug can't happen because BX still holds the original pointer.)
Then, the object in question needs to be collected (or at least
scanned?) before the interrupted code continues.
This CL needs to handle load combining somewhat differently than CL 222782
because of the new restriction on arithmetic. That's the only real
difference (other than removing the bad rules) from that old CL.
This bug is also present in the amd64 rewrite rules, and we haven't
seen any crashing as a result. I will fix up that code similarly to
this one in a separate CL.
Update #37881
Change-Id: I5f0d584d9bef4696bfe89a61ef0a27c8d507329f
Reviewed-on: https://go-review.googlesource.com/c/go/+/225798
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Use a separate compiler pass to introduce complicated x86 addressing
modes. Loads in the normal architecture rules (for x86 and all other
platforms) can have constant offsets (AuxInt values) and symbols (Aux
values), but no more.
The complex addressing modes (x+y, x+2*y, etc.) are introduced in a
separate pass that combines loads with LEAQx ops.
Organizing rewrites this way simplifies the number of rewrites
required, as there are lots of different rule orderings that have to
be specified to ensure these complex addressing modes are always found
if they are possible.
Update #36468
Change-Id: I5b4bf7b03a1e731d6dfeb9ef19b376175f3b4b44
Reviewed-on: https://go-review.googlesource.com/c/go/+/217097
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
Trying this CL again, with a fixed test that allows platforms
to disagree on the exact behavior of converting NaNs.
We store 32-bit floating point constants in a 64-bit field, by
converting that 32-bit float to 64-bit float to store it, and convert
it back to use it.
That works for *almost* all floating-point constants. The exception is
signaling NaNs. The round trip described above means we can't represent
a 32-bit signaling NaN, because conversions strip the signaling bit.
To fix this issue, just forbid NaNs as floating-point constants in SSA
form. This shouldn't affect any real-world code, as people seldom
constant-propagate NaNs (except in test code).
Additionally, NaNs are somewhat underspecified (which of the many NaNs
do you get when dividing 0/0?), so when cross-compiling there's a
danger of using the compiler machine's NaN regime for some math, and
the target machine's NaN regime for other math. Better to use the
target machine's NaN regime always.
Update #36400
Change-Id: Idf203b688a15abceabbd66ba290d4e9f63619ecb
Reviewed-on: https://go-review.googlesource.com/c/go/+/221790
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
The SSA backend has rules to read the contents of readonly Lsyms.
However, this rule was failing to trigger for many readonly Lsyms.
This is because the readonly attribute that was set on the Node.Name
was not propagated to its Lsym until the dump globals phase, after SSA runs.
To work around this phase ordering problem, introduce Node.SetReadonly,
which sets Node.Name.Readonly and also configures the Lsym
enough that SSA can use it.
This change also fixes a latent problem in the rewrite rule function,
namely that reads past the end of lsym.P were treated as entirely zero,
instead of merely requiring padding with trailing zeros.
This change also adds an amd64 rule needed to fully optimize
the results of this change. It would be better not to need this,
but the zero extension that should handle this for us
gets optimized away too soon (see #36897 for a similar problem).
I have not investigated whether other platforms also need new
rules to take full advantage of the new optimizations.
Compiled code for (interface{})(true) on amd64 goes from:
LEAQ type.bool(SB), AX
MOVBLZX ""..stmp_0(SB), BX
LEAQ runtime.staticbytes(SB), CX
ADDQ CX, BX
to
LEAQ type.bool(SB), AX
LEAQ runtime.staticbytes+1(SB), BX
Prior to this change, the readonly symbol rewrite rules
fired a total of 884 times during make.bash.
Afterwards they fire 1807 times.
file before after Δ %
cgo 4827832 4823736 -4096 -0.085%
compile 24907768 24895656 -12112 -0.049%
fix 3376952 3368760 -8192 -0.243%
pprof 14751700 14747604 -4096 -0.028%
total 120343528 120315032 -28496 -0.024%
Change-Id: I59ea52138276c37840f69e30fb109fd376d579ec
Reviewed-on: https://go-review.googlesource.com/c/go/+/220499
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
There are often many values to clobber.
Allow passing them all in at once.
The goal is increased rule readability.
As a bonus, it shrinks cmd/compile by ~97k, almost half a percent.
Package SSA requires 1.2% less memory to compile.
The single-line changes were make via regex,
and the remaining multi-line clobbers were manually combined.
Passes toolstash-check -all.
Change-Id: Ib310e9265d3616211f8192c9040b4c8933824d19
Reviewed-on: https://go-review.googlesource.com/c/go/+/220691
Reviewed-by: Michael Munday <mike.munday@ibm.com>
This reverts CL 213477.
Reason for revert: tests are failing on linux-mips*-rtrk builders.
Change-Id: I8168f7450890233f1bd7e53930b73693c26d4dc0
Reviewed-on: https://go-review.googlesource.com/c/go/+/220897
Run-TryBot: Bryan C. Mills <bcmills@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
We store 32-bit floating point constants in a 64-bit field, by
converting that 32-bit float to 64-bit float to store it, and convert
it back to use it.
That works for *almost* all floating-point constants. The exception is
signaling NaNs. The round trip described above means we can't represent
a 32-bit signaling NaN, because conversions strip the signaling bit.
To fix this issue, just forbid NaNs as floating-point constants in SSA
form. This shouldn't affect any real-world code, as people seldom
constant-propagate NaNs (except in test code).
Additionally, NaNs are somewhat underspecified (which of the many NaNs
do you get when dividing 0/0?), so when cross-compiling there's a
danger of using the compiler machine's NaN regime for some math, and
the target machine's NaN regime for other math. Better to use the
target machine's NaN regime always.
This has been a bug since 1.10, and there's an easy workaround
(declare a global varaible containing the signaling NaN pattern, and
use that as the argument to math.Float32frombits) so we'll fix it in
1.15.
Fixes#36400
Update #36399
Change-Id: Icf155e743281560eda2eed953d19a829552ccfda
Reviewed-on: https://go-review.googlesource.com/c/go/+/213477
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
The name of the function should mention division.
Eliminate double negatives from the comment describing it.
Change-Id: Icef1a5139b3a91b86acb930af97938f5160f7342
Reviewed-on: https://go-review.googlesource.com/c/go/+/217001
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Daniel Martí <mvdan@mvdan.cc>
Part 1: CL 199499 (GOOS nacl)
Part 2: CL 200077 (amd64p32 files, toolchain)
Part 3: stuff that arguably should've been part of Part 2, but I forgot
one of my grep patterns when splitting the original CL up into
two parts.
This one might also have interesting stuff to resurrect for any future
x32 ABI support.
Updates #30439
Change-Id: I2b4143374a253a003666f3c69e776b7e456bdb9c
Reviewed-on: https://go-review.googlesource.com/c/go/+/200318
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Just the low-hanging fruit; code that was either never used, or its last
use was removed a while ago.
Change-Id: Ic7f90aeee3e6daf7edc16cde3c0a767d67e617ee
Reviewed-on: https://go-review.googlesource.com/c/go/+/192618
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This reduces allocations and also resolves some
lurking inliner/inlinee line-number match problems.
However, it does add about 1.5% to compile time.
This fixes compiler OOMs seen compiling some large protobuf-
derived inputs. For compiling the compiler itself,
compilebench -pkg cmd/compile/internal/ssa -memprofile withcl.prof
the numberlines-related memory consumption is reduced from 129MB
to 29MB (about a 5% overall reduction in allocation).
Additionally modified after going over changes with Austin
to remove unused code (nobody called size()) and correct
the cache-clearing code.
I've attempted to speed this up by not using maps, and have
not succeeded. I'd rather get correct code in now, speed it
up later if I can.
Updates #27739.
Fixes#29279.
Change-Id: I098005de4e45196a5f5b10c0886a49f88e9f8fd5
Reviewed-on: https://go-review.googlesource.com/c/go/+/154617
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
When the SSA rules are re-generated to log rules,
they write output like:
rewrite AMD64.rules:527
rewrite AMD64.rules:427
rewrite AMD64.rules:494
This is silly; there are no non-rewrite lines in the file.
Furthermore, the rulelog file tends to be gigantic
for any non-trivial compilation (measured in gigabytes).
Remove the "rewrite " prefix.
No impact to normal builds.
Change-Id: I955995c1cc5f27a4a6a3849e19082ecb3e40bd4f
Reviewed-on: https://go-review.googlesource.com/c/go/+/176677
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
noteRule is useful when you're trying to debug
a particular rule, or get a general sense for
how often a rule fires overall.
It is less useful if you're trying to figure
out which functions might be useful to benchmark
to ascertain the impact of a newly added rule.
Enter countRule. You use it like noteRule,
except that you get per-function summaries.
Sample output:
# runtime
(*mspan).sweep: idx1=1
evacuate_faststr: idx1=1
evacuate_fast32: idx1=1
evacuate: idx1=2
evacuate_fast64: idx1=1
sweepone: idx1=1
purgecachedstats: idx1=1
mProf_Free: idx1=1
This suggests that the map benchmarks
might be good to run for this added rule.
Change-Id: Id471c3231f1736165f2020f6979ff01c29677808
Reviewed-on: https://go-review.googlesource.com/c/go/+/167088
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
"Division by invariant integers using multiplication" paper
by Granlund and Montgomery contains a method for directly computing
divisibility (x%c == 0 for c constant) by means of the modular inverse.
The method is further elaborated in "Hacker's Delight" by Warren Section 10-17
This general rule can compute divisibilty by one multiplication and a compare
for odd divisors and an additional rotate for even divisors.
To apply the divisibility rule, we must take into account
the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first
optimization pass "opt". This complicates the matching as we want to match
only in the cases where the result of (x/c) is not also available.
So, we must match on the expanded form of (x/c) in the expression x == c*(x/c)
in the "late opt" pass after common subexpresion elimination.
Note, that if there is an intermediate opt pass introduced in the future we
could simplify these rules by delaying the magic division rewrite to "late opt"
and matching directly on (x/c) in the intermediate opt pass.
Additional rules to lower the generic RotateLeft* ops were also applied.
On amd64, the divisibility check is 25-50% faster.
name old time/op new time/op delta
DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5)
DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5)
DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5)
DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5)
DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5)
DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5)
DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5)
DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5)
DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5)
DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5)
DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4)
DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5)
DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5)
DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5)
DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5)
DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5)
DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5)
DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5)
DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5)
DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4)
DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5)
DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5)
DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5)
DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5)
A follow-up change will address the signed division case.
Updates #30282
Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403
Reviewed-on: https://go-review.googlesource.com/c/go/+/168037
Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
These new calls should not prevent NOSPLIT promotion, like the old ones.
These new calls should not prevent racefuncenter/exit removal.
(The latter was already true, as the new calls are not yet lowered
to StaticCalls at the point where racefuncenter/exit removal is done.)
Add tests to make sure we don't regress (again).
Fixes#31219
Change-Id: I3fb6b17cdd32c425829f1e2498defa813a5a9ace
Reviewed-on: https://go-review.googlesource.com/c/go/+/170639
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ilya Tocar <ilya.tocar@intel.com>
The CL 164718 adds new condition flags for floating-point comparisons
in arm64 backend, but dose not add the handling in rewrite.go for
corresponding Ops, which causes issue 30679. And this CL fixes this
issue.
Fixes#30679
Change-Id: I8acc749f78227c3e9e74fa7938f05fb442fb62c6
Reviewed-on: https://go-review.googlesource.com/c/go/+/166579
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
This CL adds two rules to turn patterns like ((x<<8) | (x>>8)) (the type of
x is uint16, "|" can also be "+" or "^") to a REV16 instruction on arm v6+.
This optimization rule can be used for math/bits.ReverseBytes16.
Benchmarks on arm v6:
name old time/op new time/op delta
ReverseBytes-32 2.86ns ± 0% 2.86ns ± 0% ~ (all equal)
ReverseBytes16-32 2.86ns ± 0% 2.86ns ± 0% ~ (all equal)
ReverseBytes32-32 1.29ns ± 0% 1.29ns ± 0% ~ (all equal)
ReverseBytes64-32 1.43ns ± 0% 1.43ns ± 0% ~ (all equal)
Change-Id: I819e633c9a9d308f8e476fb0c82d73fb73dd019f
Reviewed-on: https://go-review.googlesource.com/c/go/+/159019
Reviewed-by: Cherry Zhang <cherryyz@google.com>
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
CL 154057 adds guards agaist out-of-bound reads from readonly
constants. It turns out that in dead code, the offset can also
be negative. Guard against negative offset as well.
Fixes#30257.
Change-Id: I47c2a2e434dd466c08ae6f50f213999a358c796e
Reviewed-on: https://go-review.googlesource.com/c/162819
Reviewed-by: Keith Randall <khr@golang.org>
Allow shifts by signed amounts. Panic if the shift amount is negative.
TODO: We end up doing two compares per shift, see Ian's comment
https://github.com/golang/go/issues/19113#issuecomment-443241799 that
we could do it with a single comparison in the normal case.
The prove pass mostly handles this code well. For instance, it removes the
<0 check for cases like this:
if s >= 0 { _ = x << s }
_ = x << len(a)
This case isn't handled well yet:
_ = x << (y & 0xf)
I'll do followon CLs for unhandled cases as needed.
Update #19113
R=go1.13
Change-Id: I839a5933d94b54ab04deb9dd5149f32c51c90fa1
Reviewed-on: https://go-review.googlesource.com/c/158719
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
We can't remove race instrumentation unless there are no calls,
not just no static calls. Closure and interface calls also count.
The problem in issue 29329 is that there was a racefuncenter, an
InterCall, and a racefuncexit. The racefuncenter was removed, then
the InterCall was rewritten to a StaticCall. That prevented the
racefuncexit from being removed. That caused an imbalance in
racefuncenter/racefuncexit calls, which made the race detector barf.
Bug introduced at CL 121235
Fixes#29329
Change-Id: I2c94ac6cf918dd910b74b2a0de5dc2480d236f16
Reviewed-on: https://go-review.googlesource.com/c/155917
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Out-of-bounds reads of globals can happen in dead code. For code
like this:
s := "a"
if len(s) == 3 {
load s[0], s[1], and s[2]
}
The out-of-bounds loads are dead code, but aren't removed yet
when lowering. We need to not panic when compile-time evaluating
those loads. This can only happen for dead code, so the result
doesn't matter.
Fixes#29215
Change-Id: I7fb765766328b9524c6f2a1e6ab8d8edd9875097
Reviewed-on: https://go-review.googlesource.com/c/154057
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alberto Donizetti <alb.donizetti@gmail.com>
By combining the load+op, we may force the op to happen earlier in
the store chain. That might force the SymAddr operation earlier, and
in particular earlier than its corresponding VarDef. That leads to
an invalid schedule, so avoid that.
This is kind of a hack to work around the issue presented. I think
the underlying problem, that LEAQ is not directly ordered with respect
to its vardef, is the real problem. The benefit of this CL is that
it fixes the immediate issue, is small, and obviously won't break
anything. A real fix for this issue is much more invasive.
The go binary is unchanged in size.
This situation just doesn't occur very often.
Fixes#28445
Change-Id: I13a765e13f075d5b6808a355ef3c43cdd7cd47b6
Reviewed-on: https://go-review.googlesource.com/c/153641
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
This optimization is not sound if A, B, or C
might overlap with each other.
Thanks to Michael Munday for pointing this
out during the review of CL 143479.
This reduces the number of times this optimization
triggers during make.bash from 386 to 74.
This is unfortunate, but I don't see an obvious way around it,
short of souping up the disjointness analysis.
name old object-bytes new object-bytes delta
Template 507kB ± 0% 507kB ± 0% +0.13% (p=0.008 n=5+5)
Unicode 225kB ± 0% 225kB ± 0% ~ (all equal)
GoTypes 1.85MB ± 0% 1.85MB ± 0% +0.02% (p=0.008 n=5+5)
Flate 328kB ± 0% 328kB ± 0% ~ (all equal)
GoParser 402kB ± 0% 402kB ± 0% ~ (all equal)
Reflect 1.41MB ± 0% 1.41MB ± 0% ~ (all equal)
Tar 457kB ± 0% 458kB ± 0% +0.20% (p=0.008 n=5+5)
XML 600kB ± 0% 601kB ± 0% +0.03% (p=0.008 n=5+5)
Change-Id: Ida408cb627145ba9faf473a78606f050c2f3f51c
Reviewed-on: https://go-review.googlesource.com/c/145208
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Michael Munday <mike.munday@ibm.com>