Commit graph

7 commits

Author SHA1 Message Date
Jorropo
ec92bc6d63 cmd/compile: rewrite Rsh to RshU if arguments are proved positive
Fixes #76332

Change-Id: I9044025d5dc599531c7f88ed2870bcf3d8b0acbd
Reviewed-on: https://go-review.googlesource.com/c/go/+/721206
Reviewed-by: Mark Freeman <markfreeman@google.com>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Jorropo <jorropo.pgm@gmail.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Keith Randall <khr@golang.org>
2025-11-21 12:37:30 -08:00
Russ Cox
235b4e729d cmd/compile/internal/ssa: model right shift more precisely
Prove currently checks for 0 sign bit extraction (x>>63) at the
end of the pass, but it is more general and more useful
(and not really more work) to model right shift during
value range tracking. This handles sign bit extraction (both 0 and -1)
but also makes the value ranges available for proving bounds checks.

'go build -a -gcflags=-d=ssa/prove/debug=1 std'
finds 105 new things to prove.
https://gist.github.com/rsc/8ac41176e53ed9c2f1a664fc668e8336

For example, the compiler now recognizes that this code in
strconv does not need to check the second shift for being ≥ 64.

	msb := xHi >> 63
	retMantissa := xHi >> (msb + 38)

nor does this code in regexp:

	return b < utf8.RuneSelf && specialBytes[b%16]&(1<<(b/16)) != 0

This code in math no longer has a bounds check on the first index:

	if 0 <= n && n <= 308 {
		return pow10postab32[uint(n)/32] * pow10tab[uint(n)%32]
	}

The diff shows one "lost" proof in ycbcr.go but it's not really lost:
the expression was folded to a constant instead, and that only shows
up with debug=2. A diff of that output is at
https://gist.github.com/rsc/9139ed46c6019ae007f5a1ba4bb3250f

Change-Id: I84087311e0a303f00e2820d957a6f8b29ee22519
Reviewed-on: https://go-review.googlesource.com/c/go/+/716140
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Russ Cox <rsc@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2025-10-30 09:17:59 -07:00
Russ Cox
9bbda7c99d cmd/compile: make prove understand div, mod better
This CL introduces new divisible and divmod passes that rewrite
divisibility checks and div, mod, and mul. These happen after
prove, so that prove can make better sense of the code for
deriving bounds, and they must run before decompose, so that
64-bit ops can be lowered to 32-bit ops on 32-bit systems.
And then they need another generic pass as well, to optimize
the generated code before decomposing.

The three opt passes are "opt", "middle opt", and "late opt".
(Perhaps instead they should be "generic", "opt", and "late opt"?)

The "late opt" pass repeats the "middle opt" work on any new code
that has been generated in the interim.
There will not be new divs or mods, but there may be new muls.

The x%c==0 rewrite rules are much simpler now, since they can
match before divs have been rewritten. This has the effect of
applying them more consistently and making the rewrite rules
independent of the exact div rewrites.

Prove is also now charged with marking signed div/mod as
unsigned when the arguments call for it, allowing simpler
code to be emitted in various cases. For example,
t.Seconds()/2 and len(x)/2 are now recognized as unsigned,
meaning they compile to a simple shift (unsigned division),
avoiding the more complex fixup we need for signed values.

https://gist.github.com/rsc/99d9d3bd99cde87b6a1a390e3d85aa32
shows a diff of 'go build -a -gcflags=-d=ssa/prove/debug=1 std'
output before and after. "Proved Rsh64x64 shifts to zero" is replaced
by the higher-level "Proved Div64 is unsigned" (the shift was in the
signed expansion of div by constant), but otherwise prove is only
finding more things to prove.

One short example, in code that does x[i%len(x)]:

< runtime/mfinal.go:131:34: Proved Rsh64x64 shifts to zero
---
> runtime/mfinal.go:131:34: Proved Div64 is unsigned
> runtime/mfinal.go:131:38: Proved IsInBounds

A longer example:

< crypto/internal/fips140/sha3/shake.go:28:30: Proved Rsh64x64 shifts to zero
< crypto/internal/fips140/sha3/shake.go:38:27: Proved Rsh64x64 shifts to zero
< crypto/internal/fips140/sha3/shake.go:53:46: Proved Rsh64x64 shifts to zero
< crypto/internal/fips140/sha3/shake.go:55:46: Proved Rsh64x64 shifts to zero
---
> crypto/internal/fips140/sha3/shake.go:28:30: Proved Div64 is unsigned
> crypto/internal/fips140/sha3/shake.go:28:30: Proved IsInBounds
> crypto/internal/fips140/sha3/shake.go:28:30: Proved IsSliceInBounds
> crypto/internal/fips140/sha3/shake.go:38:27: Proved Div64 is unsigned
> crypto/internal/fips140/sha3/shake.go:45:7: Proved IsSliceInBounds
> crypto/internal/fips140/sha3/shake.go:46:4: Proved IsInBounds
> crypto/internal/fips140/sha3/shake.go:53:46: Proved Div64 is unsigned
> crypto/internal/fips140/sha3/shake.go:53:46: Proved IsInBounds
> crypto/internal/fips140/sha3/shake.go:53:46: Proved IsSliceInBounds
> crypto/internal/fips140/sha3/shake.go:55:46: Proved Div64 is unsigned
> crypto/internal/fips140/sha3/shake.go:55:46: Proved IsInBounds
> crypto/internal/fips140/sha3/shake.go:55:46: Proved IsSliceInBounds

These diffs are due to the smaller opt being better
and taking work away from prove:

< image/jpeg/dct.go:307:5: Proved IsInBounds
< image/jpeg/dct.go:308:5: Proved IsInBounds
...
< image/jpeg/dct.go:442:5: Proved IsInBounds

In the old opt, Mul by 8 was rewritten to Lsh by 3 early.
This CL delays that rule to help prove recognize mods,
but it also helps opt constant-fold the slice x[8*i:8*i+8:8*i+8].
Specifically, computing the length, opt can now do:

	(Sub64 (Add (Mul 8 i) 8) (Add (Mul 8 i) 8)) ->
	(Add 8 (Sub (Mul 8 i) (Mul 8 i))) ->
	(Add 8 (Mul 8 (Sub i i))) ->
	(Add 8 (Mul 8 0)) ->
	(Add 8 0) ->
	8

The key step is (Sub (Mul x y) (Mul x z)) -> (Mul x (Sub y z)),
Leaving the multiply as Mul enables using that step; the old
rewrite to Lsh blocked it, leaving prove to figure out the length
and then remove the bounds checks. But now opt can evaluate
the length down to a constant 8 and then constant-fold away
the bounds checks 0 < 8, 1 < 8, and so on. After that,
the compiler has nothing left to prove.

Benchmarks are noisy in general; I checked the assembly for the many
large increases below, and the vast majority are unchanged and
presumably hitting the caches differently in some way.

The divisibility optimizations were not reliably triggering before.
This leads to a very large improvement in some cases, like
DivisiblePow2constI64, DivisibleconstI64 on 64-bit systems
and DivisbleconstU64 on 32-bit systems.

Another way the divisibility optimizations were unreliable before
was incorrectly triggering for x/3, x%3 even though they are
written not to do that. There is a real but small slowdown
in the DivisibleWDivconst benchmarks on Mac because in the cases
used in the benchmark, it is still faster (on Mac) to do the
divisibility check than to remultiply.
This may be worth further study. Perhaps when there is no rotate
(meaning the divisor is odd), the divisibility optimization
should be enabled always. In any event, this CL makes it possible
to study that.

benchmark \ host                          s7  linux-amd64      mac  linux-arm64  linux-ppc64le  linux-386  s7:GOARCH=386  linux-arm
                                     vs base      vs base  vs base      vs base        vs base    vs base        vs base    vs base
LoadAdd                                    ~            ~        ~            ~              ~     -1.59%              ~          ~
ExtShift                                   ~            ~  -42.14%       +0.10%              ~     +1.44%         +5.66%     +8.50%
Modify                                     ~            ~        ~            ~              ~          ~              ~     -1.53%
MullImm                                    ~            ~        ~            ~              ~    +37.90%        -21.87%     +3.05%
ConstModify                                ~            ~        ~            ~        -49.14%          ~              ~          ~
BitSet                                     ~            ~        ~            ~        -15.86%    -14.57%         +6.44%     +0.06%
BitClear                                   ~            ~        ~            ~              ~     +1.78%         +3.50%     +0.06%
BitToggle                                  ~            ~        ~            ~              ~    -16.09%         +2.91%          ~
BitSetConst                                ~            ~        ~            ~              ~          ~              ~     -0.49%
BitClearConst                              ~            ~        ~            ~        -28.29%          ~              ~     -0.40%
BitToggleConst                             ~            ~        ~       +8.89%        -31.19%          ~              ~     -0.77%
MulNeg                                     ~            ~        ~            ~              ~          ~              ~          ~
Mul2Neg                                    ~            ~   -4.83%            ~              ~    -13.75%         -5.92%          ~
DivconstI64                                ~            ~        ~            ~              ~    -30.12%              ~     +0.50%
ModconstI64                                ~            ~   -9.94%       -4.63%              ~     +3.15%              ~     +5.32%
DivisiblePow2constI64                -34.49%      -12.58%        ~            ~        -12.25%          ~              ~          ~
DivisibleconstI64                    -24.69%      -25.06%   -0.40%       -2.27%        -42.61%     -3.31%              ~     +1.63%
DivisibleWDivconstI64                      ~            ~        ~            ~              ~    -17.55%              ~     -0.60%
DivconstU64/3                              ~            ~        ~            ~              ~     +1.51%              ~          ~
DivconstU64/5                              ~            ~        ~            ~              ~          ~              ~          ~
DivconstU64/37                             ~            ~   -0.18%            ~              ~     +2.70%              ~          ~
DivconstU64/1234567                        ~            ~        ~            ~              ~          ~              ~     +0.12%
ModconstU64                                ~            ~        ~       -0.24%              ~     -5.10%         -1.07%     -1.56%
DivisibleconstU64                          ~            ~        ~            ~              ~    -29.01%        -59.13%    -50.72%
DivisibleWDivconstU64                      ~            ~  -12.18%      -18.88%              ~     -5.50%         -3.91%     +5.17%
DivconstI32                                ~            ~   -0.48%            ~        -34.69%    +89.01%         -6.01%    -16.67%
ModconstI32                                ~       +2.95%   -0.33%            ~              ~     -2.98%         -5.40%     -8.30%
DivisiblePow2constI32                      ~            ~        ~            ~              ~          ~              ~    -16.22%
DivisibleconstI32                          ~            ~        ~            ~              ~    -37.27%        -47.75%    -25.03%
DivisibleWDivconstI32                -11.59%       +5.22%  -12.99%      -23.83%              ~    +45.95%         -7.03%    -10.01%
DivconstU32                                ~            ~        ~            ~              ~    +74.71%         +4.81%          ~
ModconstU32                                ~            ~   +0.53%       +0.18%              ~    +51.16%              ~          ~
DivisibleconstU32                          ~            ~        ~       -0.62%              ~     -4.25%              ~          ~
DivisibleWDivconstU32                 -2.77%       +5.56%  +11.12%       -5.15%              ~    +48.70%        +25.11%     -4.07%
DivconstI16                           -6.06%            ~   -0.33%       +0.22%              ~          ~         -9.68%     +5.47%
ModconstI16                                ~            ~   +4.44%       +2.82%              ~          ~              ~     +5.06%
DivisiblePow2constI16                      ~            ~        ~            ~              ~          ~              ~     -0.17%
DivisibleconstI16                          ~            ~   -0.23%            ~              ~          ~         +4.60%     +6.64%
DivisibleWDivconstI16                 -1.44%       -0.43%  +13.48%       -5.76%              ~     +1.62%        -23.15%     -9.06%
DivconstU16                           +1.61%            ~   -0.35%       -0.47%              ~          ~        +15.59%          ~
ModconstU16                                ~            ~        ~            ~              ~     -0.72%              ~    +14.23%
DivisibleconstU16                          ~            ~   -0.05%       +3.00%              ~          ~              ~     +5.06%
DivisibleWDivconstU16                +52.10%       +0.75%  +17.28%       +4.79%              ~    -37.39%         +5.28%     -9.06%
DivconstI8                                 ~            ~   -0.34%       -0.96%              ~          ~         -9.20%          ~
ModconstI8                            +2.29%            ~   +4.38%       +2.96%              ~          ~              ~          ~
DivisiblePow2constI8                       ~            ~        ~            ~              ~          ~              ~          ~
DivisibleconstI8                           ~            ~        ~            ~              ~          ~         +6.04%          ~
DivisibleWDivconstI8                 -26.44%       +1.69%  +17.03%       +4.05%              ~    +32.48%        -24.90%          ~
DivconstU8                            -4.50%      +14.06%   -0.28%            ~              ~          ~         +4.16%     +0.88%
ModconstU8                                 ~            ~  +25.84%       -0.64%              ~          ~              ~          ~
DivisibleconstU8                           ~            ~   -5.70%            ~              ~          ~              ~          ~
DivisibleWDivconstU8                 +49.55%       +9.07%        ~       +4.03%        +53.87%    -40.03%        +39.72%     -3.01%
Mul2                                       ~            ~        ~            ~              ~          ~              ~          ~
MulNeg2                                    ~            ~        ~            ~        -11.73%          ~              ~     -0.02%
EfaceInteger                               ~            ~        ~            ~              ~    +18.11%              ~     +2.53%
TypeAssert                           +33.90%       +2.86%        ~            ~              ~     -1.07%         -5.29%     -1.04%
Div64UnsignedSmall                         ~            ~        ~            ~              ~          ~              ~          ~
Div64Small                                 ~            ~        ~            ~              ~     -0.88%              ~     +2.39%
Div64SmallNegDivisor                       ~            ~        ~            ~              ~          ~              ~     +0.35%
Div64SmallNegDividend                      ~            ~        ~            ~              ~     -0.84%              ~     +3.57%
Div64SmallNegBoth                          ~            ~        ~            ~              ~     -0.86%              ~     +3.55%
Div64Unsigned                              ~            ~        ~            ~              ~          ~              ~     -0.11%
Div64                                      ~            ~        ~            ~              ~          ~              ~     +0.11%
Div64NegDivisor                            ~            ~        ~            ~              ~     -1.29%              ~          ~
Div64NegDividend                           ~            ~        ~            ~              ~     -1.44%              ~          ~
Div64NegBoth                               ~            ~        ~            ~              ~          ~              ~     +0.28%
Mod64UnsignedSmall                         ~            ~        ~            ~              ~     +0.48%              ~     +0.93%
Mod64Small                                 ~            ~        ~            ~              ~          ~              ~          ~
Mod64SmallNegDivisor                       ~            ~        ~            ~              ~          ~              ~     +1.44%
Mod64SmallNegDividend                      ~            ~        ~            ~              ~     +0.22%              ~     +1.37%
Mod64SmallNegBoth                          ~            ~        ~            ~              ~          ~              ~     -2.22%
Mod64Unsigned                              ~            ~        ~            ~              ~     -0.95%              ~     +0.11%
Mod64                                      ~            ~        ~            ~              ~          ~              ~          ~
Mod64NegDivisor                            ~            ~        ~            ~              ~          ~              ~     -0.02%
Mod64NegDividend                           ~            ~        ~            ~              ~          ~              ~          ~
Mod64NegBoth                               ~            ~        ~            ~              ~          ~              ~     -0.02%
MulconstI32/3                              ~            ~        ~      -25.00%              ~          ~              ~    +47.37%
MulconstI32/5                              ~            ~        ~      +33.28%              ~          ~              ~    +32.21%
MulconstI32/12                             ~            ~        ~       -2.13%              ~          ~              ~     -0.02%
MulconstI32/120                            ~            ~        ~       +2.93%              ~          ~              ~     -0.03%
MulconstI32/-120                           ~            ~        ~       -2.17%              ~          ~              ~     -0.03%
MulconstI32/65537                          ~            ~        ~            ~              ~          ~              ~     +0.03%
MulconstI32/65538                          ~            ~        ~            ~              ~    -33.38%              ~     +0.04%
MulconstI64/3                              ~            ~        ~      +33.35%              ~     -0.37%              ~     -0.13%
MulconstI64/5                              ~            ~        ~      -25.00%              ~     -0.34%              ~          ~
MulconstI64/12                             ~            ~        ~       +2.13%              ~    +11.62%              ~     +2.30%
MulconstI64/120                            ~            ~        ~       -1.98%              ~          ~              ~          ~
MulconstI64/-120                           ~            ~        ~       +0.75%              ~          ~              ~          ~
MulconstI64/65537                          ~            ~        ~            ~              ~     +5.61%              ~          ~
MulconstI64/65538                          ~            ~        ~            ~              ~     +5.25%              ~          ~
MulconstU32/3                              ~       +0.81%        ~      +33.39%              ~    +77.92%              ~    -32.31%
MulconstU32/5                              ~            ~        ~      -24.97%              ~    +77.92%              ~    -24.47%
MulconstU32/12                             ~            ~        ~       +2.06%              ~          ~              ~     +0.03%
MulconstU32/120                            ~            ~        ~       -2.74%              ~          ~              ~     +0.03%
MulconstU32/65537                          ~            ~        ~            ~              ~          ~              ~     +0.03%
MulconstU32/65538                          ~            ~        ~            ~              ~    -33.42%              ~     -0.03%
MulconstU64/3                              ~            ~        ~      +33.33%              ~     -0.28%              ~     +1.22%
MulconstU64/5                              ~            ~        ~      -25.00%              ~          ~              ~     -0.64%
MulconstU64/12                             ~            ~        ~       +2.30%              ~    +11.59%              ~     +0.14%
MulconstU64/120                            ~            ~        ~       -2.82%              ~          ~              ~     +0.04%
MulconstU64/65537                          ~       +0.37%        ~            ~              ~     +5.58%              ~          ~
MulconstU64/65538                          ~            ~        ~            ~              ~     +5.16%              ~          ~
ShiftArithmeticRight                       ~            ~        ~            ~              ~    -10.81%              ~     +0.31%
Switch8Predictable                   +14.69%            ~        ~            ~              ~    -24.85%              ~          ~
Switch8Unpredictable                       ~       -0.58%   -3.80%            ~              ~    -11.78%              ~     -0.79%
Switch32Predictable                  -10.33%      +17.89%        ~            ~              ~     +5.76%              ~          ~
Switch32Unpredictable                 -3.15%       +1.19%   +9.42%            ~              ~    -10.30%         -5.09%     +0.44%
SwitchStringPredictable              +70.88%      +20.48%        ~            ~              ~     +2.39%              ~     +0.31%
SwitchStringUnpredictable                  ~       +3.91%   -5.06%       -0.98%              ~     +0.61%         +2.03%          ~
SwitchTypePredictable               +146.58%       -1.10%        ~      -12.45%              ~     -0.46%         -3.81%          ~
SwitchTypeUnpredictable               +0.46%       -0.83%        ~       +4.18%              ~     +0.43%              ~     +0.62%
SwitchInterfaceTypePredictable       -13.41%      -10.13%  +11.03%            ~              ~     -4.38%              ~     +0.75%
SwitchInterfaceTypeUnpredictable      -6.37%       -2.14%        ~       -3.21%              ~     -4.20%              ~     +1.08%

Fixes #63110.
Fixes #75954.

Change-Id: I55a876f08c6c14f419ce1a8cbba2eaae6c6efbf0
Reviewed-on: https://go-review.googlesource.com/c/go/+/714160
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-10-29 18:49:40 -07:00
Jorropo
1a72920f09 cmd/compile: learn transitive proofs for safe positive signed adds
I've split this into it's own CL to make git bisect more effective.

Change-Id: I3fbb42ec7d29169a29f7f55ef2c188317512f532
Reviewed-on: https://go-review.googlesource.com/c/go/+/685819
Auto-Submit: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2025-07-24 13:48:59 -07:00
khr@golang.org
3b96eebcbd cmd/compile: rewrite the constant parts of the prove pass
Handles a lot more cases where constant ranges can eliminate
various (mostly bounds failure) paths.

Fixes #66826
Fixes #66692
Fixes #48213
Update #57959

TODO: remove constant logic from poset code, no longer needed.

Change-Id: Id196436fcd8a0c84c7d59c04f93bd92e26a0fd7e
Reviewed-on: https://go-review.googlesource.com/c/go/+/599096
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
2024-08-07 16:07:33 +00:00
Dmitri Shuralyov
b2fd76ab8d test: migrate remaining files to go:build syntax
Most of the test cases in the test directory use the new go:build syntax
already. Convert the rest. In general, try to place the build constraint
line below the test directive comment in more places.

For #41184.
For #60268.

Change-Id: I11c41a0642a8a26dc2eda1406da908645bbc005b
Cq-Include-Trybots: luci.golang.try:gotip-linux-386-longtest,gotip-linux-amd64-longtest,gotip-windows-amd64-longtest
Reviewed-on: https://go-review.googlesource.com/c/go/+/536236
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Dmitri Shuralyov <dmitshur@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
2023-10-19 23:33:25 +00:00
Jorropo
e1e056fa6a cmd/compile: fold constants found by prove
It is hit ~70k times building go.
This make the go binary, 0.04% smaller.
I didn't included benchmarks because this is just constant foldings
and is hard to mesure objectively.

For example, this enable rewriting things like:
  if x == 20 {
    return x + 30 + z
  }

Into:
  if x == 20 {
    return 50 + z
  }

It's not just fixing programer's code,
the ssa generator generate code like this sometimes.

Change-Id: I0861f342b27f7227b5f1c34d8267fa0057b1bbbc
GitHub-Last-Rev: 4c2f9b5216
GitHub-Pull-Request: golang/go#52669
Reviewed-on: https://go-review.googlesource.com/c/go/+/403735
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Chase <drchase@google.com>
2022-05-04 20:30:17 +00:00