cmd/compile: make prove understand div, mod better

This CL introduces new divisible and divmod passes that rewrite
divisibility checks and div, mod, and mul. These happen after
prove, so that prove can make better sense of the code for
deriving bounds, and they must run before decompose, so that
64-bit ops can be lowered to 32-bit ops on 32-bit systems.
And then they need another generic pass as well, to optimize
the generated code before decomposing.

The three opt passes are "opt", "middle opt", and "late opt".
(Perhaps instead they should be "generic", "opt", and "late opt"?)

The "late opt" pass repeats the "middle opt" work on any new code
that has been generated in the interim.
There will not be new divs or mods, but there may be new muls.

The x%c==0 rewrite rules are much simpler now, since they can
match before divs have been rewritten. This has the effect of
applying them more consistently and making the rewrite rules
independent of the exact div rewrites.

Prove is also now charged with marking signed div/mod as
unsigned when the arguments call for it, allowing simpler
code to be emitted in various cases. For example,
t.Seconds()/2 and len(x)/2 are now recognized as unsigned,
meaning they compile to a simple shift (unsigned division),
avoiding the more complex fixup we need for signed values.

https://gist.github.com/rsc/99d9d3bd99cde87b6a1a390e3d85aa32
shows a diff of 'go build -a -gcflags=-d=ssa/prove/debug=1 std'
output before and after. "Proved Rsh64x64 shifts to zero" is replaced
by the higher-level "Proved Div64 is unsigned" (the shift was in the
signed expansion of div by constant), but otherwise prove is only
finding more things to prove.

One short example, in code that does x[i%len(x)]:

< runtime/mfinal.go:131:34: Proved Rsh64x64 shifts to zero
---
> runtime/mfinal.go:131:34: Proved Div64 is unsigned
> runtime/mfinal.go:131:38: Proved IsInBounds

A longer example:

< crypto/internal/fips140/sha3/shake.go:28:30: Proved Rsh64x64 shifts to zero
< crypto/internal/fips140/sha3/shake.go:38:27: Proved Rsh64x64 shifts to zero
< crypto/internal/fips140/sha3/shake.go:53:46: Proved Rsh64x64 shifts to zero
< crypto/internal/fips140/sha3/shake.go:55:46: Proved Rsh64x64 shifts to zero
---
> crypto/internal/fips140/sha3/shake.go:28:30: Proved Div64 is unsigned
> crypto/internal/fips140/sha3/shake.go:28:30: Proved IsInBounds
> crypto/internal/fips140/sha3/shake.go:28:30: Proved IsSliceInBounds
> crypto/internal/fips140/sha3/shake.go:38:27: Proved Div64 is unsigned
> crypto/internal/fips140/sha3/shake.go:45:7: Proved IsSliceInBounds
> crypto/internal/fips140/sha3/shake.go:46:4: Proved IsInBounds
> crypto/internal/fips140/sha3/shake.go:53:46: Proved Div64 is unsigned
> crypto/internal/fips140/sha3/shake.go:53:46: Proved IsInBounds
> crypto/internal/fips140/sha3/shake.go:53:46: Proved IsSliceInBounds
> crypto/internal/fips140/sha3/shake.go:55:46: Proved Div64 is unsigned
> crypto/internal/fips140/sha3/shake.go:55:46: Proved IsInBounds
> crypto/internal/fips140/sha3/shake.go:55:46: Proved IsSliceInBounds

These diffs are due to the smaller opt being better
and taking work away from prove:

< image/jpeg/dct.go:307:5: Proved IsInBounds
< image/jpeg/dct.go:308:5: Proved IsInBounds
...
< image/jpeg/dct.go:442:5: Proved IsInBounds

In the old opt, Mul by 8 was rewritten to Lsh by 3 early.
This CL delays that rule to help prove recognize mods,
but it also helps opt constant-fold the slice x[8*i:8*i+8:8*i+8].
Specifically, computing the length, opt can now do:

	(Sub64 (Add (Mul 8 i) 8) (Add (Mul 8 i) 8)) ->
	(Add 8 (Sub (Mul 8 i) (Mul 8 i))) ->
	(Add 8 (Mul 8 (Sub i i))) ->
	(Add 8 (Mul 8 0)) ->
	(Add 8 0) ->
	8

The key step is (Sub (Mul x y) (Mul x z)) -> (Mul x (Sub y z)),
Leaving the multiply as Mul enables using that step; the old
rewrite to Lsh blocked it, leaving prove to figure out the length
and then remove the bounds checks. But now opt can evaluate
the length down to a constant 8 and then constant-fold away
the bounds checks 0 < 8, 1 < 8, and so on. After that,
the compiler has nothing left to prove.

Benchmarks are noisy in general; I checked the assembly for the many
large increases below, and the vast majority are unchanged and
presumably hitting the caches differently in some way.

The divisibility optimizations were not reliably triggering before.
This leads to a very large improvement in some cases, like
DivisiblePow2constI64, DivisibleconstI64 on 64-bit systems
and DivisbleconstU64 on 32-bit systems.

Another way the divisibility optimizations were unreliable before
was incorrectly triggering for x/3, x%3 even though they are
written not to do that. There is a real but small slowdown
in the DivisibleWDivconst benchmarks on Mac because in the cases
used in the benchmark, it is still faster (on Mac) to do the
divisibility check than to remultiply.
This may be worth further study. Perhaps when there is no rotate
(meaning the divisor is odd), the divisibility optimization
should be enabled always. In any event, this CL makes it possible
to study that.

benchmark \ host                          s7  linux-amd64      mac  linux-arm64  linux-ppc64le  linux-386  s7:GOARCH=386  linux-arm
                                     vs base      vs base  vs base      vs base        vs base    vs base        vs base    vs base
LoadAdd                                    ~            ~        ~            ~              ~     -1.59%              ~          ~
ExtShift                                   ~            ~  -42.14%       +0.10%              ~     +1.44%         +5.66%     +8.50%
Modify                                     ~            ~        ~            ~              ~          ~              ~     -1.53%
MullImm                                    ~            ~        ~            ~              ~    +37.90%        -21.87%     +3.05%
ConstModify                                ~            ~        ~            ~        -49.14%          ~              ~          ~
BitSet                                     ~            ~        ~            ~        -15.86%    -14.57%         +6.44%     +0.06%
BitClear                                   ~            ~        ~            ~              ~     +1.78%         +3.50%     +0.06%
BitToggle                                  ~            ~        ~            ~              ~    -16.09%         +2.91%          ~
BitSetConst                                ~            ~        ~            ~              ~          ~              ~     -0.49%
BitClearConst                              ~            ~        ~            ~        -28.29%          ~              ~     -0.40%
BitToggleConst                             ~            ~        ~       +8.89%        -31.19%          ~              ~     -0.77%
MulNeg                                     ~            ~        ~            ~              ~          ~              ~          ~
Mul2Neg                                    ~            ~   -4.83%            ~              ~    -13.75%         -5.92%          ~
DivconstI64                                ~            ~        ~            ~              ~    -30.12%              ~     +0.50%
ModconstI64                                ~            ~   -9.94%       -4.63%              ~     +3.15%              ~     +5.32%
DivisiblePow2constI64                -34.49%      -12.58%        ~            ~        -12.25%          ~              ~          ~
DivisibleconstI64                    -24.69%      -25.06%   -0.40%       -2.27%        -42.61%     -3.31%              ~     +1.63%
DivisibleWDivconstI64                      ~            ~        ~            ~              ~    -17.55%              ~     -0.60%
DivconstU64/3                              ~            ~        ~            ~              ~     +1.51%              ~          ~
DivconstU64/5                              ~            ~        ~            ~              ~          ~              ~          ~
DivconstU64/37                             ~            ~   -0.18%            ~              ~     +2.70%              ~          ~
DivconstU64/1234567                        ~            ~        ~            ~              ~          ~              ~     +0.12%
ModconstU64                                ~            ~        ~       -0.24%              ~     -5.10%         -1.07%     -1.56%
DivisibleconstU64                          ~            ~        ~            ~              ~    -29.01%        -59.13%    -50.72%
DivisibleWDivconstU64                      ~            ~  -12.18%      -18.88%              ~     -5.50%         -3.91%     +5.17%
DivconstI32                                ~            ~   -0.48%            ~        -34.69%    +89.01%         -6.01%    -16.67%
ModconstI32                                ~       +2.95%   -0.33%            ~              ~     -2.98%         -5.40%     -8.30%
DivisiblePow2constI32                      ~            ~        ~            ~              ~          ~              ~    -16.22%
DivisibleconstI32                          ~            ~        ~            ~              ~    -37.27%        -47.75%    -25.03%
DivisibleWDivconstI32                -11.59%       +5.22%  -12.99%      -23.83%              ~    +45.95%         -7.03%    -10.01%
DivconstU32                                ~            ~        ~            ~              ~    +74.71%         +4.81%          ~
ModconstU32                                ~            ~   +0.53%       +0.18%              ~    +51.16%              ~          ~
DivisibleconstU32                          ~            ~        ~       -0.62%              ~     -4.25%              ~          ~
DivisibleWDivconstU32                 -2.77%       +5.56%  +11.12%       -5.15%              ~    +48.70%        +25.11%     -4.07%
DivconstI16                           -6.06%            ~   -0.33%       +0.22%              ~          ~         -9.68%     +5.47%
ModconstI16                                ~            ~   +4.44%       +2.82%              ~          ~              ~     +5.06%
DivisiblePow2constI16                      ~            ~        ~            ~              ~          ~              ~     -0.17%
DivisibleconstI16                          ~            ~   -0.23%            ~              ~          ~         +4.60%     +6.64%
DivisibleWDivconstI16                 -1.44%       -0.43%  +13.48%       -5.76%              ~     +1.62%        -23.15%     -9.06%
DivconstU16                           +1.61%            ~   -0.35%       -0.47%              ~          ~        +15.59%          ~
ModconstU16                                ~            ~        ~            ~              ~     -0.72%              ~    +14.23%
DivisibleconstU16                          ~            ~   -0.05%       +3.00%              ~          ~              ~     +5.06%
DivisibleWDivconstU16                +52.10%       +0.75%  +17.28%       +4.79%              ~    -37.39%         +5.28%     -9.06%
DivconstI8                                 ~            ~   -0.34%       -0.96%              ~          ~         -9.20%          ~
ModconstI8                            +2.29%            ~   +4.38%       +2.96%              ~          ~              ~          ~
DivisiblePow2constI8                       ~            ~        ~            ~              ~          ~              ~          ~
DivisibleconstI8                           ~            ~        ~            ~              ~          ~         +6.04%          ~
DivisibleWDivconstI8                 -26.44%       +1.69%  +17.03%       +4.05%              ~    +32.48%        -24.90%          ~
DivconstU8                            -4.50%      +14.06%   -0.28%            ~              ~          ~         +4.16%     +0.88%
ModconstU8                                 ~            ~  +25.84%       -0.64%              ~          ~              ~          ~
DivisibleconstU8                           ~            ~   -5.70%            ~              ~          ~              ~          ~
DivisibleWDivconstU8                 +49.55%       +9.07%        ~       +4.03%        +53.87%    -40.03%        +39.72%     -3.01%
Mul2                                       ~            ~        ~            ~              ~          ~              ~          ~
MulNeg2                                    ~            ~        ~            ~        -11.73%          ~              ~     -0.02%
EfaceInteger                               ~            ~        ~            ~              ~    +18.11%              ~     +2.53%
TypeAssert                           +33.90%       +2.86%        ~            ~              ~     -1.07%         -5.29%     -1.04%
Div64UnsignedSmall                         ~            ~        ~            ~              ~          ~              ~          ~
Div64Small                                 ~            ~        ~            ~              ~     -0.88%              ~     +2.39%
Div64SmallNegDivisor                       ~            ~        ~            ~              ~          ~              ~     +0.35%
Div64SmallNegDividend                      ~            ~        ~            ~              ~     -0.84%              ~     +3.57%
Div64SmallNegBoth                          ~            ~        ~            ~              ~     -0.86%              ~     +3.55%
Div64Unsigned                              ~            ~        ~            ~              ~          ~              ~     -0.11%
Div64                                      ~            ~        ~            ~              ~          ~              ~     +0.11%
Div64NegDivisor                            ~            ~        ~            ~              ~     -1.29%              ~          ~
Div64NegDividend                           ~            ~        ~            ~              ~     -1.44%              ~          ~
Div64NegBoth                               ~            ~        ~            ~              ~          ~              ~     +0.28%
Mod64UnsignedSmall                         ~            ~        ~            ~              ~     +0.48%              ~     +0.93%
Mod64Small                                 ~            ~        ~            ~              ~          ~              ~          ~
Mod64SmallNegDivisor                       ~            ~        ~            ~              ~          ~              ~     +1.44%
Mod64SmallNegDividend                      ~            ~        ~            ~              ~     +0.22%              ~     +1.37%
Mod64SmallNegBoth                          ~            ~        ~            ~              ~          ~              ~     -2.22%
Mod64Unsigned                              ~            ~        ~            ~              ~     -0.95%              ~     +0.11%
Mod64                                      ~            ~        ~            ~              ~          ~              ~          ~
Mod64NegDivisor                            ~            ~        ~            ~              ~          ~              ~     -0.02%
Mod64NegDividend                           ~            ~        ~            ~              ~          ~              ~          ~
Mod64NegBoth                               ~            ~        ~            ~              ~          ~              ~     -0.02%
MulconstI32/3                              ~            ~        ~      -25.00%              ~          ~              ~    +47.37%
MulconstI32/5                              ~            ~        ~      +33.28%              ~          ~              ~    +32.21%
MulconstI32/12                             ~            ~        ~       -2.13%              ~          ~              ~     -0.02%
MulconstI32/120                            ~            ~        ~       +2.93%              ~          ~              ~     -0.03%
MulconstI32/-120                           ~            ~        ~       -2.17%              ~          ~              ~     -0.03%
MulconstI32/65537                          ~            ~        ~            ~              ~          ~              ~     +0.03%
MulconstI32/65538                          ~            ~        ~            ~              ~    -33.38%              ~     +0.04%
MulconstI64/3                              ~            ~        ~      +33.35%              ~     -0.37%              ~     -0.13%
MulconstI64/5                              ~            ~        ~      -25.00%              ~     -0.34%              ~          ~
MulconstI64/12                             ~            ~        ~       +2.13%              ~    +11.62%              ~     +2.30%
MulconstI64/120                            ~            ~        ~       -1.98%              ~          ~              ~          ~
MulconstI64/-120                           ~            ~        ~       +0.75%              ~          ~              ~          ~
MulconstI64/65537                          ~            ~        ~            ~              ~     +5.61%              ~          ~
MulconstI64/65538                          ~            ~        ~            ~              ~     +5.25%              ~          ~
MulconstU32/3                              ~       +0.81%        ~      +33.39%              ~    +77.92%              ~    -32.31%
MulconstU32/5                              ~            ~        ~      -24.97%              ~    +77.92%              ~    -24.47%
MulconstU32/12                             ~            ~        ~       +2.06%              ~          ~              ~     +0.03%
MulconstU32/120                            ~            ~        ~       -2.74%              ~          ~              ~     +0.03%
MulconstU32/65537                          ~            ~        ~            ~              ~          ~              ~     +0.03%
MulconstU32/65538                          ~            ~        ~            ~              ~    -33.42%              ~     -0.03%
MulconstU64/3                              ~            ~        ~      +33.33%              ~     -0.28%              ~     +1.22%
MulconstU64/5                              ~            ~        ~      -25.00%              ~          ~              ~     -0.64%
MulconstU64/12                             ~            ~        ~       +2.30%              ~    +11.59%              ~     +0.14%
MulconstU64/120                            ~            ~        ~       -2.82%              ~          ~              ~     +0.04%
MulconstU64/65537                          ~       +0.37%        ~            ~              ~     +5.58%              ~          ~
MulconstU64/65538                          ~            ~        ~            ~              ~     +5.16%              ~          ~
ShiftArithmeticRight                       ~            ~        ~            ~              ~    -10.81%              ~     +0.31%
Switch8Predictable                   +14.69%            ~        ~            ~              ~    -24.85%              ~          ~
Switch8Unpredictable                       ~       -0.58%   -3.80%            ~              ~    -11.78%              ~     -0.79%
Switch32Predictable                  -10.33%      +17.89%        ~            ~              ~     +5.76%              ~          ~
Switch32Unpredictable                 -3.15%       +1.19%   +9.42%            ~              ~    -10.30%         -5.09%     +0.44%
SwitchStringPredictable              +70.88%      +20.48%        ~            ~              ~     +2.39%              ~     +0.31%
SwitchStringUnpredictable                  ~       +3.91%   -5.06%       -0.98%              ~     +0.61%         +2.03%          ~
SwitchTypePredictable               +146.58%       -1.10%        ~      -12.45%              ~     -0.46%         -3.81%          ~
SwitchTypeUnpredictable               +0.46%       -0.83%        ~       +4.18%              ~     +0.43%              ~     +0.62%
SwitchInterfaceTypePredictable       -13.41%      -10.13%  +11.03%            ~              ~     -4.38%              ~     +0.75%
SwitchInterfaceTypeUnpredictable      -6.37%       -2.14%        ~       -3.21%              ~     -4.20%              ~     +1.08%

Fixes #63110.
Fixes #75954.

Change-Id: I55a876f08c6c14f419ce1a8cbba2eaae6c6efbf0
Reviewed-on: https://go-review.googlesource.com/c/go/+/714160
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Auto-Submit: Russ Cox <rsc@golang.org>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
This commit is contained in:
Russ Cox 2025-10-22 22:22:51 -04:00 committed by Gopher Robot
parent 915c1839fe
commit 9bbda7c99d
25 changed files with 6190 additions and 4205 deletions

View file

@ -4,7 +4,7 @@
// This file contains rules to decompose builtin compound types
// (complex,string,slice,interface) into their constituent
// types. These rules work together with the decomposeBuiltIn
// types. These rules work together with the decomposeBuiltin
// pass which handles phis of these types.
(Store {t} _ _ mem) && t.Size() == 0 => mem

View file

@ -3,7 +3,7 @@
// license that can be found in the LICENSE file.
// This file contains rules to decompose [u]int64 types on 32-bit
// architectures. These rules work together with the decomposeBuiltIn
// architectures. These rules work together with the decomposeBuiltin
// pass which handles phis of these typ.
(Int64Hi (Int64Make hi _)) => hi
@ -217,11 +217,32 @@
(Rsh8x64 x y) => (Rsh8x32 x (Or32 <typ.UInt32> (Zeromask (Int64Hi y)) (Int64Lo y)))
(Rsh8Ux64 x y) => (Rsh8Ux32 x (Or32 <typ.UInt32> (Zeromask (Int64Hi y)) (Int64Lo y)))
(RotateLeft64 x (Int64Make hi lo)) => (RotateLeft64 x lo)
(RotateLeft32 x (Int64Make hi lo)) => (RotateLeft32 x lo)
(RotateLeft16 x (Int64Make hi lo)) => (RotateLeft16 x lo)
(RotateLeft8 x (Int64Make hi lo)) => (RotateLeft8 x lo)
// RotateLeft64 by constant, for use in divmod.
(RotateLeft64 <t> x (Const(64|32|16|8) [c])) && c&63 == 0 => x
(RotateLeft64 <t> x (Const(64|32|16|8) [c])) && c&63 == 32 => (Int64Make <t> (Int64Lo x) (Int64Hi x))
(RotateLeft64 <t> x (Const(64|32|16|8) [c])) && 0 < c&63 && c&63 < 32 =>
(Int64Make <t>
(Or32 <typ.UInt32>
(Lsh32x32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(c&31)]))
(Rsh32Ux32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(32-c&31)])))
(Or32 <typ.UInt32>
(Lsh32x32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(c&31)]))
(Rsh32Ux32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(32-c&31)]))))
(RotateLeft64 <t> x (Const(64|32|16|8) [c])) && 32 < c&63 && c&63 < 64 =>
(Int64Make <t>
(Or32 <typ.UInt32>
(Lsh32x32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(c&31)]))
(Rsh32Ux32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(32-c&31)])))
(Or32 <typ.UInt32>
(Lsh32x32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(c&31)]))
(Rsh32Ux32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(32-c&31)]))))
// Clean up constants a little
(Or32 <typ.UInt32> (Zeromask (Const32 [c])) y) && c == 0 => y
(Or32 <typ.UInt32> (Zeromask (Const32 [c])) y) && c != 0 => (Const32 <typ.UInt32> [-1])

View file

@ -0,0 +1,167 @@
// Copyright 2025 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Divisibility checks (x%c == 0 or x%c != 0) convert to multiply, rotate, compare.
// The opt pass rewrote x%c to x-(x/c)*c
// and then also rewrote x-(x/c)*c == 0 to x == (x/c)*c.
// If x/c is being used for a division already (div.Uses != 1)
// then we leave the expression alone.
//
// See ../magic.go for a detailed description of these algorithms.
// See test/codegen/divmod.go for tests.
// See divmod.rules for other division rules that run after these.
// Divisiblity by unsigned or signed power of two.
(Eq(8|16|32|64) x (Mul(8|16|32|64) <t> (Div(8|16|32|64)u x (Const(8|16|32|64) [c])) (Const(8|16|32|64) [c])))
&& x.Op != OpConst64 && isPowerOfTwo(c) =>
(Eq(8|16|32|64) (And(8|16|32|64) <t> x (Const(8|16|32|64) <t> [c-1])) (Const(8|16|32|64) <t> [0]))
(Eq(8|16|32|64) x (Mul(8|16|32|64) <t> (Div(8|16|32|64) x (Const(8|16|32|64) [c])) (Const(8|16|32|64) [c])))
&& x.Op != OpConst64 && isPowerOfTwo(c) =>
(Eq(8|16|32|64) (And(8|16|32|64) <t> x (Const(8|16|32|64) <t> [c-1])) (Const(8|16|32|64) <t> [0]))
(Neq(8|16|32|64) x (Mul(8|16|32|64) <t> (Div(8|16|32|64)u x (Const(8|16|32|64) [c])) (Const(8|16|32|64) [c])))
&& x.Op != OpConst64 && isPowerOfTwo(c) =>
(Neq(8|16|32|64) (And(8|16|32|64) <t> x (Const(8|16|32|64) <t> [c-1])) (Const(8|16|32|64) <t> [0]))
(Neq(8|16|32|64) x (Mul(8|16|32|64) <t> (Div(8|16|32|64) x (Const(8|16|32|64) [c])) (Const(8|16|32|64) [c])))
&& x.Op != OpConst64 && isPowerOfTwo(c) =>
(Neq(8|16|32|64) (And(8|16|32|64) <t> x (Const(8|16|32|64) <t> [c-1])) (Const(8|16|32|64) <t> [0]))
// Divisiblity by unsigned.
(Eq8 x (Mul8 <t> div:(Div8u x (Const8 [c])) (Const8 [c])))
&& div.Uses == 1
&& x.Op != OpConst8 && udivisibleOK8(c) =>
(Leq8U
(RotateLeft8 <t>
(Mul8 <t> x (Const8 <t> [int8(udivisible8(c).m)]))
(Const8 <t> [int8(8 - udivisible8(c).k)]))
(Const8 <t> [int8(udivisible8(c).max)]))
(Neq8 x (Mul8 <t> div:(Div8u x (Const8 [c])) (Const8 [c])))
&& div.Uses == 1
&& x.Op != OpConst8 && udivisibleOK8(c) =>
(Less8U
(Const8 <t> [int8(udivisible8(c).max)])
(RotateLeft8 <t>
(Mul8 <t> x (Const8 <t> [int8(udivisible8(c).m)]))
(Const8 <t> [int8(8 - udivisible8(c).k)])))
(Eq16 x (Mul16 <t> div:(Div16u x (Const16 [c])) (Const16 [c])))
&& div.Uses == 1
&& x.Op != OpConst16 && udivisibleOK16(c) =>
(Leq16U
(RotateLeft16 <t>
(Mul16 <t> x (Const16 <t> [int16(udivisible16(c).m)]))
(Const16 <t> [int16(16 - udivisible16(c).k)]))
(Const16 <t> [int16(udivisible16(c).max)]))
(Neq16 x (Mul16 <t> div:(Div16u x (Const16 [c])) (Const16 [c])))
&& div.Uses == 1
&& x.Op != OpConst16 && udivisibleOK16(c) =>
(Less16U
(Const16 <t> [int16(udivisible16(c).max)])
(RotateLeft16 <t>
(Mul16 <t> x (Const16 <t> [int16(udivisible16(c).m)]))
(Const16 <t> [int16(16 - udivisible16(c).k)])))
(Eq32 x (Mul32 <t> div:(Div32u x (Const32 [c])) (Const32 [c])))
&& div.Uses == 1
&& x.Op != OpConst32 && udivisibleOK32(c) =>
(Leq32U
(RotateLeft32 <t>
(Mul32 <t> x (Const32 <t> [int32(udivisible32(c).m)]))
(Const32 <t> [int32(32 - udivisible32(c).k)]))
(Const32 <t> [int32(udivisible32(c).max)]))
(Neq32 x (Mul32 <t> div:(Div32u x (Const32 [c])) (Const32 [c])))
&& div.Uses == 1
&& x.Op != OpConst32 && udivisibleOK32(c) =>
(Less32U
(Const32 <t> [int32(udivisible32(c).max)])
(RotateLeft32 <t>
(Mul32 <t> x (Const32 <t> [int32(udivisible32(c).m)]))
(Const32 <t> [int32(32 - udivisible32(c).k)])))
(Eq64 x (Mul64 <t> div:(Div64u x (Const64 [c])) (Const64 [c])))
&& div.Uses == 1
&& x.Op != OpConst64 && udivisibleOK64(c) =>
(Leq64U
(RotateLeft64 <t>
(Mul64 <t> x (Const64 <t> [int64(udivisible64(c).m)]))
(Const64 <t> [int64(64 - udivisible64(c).k)]))
(Const64 <t> [int64(udivisible64(c).max)]))
(Neq64 x (Mul64 <t> div:(Div64u x (Const64 [c])) (Const64 [c])))
&& div.Uses == 1
&& x.Op != OpConst64 && udivisibleOK64(c) =>
(Less64U
(Const64 <t> [int64(udivisible64(c).max)])
(RotateLeft64 <t>
(Mul64 <t> x (Const64 <t> [int64(udivisible64(c).m)]))
(Const64 <t> [int64(64 - udivisible64(c).k)])))
// Divisiblity by signed.
(Eq8 x (Mul8 <t> div:(Div8 x (Const8 [c])) (Const8 [c])))
&& div.Uses == 1
&& x.Op != OpConst8 && sdivisibleOK8(c) =>
(Leq8U
(RotateLeft8 <t>
(Add8 <t> (Mul8 <t> x (Const8 <t> [int8(sdivisible8(c).m)]))
(Const8 <t> [int8(sdivisible8(c).a)]))
(Const8 <t> [int8(8 - sdivisible8(c).k)]))
(Const8 <t> [int8(sdivisible8(c).max)]))
(Neq8 x (Mul8 <t> div:(Div8 x (Const8 [c])) (Const8 [c])))
&& div.Uses == 1
&& x.Op != OpConst8 && sdivisibleOK8(c) =>
(Less8U
(Const8 <t> [int8(sdivisible8(c).max)])
(RotateLeft8 <t>
(Add8 <t> (Mul8 <t> x (Const8 <t> [int8(sdivisible8(c).m)]))
(Const8 <t> [int8(sdivisible8(c).a)]))
(Const8 <t> [int8(8 - sdivisible8(c).k)])))
(Eq16 x (Mul16 <t> div:(Div16 x (Const16 [c])) (Const16 [c])))
&& div.Uses == 1
&& x.Op != OpConst16 && sdivisibleOK16(c) =>
(Leq16U
(RotateLeft16 <t>
(Add16 <t> (Mul16 <t> x (Const16 <t> [int16(sdivisible16(c).m)]))
(Const16 <t> [int16(sdivisible16(c).a)]))
(Const16 <t> [int16(16 - sdivisible16(c).k)]))
(Const16 <t> [int16(sdivisible16(c).max)]))
(Neq16 x (Mul16 <t> div:(Div16 x (Const16 [c])) (Const16 [c])))
&& div.Uses == 1
&& x.Op != OpConst16 && sdivisibleOK16(c) =>
(Less16U
(Const16 <t> [int16(sdivisible16(c).max)])
(RotateLeft16 <t>
(Add16 <t> (Mul16 <t> x (Const16 <t> [int16(sdivisible16(c).m)]))
(Const16 <t> [int16(sdivisible16(c).a)]))
(Const16 <t> [int16(16 - sdivisible16(c).k)])))
(Eq32 x (Mul32 <t> div:(Div32 x (Const32 [c])) (Const32 [c])))
&& div.Uses == 1
&& x.Op != OpConst32 && sdivisibleOK32(c) =>
(Leq32U
(RotateLeft32 <t>
(Add32 <t> (Mul32 <t> x (Const32 <t> [int32(sdivisible32(c).m)]))
(Const32 <t> [int32(sdivisible32(c).a)]))
(Const32 <t> [int32(32 - sdivisible32(c).k)]))
(Const32 <t> [int32(sdivisible32(c).max)]))
(Neq32 x (Mul32 <t> div:(Div32 x (Const32 [c])) (Const32 [c])))
&& div.Uses == 1
&& x.Op != OpConst32 && sdivisibleOK32(c) =>
(Less32U
(Const32 <t> [int32(sdivisible32(c).max)])
(RotateLeft32 <t>
(Add32 <t> (Mul32 <t> x (Const32 <t> [int32(sdivisible32(c).m)]))
(Const32 <t> [int32(sdivisible32(c).a)]))
(Const32 <t> [int32(32 - sdivisible32(c).k)])))
(Eq64 x (Mul64 <t> div:(Div64 x (Const64 [c])) (Const64 [c])))
&& div.Uses == 1
&& x.Op != OpConst64 && sdivisibleOK64(c) =>
(Leq64U
(RotateLeft64 <t>
(Add64 <t> (Mul64 <t> x (Const64 <t> [int64(sdivisible64(c).m)]))
(Const64 <t> [int64(sdivisible64(c).a)]))
(Const64 <t> [int64(64 - sdivisible64(c).k)]))
(Const64 <t> [int64(sdivisible64(c).max)]))
(Neq64 x (Mul64 <t> div:(Div64 x (Const64 [c])) (Const64 [c])))
&& div.Uses == 1
&& x.Op != OpConst64 && sdivisibleOK64(c) =>
(Less64U
(Const64 <t> [int64(sdivisible64(c).max)])
(RotateLeft64 <t>
(Add64 <t> (Mul64 <t> x (Const64 <t> [int64(sdivisible64(c).m)]))
(Const64 <t> [int64(sdivisible64(c).a)]))
(Const64 <t> [int64(64 - sdivisible64(c).k)])))

View file

@ -0,0 +1,18 @@
// Copyright 2025 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package main
var divisibleOps = []opData{}
var divisibleBlocks = []blockData{}
func init() {
archs = append(archs, arch{
name: "divisible",
ops: divisibleOps,
blocks: divisibleBlocks,
generic: true,
})
}

View file

@ -0,0 +1,288 @@
// Copyright 2025 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Lowering of mul, div, and mod operations.
// Runs after prove, so that prove can analyze div and mod ops
// directly instead of these obscured expansions,
// but before decompose builtin, so that 32-bit systems
// can still lower 64-bit ops to 32-bit ones.
//
// See ../magic.go for a detailed description of these algorithms.
// See test/codegen/divmod.go for tests.
// Unsigned div and mod by power of 2 handled in generic.rules.
// (The equivalent unsigned right shift and mask are simple enough for prove to analyze.)
// Signed divide by power of 2.
// n / c = n >> log(c) if n >= 0
// = (n+c-1) >> log(c) if n < 0
// We conditionally add c-1 by adding n>>63>>(64-log(c)) (first shift signed, second shift unsigned).
(Div8 <t> n (Const8 [c])) && isPowerOfTwo(c) =>
(Rsh8x64
(Add8 <t> n (Rsh8Ux64 <t> (Rsh8x64 <t> n (Const64 <typ.UInt64> [ 7])) (Const64 <typ.UInt64> [int64( 8-log8(c))])))
(Const64 <typ.UInt64> [int64(log8(c))]))
(Div16 <t> n (Const16 [c])) && isPowerOfTwo(c) =>
(Rsh16x64
(Add16 <t> n (Rsh16Ux64 <t> (Rsh16x64 <t> n (Const64 <typ.UInt64> [15])) (Const64 <typ.UInt64> [int64(16-log16(c))])))
(Const64 <typ.UInt64> [int64(log16(c))]))
(Div32 <t> n (Const32 [c])) && isPowerOfTwo(c) =>
(Rsh32x64
(Add32 <t> n (Rsh32Ux64 <t> (Rsh32x64 <t> n (Const64 <typ.UInt64> [31])) (Const64 <typ.UInt64> [int64(32-log32(c))])))
(Const64 <typ.UInt64> [int64(log32(c))]))
(Div64 <t> n (Const64 [c])) && isPowerOfTwo(c) =>
(Rsh64x64
(Add64 <t> n (Rsh64Ux64 <t> (Rsh64x64 <t> n (Const64 <typ.UInt64> [63])) (Const64 <typ.UInt64> [int64(64-log64(c))])))
(Const64 <typ.UInt64> [int64(log64(c))]))
// Divide, not a power of 2, by strength reduction to double-width multiply and shift.
//
// umagicN(c) computes m, s such that N-bit unsigned divide
// x/c = (x*((1<<N)+m))>>N>>s = ((x*m)>>N+x)>>s
// where the multiplies are unsigned.
// Note that the returned m is always N+1 bits; umagicN omits the high 1<<N bit.
// The difficult part is implementing the 2N+1-bit multiply,
// since in general we have only a 2N-bit multiply available.
//
// smagic(c) computes m, s such that N-bit signed divide
// x/c = (x*m)>>N>>s - bool2int(x < 0).
// Here m is an unsigned N-bit number but x is signed.
//
// In general the division cases are:
//
// 1. A signed divide where 2N ≤ the register size.
// This form can use the signed algorithm directly.
//
// 2. A signed divide where m is even.
// This form can use a signed double-width multiply with m/2,
// shifting by s-1.
//
// 3. A signed divide where m is odd.
// This form can use x*m = ((x*(m-2^N))>>N+x) with a signed multiply.
// Since intN(m) is m-2^N < 0, the product and x have different signs,
// so there can be no overflow on the addition.
//
// 4. An unsigned divide where we know x < 1<<(N-1).
// This form can use the signed algorithm without the bool2int fixup,
// and since we know the product is only 2N-1 bits, we can use an
// unsigned multiply to obtain the high N bits directly, regardless
// of whether m is odd or even.
//
// 5. An unsigned divide where 2N+1 ≤ the register size.
// This form uses the unsigned algorithm with an explicit (1<<N)+m.
//
// 6. An unsigned divide where the N+1-bit m is even.
// This form can use an N-bit m/2 instead and shift one less bit.
//
// 7. An unsigned divide where m is odd but c is even.
// This form can shift once and then divide by (c/2) instead.
// The magic number m for c is ⌈2^k/c⌉, so we can use
// (m+1)/2 = ⌈2^k/(c/2)⌉ instead.
//
// 8. An unsigned divide on systems with an avg instruction.
// We noted above that (x*((1<<N)+m))>>N>>s = ((x*m)>>N+x)>>s.
// Let hi = (x*m)>>N, so we want (hi+x) >> s = avg(hi, x) >> (s-1).
//
// 9. Unsigned 64-bit divide by 16-bit constant on 32-bit systems.
// Use long division with 16-bit digits.
//
// Note: All systems have Hmul and Avg except for wasm, and the
// wasm JITs may well apply all these optimizations already anyway,
// so it may be worth looking into avoiding this pass entirely on wasm
// and dropping all the useAvg useHmul uncertainty.
// Case 1. Signed divides where 2N ≤ register size.
(Div8 <t> x (Const8 [c])) && smagicOK8(c) =>
(Sub8 <t>
(Rsh32x64 <t>
(Mul32 <typ.UInt32> (SignExt8to32 x) (Const32 <typ.UInt32> [int32(smagic8(c).m)]))
(Const64 <typ.UInt64> [8 + smagic8(c).s]))
(Rsh32x64 <t> (SignExt8to32 x) (Const64 <typ.UInt64> [31])))
(Div16 <t> x (Const16 [c])) && smagicOK16(c) =>
(Sub16 <t>
(Rsh32x64 <t>
(Mul32 <typ.UInt32> (SignExt16to32 x) (Const32 <typ.UInt32> [int32(smagic16(c).m)]))
(Const64 <typ.UInt64> [16 + smagic16(c).s]))
(Rsh32x64 <t> (SignExt16to32 x) (Const64 <typ.UInt64> [31])))
(Div32 <t> x (Const32 [c])) && smagicOK32(c) && config.RegSize == 8 =>
(Sub32 <t>
(Rsh64x64 <t>
(Mul64 <typ.UInt64> (SignExt32to64 x) (Const64 <typ.UInt64> [int64(smagic32(c).m)]))
(Const64 <typ.UInt64> [32 + smagic32(c).s]))
(Rsh64x64 <t> (SignExt32to64 x) (Const64 <typ.UInt64> [63])))
// Case 2. Signed divides where m is even.
(Div32 <t> x (Const32 [c])) && smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 == 0 && config.useHmul =>
(Sub32 <t>
(Rsh32x64 <t>
(Hmul32 <t> x (Const32 <typ.UInt32> [int32(smagic32(c).m/2)]))
(Const64 <typ.UInt64> [smagic32(c).s - 1]))
(Rsh32x64 <t> x (Const64 <typ.UInt64> [31])))
(Div64 <t> x (Const64 [c])) && smagicOK64(c) && config.RegSize == 8 && smagic64(c).m&1 == 0 && config.useHmul =>
(Sub64 <t>
(Rsh64x64 <t>
(Hmul64 <t> x (Const64 <typ.UInt64> [int64(smagic64(c).m/2)]))
(Const64 <typ.UInt64> [smagic64(c).s - 1]))
(Rsh64x64 <t> x (Const64 <typ.UInt64> [63])))
// Case 3. Signed divides where m is odd.
(Div32 <t> x (Const32 [c])) && smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 != 0 && config.useHmul =>
(Sub32 <t>
(Rsh32x64 <t>
(Add32 <t> x (Hmul32 <t> x (Const32 <typ.UInt32> [int32(smagic32(c).m)])))
(Const64 <typ.UInt64> [smagic32(c).s]))
(Rsh32x64 <t> x (Const64 <typ.UInt64> [31])))
(Div64 <t> x (Const64 [c])) && smagicOK64(c) && config.RegSize == 8 && smagic64(c).m&1 != 0 && config.useHmul =>
(Sub64 <t>
(Rsh64x64 <t>
(Add64 <t> x (Hmul64 <t> x (Const64 <typ.UInt64> [int64(smagic64(c).m)])))
(Const64 <typ.UInt64> [smagic64(c).s]))
(Rsh64x64 <t> x (Const64 <typ.UInt64> [63])))
// Case 4. Unsigned divide where x < 1<<(N-1).
// Skip Div8u since case 5's handling is just as good.
(Div16u <t> x (Const16 [c])) && t.IsSigned() && smagicOK16(c) =>
(Rsh32Ux64 <t>
(Mul32 <typ.UInt32> (SignExt16to32 x) (Const32 <typ.UInt32> [int32(smagic16(c).m)]))
(Const64 <typ.UInt64> [16 + smagic16(c).s]))
(Div32u <t> x (Const32 [c])) && t.IsSigned() && smagicOK32(c) && config.RegSize == 8 =>
(Rsh64Ux64 <t>
(Mul64 <typ.UInt64> (SignExt32to64 x) (Const64 <typ.UInt64> [int64(smagic32(c).m)]))
(Const64 <typ.UInt64> [32 + smagic32(c).s]))
(Div32u <t> x (Const32 [c])) && t.IsSigned() && smagicOK32(c) && config.RegSize == 4 && config.useHmul =>
(Rsh32Ux64 <t>
(Hmul32u <typ.UInt32> x (Const32 <typ.UInt32> [int32(smagic32(c).m)]))
(Const64 <typ.UInt64> [smagic32(c).s]))
(Div64u <t> x (Const64 [c])) && t.IsSigned() && smagicOK64(c) && config.RegSize == 8 && config.useHmul =>
(Rsh64Ux64 <t>
(Hmul64u <typ.UInt64> x (Const64 <typ.UInt64> [int64(smagic64(c).m)]))
(Const64 <typ.UInt64> [smagic64(c).s]))
// Case 5. Unsigned divide where 2N+1 ≤ register size.
(Div8u <t> x (Const8 [c])) && umagicOK8(c) =>
(Trunc32to8 <t>
(Rsh32Ux64 <typ.UInt32>
(Mul32 <typ.UInt32> (ZeroExt8to32 x) (Const32 <typ.UInt32> [int32(1<<8 + umagic8(c).m)]))
(Const64 <typ.UInt64> [8 + umagic8(c).s])))
(Div16u <t> x (Const16 [c])) && umagicOK16(c) && config.RegSize == 8 =>
(Trunc64to16 <t>
(Rsh64Ux64 <typ.UInt64>
(Mul64 <typ.UInt64> (ZeroExt16to64 x) (Const64 <typ.UInt64> [int64(1<<16 + umagic16(c).m)]))
(Const64 <typ.UInt64> [16 + umagic16(c).s])))
// Case 6. Unsigned divide where m is even.
(Div16u <t> x (Const16 [c])) && umagicOK16(c) && umagic16(c).m&1 == 0 =>
(Trunc32to16 <t>
(Rsh32Ux64 <typ.UInt32>
(Mul32 <typ.UInt32> (ZeroExt16to32 x) (Const32 <typ.UInt32> [int32(1<<15 + umagic16(c).m/2)]))
(Const64 <typ.UInt64> [16 + umagic16(c).s - 1])))
(Div32u <t> x (Const32 [c])) && umagicOK32(c) && umagic32(c).m&1 == 0 && config.RegSize == 8 =>
(Trunc64to32 <t>
(Rsh64Ux64 <typ.UInt64>
(Mul64 <typ.UInt64> (ZeroExt32to64 x) (Const64 <typ.UInt64> [int64(1<<31 + umagic32(c).m/2)]))
(Const64 <typ.UInt64> [32 + umagic32(c).s - 1])))
(Div32u <t> x (Const32 [c])) && umagicOK32(c) && umagic32(c).m&1 == 0 && config.RegSize == 4 && config.useHmul =>
(Rsh32Ux64 <t>
(Hmul32u <typ.UInt32> x (Const32 <typ.UInt32> [int32(1<<31 + umagic32(c).m/2)]))
(Const64 <typ.UInt64> [umagic32(c).s - 1]))
(Div64u <t> x (Const64 [c])) && umagicOK64(c) && umagic64(c).m&1 == 0 && config.RegSize == 8 && config.useHmul =>
(Rsh64Ux64 <t>
(Hmul64u <typ.UInt64> x (Const64 <typ.UInt64> [int64(1<<63 + umagic64(c).m/2)]))
(Const64 <typ.UInt64> [umagic64(c).s - 1]))
// Case 7. Unsigned divide where c is even.
(Div16u <t> x (Const16 [c])) && umagicOK16(c) && config.RegSize == 4 && c&1 == 0 =>
(Trunc32to16 <t>
(Rsh32Ux64 <typ.UInt32>
(Mul32 <typ.UInt32>
(Rsh32Ux64 <typ.UInt32> (ZeroExt16to32 x) (Const64 <typ.UInt64> [1]))
(Const32 <typ.UInt32> [int32(1<<15 + (umagic16(c).m+1)/2)]))
(Const64 <typ.UInt64> [16 + umagic16(c).s - 2])))
(Div32u <t> x (Const32 [c])) && umagicOK32(c) && config.RegSize == 8 && c&1 == 0 =>
(Trunc64to32 <t>
(Rsh64Ux64 <typ.UInt64>
(Mul64 <typ.UInt64>
(Rsh64Ux64 <typ.UInt64> (ZeroExt32to64 x) (Const64 <typ.UInt64> [1]))
(Const64 <typ.UInt64> [int64(1<<31 + (umagic32(c).m+1)/2)]))
(Const64 <typ.UInt64> [32 + umagic32(c).s - 2])))
(Div32u <t> x (Const32 [c])) && umagicOK32(c) && config.RegSize == 4 && c&1 == 0 && config.useHmul =>
(Rsh32Ux64 <t>
(Hmul32u <typ.UInt32>
(Rsh32Ux64 <typ.UInt32> x (Const64 <typ.UInt64> [1]))
(Const32 <typ.UInt32> [int32(1<<31 + (umagic32(c).m+1)/2)]))
(Const64 <typ.UInt64> [umagic32(c).s - 2]))
(Div64u <t> x (Const64 [c])) && umagicOK64(c) && config.RegSize == 8 && c&1 == 0 && config.useHmul =>
(Rsh64Ux64 <t>
(Hmul64u <typ.UInt64>
(Rsh64Ux64 <typ.UInt64> x (Const64 <typ.UInt64> [1]))
(Const64 <typ.UInt64> [int64(1<<63 + (umagic64(c).m+1)/2)]))
(Const64 <typ.UInt64> [umagic64(c).s - 2]))
// Case 8. Unsigned divide on systems with avg.
(Div16u <t> x (Const16 [c])) && umagicOK16(c) && config.RegSize == 4 && config.useAvg =>
(Trunc32to16 <t>
(Rsh32Ux64 <typ.UInt32>
(Avg32u
(Lsh32x64 <typ.UInt32> (ZeroExt16to32 x) (Const64 <typ.UInt64> [16]))
(Mul32 <typ.UInt32> (ZeroExt16to32 x) (Const32 <typ.UInt32> [int32(umagic16(c).m)])))
(Const64 <typ.UInt64> [16 + umagic16(c).s - 1])))
(Div32u <t> x (Const32 [c])) && umagicOK32(c) && config.RegSize == 8 && config.useAvg =>
(Trunc64to32 <t>
(Rsh64Ux64 <typ.UInt64>
(Avg64u
(Lsh64x64 <typ.UInt64> (ZeroExt32to64 x) (Const64 <typ.UInt64> [32]))
(Mul64 <typ.UInt64> (ZeroExt32to64 x) (Const64 <typ.UInt32> [int64(umagic32(c).m)])))
(Const64 <typ.UInt64> [32 + umagic32(c).s - 1])))
(Div32u <t> x (Const32 [c])) && umagicOK32(c) && config.RegSize == 4 && config.useAvg && config.useHmul =>
(Rsh32Ux64 <t>
(Avg32u x (Hmul32u <typ.UInt32> x (Const32 <typ.UInt32> [int32(umagic32(c).m)])))
(Const64 <typ.UInt64> [umagic32(c).s - 1]))
(Div64u <t> x (Const64 [c])) && umagicOK64(c) && config.RegSize == 8 && config.useAvg && config.useHmul =>
(Rsh64Ux64 <t>
(Avg64u x (Hmul64u <typ.UInt64> x (Const64 <typ.UInt64> [int64(umagic64(c).m)])))
(Const64 <typ.UInt64> [umagic64(c).s - 1]))
// Case 9. For unsigned 64-bit divides on 32-bit machines,
// if the constant fits in 16 bits (so that the last term
// fits in 32 bits), convert to three 32-bit divides by a constant.
//
// If 1<<32 = Q * c + R
// and x = hi << 32 + lo
//
// Then x = (hi/c*c + hi%c) << 32 + lo
// = hi/c*c<<32 + hi%c<<32 + lo
// = hi/c*c<<32 + (hi%c)*(Q*c+R) + lo/c*c + lo%c
// = hi/c*c<<32 + (hi%c)*Q*c + lo/c*c + (hi%c*R+lo%c)
// and x / c = (hi/c)<<32 + (hi%c)*Q + lo/c + (hi%c*R+lo%c)/c
(Div64u x (Const64 [c])) && c > 0 && c <= 0xFFFF && umagicOK32(int32(c)) && config.RegSize == 4 && config.useHmul =>
(Add64
(Add64 <typ.UInt64>
(Add64 <typ.UInt64>
(Lsh64x64 <typ.UInt64>
(ZeroExt32to64
(Div32u <typ.UInt32>
(Trunc64to32 <typ.UInt32> (Rsh64Ux64 <typ.UInt64> x (Const64 <typ.UInt64> [32])))
(Const32 <typ.UInt32> [int32(c)])))
(Const64 <typ.UInt64> [32]))
(ZeroExt32to64 (Div32u <typ.UInt32> (Trunc64to32 <typ.UInt32> x) (Const32 <typ.UInt32> [int32(c)]))))
(Mul64 <typ.UInt64>
(ZeroExt32to64 <typ.UInt64>
(Mod32u <typ.UInt32>
(Trunc64to32 <typ.UInt32> (Rsh64Ux64 <typ.UInt64> x (Const64 <typ.UInt64> [32])))
(Const32 <typ.UInt32> [int32(c)])))
(Const64 <typ.UInt64> [int64((1<<32)/c)])))
(ZeroExt32to64
(Div32u <typ.UInt32>
(Add32 <typ.UInt32>
(Mod32u <typ.UInt32> (Trunc64to32 <typ.UInt32> x) (Const32 <typ.UInt32> [int32(c)]))
(Mul32 <typ.UInt32>
(Mod32u <typ.UInt32>
(Trunc64to32 <typ.UInt32> (Rsh64Ux64 <typ.UInt64> x (Const64 <typ.UInt64> [32])))
(Const32 <typ.UInt32> [int32(c)]))
(Const32 <typ.UInt32> [int32((1<<32)%c)])))
(Const32 <typ.UInt32> [int32(c)]))))
// Repeated from generic.rules, for expanding the expression above
// (which can then be further expanded to handle the nested Div32u).
(Mod32u <t> x (Const32 [c])) && x.Op != OpConst32 && c > 0 && umagicOK32(c)
=> (Sub32 x (Mul32 <t> (Div32u <t> x (Const32 <t> [c])) (Const32 <t> [c])))

View file

@ -0,0 +1,18 @@
// Copyright 2025 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package main
var divmodOps = []opData{}
var divmodBlocks = []blockData{}
func init() {
archs = append(archs, arch{
name: "divmod",
ops: divmodOps,
blocks: divmodBlocks,
generic: true,
})
}

View file

@ -199,16 +199,6 @@
(And(8|16|32|64) <t> (Com(8|16|32|64) x) (Com(8|16|32|64) y)) => (Com(8|16|32|64) (Or(8|16|32|64) <t> x y))
(Or(8|16|32|64) <t> (Com(8|16|32|64) x) (Com(8|16|32|64) y)) => (Com(8|16|32|64) (And(8|16|32|64) <t> x y))
// Convert multiplication by a power of two to a shift.
(Mul8 <t> n (Const8 [c])) && isPowerOfTwo(c) => (Lsh8x64 <t> n (Const64 <typ.UInt64> [log8(c)]))
(Mul16 <t> n (Const16 [c])) && isPowerOfTwo(c) => (Lsh16x64 <t> n (Const64 <typ.UInt64> [log16(c)]))
(Mul32 <t> n (Const32 [c])) && isPowerOfTwo(c) => (Lsh32x64 <t> n (Const64 <typ.UInt64> [log32(c)]))
(Mul64 <t> n (Const64 [c])) && isPowerOfTwo(c) => (Lsh64x64 <t> n (Const64 <typ.UInt64> [log64(c)]))
(Mul8 <t> n (Const8 [c])) && t.IsSigned() && isPowerOfTwo(-c) => (Neg8 (Lsh8x64 <t> n (Const64 <typ.UInt64> [log8(-c)])))
(Mul16 <t> n (Const16 [c])) && t.IsSigned() && isPowerOfTwo(-c) => (Neg16 (Lsh16x64 <t> n (Const64 <typ.UInt64> [log16(-c)])))
(Mul32 <t> n (Const32 [c])) && t.IsSigned() && isPowerOfTwo(-c) => (Neg32 (Lsh32x64 <t> n (Const64 <typ.UInt64> [log32(-c)])))
(Mul64 <t> n (Const64 [c])) && t.IsSigned() && isPowerOfTwo(-c) => (Neg64 (Lsh64x64 <t> n (Const64 <typ.UInt64> [log64(-c)])))
(Mod8 (Const8 [c]) (Const8 [d])) && d != 0 => (Const8 [c % d])
(Mod16 (Const16 [c]) (Const16 [d])) && d != 0 => (Const16 [c % d])
(Mod32 (Const32 [c]) (Const32 [d])) && d != 0 => (Const32 [c % d])
@ -380,13 +370,15 @@
// Distribute multiplication c * (d+x) -> c*d + c*x. Useful for:
// a[i].b = ...; a[i+1].b = ...
(Mul64 (Const64 <t> [c]) (Add64 <t> (Const64 <t> [d]) x)) =>
// The !isPowerOfTwo is a kludge to keep a[i+1] using an index by a multiply,
// which turns into an index by a shift, which can use a shifted operand on ARM systems.
(Mul64 (Const64 <t> [c]) (Add64 <t> (Const64 <t> [d]) x)) && !isPowerOfTwo(c) =>
(Add64 (Const64 <t> [c*d]) (Mul64 <t> (Const64 <t> [c]) x))
(Mul32 (Const32 <t> [c]) (Add32 <t> (Const32 <t> [d]) x)) =>
(Mul32 (Const32 <t> [c]) (Add32 <t> (Const32 <t> [d]) x)) && !isPowerOfTwo(c) =>
(Add32 (Const32 <t> [c*d]) (Mul32 <t> (Const32 <t> [c]) x))
(Mul16 (Const16 <t> [c]) (Add16 <t> (Const16 <t> [d]) x)) =>
(Mul16 (Const16 <t> [c]) (Add16 <t> (Const16 <t> [d]) x)) && !isPowerOfTwo(c) =>
(Add16 (Const16 <t> [c*d]) (Mul16 <t> (Const16 <t> [c]) x))
(Mul8 (Const8 <t> [c]) (Add8 <t> (Const8 <t> [d]) x)) =>
(Mul8 (Const8 <t> [c]) (Add8 <t> (Const8 <t> [d]) x)) && !isPowerOfTwo(c) =>
(Add8 (Const8 <t> [c*d]) (Mul8 <t> (Const8 <t> [c]) x))
// Rewrite x*y ± x*z to x*(y±z)
@ -1034,176 +1026,9 @@
// We must ensure that no intermediate computations are invalid pointers.
(Convert a:(Add(64|32) (Add(64|32) (Convert ptr mem) off1) off2) mem) => (AddPtr ptr (Add(64|32) <a.Type> off1 off2))
// strength reduction of divide by a constant.
// See ../magic.go for a detailed description of these algorithms.
// Unsigned divide by power of 2. Strength reduce to a shift.
(Div8u n (Const8 [c])) && isUnsignedPowerOfTwo(uint8(c)) => (Rsh8Ux64 n (Const64 <typ.UInt64> [log8u(uint8(c))]))
(Div16u n (Const16 [c])) && isUnsignedPowerOfTwo(uint16(c)) => (Rsh16Ux64 n (Const64 <typ.UInt64> [log16u(uint16(c))]))
(Div32u n (Const32 [c])) && isUnsignedPowerOfTwo(uint32(c)) => (Rsh32Ux64 n (Const64 <typ.UInt64> [log32u(uint32(c))]))
(Div64u n (Const64 [c])) && isUnsignedPowerOfTwo(uint64(c)) => (Rsh64Ux64 n (Const64 <typ.UInt64> [log64u(uint64(c))]))
// Signed non-negative divide by power of 2.
(Div8 n (Const8 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (Rsh8Ux64 n (Const64 <typ.UInt64> [log8(c)]))
(Div16 n (Const16 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (Rsh16Ux64 n (Const64 <typ.UInt64> [log16(c)]))
(Div32 n (Const32 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (Rsh32Ux64 n (Const64 <typ.UInt64> [log32(c)]))
(Div64 n (Const64 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (Rsh64Ux64 n (Const64 <typ.UInt64> [log64(c)]))
(Div64 n (Const64 [-1<<63])) && isNonNegative(n) => (Const64 [0])
// Unsigned divide, not a power of 2. Strength reduce to a multiply.
// For 8-bit divides, we just do a direct 9-bit by 8-bit multiply.
(Div8u x (Const8 [c])) && umagicOK8(c) =>
(Trunc32to8
(Rsh32Ux64 <typ.UInt32>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(1<<8+umagic8(c).m)])
(ZeroExt8to32 x))
(Const64 <typ.UInt64> [8+umagic8(c).s])))
// For 16-bit divides on 64-bit machines, we do a direct 17-bit by 16-bit multiply.
(Div16u x (Const16 [c])) && umagicOK16(c) && config.RegSize == 8 =>
(Trunc64to16
(Rsh64Ux64 <typ.UInt64>
(Mul64 <typ.UInt64>
(Const64 <typ.UInt64> [int64(1<<16+umagic16(c).m)])
(ZeroExt16to64 x))
(Const64 <typ.UInt64> [16+umagic16(c).s])))
// For 16-bit divides on 32-bit machines
(Div16u x (Const16 [c])) && umagicOK16(c) && config.RegSize == 4 && umagic16(c).m&1 == 0 =>
(Trunc32to16
(Rsh32Ux64 <typ.UInt32>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(1<<15+umagic16(c).m/2)])
(ZeroExt16to32 x))
(Const64 <typ.UInt64> [16+umagic16(c).s-1])))
(Div16u x (Const16 [c])) && umagicOK16(c) && config.RegSize == 4 && c&1 == 0 =>
(Trunc32to16
(Rsh32Ux64 <typ.UInt32>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(1<<15+(umagic16(c).m+1)/2)])
(Rsh32Ux64 <typ.UInt32> (ZeroExt16to32 x) (Const64 <typ.UInt64> [1])))
(Const64 <typ.UInt64> [16+umagic16(c).s-2])))
(Div16u x (Const16 [c])) && umagicOK16(c) && config.RegSize == 4 && config.useAvg =>
(Trunc32to16
(Rsh32Ux64 <typ.UInt32>
(Avg32u
(Lsh32x64 <typ.UInt32> (ZeroExt16to32 x) (Const64 <typ.UInt64> [16]))
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(umagic16(c).m)])
(ZeroExt16to32 x)))
(Const64 <typ.UInt64> [16+umagic16(c).s-1])))
// For 32-bit divides on 32-bit machines
(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 4 && umagic32(c).m&1 == 0 && config.useHmul =>
(Rsh32Ux64 <typ.UInt32>
(Hmul32u <typ.UInt32>
(Const32 <typ.UInt32> [int32(1<<31+umagic32(c).m/2)])
x)
(Const64 <typ.UInt64> [umagic32(c).s-1]))
(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 4 && c&1 == 0 && config.useHmul =>
(Rsh32Ux64 <typ.UInt32>
(Hmul32u <typ.UInt32>
(Const32 <typ.UInt32> [int32(1<<31+(umagic32(c).m+1)/2)])
(Rsh32Ux64 <typ.UInt32> x (Const64 <typ.UInt64> [1])))
(Const64 <typ.UInt64> [umagic32(c).s-2]))
(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 4 && config.useAvg && config.useHmul =>
(Rsh32Ux64 <typ.UInt32>
(Avg32u
x
(Hmul32u <typ.UInt32>
(Const32 <typ.UInt32> [int32(umagic32(c).m)])
x))
(Const64 <typ.UInt64> [umagic32(c).s-1]))
// For 32-bit divides on 64-bit machines
// We'll use a regular (non-hi) multiply for this case.
(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 8 && umagic32(c).m&1 == 0 =>
(Trunc64to32
(Rsh64Ux64 <typ.UInt64>
(Mul64 <typ.UInt64>
(Const64 <typ.UInt64> [int64(1<<31+umagic32(c).m/2)])
(ZeroExt32to64 x))
(Const64 <typ.UInt64> [32+umagic32(c).s-1])))
(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 8 && c&1 == 0 =>
(Trunc64to32
(Rsh64Ux64 <typ.UInt64>
(Mul64 <typ.UInt64>
(Const64 <typ.UInt64> [int64(1<<31+(umagic32(c).m+1)/2)])
(Rsh64Ux64 <typ.UInt64> (ZeroExt32to64 x) (Const64 <typ.UInt64> [1])))
(Const64 <typ.UInt64> [32+umagic32(c).s-2])))
(Div32u x (Const32 [c])) && umagicOK32(c) && config.RegSize == 8 && config.useAvg =>
(Trunc64to32
(Rsh64Ux64 <typ.UInt64>
(Avg64u
(Lsh64x64 <typ.UInt64> (ZeroExt32to64 x) (Const64 <typ.UInt64> [32]))
(Mul64 <typ.UInt64>
(Const64 <typ.UInt32> [int64(umagic32(c).m)])
(ZeroExt32to64 x)))
(Const64 <typ.UInt64> [32+umagic32(c).s-1])))
// For unsigned 64-bit divides on 32-bit machines,
// if the constant fits in 16 bits (so that the last term
// fits in 32 bits), convert to three 32-bit divides by a constant.
//
// If 1<<32 = Q * c + R
// and x = hi << 32 + lo
//
// Then x = (hi/c*c + hi%c) << 32 + lo
// = hi/c*c<<32 + hi%c<<32 + lo
// = hi/c*c<<32 + (hi%c)*(Q*c+R) + lo/c*c + lo%c
// = hi/c*c<<32 + (hi%c)*Q*c + lo/c*c + (hi%c*R+lo%c)
// and x / c = (hi/c)<<32 + (hi%c)*Q + lo/c + (hi%c*R+lo%c)/c
(Div64u x (Const64 [c])) && c > 0 && c <= 0xFFFF && umagicOK32(int32(c)) && config.RegSize == 4 && config.useHmul =>
(Add64
(Add64 <typ.UInt64>
(Add64 <typ.UInt64>
(Lsh64x64 <typ.UInt64>
(ZeroExt32to64
(Div32u <typ.UInt32>
(Trunc64to32 <typ.UInt32> (Rsh64Ux64 <typ.UInt64> x (Const64 <typ.UInt64> [32])))
(Const32 <typ.UInt32> [int32(c)])))
(Const64 <typ.UInt64> [32]))
(ZeroExt32to64 (Div32u <typ.UInt32> (Trunc64to32 <typ.UInt32> x) (Const32 <typ.UInt32> [int32(c)]))))
(Mul64 <typ.UInt64>
(ZeroExt32to64 <typ.UInt64>
(Mod32u <typ.UInt32>
(Trunc64to32 <typ.UInt32> (Rsh64Ux64 <typ.UInt64> x (Const64 <typ.UInt64> [32])))
(Const32 <typ.UInt32> [int32(c)])))
(Const64 <typ.UInt64> [int64((1<<32)/c)])))
(ZeroExt32to64
(Div32u <typ.UInt32>
(Add32 <typ.UInt32>
(Mod32u <typ.UInt32> (Trunc64to32 <typ.UInt32> x) (Const32 <typ.UInt32> [int32(c)]))
(Mul32 <typ.UInt32>
(Mod32u <typ.UInt32>
(Trunc64to32 <typ.UInt32> (Rsh64Ux64 <typ.UInt64> x (Const64 <typ.UInt64> [32])))
(Const32 <typ.UInt32> [int32(c)]))
(Const32 <typ.UInt32> [int32((1<<32)%c)])))
(Const32 <typ.UInt32> [int32(c)]))))
// For 64-bit divides on 64-bit machines
// (64-bit divides on 32-bit machines are lowered to a runtime call by the walk pass.)
(Div64u x (Const64 [c])) && umagicOK64(c) && config.RegSize == 8 && umagic64(c).m&1 == 0 && config.useHmul =>
(Rsh64Ux64 <typ.UInt64>
(Hmul64u <typ.UInt64>
(Const64 <typ.UInt64> [int64(1<<63+umagic64(c).m/2)])
x)
(Const64 <typ.UInt64> [umagic64(c).s-1]))
(Div64u x (Const64 [c])) && umagicOK64(c) && config.RegSize == 8 && c&1 == 0 && config.useHmul =>
(Rsh64Ux64 <typ.UInt64>
(Hmul64u <typ.UInt64>
(Const64 <typ.UInt64> [int64(1<<63+(umagic64(c).m+1)/2)])
(Rsh64Ux64 <typ.UInt64> x (Const64 <typ.UInt64> [1])))
(Const64 <typ.UInt64> [umagic64(c).s-2]))
(Div64u x (Const64 [c])) && umagicOK64(c) && config.RegSize == 8 && config.useAvg && config.useHmul =>
(Rsh64Ux64 <typ.UInt64>
(Avg64u
x
(Hmul64u <typ.UInt64>
(Const64 <typ.UInt64> [int64(umagic64(c).m)])
x))
(Const64 <typ.UInt64> [umagic64(c).s-1]))
// Simplification of divisions.
// Only trivial, easily analyzed (by prove) rewrites here.
// Strength reduction of div to mul is delayed to divmod.rules.
// Signed divide by a negative constant. Rewrite to divide by a positive constant.
(Div8 <t> n (Const8 [c])) && c < 0 && c != -1<<7 => (Neg8 (Div8 <t> n (Const8 <t> [-c])))
@ -1214,107 +1039,41 @@
// Dividing by the most-negative number. Result is always 0 except
// if the input is also the most-negative number.
// We can detect that using the sign bit of x & -x.
(Div64 x (Const64 [-1<<63])) && isNonNegative(x) => (Const64 [0])
(Div8 <t> x (Const8 [-1<<7 ])) => (Rsh8Ux64 (And8 <t> x (Neg8 <t> x)) (Const64 <typ.UInt64> [7 ]))
(Div16 <t> x (Const16 [-1<<15])) => (Rsh16Ux64 (And16 <t> x (Neg16 <t> x)) (Const64 <typ.UInt64> [15]))
(Div32 <t> x (Const32 [-1<<31])) => (Rsh32Ux64 (And32 <t> x (Neg32 <t> x)) (Const64 <typ.UInt64> [31]))
(Div64 <t> x (Const64 [-1<<63])) => (Rsh64Ux64 (And64 <t> x (Neg64 <t> x)) (Const64 <typ.UInt64> [63]))
// Signed divide by power of 2.
// n / c = n >> log(c) if n >= 0
// = (n+c-1) >> log(c) if n < 0
// We conditionally add c-1 by adding n>>63>>(64-log(c)) (first shift signed, second shift unsigned).
(Div8 <t> n (Const8 [c])) && isPowerOfTwo(c) =>
(Rsh8x64
(Add8 <t> n (Rsh8Ux64 <t> (Rsh8x64 <t> n (Const64 <typ.UInt64> [ 7])) (Const64 <typ.UInt64> [int64( 8-log8(c))])))
(Const64 <typ.UInt64> [int64(log8(c))]))
(Div16 <t> n (Const16 [c])) && isPowerOfTwo(c) =>
(Rsh16x64
(Add16 <t> n (Rsh16Ux64 <t> (Rsh16x64 <t> n (Const64 <typ.UInt64> [15])) (Const64 <typ.UInt64> [int64(16-log16(c))])))
(Const64 <typ.UInt64> [int64(log16(c))]))
(Div32 <t> n (Const32 [c])) && isPowerOfTwo(c) =>
(Rsh32x64
(Add32 <t> n (Rsh32Ux64 <t> (Rsh32x64 <t> n (Const64 <typ.UInt64> [31])) (Const64 <typ.UInt64> [int64(32-log32(c))])))
(Const64 <typ.UInt64> [int64(log32(c))]))
(Div64 <t> n (Const64 [c])) && isPowerOfTwo(c) =>
(Rsh64x64
(Add64 <t> n (Rsh64Ux64 <t> (Rsh64x64 <t> n (Const64 <typ.UInt64> [63])) (Const64 <typ.UInt64> [int64(64-log64(c))])))
(Const64 <typ.UInt64> [int64(log64(c))]))
// Unsigned divide by power of 2. Strength reduce to a shift.
(Div8u n (Const8 [c])) && isUnsignedPowerOfTwo(uint8(c)) => (Rsh8Ux64 n (Const64 <typ.UInt64> [log8u(uint8(c))]))
(Div16u n (Const16 [c])) && isUnsignedPowerOfTwo(uint16(c)) => (Rsh16Ux64 n (Const64 <typ.UInt64> [log16u(uint16(c))]))
(Div32u n (Const32 [c])) && isUnsignedPowerOfTwo(uint32(c)) => (Rsh32Ux64 n (Const64 <typ.UInt64> [log32u(uint32(c))]))
(Div64u n (Const64 [c])) && isUnsignedPowerOfTwo(uint64(c)) => (Rsh64Ux64 n (Const64 <typ.UInt64> [log64u(uint64(c))]))
// Signed divide, not a power of 2. Strength reduce to a multiply.
(Div8 <t> x (Const8 [c])) && smagicOK8(c) =>
(Sub8 <t>
(Rsh32x64 <t>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(smagic8(c).m)])
(SignExt8to32 x))
(Const64 <typ.UInt64> [8+smagic8(c).s]))
(Rsh32x64 <t>
(SignExt8to32 x)
(Const64 <typ.UInt64> [31])))
(Div16 <t> x (Const16 [c])) && smagicOK16(c) =>
(Sub16 <t>
(Rsh32x64 <t>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(smagic16(c).m)])
(SignExt16to32 x))
(Const64 <typ.UInt64> [16+smagic16(c).s]))
(Rsh32x64 <t>
(SignExt16to32 x)
(Const64 <typ.UInt64> [31])))
(Div32 <t> x (Const32 [c])) && smagicOK32(c) && config.RegSize == 8 =>
(Sub32 <t>
(Rsh64x64 <t>
(Mul64 <typ.UInt64>
(Const64 <typ.UInt64> [int64(smagic32(c).m)])
(SignExt32to64 x))
(Const64 <typ.UInt64> [32+smagic32(c).s]))
(Rsh64x64 <t>
(SignExt32to64 x)
(Const64 <typ.UInt64> [63])))
(Div32 <t> x (Const32 [c])) && smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 == 0 && config.useHmul =>
(Sub32 <t>
(Rsh32x64 <t>
(Hmul32 <t>
(Const32 <typ.UInt32> [int32(smagic32(c).m/2)])
x)
(Const64 <typ.UInt64> [smagic32(c).s-1]))
(Rsh32x64 <t>
x
(Const64 <typ.UInt64> [31])))
(Div32 <t> x (Const32 [c])) && smagicOK32(c) && config.RegSize == 4 && smagic32(c).m&1 != 0 && config.useHmul =>
(Sub32 <t>
(Rsh32x64 <t>
(Add32 <t>
(Hmul32 <t>
(Const32 <typ.UInt32> [int32(smagic32(c).m)])
x)
x)
(Const64 <typ.UInt64> [smagic32(c).s]))
(Rsh32x64 <t>
x
(Const64 <typ.UInt64> [31])))
(Div64 <t> x (Const64 [c])) && smagicOK64(c) && smagic64(c).m&1 == 0 && config.useHmul =>
(Sub64 <t>
(Rsh64x64 <t>
(Hmul64 <t>
(Const64 <typ.UInt64> [int64(smagic64(c).m/2)])
x)
(Const64 <typ.UInt64> [smagic64(c).s-1]))
(Rsh64x64 <t>
x
(Const64 <typ.UInt64> [63])))
(Div64 <t> x (Const64 [c])) && smagicOK64(c) && smagic64(c).m&1 != 0 && config.useHmul =>
(Sub64 <t>
(Rsh64x64 <t>
(Add64 <t>
(Hmul64 <t>
(Const64 <typ.UInt64> [int64(smagic64(c).m)])
x)
x)
(Const64 <typ.UInt64> [smagic64(c).s]))
(Rsh64x64 <t>
x
(Const64 <typ.UInt64> [63])))
// Strength reduce multiplication by a power of two to a shift.
// Excluded from early opt so that prove can recognize mod
// by the x - (x/d)*d pattern.
// (Runs during "middle opt" and "late opt".)
(Mul8 <t> x (Const8 [c])) && isPowerOfTwo(c) && v.Block.Func.pass.name != "opt" =>
(Lsh8x64 <t> x (Const64 <typ.UInt64> [log8(c)]))
(Mul16 <t> x (Const16 [c])) && isPowerOfTwo(c) && v.Block.Func.pass.name != "opt" =>
(Lsh16x64 <t> x (Const64 <typ.UInt64> [log16(c)]))
(Mul32 <t> x (Const32 [c])) && isPowerOfTwo(c) && v.Block.Func.pass.name != "opt" =>
(Lsh32x64 <t> x (Const64 <typ.UInt64> [log32(c)]))
(Mul64 <t> x (Const64 [c])) && isPowerOfTwo(c) && v.Block.Func.pass.name != "opt" =>
(Lsh64x64 <t> x (Const64 <typ.UInt64> [log64(c)]))
(Mul8 <t> x (Const8 [c])) && t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt" =>
(Neg8 (Lsh8x64 <t> x (Const64 <typ.UInt64> [log8(-c)])))
(Mul16 <t> x (Const16 [c])) && t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt" =>
(Neg16 (Lsh16x64 <t> x (Const64 <typ.UInt64> [log16(-c)])))
(Mul32 <t> x (Const32 [c])) && t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt" =>
(Neg32 (Lsh32x64 <t> x (Const64 <typ.UInt64> [log32(-c)])))
(Mul64 <t> x (Const64 [c])) && t.IsSigned() && isPowerOfTwo(-c) && v.Block.Func.pass.name != "opt" =>
(Neg64 (Lsh64x64 <t> x (Const64 <typ.UInt64> [log64(-c)])))
// Strength reduction of mod to div.
// Strength reduction of div to mul is delayed to genericlateopt.rules.
// Unsigned mod by power of 2 constant.
(Mod8u <t> n (Const8 [c])) && isUnsignedPowerOfTwo(uint8(c)) => (And8 n (Const8 <t> [c-1]))
@ -1323,6 +1082,7 @@
(Mod64u <t> n (Const64 [c])) && isUnsignedPowerOfTwo(uint64(c)) => (And64 n (Const64 <t> [c-1]))
// Signed non-negative mod by power of 2 constant.
// TODO: Replace ModN with ModNu in prove.
(Mod8 <t> n (Const8 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (And8 n (Const8 <t> [c-1]))
(Mod16 <t> n (Const16 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (And16 n (Const16 <t> [c-1]))
(Mod32 <t> n (Const32 [c])) && isNonNegative(n) && isPowerOfTwo(c) => (And32 n (Const32 <t> [c-1]))
@ -1355,7 +1115,9 @@
(Mod64u <t> x (Const64 [c])) && x.Op != OpConst64 && c > 0 && umagicOK64(c)
=> (Sub64 x (Mul64 <t> (Div64u <t> x (Const64 <t> [c])) (Const64 <t> [c])))
// For architectures without rotates on less than 32-bits, promote these checks to 32-bit.
// Set up for mod->mul+rot optimization in genericlateopt.rules.
// For architectures without rotates on less than 32-bits, promote to 32-bit.
// TODO: Also != 0 case?
(Eq8 (Mod8u x (Const8 [c])) (Const8 [0])) && x.Op != OpConst8 && udivisibleOK8(c) && !hasSmallRotate(config) =>
(Eq32 (Mod32u <typ.UInt32> (ZeroExt8to32 <typ.UInt32> x) (Const32 <typ.UInt32> [int32(uint8(c))])) (Const32 <typ.UInt32> [0]))
(Eq16 (Mod16u x (Const16 [c])) (Const16 [0])) && x.Op != OpConst16 && udivisibleOK16(c) && !hasSmallRotate(config) =>
@ -1365,557 +1127,6 @@
(Eq16 (Mod16 x (Const16 [c])) (Const16 [0])) && x.Op != OpConst16 && sdivisibleOK16(c) && !hasSmallRotate(config) =>
(Eq32 (Mod32 <typ.Int32> (SignExt16to32 <typ.Int32> x) (Const32 <typ.Int32> [int32(c)])) (Const32 <typ.Int32> [0]))
// Divisibility checks x%c == 0 convert to multiply and rotate.
// Note, x%c == 0 is rewritten as x == c*(x/c) during the opt pass
// where (x/c) is performed using multiplication with magic constants.
// To rewrite x%c == 0 requires pattern matching the rewritten expression
// and checking that the division by the same constant wasn't already calculated.
// This check is made by counting uses of the magic constant multiplication.
// Note that if there were an intermediate opt pass, this rule could be applied
// directly on the Div op and magic division rewrites could be delayed to late opt.
// Unsigned divisibility checks convert to multiply and rotate.
(Eq8 x (Mul8 (Const8 [c])
(Trunc32to8
(Rsh32Ux64
mul:(Mul32
(Const32 [m])
(ZeroExt8to32 x))
(Const64 [s])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int32(1<<8+umagic8(c).m) && s == 8+umagic8(c).s
&& x.Op != OpConst8 && udivisibleOK8(c)
=> (Leq8U
(RotateLeft8 <typ.UInt8>
(Mul8 <typ.UInt8>
(Const8 <typ.UInt8> [int8(udivisible8(c).m)])
x)
(Const8 <typ.UInt8> [int8(8-udivisible8(c).k)])
)
(Const8 <typ.UInt8> [int8(udivisible8(c).max)])
)
(Eq16 x (Mul16 (Const16 [c])
(Trunc64to16
(Rsh64Ux64
mul:(Mul64
(Const64 [m])
(ZeroExt16to64 x))
(Const64 [s])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int64(1<<16+umagic16(c).m) && s == 16+umagic16(c).s
&& x.Op != OpConst16 && udivisibleOK16(c)
=> (Leq16U
(RotateLeft16 <typ.UInt16>
(Mul16 <typ.UInt16>
(Const16 <typ.UInt16> [int16(udivisible16(c).m)])
x)
(Const16 <typ.UInt16> [int16(16-udivisible16(c).k)])
)
(Const16 <typ.UInt16> [int16(udivisible16(c).max)])
)
(Eq16 x (Mul16 (Const16 [c])
(Trunc32to16
(Rsh32Ux64
mul:(Mul32
(Const32 [m])
(ZeroExt16to32 x))
(Const64 [s])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int32(1<<15+umagic16(c).m/2) && s == 16+umagic16(c).s-1
&& x.Op != OpConst16 && udivisibleOK16(c)
=> (Leq16U
(RotateLeft16 <typ.UInt16>
(Mul16 <typ.UInt16>
(Const16 <typ.UInt16> [int16(udivisible16(c).m)])
x)
(Const16 <typ.UInt16> [int16(16-udivisible16(c).k)])
)
(Const16 <typ.UInt16> [int16(udivisible16(c).max)])
)
(Eq16 x (Mul16 (Const16 [c])
(Trunc32to16
(Rsh32Ux64
mul:(Mul32
(Const32 [m])
(Rsh32Ux64 (ZeroExt16to32 x) (Const64 [1])))
(Const64 [s])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int32(1<<15+(umagic16(c).m+1)/2) && s == 16+umagic16(c).s-2
&& x.Op != OpConst16 && udivisibleOK16(c)
=> (Leq16U
(RotateLeft16 <typ.UInt16>
(Mul16 <typ.UInt16>
(Const16 <typ.UInt16> [int16(udivisible16(c).m)])
x)
(Const16 <typ.UInt16> [int16(16-udivisible16(c).k)])
)
(Const16 <typ.UInt16> [int16(udivisible16(c).max)])
)
(Eq16 x (Mul16 (Const16 [c])
(Trunc32to16
(Rsh32Ux64
(Avg32u
(Lsh32x64 (ZeroExt16to32 x) (Const64 [16]))
mul:(Mul32
(Const32 [m])
(ZeroExt16to32 x)))
(Const64 [s])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int32(umagic16(c).m) && s == 16+umagic16(c).s-1
&& x.Op != OpConst16 && udivisibleOK16(c)
=> (Leq16U
(RotateLeft16 <typ.UInt16>
(Mul16 <typ.UInt16>
(Const16 <typ.UInt16> [int16(udivisible16(c).m)])
x)
(Const16 <typ.UInt16> [int16(16-udivisible16(c).k)])
)
(Const16 <typ.UInt16> [int16(udivisible16(c).max)])
)
(Eq32 x (Mul32 (Const32 [c])
(Rsh32Ux64
mul:(Hmul32u
(Const32 [m])
x)
(Const64 [s]))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int32(1<<31+umagic32(c).m/2) && s == umagic32(c).s-1
&& x.Op != OpConst32 && udivisibleOK32(c)
=> (Leq32U
(RotateLeft32 <typ.UInt32>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(udivisible32(c).m)])
x)
(Const32 <typ.UInt32> [int32(32-udivisible32(c).k)])
)
(Const32 <typ.UInt32> [int32(udivisible32(c).max)])
)
(Eq32 x (Mul32 (Const32 [c])
(Rsh32Ux64
mul:(Hmul32u
(Const32 <typ.UInt32> [m])
(Rsh32Ux64 x (Const64 [1])))
(Const64 [s]))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int32(1<<31+(umagic32(c).m+1)/2) && s == umagic32(c).s-2
&& x.Op != OpConst32 && udivisibleOK32(c)
=> (Leq32U
(RotateLeft32 <typ.UInt32>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(udivisible32(c).m)])
x)
(Const32 <typ.UInt32> [int32(32-udivisible32(c).k)])
)
(Const32 <typ.UInt32> [int32(udivisible32(c).max)])
)
(Eq32 x (Mul32 (Const32 [c])
(Rsh32Ux64
(Avg32u
x
mul:(Hmul32u
(Const32 [m])
x))
(Const64 [s]))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int32(umagic32(c).m) && s == umagic32(c).s-1
&& x.Op != OpConst32 && udivisibleOK32(c)
=> (Leq32U
(RotateLeft32 <typ.UInt32>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(udivisible32(c).m)])
x)
(Const32 <typ.UInt32> [int32(32-udivisible32(c).k)])
)
(Const32 <typ.UInt32> [int32(udivisible32(c).max)])
)
(Eq32 x (Mul32 (Const32 [c])
(Trunc64to32
(Rsh64Ux64
mul:(Mul64
(Const64 [m])
(ZeroExt32to64 x))
(Const64 [s])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int64(1<<31+umagic32(c).m/2) && s == 32+umagic32(c).s-1
&& x.Op != OpConst32 && udivisibleOK32(c)
=> (Leq32U
(RotateLeft32 <typ.UInt32>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(udivisible32(c).m)])
x)
(Const32 <typ.UInt32> [int32(32-udivisible32(c).k)])
)
(Const32 <typ.UInt32> [int32(udivisible32(c).max)])
)
(Eq32 x (Mul32 (Const32 [c])
(Trunc64to32
(Rsh64Ux64
mul:(Mul64
(Const64 [m])
(Rsh64Ux64 (ZeroExt32to64 x) (Const64 [1])))
(Const64 [s])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int64(1<<31+(umagic32(c).m+1)/2) && s == 32+umagic32(c).s-2
&& x.Op != OpConst32 && udivisibleOK32(c)
=> (Leq32U
(RotateLeft32 <typ.UInt32>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(udivisible32(c).m)])
x)
(Const32 <typ.UInt32> [int32(32-udivisible32(c).k)])
)
(Const32 <typ.UInt32> [int32(udivisible32(c).max)])
)
(Eq32 x (Mul32 (Const32 [c])
(Trunc64to32
(Rsh64Ux64
(Avg64u
(Lsh64x64 (ZeroExt32to64 x) (Const64 [32]))
mul:(Mul64
(Const64 [m])
(ZeroExt32to64 x)))
(Const64 [s])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int64(umagic32(c).m) && s == 32+umagic32(c).s-1
&& x.Op != OpConst32 && udivisibleOK32(c)
=> (Leq32U
(RotateLeft32 <typ.UInt32>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(udivisible32(c).m)])
x)
(Const32 <typ.UInt32> [int32(32-udivisible32(c).k)])
)
(Const32 <typ.UInt32> [int32(udivisible32(c).max)])
)
(Eq64 x (Mul64 (Const64 [c])
(Rsh64Ux64
mul:(Hmul64u
(Const64 [m])
x)
(Const64 [s]))
)
) && v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int64(1<<63+umagic64(c).m/2) && s == umagic64(c).s-1
&& x.Op != OpConst64 && udivisibleOK64(c)
=> (Leq64U
(RotateLeft64 <typ.UInt64>
(Mul64 <typ.UInt64>
(Const64 <typ.UInt64> [int64(udivisible64(c).m)])
x)
(Const64 <typ.UInt64> [64-udivisible64(c).k])
)
(Const64 <typ.UInt64> [int64(udivisible64(c).max)])
)
(Eq64 x (Mul64 (Const64 [c])
(Rsh64Ux64
mul:(Hmul64u
(Const64 [m])
(Rsh64Ux64 x (Const64 [1])))
(Const64 [s]))
)
) && v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int64(1<<63+(umagic64(c).m+1)/2) && s == umagic64(c).s-2
&& x.Op != OpConst64 && udivisibleOK64(c)
=> (Leq64U
(RotateLeft64 <typ.UInt64>
(Mul64 <typ.UInt64>
(Const64 <typ.UInt64> [int64(udivisible64(c).m)])
x)
(Const64 <typ.UInt64> [64-udivisible64(c).k])
)
(Const64 <typ.UInt64> [int64(udivisible64(c).max)])
)
(Eq64 x (Mul64 (Const64 [c])
(Rsh64Ux64
(Avg64u
x
mul:(Hmul64u
(Const64 [m])
x))
(Const64 [s]))
)
) && v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int64(umagic64(c).m) && s == umagic64(c).s-1
&& x.Op != OpConst64 && udivisibleOK64(c)
=> (Leq64U
(RotateLeft64 <typ.UInt64>
(Mul64 <typ.UInt64>
(Const64 <typ.UInt64> [int64(udivisible64(c).m)])
x)
(Const64 <typ.UInt64> [64-udivisible64(c).k])
)
(Const64 <typ.UInt64> [int64(udivisible64(c).max)])
)
// Signed divisibility checks convert to multiply, add and rotate.
(Eq8 x (Mul8 (Const8 [c])
(Sub8
(Rsh32x64
mul:(Mul32
(Const32 [m])
(SignExt8to32 x))
(Const64 [s]))
(Rsh32x64
(SignExt8to32 x)
(Const64 [31])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int32(smagic8(c).m) && s == 8+smagic8(c).s
&& x.Op != OpConst8 && sdivisibleOK8(c)
=> (Leq8U
(RotateLeft8 <typ.UInt8>
(Add8 <typ.UInt8>
(Mul8 <typ.UInt8>
(Const8 <typ.UInt8> [int8(sdivisible8(c).m)])
x)
(Const8 <typ.UInt8> [int8(sdivisible8(c).a)])
)
(Const8 <typ.UInt8> [int8(8-sdivisible8(c).k)])
)
(Const8 <typ.UInt8> [int8(sdivisible8(c).max)])
)
(Eq16 x (Mul16 (Const16 [c])
(Sub16
(Rsh32x64
mul:(Mul32
(Const32 [m])
(SignExt16to32 x))
(Const64 [s]))
(Rsh32x64
(SignExt16to32 x)
(Const64 [31])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int32(smagic16(c).m) && s == 16+smagic16(c).s
&& x.Op != OpConst16 && sdivisibleOK16(c)
=> (Leq16U
(RotateLeft16 <typ.UInt16>
(Add16 <typ.UInt16>
(Mul16 <typ.UInt16>
(Const16 <typ.UInt16> [int16(sdivisible16(c).m)])
x)
(Const16 <typ.UInt16> [int16(sdivisible16(c).a)])
)
(Const16 <typ.UInt16> [int16(16-sdivisible16(c).k)])
)
(Const16 <typ.UInt16> [int16(sdivisible16(c).max)])
)
(Eq32 x (Mul32 (Const32 [c])
(Sub32
(Rsh64x64
mul:(Mul64
(Const64 [m])
(SignExt32to64 x))
(Const64 [s]))
(Rsh64x64
(SignExt32to64 x)
(Const64 [63])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int64(smagic32(c).m) && s == 32+smagic32(c).s
&& x.Op != OpConst32 && sdivisibleOK32(c)
=> (Leq32U
(RotateLeft32 <typ.UInt32>
(Add32 <typ.UInt32>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(sdivisible32(c).m)])
x)
(Const32 <typ.UInt32> [int32(sdivisible32(c).a)])
)
(Const32 <typ.UInt32> [int32(32-sdivisible32(c).k)])
)
(Const32 <typ.UInt32> [int32(sdivisible32(c).max)])
)
(Eq32 x (Mul32 (Const32 [c])
(Sub32
(Rsh32x64
mul:(Hmul32
(Const32 [m])
x)
(Const64 [s]))
(Rsh32x64
x
(Const64 [31])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int32(smagic32(c).m/2) && s == smagic32(c).s-1
&& x.Op != OpConst32 && sdivisibleOK32(c)
=> (Leq32U
(RotateLeft32 <typ.UInt32>
(Add32 <typ.UInt32>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(sdivisible32(c).m)])
x)
(Const32 <typ.UInt32> [int32(sdivisible32(c).a)])
)
(Const32 <typ.UInt32> [int32(32-sdivisible32(c).k)])
)
(Const32 <typ.UInt32> [int32(sdivisible32(c).max)])
)
(Eq32 x (Mul32 (Const32 [c])
(Sub32
(Rsh32x64
(Add32
mul:(Hmul32
(Const32 [m])
x)
x)
(Const64 [s]))
(Rsh32x64
x
(Const64 [31])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int32(smagic32(c).m) && s == smagic32(c).s
&& x.Op != OpConst32 && sdivisibleOK32(c)
=> (Leq32U
(RotateLeft32 <typ.UInt32>
(Add32 <typ.UInt32>
(Mul32 <typ.UInt32>
(Const32 <typ.UInt32> [int32(sdivisible32(c).m)])
x)
(Const32 <typ.UInt32> [int32(sdivisible32(c).a)])
)
(Const32 <typ.UInt32> [int32(32-sdivisible32(c).k)])
)
(Const32 <typ.UInt32> [int32(sdivisible32(c).max)])
)
(Eq64 x (Mul64 (Const64 [c])
(Sub64
(Rsh64x64
mul:(Hmul64
(Const64 [m])
x)
(Const64 [s]))
(Rsh64x64
x
(Const64 [63])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int64(smagic64(c).m/2) && s == smagic64(c).s-1
&& x.Op != OpConst64 && sdivisibleOK64(c)
=> (Leq64U
(RotateLeft64 <typ.UInt64>
(Add64 <typ.UInt64>
(Mul64 <typ.UInt64>
(Const64 <typ.UInt64> [int64(sdivisible64(c).m)])
x)
(Const64 <typ.UInt64> [int64(sdivisible64(c).a)])
)
(Const64 <typ.UInt64> [64-sdivisible64(c).k])
)
(Const64 <typ.UInt64> [int64(sdivisible64(c).max)])
)
(Eq64 x (Mul64 (Const64 [c])
(Sub64
(Rsh64x64
(Add64
mul:(Hmul64
(Const64 [m])
x)
x)
(Const64 [s]))
(Rsh64x64
x
(Const64 [63])))
)
)
&& v.Block.Func.pass.name != "opt" && mul.Uses == 1
&& m == int64(smagic64(c).m) && s == smagic64(c).s
&& x.Op != OpConst64 && sdivisibleOK64(c)
=> (Leq64U
(RotateLeft64 <typ.UInt64>
(Add64 <typ.UInt64>
(Mul64 <typ.UInt64>
(Const64 <typ.UInt64> [int64(sdivisible64(c).m)])
x)
(Const64 <typ.UInt64> [int64(sdivisible64(c).a)])
)
(Const64 <typ.UInt64> [64-sdivisible64(c).k])
)
(Const64 <typ.UInt64> [int64(sdivisible64(c).max)])
)
// Divisibility check for signed integers for power of two constant are simple mask.
// However, we must match against the rewritten n%c == 0 -> n - c*(n/c) == 0 -> n == c*(n/c)
// where n/c contains fixup code to handle signed n.
((Eq8|Neq8) n (Lsh8x64
(Rsh8x64
(Add8 <t> n (Rsh8Ux64 <t> (Rsh8x64 <t> n (Const64 <typ.UInt64> [ 7])) (Const64 <typ.UInt64> [kbar])))
(Const64 <typ.UInt64> [k]))
(Const64 <typ.UInt64> [k]))
) && k > 0 && k < 7 && kbar == 8 - k
=> ((Eq8|Neq8) (And8 <t> n (Const8 <t> [1<<uint(k)-1])) (Const8 <t> [0]))
((Eq16|Neq16) n (Lsh16x64
(Rsh16x64
(Add16 <t> n (Rsh16Ux64 <t> (Rsh16x64 <t> n (Const64 <typ.UInt64> [15])) (Const64 <typ.UInt64> [kbar])))
(Const64 <typ.UInt64> [k]))
(Const64 <typ.UInt64> [k]))
) && k > 0 && k < 15 && kbar == 16 - k
=> ((Eq16|Neq16) (And16 <t> n (Const16 <t> [1<<uint(k)-1])) (Const16 <t> [0]))
((Eq32|Neq32) n (Lsh32x64
(Rsh32x64
(Add32 <t> n (Rsh32Ux64 <t> (Rsh32x64 <t> n (Const64 <typ.UInt64> [31])) (Const64 <typ.UInt64> [kbar])))
(Const64 <typ.UInt64> [k]))
(Const64 <typ.UInt64> [k]))
) && k > 0 && k < 31 && kbar == 32 - k
=> ((Eq32|Neq32) (And32 <t> n (Const32 <t> [1<<uint(k)-1])) (Const32 <t> [0]))
((Eq64|Neq64) n (Lsh64x64
(Rsh64x64
(Add64 <t> n (Rsh64Ux64 <t> (Rsh64x64 <t> n (Const64 <typ.UInt64> [63])) (Const64 <typ.UInt64> [kbar])))
(Const64 <typ.UInt64> [k]))
(Const64 <typ.UInt64> [k]))
) && k > 0 && k < 63 && kbar == 64 - k
=> ((Eq64|Neq64) (And64 <t> n (Const64 <t> [1<<uint(k)-1])) (Const64 <t> [0]))
(Eq(8|16|32|64) s:(Sub(8|16|32|64) x y) (Const(8|16|32|64) [0])) && s.Uses == 1 => (Eq(8|16|32|64) x y)
(Neq(8|16|32|64) s:(Sub(8|16|32|64) x y) (Const(8|16|32|64) [0])) && s.Uses == 1 => (Neq(8|16|32|64) x y)
@ -1925,6 +1136,20 @@
(Neq(8|16|32|64) (And(8|16|32|64) <t> x (Const(8|16|32|64) <t> [y])) (Const(8|16|32|64) <t> [y])) && oneBit(y)
=> (Eq(8|16|32|64) (And(8|16|32|64) <t> x (Const(8|16|32|64) <t> [y])) (Const(8|16|32|64) <t> [0]))
// Mark newly generated bounded shifts as bounded, for opt passes after prove.
(Lsh64x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 64 => (Lsh64x(8|16|32|64) [true] x con)
(Rsh64x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 64 => (Rsh64x(8|16|32|64) [true] x con)
(Rsh64Ux(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 64 => (Rsh64Ux(8|16|32|64) [true] x con)
(Lsh32x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 32 => (Lsh32x(8|16|32|64) [true] x con)
(Rsh32x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 32 => (Rsh32x(8|16|32|64) [true] x con)
(Rsh32Ux(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 32 => (Rsh32Ux(8|16|32|64) [true] x con)
(Lsh16x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 16 => (Lsh16x(8|16|32|64) [true] x con)
(Rsh16x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 16 => (Rsh16x(8|16|32|64) [true] x con)
(Rsh16Ux(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 16 => (Rsh16Ux(8|16|32|64) [true] x con)
(Lsh8x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 8 => (Lsh8x(8|16|32|64) [true] x con)
(Rsh8x(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 8 => (Rsh8x(8|16|32|64) [true] x con)
(Rsh8Ux(8|16|32|64) [false] x con:(Const(8|16|32|64) [c])) && 0 < c && c < 8 => (Rsh8Ux(8|16|32|64) [true] x con)
// Reassociate expressions involving
// constants such that constants come first,
// exposing obvious constant-folding opportunities.

View file

@ -461,7 +461,7 @@ var passes = [...]pass{
{name: "short circuit", fn: shortcircuit},
{name: "decompose user", fn: decomposeUser, required: true},
{name: "pre-opt deadcode", fn: deadcode},
{name: "opt", fn: opt, required: true}, // NB: some generic rules know the name of the opt pass. TODO: split required rules and optimizing rules
{name: "opt", fn: opt, required: true},
{name: "zero arg cse", fn: zcse, required: true}, // required to merge OpSB values
{name: "opt deadcode", fn: deadcode, required: true}, // remove any blocks orphaned during opt
{name: "generic cse", fn: cse},
@ -469,12 +469,15 @@ var passes = [...]pass{
{name: "gcse deadcode", fn: deadcode, required: true}, // clean out after cse and phiopt
{name: "nilcheckelim", fn: nilcheckelim},
{name: "prove", fn: prove},
{name: "divisible", fn: divisible, required: true},
{name: "divmod", fn: divmod, required: true},
{name: "middle opt", fn: opt, required: true},
{name: "early fuse", fn: fuseEarly},
{name: "expand calls", fn: expandCalls, required: true},
{name: "decompose builtin", fn: postExpandCallsDecompose, required: true},
{name: "softfloat", fn: softfloat, required: true},
{name: "branchelim", fn: branchelim},
{name: "late opt", fn: opt, required: true}, // TODO: split required rules and optimizing rules
{name: "late opt", fn: opt, required: true},
{name: "dead auto elim", fn: elimDeadAutosGeneric},
{name: "sccp", fn: sccp},
{name: "generic deadcode", fn: deadcode, required: true}, // remove dead stores, which otherwise mess up store chain
@ -529,6 +532,12 @@ var passOrder = [...]constraint{
{"generic cse", "prove"},
// deadcode after prove to eliminate all new dead blocks.
{"prove", "generic deadcode"},
// divisible after prove to let prove analyze div and mod
{"prove", "divisible"},
// divmod after divisible to avoid rewriting subexpressions of ones divisible will handle
{"divisible", "divmod"},
// divmod before decompose builtin to handle 64-bit on 32-bit systems
{"divmod", "decompose builtin"},
// common-subexpression before dead-store elim, so that we recognize
// when two address expressions are the same.
{"generic cse", "dse"},
@ -538,7 +547,7 @@ var passOrder = [...]constraint{
{"nilcheckelim", "generic deadcode"},
// nilcheckelim generates sequences of plain basic blocks
{"nilcheckelim", "late fuse"},
// nilcheckelim relies on opt to rewrite user nil checks
// nilcheckelim relies on the first opt to rewrite user nil checks
{"opt", "nilcheckelim"},
// tighten will be most effective when as many values have been removed as possible
{"generic deadcode", "tighten"},

View file

@ -13,14 +13,14 @@ import (
// decompose converts phi ops on compound builtin types into phi
// ops on simple types, then invokes rewrite rules to decompose
// other ops on those types.
func decomposeBuiltIn(f *Func) {
func decomposeBuiltin(f *Func) {
// Decompose phis
for _, b := range f.Blocks {
for _, v := range b.Values {
if v.Op != OpPhi {
continue
}
decomposeBuiltInPhi(v)
decomposeBuiltinPhi(v)
}
}
@ -121,7 +121,7 @@ func maybeAppend2(f *Func, ss []*LocalSlot, s1, s2 *LocalSlot) []*LocalSlot {
return maybeAppend(f, maybeAppend(f, ss, s1), s2)
}
func decomposeBuiltInPhi(v *Value) {
func decomposeBuiltinPhi(v *Value) {
switch {
case v.Type.IsInteger() && v.Type.Size() > v.Block.Func.Config.RegSize:
decomposeInt64Phi(v)

View file

@ -15,7 +15,7 @@ import (
func postExpandCallsDecompose(f *Func) {
decomposeUser(f) // redo user decompose to cleanup after expand calls
decomposeBuiltIn(f) // handles both regular decomposition and cleanup.
decomposeBuiltin(f) // handles both regular decomposition and cleanup.
}
func expandCalls(f *Func) {

View file

@ -8,3 +8,11 @@ package ssa
func opt(f *Func) {
applyRewrite(f, rewriteBlockgeneric, rewriteValuegeneric, removeDeadValues)
}
func divisible(f *Func) {
applyRewrite(f, rewriteBlockdivisible, rewriteValuedivisible, removeDeadValues)
}
func divmod(f *Func) {
applyRewrite(f, rewriteBlockdivmod, rewriteValuedivmod, removeDeadValues)
}

View file

@ -1946,7 +1946,7 @@ func (ft *factsTable) flowLimit(v *Value) bool {
a := ft.limits[v.Args[0].ID]
b := ft.limits[v.Args[1].ID]
sub := ft.newLimit(v, a.sub(b, uint(v.Type.Size())*8))
mod := ft.detectSignedMod(v)
mod := ft.detectMod(v)
inferred := ft.detectSliceLenRelation(v)
return sub || mod || inferred
case OpNeg64, OpNeg32, OpNeg16, OpNeg8:
@ -1984,6 +1984,10 @@ func (ft *factsTable) flowLimit(v *Value) bool {
lim = lim.unsignedMax(a.umax / b.umin)
}
return ft.newLimit(v, lim)
case OpMod64, OpMod32, OpMod16, OpMod8:
return ft.modLimit(true, v, v.Args[0], v.Args[1])
case OpMod64u, OpMod32u, OpMod16u, OpMod8u:
return ft.modLimit(false, v, v.Args[0], v.Args[1])
case OpPhi:
// Compute the union of all the input phis.
@ -2008,32 +2012,6 @@ func (ft *factsTable) flowLimit(v *Value) bool {
return false
}
// See if we can get any facts because v is the result of signed mod by a constant.
// The mod operation has already been rewritten, so we have to try and reconstruct it.
//
// x % d
//
// is rewritten as
//
// x - (x / d) * d
//
// furthermore, the divide itself gets rewritten. If d is a power of 2 (d == 1<<k), we do
//
// (x / d) * d = ((x + adj) >> k) << k
// = (x + adj) & (-1<<k)
//
// with adj being an adjustment in case x is negative (see below).
// if d is not a power of 2, we do
//
// x / d = ... TODO ...
func (ft *factsTable) detectSignedMod(v *Value) bool {
if ft.detectSignedModByPowerOfTwo(v) {
return true
}
// TODO: non-powers-of-2
return false
}
// detectSliceLenRelation matches the pattern where
// 1. v := slicelen - index, OR v := slicecap - index
// AND
@ -2095,102 +2073,64 @@ func (ft *factsTable) detectSliceLenRelation(v *Value) (inferred bool) {
return inferred
}
func (ft *factsTable) detectSignedModByPowerOfTwo(v *Value) bool {
// We're looking for:
//
// x % d ==
// x - (x / d) * d
//
// which for d a power of 2, d == 1<<k, is done as
//
// x - ((x + (x>>(w-1))>>>(w-k)) & (-1<<k))
//
// w = bit width of x.
// (>> = signed shift, >>> = unsigned shift).
// See ./_gen/generic.rules, search for "Signed divide by power of 2".
var w int64
var addOp, andOp, constOp, sshiftOp, ushiftOp Op
// x%d has been rewritten to x - (x/d)*d.
func (ft *factsTable) detectMod(v *Value) bool {
var opDiv, opDivU, opMul, opConst Op
switch v.Op {
case OpSub64:
w = 64
addOp = OpAdd64
andOp = OpAnd64
constOp = OpConst64
sshiftOp = OpRsh64x64
ushiftOp = OpRsh64Ux64
opDiv = OpDiv64
opDivU = OpDiv64u
opMul = OpMul64
opConst = OpConst64
case OpSub32:
w = 32
addOp = OpAdd32
andOp = OpAnd32
constOp = OpConst32
sshiftOp = OpRsh32x64
ushiftOp = OpRsh32Ux64
opDiv = OpDiv32
opDivU = OpDiv32u
opMul = OpMul32
opConst = OpConst32
case OpSub16:
w = 16
addOp = OpAdd16
andOp = OpAnd16
constOp = OpConst16
sshiftOp = OpRsh16x64
ushiftOp = OpRsh16Ux64
opDiv = OpDiv16
opDivU = OpDiv16u
opMul = OpMul16
opConst = OpConst16
case OpSub8:
w = 8
addOp = OpAdd8
andOp = OpAnd8
constOp = OpConst8
sshiftOp = OpRsh8x64
ushiftOp = OpRsh8Ux64
default:
return false
opDiv = OpDiv8
opDivU = OpDiv8u
opMul = OpMul8
opConst = OpConst8
}
x := v.Args[0]
and := v.Args[1]
if and.Op != andOp {
mul := v.Args[1]
if mul.Op != opMul {
return false
}
var add, mask *Value
if and.Args[0].Op == addOp && and.Args[1].Op == constOp {
add = and.Args[0]
mask = and.Args[1]
} else if and.Args[1].Op == addOp && and.Args[0].Op == constOp {
add = and.Args[1]
mask = and.Args[0]
} else {
return false
}
var ushift *Value
if add.Args[0] == x {
ushift = add.Args[1]
} else if add.Args[1] == x {
ushift = add.Args[0]
} else {
return false
}
if ushift.Op != ushiftOp {
return false
}
if ushift.Args[1].Op != OpConst64 {
return false
}
k := w - ushift.Args[1].AuxInt // Now we know k!
d := int64(1) << k // divisor
sshift := ushift.Args[0]
if sshift.Op != sshiftOp {
return false
}
if sshift.Args[0] != x {
return false
}
if sshift.Args[1].Op != OpConst64 || sshift.Args[1].AuxInt != w-1 {
return false
}
if mask.AuxInt != -d {
div, con := mul.Args[0], mul.Args[1]
if div.Op == opConst {
div, con = con, div
}
if con.Op != opConst || (div.Op != opDiv && div.Op != opDivU) || div.Args[0] != v.Args[0] || div.Args[1].Op != opConst || div.Args[1].AuxInt != con.AuxInt {
return false
}
return ft.modLimit(div.Op == opDiv, v, v.Args[0], con)
}
// All looks ok. x % d is at most +/- d-1.
return ft.signedMinMax(v, -d+1, d-1)
// modLimit sets v with facts derived from v = p % q.
func (ft *factsTable) modLimit(signed bool, v, p, q *Value) bool {
a := ft.limits[p.ID]
b := ft.limits[q.ID]
if signed {
if a.min < 0 && b.min > 0 {
return ft.signedMinMax(v, -(b.max - 1), b.max-1)
}
if !(a.nonnegative() && b.nonnegative()) {
// TODO: we could handle signed limits but I didn't bother.
return false
}
if a.min >= 0 && b.min > 0 {
ft.setNonNegative(v)
}
}
// Underflow in the arithmetic below is ok, it gives to MaxUint64 which does nothing to the limit.
return ft.unsignedMax(v, min(a.umax, b.umax-1))
}
// getBranch returns the range restrictions added by p
@ -2468,6 +2408,10 @@ func addLocalFacts(ft *factsTable, b *Block) {
// TODO: investigate how to always add facts without much slowdown, see issue #57959
//ft.update(b, v, v.Args[0], unsigned, gt|eq)
//ft.update(b, v, v.Args[1], unsigned, gt|eq)
case OpDiv64, OpDiv32, OpDiv16, OpDiv8:
if ft.isNonNegative(v.Args[0]) && ft.isNonNegative(v.Args[1]) {
ft.update(b, v, v.Args[0], unsigned, lt|eq)
}
case OpDiv64u, OpDiv32u, OpDiv16u, OpDiv8u,
OpRsh8Ux64, OpRsh8Ux32, OpRsh8Ux16, OpRsh8Ux8,
OpRsh16Ux64, OpRsh16Ux32, OpRsh16Ux16, OpRsh16Ux8,
@ -2510,10 +2454,7 @@ func addLocalFacts(ft *factsTable, b *Block) {
}
ft.update(b, v, v.Args[0], unsigned, lt|eq)
case OpMod64, OpMod32, OpMod16, OpMod8:
a := ft.limits[v.Args[0].ID]
b := ft.limits[v.Args[1].ID]
if !(a.nonnegative() && b.nonnegative()) {
// TODO: we could handle signed limits but I didn't bother.
if !ft.isNonNegative(v.Args[0]) || !ft.isNonNegative(v.Args[1]) {
break
}
fallthrough
@ -2631,14 +2572,30 @@ func addLocalFactsPhi(ft *factsTable, v *Value) {
ft.update(b, v, y, dom, rel)
}
var ctzNonZeroOp = map[Op]Op{OpCtz8: OpCtz8NonZero, OpCtz16: OpCtz16NonZero, OpCtz32: OpCtz32NonZero, OpCtz64: OpCtz64NonZero}
var ctzNonZeroOp = map[Op]Op{
OpCtz8: OpCtz8NonZero,
OpCtz16: OpCtz16NonZero,
OpCtz32: OpCtz32NonZero,
OpCtz64: OpCtz64NonZero,
}
var mostNegativeDividend = map[Op]int64{
OpDiv16: -1 << 15,
OpMod16: -1 << 15,
OpDiv32: -1 << 31,
OpMod32: -1 << 31,
OpDiv64: -1 << 63,
OpMod64: -1 << 63}
OpMod64: -1 << 63,
}
var unsignedOp = map[Op]Op{
OpDiv8: OpDiv8u,
OpDiv16: OpDiv16u,
OpDiv32: OpDiv32u,
OpDiv64: OpDiv64u,
OpMod8: OpMod8u,
OpMod16: OpMod16u,
OpMod32: OpMod32u,
OpMod64: OpMod64u,
}
var bytesizeToConst = [...]Op{
8 / 8: OpConst8,
@ -2746,34 +2703,51 @@ func simplifyBlock(sdom SparseTree, ft *factsTable, b *Block) {
b.Func.Warnl(v.Pos, "Proved %v bounded", v.Op)
}
}
case OpDiv16, OpDiv32, OpDiv64, OpMod16, OpMod32, OpMod64:
// On amd64 and 386 fix-up code can be avoided if we know
// the divisor is not -1 or the dividend > MinIntNN.
// Don't modify AuxInt on other architectures,
// as that can interfere with CSE.
// TODO: add other architectures?
if b.Func.Config.arch != "386" && b.Func.Config.arch != "amd64" {
case OpDiv8, OpDiv16, OpDiv32, OpDiv64, OpMod8, OpMod16, OpMod32, OpMod64:
p, q := ft.limits[v.Args[0].ID], ft.limits[v.Args[1].ID] // p/q
if p.nonnegative() && q.nonnegative() {
if b.Func.pass.debug > 0 {
b.Func.Warnl(v.Pos, "Proved %v is unsigned", v.Op)
}
v.Op = unsignedOp[v.Op]
v.AuxInt = 0
break
}
divr := v.Args[1]
divrLim := ft.limits[divr.ID]
divd := v.Args[0]
divdLim := ft.limits[divd.ID]
if divrLim.max < -1 || divrLim.min > -1 || divdLim.min > mostNegativeDividend[v.Op] {
// Fixup code can be avoided on x86 if we know
// the divisor is not -1 or the dividend > MinIntNN.
if v.Op != OpDiv8 && v.Op != OpMod8 && (q.max < -1 || q.min > -1 || p.min > mostNegativeDividend[v.Op]) {
// See DivisionNeedsFixUp in rewrite.go.
// v.AuxInt = 1 means we have proved both that the divisor is not -1
// and that the dividend is not the most negative integer,
// v.AuxInt = 1 means we have proved that the divisor is not -1
// or that the dividend is not the most negative integer,
// so we do not need to add fix-up code.
v.AuxInt = 1
if b.Func.pass.debug > 0 {
b.Func.Warnl(v.Pos, "Proved %v does not need fix-up", v.Op)
}
// Only usable on amd64 and 386, and only for ≥ 16-bit ops.
// Don't modify AuxInt on other architectures, as that can interfere with CSE.
// (Print the debug info above always, so that test/prove.go can be
// checked on non-x86 systems.)
// TODO: add other architectures?
if b.Func.Config.arch == "386" || b.Func.Config.arch == "amd64" {
v.AuxInt = 1
}
}
case OpMul64, OpMul32, OpMul16, OpMul8:
if vl := ft.limits[v.ID]; vl.min == vl.max || vl.umin == vl.umax {
// v is going to be constant folded away; don't "optimize" it.
break
}
x := v.Args[0]
xl := ft.limits[x.ID]
y := v.Args[1]
yl := ft.limits[y.ID]
if xl.umin == xl.umax && isPowerOfTwo(int64(xl.umin)) ||
xl.min == xl.max && isPowerOfTwo(xl.min) ||
yl.umin == yl.umax && isPowerOfTwo(int64(yl.umin)) ||
yl.min == yl.max && isPowerOfTwo(yl.min) {
// 0,1 * a power of two is better done as a shift
break
}
switch xOne, yOne := xl.umax <= 1, yl.umax <= 1; {
case xOne && yOne:
v.Op = bytesizeToAnd[v.Type.Size()]
@ -2807,6 +2781,7 @@ func simplifyBlock(sdom SparseTree, ft *factsTable, b *Block) {
}
}
}
// Fold provable constant results.
// Helps in cases where we reuse a value after branching on its equality.
for i, arg := range v.Args {

View file

@ -57,11 +57,15 @@ func applyRewrite(f *Func, rb blockRewriter, rv valueRewriter, deadcode deadValu
var iters int
var states map[string]bool
for {
if debug > 1 {
fmt.Printf("%s: iter %d\n", f.pass.name, iters)
}
change := false
deadChange := false
for _, b := range f.Blocks {
var b0 *Block
if debug > 1 {
fmt.Printf("%s: start block\n", f.pass.name)
b0 = new(Block)
*b0 = *b
b0.Succs = append([]Edge{}, b.Succs...) // make a new copy, not aliasing
@ -79,6 +83,9 @@ func applyRewrite(f *Func, rb blockRewriter, rv valueRewriter, deadcode deadValu
}
}
for j, v := range b.Values {
if debug > 1 {
fmt.Printf("%s: consider %v\n", f.pass.name, v.LongString())
}
var v0 *Value
if debug > 1 {
v0 = new(Value)
@ -1260,10 +1267,8 @@ func logRule(s string) {
}
ruleFile = w
}
_, err := fmt.Fprintln(ruleFile, s)
if err != nil {
panic(err)
}
// Ignore errors in case of multiple processes fighting over the file.
fmt.Fprintln(ruleFile, s)
}
var ruleFile io.Writer

View file

@ -1310,6 +1310,8 @@ func rewriteValuedec64_OpRotateLeft32(v *Value) bool {
func rewriteValuedec64_OpRotateLeft64(v *Value) bool {
v_1 := v.Args[1]
v_0 := v.Args[0]
b := v.Block
typ := &b.Func.Config.Types
// match: (RotateLeft64 x (Int64Make hi lo))
// result: (RotateLeft64 x lo)
for {
@ -1322,6 +1324,458 @@ func rewriteValuedec64_OpRotateLeft64(v *Value) bool {
v.AddArg2(x, lo)
return true
}
// match: (RotateLeft64 <t> x (Const64 [c]))
// cond: c&63 == 0
// result: x
for {
x := v_0
if v_1.Op != OpConst64 {
break
}
c := auxIntToInt64(v_1.AuxInt)
if !(c&63 == 0) {
break
}
v.copyOf(x)
return true
}
// match: (RotateLeft64 <t> x (Const32 [c]))
// cond: c&63 == 0
// result: x
for {
x := v_0
if v_1.Op != OpConst32 {
break
}
c := auxIntToInt32(v_1.AuxInt)
if !(c&63 == 0) {
break
}
v.copyOf(x)
return true
}
// match: (RotateLeft64 <t> x (Const16 [c]))
// cond: c&63 == 0
// result: x
for {
x := v_0
if v_1.Op != OpConst16 {
break
}
c := auxIntToInt16(v_1.AuxInt)
if !(c&63 == 0) {
break
}
v.copyOf(x)
return true
}
// match: (RotateLeft64 <t> x (Const8 [c]))
// cond: c&63 == 0
// result: x
for {
x := v_0
if v_1.Op != OpConst8 {
break
}
c := auxIntToInt8(v_1.AuxInt)
if !(c&63 == 0) {
break
}
v.copyOf(x)
return true
}
// match: (RotateLeft64 <t> x (Const64 [c]))
// cond: c&63 == 32
// result: (Int64Make <t> (Int64Lo x) (Int64Hi x))
for {
t := v.Type
x := v_0
if v_1.Op != OpConst64 {
break
}
c := auxIntToInt64(v_1.AuxInt)
if !(c&63 == 32) {
break
}
v.reset(OpInt64Make)
v.Type = t
v0 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32)
v0.AddArg(x)
v1 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32)
v1.AddArg(x)
v.AddArg2(v0, v1)
return true
}
// match: (RotateLeft64 <t> x (Const32 [c]))
// cond: c&63 == 32
// result: (Int64Make <t> (Int64Lo x) (Int64Hi x))
for {
t := v.Type
x := v_0
if v_1.Op != OpConst32 {
break
}
c := auxIntToInt32(v_1.AuxInt)
if !(c&63 == 32) {
break
}
v.reset(OpInt64Make)
v.Type = t
v0 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32)
v0.AddArg(x)
v1 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32)
v1.AddArg(x)
v.AddArg2(v0, v1)
return true
}
// match: (RotateLeft64 <t> x (Const16 [c]))
// cond: c&63 == 32
// result: (Int64Make <t> (Int64Lo x) (Int64Hi x))
for {
t := v.Type
x := v_0
if v_1.Op != OpConst16 {
break
}
c := auxIntToInt16(v_1.AuxInt)
if !(c&63 == 32) {
break
}
v.reset(OpInt64Make)
v.Type = t
v0 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32)
v0.AddArg(x)
v1 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32)
v1.AddArg(x)
v.AddArg2(v0, v1)
return true
}
// match: (RotateLeft64 <t> x (Const8 [c]))
// cond: c&63 == 32
// result: (Int64Make <t> (Int64Lo x) (Int64Hi x))
for {
t := v.Type
x := v_0
if v_1.Op != OpConst8 {
break
}
c := auxIntToInt8(v_1.AuxInt)
if !(c&63 == 32) {
break
}
v.reset(OpInt64Make)
v.Type = t
v0 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32)
v0.AddArg(x)
v1 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32)
v1.AddArg(x)
v.AddArg2(v0, v1)
return true
}
// match: (RotateLeft64 <t> x (Const64 [c]))
// cond: 0 < c&63 && c&63 < 32
// result: (Int64Make <t> (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(32-c&31)]))) (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(32-c&31)]))))
for {
t := v.Type
x := v_0
if v_1.Op != OpConst64 {
break
}
c := auxIntToInt64(v_1.AuxInt)
if !(0 < c&63 && c&63 < 32) {
break
}
v.reset(OpInt64Make)
v.Type = t
v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v2 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32)
v2.AddArg(x)
v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v3.AuxInt = int32ToAuxInt(int32(c & 31))
v1.AddArg2(v2, v3)
v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v5 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32)
v5.AddArg(x)
v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v6.AuxInt = int32ToAuxInt(int32(32 - c&31))
v4.AddArg2(v5, v6)
v0.AddArg2(v1, v4)
v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v8.AddArg2(v5, v3)
v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v9.AddArg2(v2, v6)
v7.AddArg2(v8, v9)
v.AddArg2(v0, v7)
return true
}
// match: (RotateLeft64 <t> x (Const32 [c]))
// cond: 0 < c&63 && c&63 < 32
// result: (Int64Make <t> (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(32-c&31)]))) (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(32-c&31)]))))
for {
t := v.Type
x := v_0
if v_1.Op != OpConst32 {
break
}
c := auxIntToInt32(v_1.AuxInt)
if !(0 < c&63 && c&63 < 32) {
break
}
v.reset(OpInt64Make)
v.Type = t
v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v2 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32)
v2.AddArg(x)
v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v3.AuxInt = int32ToAuxInt(int32(c & 31))
v1.AddArg2(v2, v3)
v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v5 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32)
v5.AddArg(x)
v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v6.AuxInt = int32ToAuxInt(int32(32 - c&31))
v4.AddArg2(v5, v6)
v0.AddArg2(v1, v4)
v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v8.AddArg2(v5, v3)
v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v9.AddArg2(v2, v6)
v7.AddArg2(v8, v9)
v.AddArg2(v0, v7)
return true
}
// match: (RotateLeft64 <t> x (Const16 [c]))
// cond: 0 < c&63 && c&63 < 32
// result: (Int64Make <t> (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(32-c&31)]))) (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(32-c&31)]))))
for {
t := v.Type
x := v_0
if v_1.Op != OpConst16 {
break
}
c := auxIntToInt16(v_1.AuxInt)
if !(0 < c&63 && c&63 < 32) {
break
}
v.reset(OpInt64Make)
v.Type = t
v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v2 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32)
v2.AddArg(x)
v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v3.AuxInt = int32ToAuxInt(int32(c & 31))
v1.AddArg2(v2, v3)
v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v5 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32)
v5.AddArg(x)
v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v6.AuxInt = int32ToAuxInt(int32(32 - c&31))
v4.AddArg2(v5, v6)
v0.AddArg2(v1, v4)
v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v8.AddArg2(v5, v3)
v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v9.AddArg2(v2, v6)
v7.AddArg2(v8, v9)
v.AddArg2(v0, v7)
return true
}
// match: (RotateLeft64 <t> x (Const8 [c]))
// cond: 0 < c&63 && c&63 < 32
// result: (Int64Make <t> (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(32-c&31)]))) (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(32-c&31)]))))
for {
t := v.Type
x := v_0
if v_1.Op != OpConst8 {
break
}
c := auxIntToInt8(v_1.AuxInt)
if !(0 < c&63 && c&63 < 32) {
break
}
v.reset(OpInt64Make)
v.Type = t
v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v2 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32)
v2.AddArg(x)
v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v3.AuxInt = int32ToAuxInt(int32(c & 31))
v1.AddArg2(v2, v3)
v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v5 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32)
v5.AddArg(x)
v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v6.AuxInt = int32ToAuxInt(int32(32 - c&31))
v4.AddArg2(v5, v6)
v0.AddArg2(v1, v4)
v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v8.AddArg2(v5, v3)
v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v9.AddArg2(v2, v6)
v7.AddArg2(v8, v9)
v.AddArg2(v0, v7)
return true
}
// match: (RotateLeft64 <t> x (Const64 [c]))
// cond: 32 < c&63 && c&63 < 64
// result: (Int64Make <t> (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(32-c&31)]))) (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(32-c&31)]))))
for {
t := v.Type
x := v_0
if v_1.Op != OpConst64 {
break
}
c := auxIntToInt64(v_1.AuxInt)
if !(32 < c&63 && c&63 < 64) {
break
}
v.reset(OpInt64Make)
v.Type = t
v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v2 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32)
v2.AddArg(x)
v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v3.AuxInt = int32ToAuxInt(int32(c & 31))
v1.AddArg2(v2, v3)
v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v5 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32)
v5.AddArg(x)
v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v6.AuxInt = int32ToAuxInt(int32(32 - c&31))
v4.AddArg2(v5, v6)
v0.AddArg2(v1, v4)
v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v8.AddArg2(v5, v3)
v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v9.AddArg2(v2, v6)
v7.AddArg2(v8, v9)
v.AddArg2(v0, v7)
return true
}
// match: (RotateLeft64 <t> x (Const32 [c]))
// cond: 32 < c&63 && c&63 < 64
// result: (Int64Make <t> (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(32-c&31)]))) (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(32-c&31)]))))
for {
t := v.Type
x := v_0
if v_1.Op != OpConst32 {
break
}
c := auxIntToInt32(v_1.AuxInt)
if !(32 < c&63 && c&63 < 64) {
break
}
v.reset(OpInt64Make)
v.Type = t
v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v2 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32)
v2.AddArg(x)
v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v3.AuxInt = int32ToAuxInt(int32(c & 31))
v1.AddArg2(v2, v3)
v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v5 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32)
v5.AddArg(x)
v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v6.AuxInt = int32ToAuxInt(int32(32 - c&31))
v4.AddArg2(v5, v6)
v0.AddArg2(v1, v4)
v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v8.AddArg2(v5, v3)
v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v9.AddArg2(v2, v6)
v7.AddArg2(v8, v9)
v.AddArg2(v0, v7)
return true
}
// match: (RotateLeft64 <t> x (Const16 [c]))
// cond: 32 < c&63 && c&63 < 64
// result: (Int64Make <t> (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(32-c&31)]))) (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(32-c&31)]))))
for {
t := v.Type
x := v_0
if v_1.Op != OpConst16 {
break
}
c := auxIntToInt16(v_1.AuxInt)
if !(32 < c&63 && c&63 < 64) {
break
}
v.reset(OpInt64Make)
v.Type = t
v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v2 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32)
v2.AddArg(x)
v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v3.AuxInt = int32ToAuxInt(int32(c & 31))
v1.AddArg2(v2, v3)
v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v5 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32)
v5.AddArg(x)
v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v6.AuxInt = int32ToAuxInt(int32(32 - c&31))
v4.AddArg2(v5, v6)
v0.AddArg2(v1, v4)
v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v8.AddArg2(v5, v3)
v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v9.AddArg2(v2, v6)
v7.AddArg2(v8, v9)
v.AddArg2(v0, v7)
return true
}
// match: (RotateLeft64 <t> x (Const8 [c]))
// cond: 32 < c&63 && c&63 < 64
// result: (Int64Make <t> (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(32-c&31)]))) (Or32 <typ.UInt32> (Lsh32x32 <typ.UInt32> (Int64Hi x) (Const32 <typ.UInt32> [int32(c&31)])) (Rsh32Ux32 <typ.UInt32> (Int64Lo x) (Const32 <typ.UInt32> [int32(32-c&31)]))))
for {
t := v.Type
x := v_0
if v_1.Op != OpConst8 {
break
}
c := auxIntToInt8(v_1.AuxInt)
if !(32 < c&63 && c&63 < 64) {
break
}
v.reset(OpInt64Make)
v.Type = t
v0 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v1 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v2 := b.NewValue0(v.Pos, OpInt64Lo, typ.UInt32)
v2.AddArg(x)
v3 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v3.AuxInt = int32ToAuxInt(int32(c & 31))
v1.AddArg2(v2, v3)
v4 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v5 := b.NewValue0(v.Pos, OpInt64Hi, typ.UInt32)
v5.AddArg(x)
v6 := b.NewValue0(v.Pos, OpConst32, typ.UInt32)
v6.AuxInt = int32ToAuxInt(int32(32 - c&31))
v4.AddArg2(v5, v6)
v0.AddArg2(v1, v4)
v7 := b.NewValue0(v.Pos, OpOr32, typ.UInt32)
v8 := b.NewValue0(v.Pos, OpLsh32x32, typ.UInt32)
v8.AddArg2(v5, v3)
v9 := b.NewValue0(v.Pos, OpRsh32Ux32, typ.UInt32)
v9.AddArg2(v2, v6)
v7.AddArg2(v8, v9)
v.AddArg2(v0, v7)
return true
}
return false
}
func rewriteValuedec64_OpRotateLeft8(v *Value) bool {

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -73,7 +73,7 @@ func softfloat(f *Func) {
if newInt64 && f.Config.RegSize == 4 {
// On 32bit arch, decompose Uint64 introduced in the switch above.
decomposeBuiltIn(f)
decomposeBuiltin(f)
applyRewrite(f, rewriteBlockdec64, rewriteValuedec64, removeDeadValues)
}

View file

@ -1390,11 +1390,17 @@ func div19_int64(n int64) bool {
return n%19 == 0
}
var (
// These have to be global to avoid getting constant-folded in the function body:
// as locals, prove can see that they are actually constants.
sixU, nineteenU uint64 = 6, 19
sixS, nineteenS int64 = 6, 19
)
// testDivisibility confirms that rewrite rules x%c ==0 for c constant are correct.
func testDivisibility(t *testing.T) {
// unsigned tests
// test an even and an odd divisor
var sixU, nineteenU uint64 = 6, 19
// test all inputs for uint8, uint16
for i := uint64(0); i <= math.MaxUint16; i++ {
if i <= math.MaxUint8 {
@ -1402,7 +1408,7 @@ func testDivisibility(t *testing.T) {
t.Errorf("div6_uint8(%d) = %v want %v", i, got, want)
}
if want, got := uint8(i)%uint8(nineteenU) == 0, div19_uint8(uint8(i)); got != want {
t.Errorf("div6_uint19(%d) = %v want %v", i, got, want)
t.Errorf("div19_uint8(%d) = %v want %v", i, got, want)
}
}
if want, got := uint16(i)%uint16(sixU) == 0, div6_uint16(uint16(i)); got != want {
@ -1450,7 +1456,6 @@ func testDivisibility(t *testing.T) {
// signed tests
// test an even and an odd divisor
var sixS, nineteenS int64 = 6, 19
// test all inputs for int8, int16
for i := int64(math.MinInt16); i <= math.MaxInt16; i++ {
if math.MinInt8 <= i && i <= math.MaxInt8 {
@ -1458,7 +1463,7 @@ func testDivisibility(t *testing.T) {
t.Errorf("div6_int8(%d) = %v want %v", i, got, want)
}
if want, got := int8(i)%int8(nineteenS) == 0, div19_int8(int8(i)); got != want {
t.Errorf("div6_int19(%d) = %v want %v", i, got, want)
t.Errorf("div19_int8(%d) = %v want %v", i, got, want)
}
}
if want, got := int16(i)%int16(sixS) == 0, div6_int16(int16(i)); got != want {

View file

@ -26,7 +26,7 @@ func f1(a [256]int, i int) {
var j int
useInt(a[i]) // ERROR "Found IsInBounds$"
j = i % 256
useInt(a[j]) // ERROR "Found IsInBounds$"
useInt(a[j])
j = i & 255
useInt(a[j])
j = i & 17

View file

@ -10,9 +10,7 @@ package codegen
// simplifications and optimizations on integer types.
// For codegen tests on float types, see floats.go.
// ----------------- //
// Addition //
// ----------------- //
// Addition
func AddLargeConst(a uint64, out []uint64) {
// ppc64x/power10:"ADD [$]4294967296,"
@ -56,9 +54,7 @@ func AddLargeConst2(a int, out []int) {
out[0] = a + 0x10000
}
// ----------------- //
// Subtraction //
// ----------------- //
// Subtraction
var ef int
@ -90,58 +86,58 @@ func SubMem(arr []int, b, c, d int) int {
func SubFromConst(a int) int {
// ppc64x: `SUBC R[0-9]+,\s[$]40,\sR`
// riscv64: "ADDI \\$-40" "NEG"
// riscv64: "ADDI [$]-40" "NEG"
b := 40 - a
return b
}
func SubFromConstNeg(a int) int {
// arm64: "ADD \\$40"
// loong64: "ADDV[U] \\$40"
// mips: "ADD[U] \\$40"
// mips64: "ADDV[U] \\$40"
// arm64: "ADD [$]40"
// loong64: "ADDV[U] [$]40"
// mips: "ADD[U] [$]40"
// mips64: "ADDV[U] [$]40"
// ppc64x: `ADD [$]40,\sR[0-9]+,\sR`
// riscv64: "ADDI \\$40" -"NEG"
// riscv64: "ADDI [$]40" -"NEG"
c := 40 - (-a)
return c
}
func SubSubFromConst(a int) int {
// arm64: "ADD \\$20"
// loong64: "ADDV[U] \\$20"
// mips: "ADD[U] \\$20"
// mips64: "ADDV[U] \\$20"
// arm64: "ADD [$]20"
// loong64: "ADDV[U] [$]20"
// mips: "ADD[U] [$]20"
// mips64: "ADDV[U] [$]20"
// ppc64x: `ADD [$]20,\sR[0-9]+,\sR`
// riscv64: "ADDI \\$20" -"NEG"
// riscv64: "ADDI [$]20" -"NEG"
c := 40 - (20 - a)
return c
}
func AddSubFromConst(a int) int {
// ppc64x: `SUBC R[0-9]+,\s[$]60,\sR`
// riscv64: "ADDI \\$-60" "NEG"
// riscv64: "ADDI [$]-60" "NEG"
c := 40 + (20 - a)
return c
}
func NegSubFromConst(a int) int {
// arm64: "SUB \\$20"
// loong64: "ADDV[U] \\$-20"
// mips: "ADD[U] \\$-20"
// mips64: "ADDV[U] \\$-20"
// arm64: "SUB [$]20"
// loong64: "ADDV[U] [$]-20"
// mips: "ADD[U] [$]-20"
// mips64: "ADDV[U] [$]-20"
// ppc64x: `ADD [$]-20,\sR[0-9]+,\sR`
// riscv64: "ADDI \\$-20"
// riscv64: "ADDI [$]-20"
c := -(20 - a)
return c
}
func NegAddFromConstNeg(a int) int {
// arm64: "SUB \\$40" "NEG"
// loong64: "ADDV[U] \\$-40" "SUBV"
// mips: "ADD[U] \\$-40" "SUB"
// mips64: "ADDV[U] \\$-40" "SUBV"
// arm64: "SUB [$]40" "NEG"
// loong64: "ADDV[U] [$]-40" "SUBV"
// mips: "ADD[U] [$]-40" "SUB"
// mips64: "ADDV[U] [$]-40" "SUBV"
// ppc64x: `SUBC R[0-9]+,\s[$]40,\sR`
// riscv64: "ADDI \\$-40" "NEG"
// riscv64: "ADDI [$]-40" "NEG"
c := -(-40 + a)
return c
}
@ -361,16 +357,16 @@ func Pow2Divs(n1 uint, n2 int) (uint, int) {
// Check that constant divisions get turned into MULs
func ConstDivs(n1 uint, n2 int) (uint, int) {
// amd64:"MOVQ [$]-1085102592571150095" "MULQ" -"DIVQ"
// 386:"MOVL [$]-252645135" "MULL" -"DIVL"
// arm64:`MOVD`,`UMULH`,-`DIV`
// arm:`MOVW`,`MUL`,-`.*udiv`
// amd64: "MOVQ [$]-1085102592571150095" "MULQ" -"DIVQ"
// 386: "MOVL [$]-252645135" "MULL" -"DIVL"
// arm64: `MOVD`,`UMULH`,-`DIV`
// arm: `MOVW`,`MUL`,-`.*udiv`
a := n1 / 17 // unsigned
// amd64:"MOVQ [$]-1085102592571150095" "IMULQ" -"IDIVQ"
// 386:"IMULL" -"IDIVL"
// arm64:`SMULH`,-`DIV`
// arm:`MOVW`,`MUL`,-`.*udiv`
// amd64: "MOVQ [$]-1085102592571150095" "IMULQ" -"IDIVQ"
// 386: "IMULL" "SARL [$]4," "SARL [$]31," "SUBL" -".*DIV"
// arm64: `SMULH` -`DIV`
// arm: `MOVW` `MUL` -`.*udiv`
b := n2 / 17 // signed
return a, b
@ -421,16 +417,16 @@ func Pow2DivisibleSigned(n1, n2 int) (bool, bool) {
// Check that constant modulo divs get turned into MULs
func ConstMods(n1 uint, n2 int) (uint, int) {
// amd64:"MOVQ [$]-1085102592571150095" "MULQ" -"DIVQ"
// 386:"MOVL [$]-252645135" "MULL" -"DIVL"
// arm64:`MOVD`,`UMULH`,-`DIV`
// arm:`MOVW`,`MUL`,-`.*udiv`
// amd64: "MOVQ [$]-1085102592571150095" "MULQ" -"DIVQ"
// 386: "MOVL [$]-252645135" "MULL" -".*DIVL"
// arm64: `MOVD` `UMULH` -`DIV`
// arm: `MOVW` `MUL` -`.*udiv`
a := n1 % 17 // unsigned
// amd64:"MOVQ [$]-1085102592571150095" "IMULQ" -"IDIVQ"
// 386: "IMULL" -"IDIVL"
// arm64:`SMULH`,-`DIV`
// arm:`MOVW`,`MUL`,-`.*udiv`
// amd64: "MOVQ [$]-1085102592571150095" "IMULQ" -"IDIVQ"
// 386: "IMULL" "SARL [$]4," "SARL [$]31," "SUBL" "SHLL [$]4," "SUBL" -".*DIV"
// arm64: `SMULH` -`DIV`
// arm: `MOVW` `MUL` -`.*udiv`
b := n2 % 17 // signed
return a, b
@ -675,12 +671,13 @@ func addSpecial(a, b, c uint32) (uint32, uint32, uint32) {
}
// Divide -> shift rules usually require fixup for negative inputs.
// If the input is non-negative, make sure the fixup is eliminated.
// If the input is non-negative, make sure the unsigned form is generated.
func divInt(v int64) int64 {
if v < 0 {
return 0
// amd64:`SARQ.*63,`, `SHRQ.*56,`, `SARQ.*8,`
return v / 256
}
// amd64:-`.*SARQ.*63,`, -".*SHRQ", ".*SARQ.*[$]9,"
// amd64:-`.*SARQ`, `SHRQ.*9,`
return v / 512
}
@ -721,9 +718,7 @@ func constantFold3(i, j int) int {
return r
}
// ----------------- //
// Integer Min/Max //
// ----------------- //
// Integer Min/Max
func Int64Min(a, b int64) int64 {
// amd64: "CMPQ" "CMOVQLT"

1115
test/codegen/divmod.go Normal file

File diff suppressed because it is too large Load diff

View file

@ -1,6 +1,6 @@
// errorcheck -0 -d=ssa/prove/debug=1
//go:build amd64
//go:build amd64 || arm64
// Copyright 2016 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
@ -1018,21 +1018,21 @@ func divShiftClean(n int) int {
if n < 0 {
return n
}
return n / int(8) // ERROR "Proved Rsh64x64 shifts to zero"
return n / int(8) // ERROR "Proved Div64 is unsigned$"
}
func divShiftClean64(n int64) int64 {
if n < 0 {
return n
}
return n / int64(16) // ERROR "Proved Rsh64x64 shifts to zero"
return n / int64(16) // ERROR "Proved Div64 is unsigned$"
}
func divShiftClean32(n int32) int32 {
if n < 0 {
return n
}
return n / int32(16) // ERROR "Proved Rsh32x64 shifts to zero"
return n / int32(16) // ERROR "Proved Div32 is unsigned$"
}
// Bounds check elimination
@ -1112,7 +1112,7 @@ func modu2(x, y uint) int {
}
func issue57077(s []int) (left, right []int) {
middle := len(s) / 2
middle := len(s) / 2 // ERROR "Proved Div64 is unsigned$"
left = s[:middle] // ERROR "Proved IsSliceInBounds$"
right = s[middle:] // ERROR "Proved IsSliceInBounds$"
return
@ -1501,7 +1501,7 @@ func mod64sPositiveWithSmallerDividendMax(a, b int64, ensureBothBranchesCouldHap
a = min(a, 0xff)
b = min(b, 0xfff)
z := a % b // ERROR "Proved Mod64 does not need fix-up$"
z := a % b // ERROR "Proved Mod64 is unsigned$"
if ensureBothBranchesCouldHappen {
if z > 0xff { // ERROR "Disproved Less64$"
@ -1521,7 +1521,7 @@ func mod64sPositiveWithSmallerDivisorMax(a, b int64, ensureBothBranchesCouldHapp
a = min(a, 0xfff)
b = min(b, 0xff)
z := a % b // ERROR "Proved Mod64 does not need fix-up$"
z := a % b // ERROR "Proved Mod64 is unsigned$"
if ensureBothBranchesCouldHappen {
if z > 0xff-1 { // ERROR "Disproved Less64$"
@ -1541,7 +1541,7 @@ func mod64sPositiveWithIdenticalMax(a, b int64, ensureBothBranchesCouldHappen bo
a = min(a, 0xfff)
b = min(b, 0xfff)
z := a % b // ERROR "Proved Mod64 does not need fix-up$"
z := a % b // ERROR "Proved Mod64 is unsigned$"
if ensureBothBranchesCouldHappen {
if z > 0xfff-1 { // ERROR "Disproved Less64$"
@ -1586,7 +1586,7 @@ func div64s(a, b int64, ensureAllBranchesCouldHappen func() bool) int64 {
b = min(b, 0xff)
b = max(b, 0xf)
z := a / b // ERROR "(Proved Div64 does not need fix-up|Proved Neq64)$"
z := a / b // ERROR "Proved Div64 is unsigned|Proved Neq64"
if ensureAllBranchesCouldHappen() && z > 0xffff/0xf { // ERROR "Disproved Less64$"
return 42
@ -2507,6 +2507,7 @@ func mulIntoAnd(a, b uint) uint {
}
return a * b // ERROR "Rewrote Mul v[0-9]+ into And$"
}
func mulIntoCondSelect(a, b uint) uint {
if a > 1 {
return 0
@ -2514,6 +2515,75 @@ func mulIntoCondSelect(a, b uint) uint {
return a * b // ERROR "Rewrote Mul v[0-9]+ into CondSelect"
}
func div7pos(x int32) bool {
if x > 0 {
return x%7 == 0 // ERROR "Proved Div32 is unsigned"
}
return false
}
func div2pos(x []int) int {
return len(x) / 2 // ERROR "Proved Div64 is unsigned"
}
func div3pos(x []int) int {
return len(x) / 3 // ERROR "Proved Div64 is unsigned"
}
var len200 [200]int
func modbound1(u uint64) int {
s := 0
for u > 0 {
var d uint64
u, d = u/100, u%100
s += len200[d*2+1] // ERROR "Proved IsInBounds"
}
return s
}
func modbound2(p *[10]int, x uint) int {
return p[x%9+1] // ERROR "Proved IsInBounds"
}
func shiftbound(x int) int {
return 1 << (x % 11) // ERROR "Proved Lsh(32x32|64x64) bounded" "Proved Div64 does not need fix-up"
}
func shiftbound2(x int) int {
return 1 << (x % 8) // ERROR "Proved Lsh(32x32|64x64) bounded" "Proved Div64 does not need fix-up"
}
func rangebound1(x []int) int {
s := 0
for i := range 1000 { // ERROR "Induction variable"
if i < len(x) {
s += x[i] // ERROR "Proved IsInBounds"
}
}
return s
}
func rangebound2(x []int) int {
s := 0
if len(x) > 0 {
for i := range 1000 { // ERROR "Induction variable"
s += x[i%len(x)] // ERROR "Proved Mod64 is unsigned" "Proved Neq64" "Proved IsInBounds"
}
}
return s
}
func swapbound(v []int) {
for i := 0; i < len(v)/2; i++ { // ERROR "Proved Div64 is unsigned|Induction variable"
v[i], // ERROR "Proved IsInBounds"
v[len(v)-1-i] = // ERROR "Proved IsInBounds"
v[len(v)-1-i],
v[i] // ERROR "Proved IsInBounds"
}
}
//go:noinline
func useInt(a int) {
}

View file

@ -1,6 +1,6 @@
// errorcheck -0 -d=ssa/prove/debug=2
//go:build amd64
//go:build amd64 || arm64
// Copyright 2022 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
@ -17,7 +17,7 @@ func f0i(x int) int {
return x + 5 // ERROR "Proved.+is constant 0$" "Proved.+is constant 5$" "x\+d >=? w"
}
return x / 2
return x + 1
}
func f0u(x uint) uint {
@ -29,5 +29,5 @@ func f0u(x uint) uint {
return x + 5 // ERROR "Proved.+is constant 0$" "Proved.+is constant 5$" "x\+d >=? w"
}
return x / 2
return x + 1
}

View file

@ -1,6 +1,6 @@
// errorcheck -0 -d=ssa/prove/debug=1
//go:build amd64
//go:build amd64 || arm64
package main