go/src/cmd/compile/internal/test/divconst_test.go

326 lines
7.7 KiB
Go
Raw Normal View History

// Copyright 2016 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package test
import (
"testing"
)
cmd/compile: add signed divisibility by power of 2 rules For powers of two (c=1<<k), the divisibility check x%c == 0 can be made just by checking the trailing zeroes via a mask x&(c-1) == 0 even for signed integers. This avoids division fix-ups when just divisibility check is needed. To apply this rule, we match on the fixed-up version of the division. This is neccessary because the mod and division rewrite rules are already applied during the initial opt pass. The speed up on amd64 due to elimination of unneccessary fix-up code is ~55%: name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.09ns ± 1% ~ (p=0.730 n=5+5) DivisiblePow2constI64-4 1.78ns ± 1% 0.81ns ± 1% -54.66% (p=0.008 n=5+5) DivconstU64-4 2.08ns ± 0% 2.08ns ± 0% ~ (p=0.683 n=5+5) DivconstI32-4 1.53ns ± 0% 1.53ns ± 1% ~ (p=0.968 n=4+5) DivisiblePow2constI32-4 1.79ns ± 1% 0.81ns ± 1% -54.97% (p=0.008 n=5+5) DivconstU32-4 1.78ns ± 1% 1.80ns ± 2% ~ (p=0.206 n=5+5) DivconstI16-4 1.54ns ± 2% 1.54ns ± 0% ~ (p=0.238 n=5+4) DivisiblePow2constI16-4 1.78ns ± 0% 0.81ns ± 1% -54.72% (p=0.000 n=4+5) DivconstU16-4 1.00ns ± 5% 1.01ns ± 1% ~ (p=0.119 n=5+5) DivconstI8-4 1.54ns ± 0% 1.54ns ± 2% ~ (p=0.571 n=4+5) DivisiblePow2constI8-4 1.78ns ± 0% 0.82ns ± 8% -53.71% (p=0.008 n=5+5) DivconstU8-4 0.93ns ± 1% 0.93ns ± 1% ~ (p=0.643 n=5+5) A follow-up CL will address the general case of x%c == 0 for signed integers. Updates #15806 Change-Id: Iabadbbe369b6e0998c8ce85d038ebc236142e42a Reviewed-on: https://go-review.googlesource.com/c/go/+/173557 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-04-23 22:04:38 -06:00
var boolres bool
var i64res int64
func BenchmarkDivconstI64(b *testing.B) {
for i := 0; i < b.N; i++ {
i64res = int64(i) / 7
}
}
cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c*(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-09 21:58:16 -07:00
func BenchmarkModconstI64(b *testing.B) {
for i := 0; i < b.N; i++ {
i64res = int64(i) % 7
}
}
cmd/compile: add signed divisibility by power of 2 rules For powers of two (c=1<<k), the divisibility check x%c == 0 can be made just by checking the trailing zeroes via a mask x&(c-1) == 0 even for signed integers. This avoids division fix-ups when just divisibility check is needed. To apply this rule, we match on the fixed-up version of the division. This is neccessary because the mod and division rewrite rules are already applied during the initial opt pass. The speed up on amd64 due to elimination of unneccessary fix-up code is ~55%: name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.09ns ± 1% ~ (p=0.730 n=5+5) DivisiblePow2constI64-4 1.78ns ± 1% 0.81ns ± 1% -54.66% (p=0.008 n=5+5) DivconstU64-4 2.08ns ± 0% 2.08ns ± 0% ~ (p=0.683 n=5+5) DivconstI32-4 1.53ns ± 0% 1.53ns ± 1% ~ (p=0.968 n=4+5) DivisiblePow2constI32-4 1.79ns ± 1% 0.81ns ± 1% -54.97% (p=0.008 n=5+5) DivconstU32-4 1.78ns ± 1% 1.80ns ± 2% ~ (p=0.206 n=5+5) DivconstI16-4 1.54ns ± 2% 1.54ns ± 0% ~ (p=0.238 n=5+4) DivisiblePow2constI16-4 1.78ns ± 0% 0.81ns ± 1% -54.72% (p=0.000 n=4+5) DivconstU16-4 1.00ns ± 5% 1.01ns ± 1% ~ (p=0.119 n=5+5) DivconstI8-4 1.54ns ± 0% 1.54ns ± 2% ~ (p=0.571 n=4+5) DivisiblePow2constI8-4 1.78ns ± 0% 0.82ns ± 8% -53.71% (p=0.008 n=5+5) DivconstU8-4 0.93ns ± 1% 0.93ns ± 1% ~ (p=0.643 n=5+5) A follow-up CL will address the general case of x%c == 0 for signed integers. Updates #15806 Change-Id: Iabadbbe369b6e0998c8ce85d038ebc236142e42a Reviewed-on: https://go-review.googlesource.com/c/go/+/173557 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-04-23 22:04:38 -06:00
func BenchmarkDivisiblePow2constI64(b *testing.B) {
for i := 0; i < b.N; i++ {
boolres = int64(i)%16 == 0
}
}
cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c*(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-09 21:58:16 -07:00
func BenchmarkDivisibleconstI64(b *testing.B) {
for i := 0; i < b.N; i++ {
boolres = int64(i)%7 == 0
}
}
func BenchmarkDivisibleWDivconstI64(b *testing.B) {
for i := 0; i < b.N; i++ {
i64res = int64(i) / 7
boolres = int64(i)%7 == 0
}
}
cmd/compile: add signed divisibility by power of 2 rules For powers of two (c=1<<k), the divisibility check x%c == 0 can be made just by checking the trailing zeroes via a mask x&(c-1) == 0 even for signed integers. This avoids division fix-ups when just divisibility check is needed. To apply this rule, we match on the fixed-up version of the division. This is neccessary because the mod and division rewrite rules are already applied during the initial opt pass. The speed up on amd64 due to elimination of unneccessary fix-up code is ~55%: name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.09ns ± 1% ~ (p=0.730 n=5+5) DivisiblePow2constI64-4 1.78ns ± 1% 0.81ns ± 1% -54.66% (p=0.008 n=5+5) DivconstU64-4 2.08ns ± 0% 2.08ns ± 0% ~ (p=0.683 n=5+5) DivconstI32-4 1.53ns ± 0% 1.53ns ± 1% ~ (p=0.968 n=4+5) DivisiblePow2constI32-4 1.79ns ± 1% 0.81ns ± 1% -54.97% (p=0.008 n=5+5) DivconstU32-4 1.78ns ± 1% 1.80ns ± 2% ~ (p=0.206 n=5+5) DivconstI16-4 1.54ns ± 2% 1.54ns ± 0% ~ (p=0.238 n=5+4) DivisiblePow2constI16-4 1.78ns ± 0% 0.81ns ± 1% -54.72% (p=0.000 n=4+5) DivconstU16-4 1.00ns ± 5% 1.01ns ± 1% ~ (p=0.119 n=5+5) DivconstI8-4 1.54ns ± 0% 1.54ns ± 2% ~ (p=0.571 n=4+5) DivisiblePow2constI8-4 1.78ns ± 0% 0.82ns ± 8% -53.71% (p=0.008 n=5+5) DivconstU8-4 0.93ns ± 1% 0.93ns ± 1% ~ (p=0.643 n=5+5) A follow-up CL will address the general case of x%c == 0 for signed integers. Updates #15806 Change-Id: Iabadbbe369b6e0998c8ce85d038ebc236142e42a Reviewed-on: https://go-review.googlesource.com/c/go/+/173557 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-04-23 22:04:38 -06:00
var u64res uint64
func TestDivmodConstU64(t *testing.T) {
// Test division by c. Function f must be func(n) { return n/c, n%c }
testdiv := func(c uint64, f func(uint64) (uint64, uint64)) func(*testing.T) {
return func(t *testing.T) {
x := uint64(12345)
for i := 0; i < 10000; i++ {
x += x << 2
q, r := f(x)
if r < 0 || r >= c || q*c+r != x {
t.Errorf("divmod(%d, %d) returned incorrect (%d, %d)", x, c, q, r)
}
}
max := uint64(1<<64-1) / c * c
xs := []uint64{0, 1, c - 1, c, c + 1, 2*c - 1, 2 * c, 2*c + 1,
c*c - 1, c * c, c*c + 1, max - 1, max, max + 1, 1<<64 - 1}
for _, x := range xs {
q, r := f(x)
if r < 0 || r >= c || q*c+r != x {
t.Errorf("divmod(%d, %d) returned incorrect (%d, %d)", x, c, q, r)
}
}
}
}
t.Run("2", testdiv(2, func(n uint64) (uint64, uint64) { return n / 2, n % 2 }))
t.Run("3", testdiv(3, func(n uint64) (uint64, uint64) { return n / 3, n % 3 }))
t.Run("4", testdiv(4, func(n uint64) (uint64, uint64) { return n / 4, n % 4 }))
t.Run("5", testdiv(5, func(n uint64) (uint64, uint64) { return n / 5, n % 5 }))
t.Run("6", testdiv(6, func(n uint64) (uint64, uint64) { return n / 6, n % 6 }))
t.Run("7", testdiv(7, func(n uint64) (uint64, uint64) { return n / 7, n % 7 }))
t.Run("8", testdiv(8, func(n uint64) (uint64, uint64) { return n / 8, n % 8 }))
t.Run("9", testdiv(9, func(n uint64) (uint64, uint64) { return n / 9, n % 9 }))
t.Run("10", testdiv(10, func(n uint64) (uint64, uint64) { return n / 10, n % 10 }))
t.Run("11", testdiv(11, func(n uint64) (uint64, uint64) { return n / 11, n % 11 }))
t.Run("12", testdiv(12, func(n uint64) (uint64, uint64) { return n / 12, n % 12 }))
t.Run("13", testdiv(13, func(n uint64) (uint64, uint64) { return n / 13, n % 13 }))
t.Run("14", testdiv(14, func(n uint64) (uint64, uint64) { return n / 14, n % 14 }))
t.Run("15", testdiv(15, func(n uint64) (uint64, uint64) { return n / 15, n % 15 }))
t.Run("16", testdiv(16, func(n uint64) (uint64, uint64) { return n / 16, n % 16 }))
t.Run("17", testdiv(17, func(n uint64) (uint64, uint64) { return n / 17, n % 17 }))
t.Run("255", testdiv(255, func(n uint64) (uint64, uint64) { return n / 255, n % 255 }))
t.Run("256", testdiv(256, func(n uint64) (uint64, uint64) { return n / 256, n % 256 }))
t.Run("257", testdiv(257, func(n uint64) (uint64, uint64) { return n / 257, n % 257 }))
t.Run("65535", testdiv(65535, func(n uint64) (uint64, uint64) { return n / 65535, n % 65535 }))
t.Run("65536", testdiv(65536, func(n uint64) (uint64, uint64) { return n / 65536, n % 65536 }))
t.Run("65537", testdiv(65537, func(n uint64) (uint64, uint64) { return n / 65537, n % 65537 }))
t.Run("1<<32-1", testdiv(1<<32-1, func(n uint64) (uint64, uint64) { return n / (1<<32 - 1), n % (1<<32 - 1) }))
t.Run("1<<32+1", testdiv(1<<32+1, func(n uint64) (uint64, uint64) { return n / (1<<32 + 1), n % (1<<32 + 1) }))
t.Run("1<<64-1", testdiv(1<<64-1, func(n uint64) (uint64, uint64) { return n / (1<<64 - 1), n % (1<<64 - 1) }))
}
func BenchmarkDivconstU64(b *testing.B) {
b.Run("3", func(b *testing.B) {
x := uint64(123456789123456789)
for i := 0; i < b.N; i++ {
x += x << 4
u64res = uint64(x) / 3
}
})
b.Run("5", func(b *testing.B) {
x := uint64(123456789123456789)
for i := 0; i < b.N; i++ {
x += x << 4
u64res = uint64(x) / 5
}
})
b.Run("37", func(b *testing.B) {
x := uint64(123456789123456789)
for i := 0; i < b.N; i++ {
x += x << 4
u64res = uint64(x) / 37
}
})
b.Run("1234567", func(b *testing.B) {
x := uint64(123456789123456789)
for i := 0; i < b.N; i++ {
x += x << 4
u64res = uint64(x) / 1234567
}
})
}
cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c*(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-09 21:58:16 -07:00
func BenchmarkModconstU64(b *testing.B) {
for i := 0; i < b.N; i++ {
u64res = uint64(i) % 7
}
}
func BenchmarkDivisibleconstU64(b *testing.B) {
for i := 0; i < b.N; i++ {
boolres = uint64(i)%7 == 0
}
}
func BenchmarkDivisibleWDivconstU64(b *testing.B) {
for i := 0; i < b.N; i++ {
u64res = uint64(i) / 7
boolres = uint64(i)%7 == 0
}
}
var i32res int32
func BenchmarkDivconstI32(b *testing.B) {
for i := 0; i < b.N; i++ {
i32res = int32(i) / 7
}
}
cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c*(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-09 21:58:16 -07:00
func BenchmarkModconstI32(b *testing.B) {
for i := 0; i < b.N; i++ {
i32res = int32(i) % 7
}
}
cmd/compile: add signed divisibility by power of 2 rules For powers of two (c=1<<k), the divisibility check x%c == 0 can be made just by checking the trailing zeroes via a mask x&(c-1) == 0 even for signed integers. This avoids division fix-ups when just divisibility check is needed. To apply this rule, we match on the fixed-up version of the division. This is neccessary because the mod and division rewrite rules are already applied during the initial opt pass. The speed up on amd64 due to elimination of unneccessary fix-up code is ~55%: name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.09ns ± 1% ~ (p=0.730 n=5+5) DivisiblePow2constI64-4 1.78ns ± 1% 0.81ns ± 1% -54.66% (p=0.008 n=5+5) DivconstU64-4 2.08ns ± 0% 2.08ns ± 0% ~ (p=0.683 n=5+5) DivconstI32-4 1.53ns ± 0% 1.53ns ± 1% ~ (p=0.968 n=4+5) DivisiblePow2constI32-4 1.79ns ± 1% 0.81ns ± 1% -54.97% (p=0.008 n=5+5) DivconstU32-4 1.78ns ± 1% 1.80ns ± 2% ~ (p=0.206 n=5+5) DivconstI16-4 1.54ns ± 2% 1.54ns ± 0% ~ (p=0.238 n=5+4) DivisiblePow2constI16-4 1.78ns ± 0% 0.81ns ± 1% -54.72% (p=0.000 n=4+5) DivconstU16-4 1.00ns ± 5% 1.01ns ± 1% ~ (p=0.119 n=5+5) DivconstI8-4 1.54ns ± 0% 1.54ns ± 2% ~ (p=0.571 n=4+5) DivisiblePow2constI8-4 1.78ns ± 0% 0.82ns ± 8% -53.71% (p=0.008 n=5+5) DivconstU8-4 0.93ns ± 1% 0.93ns ± 1% ~ (p=0.643 n=5+5) A follow-up CL will address the general case of x%c == 0 for signed integers. Updates #15806 Change-Id: Iabadbbe369b6e0998c8ce85d038ebc236142e42a Reviewed-on: https://go-review.googlesource.com/c/go/+/173557 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-04-23 22:04:38 -06:00
func BenchmarkDivisiblePow2constI32(b *testing.B) {
for i := 0; i < b.N; i++ {
boolres = int32(i)%16 == 0
}
}
cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c*(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-09 21:58:16 -07:00
func BenchmarkDivisibleconstI32(b *testing.B) {
for i := 0; i < b.N; i++ {
boolres = int32(i)%7 == 0
}
}
func BenchmarkDivisibleWDivconstI32(b *testing.B) {
for i := 0; i < b.N; i++ {
i32res = int32(i) / 7
boolres = int32(i)%7 == 0
}
}
var u32res uint32
func BenchmarkDivconstU32(b *testing.B) {
for i := 0; i < b.N; i++ {
u32res = uint32(i) / 7
}
}
cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c*(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-09 21:58:16 -07:00
func BenchmarkModconstU32(b *testing.B) {
for i := 0; i < b.N; i++ {
u32res = uint32(i) % 7
}
}
func BenchmarkDivisibleconstU32(b *testing.B) {
for i := 0; i < b.N; i++ {
boolres = uint32(i)%7 == 0
}
}
func BenchmarkDivisibleWDivconstU32(b *testing.B) {
for i := 0; i < b.N; i++ {
u32res = uint32(i) / 7
boolres = uint32(i)%7 == 0
}
}
var i16res int16
func BenchmarkDivconstI16(b *testing.B) {
for i := 0; i < b.N; i++ {
i16res = int16(i) / 7
}
}
cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c*(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-09 21:58:16 -07:00
func BenchmarkModconstI16(b *testing.B) {
for i := 0; i < b.N; i++ {
i16res = int16(i) % 7
}
}
cmd/compile: add signed divisibility by power of 2 rules For powers of two (c=1<<k), the divisibility check x%c == 0 can be made just by checking the trailing zeroes via a mask x&(c-1) == 0 even for signed integers. This avoids division fix-ups when just divisibility check is needed. To apply this rule, we match on the fixed-up version of the division. This is neccessary because the mod and division rewrite rules are already applied during the initial opt pass. The speed up on amd64 due to elimination of unneccessary fix-up code is ~55%: name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.09ns ± 1% ~ (p=0.730 n=5+5) DivisiblePow2constI64-4 1.78ns ± 1% 0.81ns ± 1% -54.66% (p=0.008 n=5+5) DivconstU64-4 2.08ns ± 0% 2.08ns ± 0% ~ (p=0.683 n=5+5) DivconstI32-4 1.53ns ± 0% 1.53ns ± 1% ~ (p=0.968 n=4+5) DivisiblePow2constI32-4 1.79ns ± 1% 0.81ns ± 1% -54.97% (p=0.008 n=5+5) DivconstU32-4 1.78ns ± 1% 1.80ns ± 2% ~ (p=0.206 n=5+5) DivconstI16-4 1.54ns ± 2% 1.54ns ± 0% ~ (p=0.238 n=5+4) DivisiblePow2constI16-4 1.78ns ± 0% 0.81ns ± 1% -54.72% (p=0.000 n=4+5) DivconstU16-4 1.00ns ± 5% 1.01ns ± 1% ~ (p=0.119 n=5+5) DivconstI8-4 1.54ns ± 0% 1.54ns ± 2% ~ (p=0.571 n=4+5) DivisiblePow2constI8-4 1.78ns ± 0% 0.82ns ± 8% -53.71% (p=0.008 n=5+5) DivconstU8-4 0.93ns ± 1% 0.93ns ± 1% ~ (p=0.643 n=5+5) A follow-up CL will address the general case of x%c == 0 for signed integers. Updates #15806 Change-Id: Iabadbbe369b6e0998c8ce85d038ebc236142e42a Reviewed-on: https://go-review.googlesource.com/c/go/+/173557 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-04-23 22:04:38 -06:00
func BenchmarkDivisiblePow2constI16(b *testing.B) {
for i := 0; i < b.N; i++ {
boolres = int16(i)%16 == 0
}
}
cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c*(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-09 21:58:16 -07:00
func BenchmarkDivisibleconstI16(b *testing.B) {
for i := 0; i < b.N; i++ {
boolres = int16(i)%7 == 0
}
}
func BenchmarkDivisibleWDivconstI16(b *testing.B) {
for i := 0; i < b.N; i++ {
i16res = int16(i) / 7
boolres = int16(i)%7 == 0
}
}
var u16res uint16
func BenchmarkDivconstU16(b *testing.B) {
for i := 0; i < b.N; i++ {
u16res = uint16(i) / 7
}
}
cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c*(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-09 21:58:16 -07:00
func BenchmarkModconstU16(b *testing.B) {
for i := 0; i < b.N; i++ {
u16res = uint16(i) % 7
}
}
func BenchmarkDivisibleconstU16(b *testing.B) {
for i := 0; i < b.N; i++ {
boolres = uint16(i)%7 == 0
}
}
func BenchmarkDivisibleWDivconstU16(b *testing.B) {
for i := 0; i < b.N; i++ {
u16res = uint16(i) / 7
boolres = uint16(i)%7 == 0
}
}
var i8res int8
func BenchmarkDivconstI8(b *testing.B) {
for i := 0; i < b.N; i++ {
i8res = int8(i) / 7
}
}
cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c*(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-09 21:58:16 -07:00
func BenchmarkModconstI8(b *testing.B) {
for i := 0; i < b.N; i++ {
i8res = int8(i) % 7
}
}
cmd/compile: add signed divisibility by power of 2 rules For powers of two (c=1<<k), the divisibility check x%c == 0 can be made just by checking the trailing zeroes via a mask x&(c-1) == 0 even for signed integers. This avoids division fix-ups when just divisibility check is needed. To apply this rule, we match on the fixed-up version of the division. This is neccessary because the mod and division rewrite rules are already applied during the initial opt pass. The speed up on amd64 due to elimination of unneccessary fix-up code is ~55%: name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.09ns ± 1% ~ (p=0.730 n=5+5) DivisiblePow2constI64-4 1.78ns ± 1% 0.81ns ± 1% -54.66% (p=0.008 n=5+5) DivconstU64-4 2.08ns ± 0% 2.08ns ± 0% ~ (p=0.683 n=5+5) DivconstI32-4 1.53ns ± 0% 1.53ns ± 1% ~ (p=0.968 n=4+5) DivisiblePow2constI32-4 1.79ns ± 1% 0.81ns ± 1% -54.97% (p=0.008 n=5+5) DivconstU32-4 1.78ns ± 1% 1.80ns ± 2% ~ (p=0.206 n=5+5) DivconstI16-4 1.54ns ± 2% 1.54ns ± 0% ~ (p=0.238 n=5+4) DivisiblePow2constI16-4 1.78ns ± 0% 0.81ns ± 1% -54.72% (p=0.000 n=4+5) DivconstU16-4 1.00ns ± 5% 1.01ns ± 1% ~ (p=0.119 n=5+5) DivconstI8-4 1.54ns ± 0% 1.54ns ± 2% ~ (p=0.571 n=4+5) DivisiblePow2constI8-4 1.78ns ± 0% 0.82ns ± 8% -53.71% (p=0.008 n=5+5) DivconstU8-4 0.93ns ± 1% 0.93ns ± 1% ~ (p=0.643 n=5+5) A follow-up CL will address the general case of x%c == 0 for signed integers. Updates #15806 Change-Id: Iabadbbe369b6e0998c8ce85d038ebc236142e42a Reviewed-on: https://go-review.googlesource.com/c/go/+/173557 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-04-23 22:04:38 -06:00
func BenchmarkDivisiblePow2constI8(b *testing.B) {
for i := 0; i < b.N; i++ {
boolres = int8(i)%16 == 0
}
}
cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c*(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-09 21:58:16 -07:00
func BenchmarkDivisibleconstI8(b *testing.B) {
for i := 0; i < b.N; i++ {
boolres = int8(i)%7 == 0
}
}
func BenchmarkDivisibleWDivconstI8(b *testing.B) {
for i := 0; i < b.N; i++ {
i8res = int8(i) / 7
boolres = int8(i)%7 == 0
}
}
var u8res uint8
func BenchmarkDivconstU8(b *testing.B) {
for i := 0; i < b.N; i++ {
u8res = uint8(i) / 7
}
}
cmd/compile: add unsigned divisibility rules "Division by invariant integers using multiplication" paper by Granlund and Montgomery contains a method for directly computing divisibility (x%c == 0 for c constant) by means of the modular inverse. The method is further elaborated in "Hacker's Delight" by Warren Section 10-17 This general rule can compute divisibilty by one multiplication and a compare for odd divisors and an additional rotate for even divisors. To apply the divisibility rule, we must take into account the rules to rewrite x%c = x-((x/c)*c) and (x/c) for c constant on the first optimization pass "opt". This complicates the matching as we want to match only in the cases where the result of (x/c) is not also available. So, we must match on the expanded form of (x/c) in the expression x == c*(x/c) in the "late opt" pass after common subexpresion elimination. Note, that if there is an intermediate opt pass introduced in the future we could simplify these rules by delaying the magic division rewrite to "late opt" and matching directly on (x/c) in the intermediate opt pass. Additional rules to lower the generic RotateLeft* ops were also applied. On amd64, the divisibility check is 25-50% faster. name old time/op new time/op delta DivconstI64-4 2.08ns ± 0% 2.08ns ± 1% ~ (p=0.881 n=5+5) DivisibleconstI64-4 2.67ns ± 0% 2.67ns ± 1% ~ (p=1.000 n=5+5) DivisibleWDivconstI64-4 2.67ns ± 0% 2.67ns ± 0% ~ (p=0.683 n=5+5) DivconstU64-4 2.08ns ± 1% 2.08ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstU64-4 2.77ns ± 1% 1.55ns ± 2% -43.90% (p=0.008 n=5+5) DivisibleWDivconstU64-4 2.99ns ± 1% 2.99ns ± 1% ~ (p=1.000 n=5+5) DivconstI32-4 1.53ns ± 2% 1.53ns ± 0% ~ (p=1.000 n=5+5) DivisibleconstI32-4 2.23ns ± 0% 2.25ns ± 3% ~ (p=0.167 n=5+5) DivisibleWDivconstI32-4 2.27ns ± 1% 2.27ns ± 1% ~ (p=0.429 n=5+5) DivconstU32-4 1.78ns ± 0% 1.78ns ± 1% ~ (p=1.000 n=4+5) DivisibleconstU32-4 2.52ns ± 2% 1.26ns ± 0% -49.96% (p=0.000 n=5+4) DivisibleWDivconstU32-4 2.63ns ± 0% 2.85ns ±10% +8.29% (p=0.016 n=4+5) DivconstI16-4 1.54ns ± 0% 1.54ns ± 0% ~ (p=0.333 n=4+5) DivisibleconstI16-4 2.10ns ± 0% 2.10ns ± 1% ~ (p=0.571 n=4+5) DivisibleWDivconstI16-4 2.22ns ± 0% 2.23ns ± 1% ~ (p=0.556 n=4+5) DivconstU16-4 1.09ns ± 0% 1.01ns ± 1% -7.74% (p=0.000 n=4+5) DivisibleconstU16-4 1.83ns ± 0% 1.26ns ± 0% -31.52% (p=0.008 n=5+5) DivisibleWDivconstU16-4 1.88ns ± 0% 1.89ns ± 1% ~ (p=0.365 n=5+5) DivconstI8-4 1.54ns ± 1% 1.54ns ± 1% ~ (p=1.000 n=5+5) DivisibleconstI8-4 2.10ns ± 0% 2.11ns ± 0% ~ (p=0.238 n=5+4) DivisibleWDivconstI8-4 2.22ns ± 0% 2.23ns ± 2% ~ (p=0.762 n=5+5) DivconstU8-4 0.92ns ± 1% 0.94ns ± 1% +2.65% (p=0.008 n=5+5) DivisibleconstU8-4 1.66ns ± 0% 1.26ns ± 1% -24.28% (p=0.008 n=5+5) DivisibleWDivconstU8-4 1.79ns ± 0% 1.80ns ± 1% ~ (p=0.079 n=4+5) A follow-up change will address the signed division case. Updates #30282 Change-Id: I7e995f167179aa5c76bb10fbcbeb49c520943403 Reviewed-on: https://go-review.googlesource.com/c/go/+/168037 Run-TryBot: Brian Kessler <brian.m.kessler@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
2019-03-09 21:58:16 -07:00
func BenchmarkModconstU8(b *testing.B) {
for i := 0; i < b.N; i++ {
u8res = uint8(i) % 7
}
}
func BenchmarkDivisibleconstU8(b *testing.B) {
for i := 0; i < b.N; i++ {
boolres = uint8(i)%7 == 0
}
}
func BenchmarkDivisibleWDivconstU8(b *testing.B) {
for i := 0; i < b.N; i++ {
u8res = uint8(i) / 7
boolres = uint8(i)%7 == 0
}
}