cmd/compile: intrinsics for math/bits.OnesCount

Popcount instructions on amd64 are not guaranteed to be
present, so we must guard their call.  Rewrite rules can't
generate control flow at the moment, so the intrinsifier
needs to generate that code.

name           old time/op  new time/op  delta
OnesCount-8    2.47ns ± 5%  1.04ns ± 2%  -57.70%  (p=0.000 n=10+10)
OnesCount16-8  1.05ns ± 1%  0.78ns ± 0%  -25.56%    (p=0.000 n=9+8)
OnesCount32-8  1.63ns ± 5%  1.04ns ± 2%  -35.96%  (p=0.000 n=10+10)
OnesCount64-8  2.45ns ± 0%  1.04ns ± 1%  -57.55%   (p=0.000 n=6+10)

Update #18616

Change-Id: I4aff2cc9aa93787898d7b22055fe272a7cf95673
Reviewed-on: https://go-review.googlesource.com/38320
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
This commit is contained in:
Keith Randall 2017-03-16 21:33:03 -07:00 committed by Keith Randall
parent 59f6549d1c
commit 5cadc91b3c
10 changed files with 228 additions and 0 deletions

View file

@ -699,6 +699,34 @@ var linuxAMD64Tests = []*asmTest{
`,
[]string{"\tBSRQ\t"},
},
{
`
func pop1(x uint64) int {
return bits.OnesCount64(x)
}`,
[]string{"\tPOPCNTQ\t", "support_popcnt"},
},
{
`
func pop2(x uint32) int {
return bits.OnesCount32(x)
}`,
[]string{"\tPOPCNTL\t", "support_popcnt"},
},
{
`
func pop3(x uint16) int {
return bits.OnesCount16(x)
}`,
[]string{"\tPOPCNTL\t", "support_popcnt"},
},
{
`
func pop4(x uint) int {
return bits.OnesCount(x)
}`,
[]string{"\tPOPCNTQ\t", "support_popcnt"},
},
// see issue 19595.
// We want to merge load+op in f58, but not in f59.
{